The compiler is your best friend

(blog.daniel-beskin.com)

95 points | by based2 4 hours ago

15 comments

  • supermdguy 3 minutes ago
    > A common pattern would be to separate pure business logic from data fetching/writing. So instead of intertwining database calls with computation, you split into three separate phases: fetch, compute, store (a tiny ETL). First fetch all the data you need from a database, then you pass it to a (pure) function that produces some output, then pass the output of the pure function to a store procedure.

    Does anyone have any good resources on how to get better at doing "functional core imperative shell" style design? I've heard a lot about it, contrived examples make it seem like something I'd want, but I often find it's much more difficult in real-world cases.

    Random example from my codebase: I have a function that periodically sends out reminders for usage-based billing customers. It pulls customer metadata, checks the customer type, and then based on that it computes their latest usage charges, and then based on that it may trigger automatic balance top-ups or subscription overage emails (again, depending on the customer type). The code feels very messy and procedural, with business logic mixed with side effects, but I'm not sure where a natural separation point would be -- there's no way to "fetch all the data" up front.

  • LegionMammal978 3 hours ago
    > How many times did you leave a comment on some branch of code stating "this CANNOT happen" and thrown an exception? Did you ever find yourself surprised when eventually it did happen? I know I did, since then I at least add some logs even if I think I'm sure that it really cannot happen.

    I'm not sure what the author expects the program to do when there's an internal logic error that has no known cause and no definite recovery path. Further down the article, the author suggests bubbling up the error with a result type, but you can only bubble it up so far before you have to get rid of it one way or another. Unless you bubble everything all the way to the top, but then you've just reinvented unchecked exceptions.

    At some level, the simplest thing to do is to give up and crash if things are no longer sane. After all, there's no guarantee that 'unreachable' recovery paths won't introduce further bugs or vulnerabilities. Logging can typically be done just fine within a top-level exception handler or panic handler in many languages.

    • thatoneengineer 56 minutes ago
      Ideally, if you can convince yourself something cannot happen, you can also convince the compiler, and get rid of the branch entirely by expressing the predicate as part of the type (or a function on the type, etc.)

      Language support for that varies. Rust is great, but not perfect. Typescript is surprisingly good in many cases. Enums and algebraic type systems are your friend. It'll never be 100% but it sure helps fill a lot of holes in the swiss cheese.

      Because there's no such thing as a purely internal error in a well-constructed program. Every "logic error" has to bottom out in data from outside the code eventually-- otherwise it could be refactored to be static. Client input is wrong? Error the request! Config doesn't parse? Better specify defaults! Network call fails? Yeah, you should have a plan for that.

    • skydhash 2 hours ago
      A comment "this CANNOT happen" has no value on itself. Unless you've formally verified the code (including its dependencies) and have the proof linked, such comments may as well be wishes and prayers.

      Yes, sometimes, the compiler or the hardware have bugs that violate the premises you're operating on, but that's rare. But most non pure algorithms (side effects and external systems) have documented failure cases.

      • JohnFen 2 hours ago
        > A comment "this CANNOT happen" has no value on itself.

        I think it does have some value: it makes clear an assumption the programmer made. I always appreciate it when I encounter comments that clarify assumptions made.

        • addaon 2 hours ago
          But if you spell that `assert(false)` instead of as a comment, the intent is equally clear, but the behavior when you're wrong is well-defined.
          • JohnFen 2 hours ago
            I agree that including that assert along with the comment is much better. But the comment alone is better than nothing, so isn't without value.
          • eterm 2 hours ago
            Better yet, `assert(false, message)`, with the message what you would have written in the comment.
            • addaon 2 hours ago
              `assert(false)` is pronounced "this can never happen." It's reasonable to add a comment with /why/ this can never happen, but if that's all the comment would have said, a message adds no value.
              • eterm 2 hours ago
                Oh I agree, literally `assert(false, "This cannot happen")` is useless, but ensuring message is always there encourages something more like, `assert(false, "This implies the Foo is Barred, but we have the Qux to make sure it never is")`.

                Ensuring a message encourages people to state the assumptions that are violated, rather than just asserting that their assumptions (which?) don't hold.

          • zffr 1 hour ago
            At least on iOS, asserts become no-ops on release builds
            • addaon 55 minutes ago
              You can (and probably should) undef NDEBUG even for release builds.
        • skydhash 2 hours ago
          Such comments rot so rapidly that they're an antipattern. Such assumptions are dangerous and I would point it out in a PR.
          • LegionMammal978 2 hours ago
            Do you not make such a tacit assumption every time you index into an array (which in almost all languages throws an exception on bounds failure)? You always have to make assumptions that things stay consistent from one statement to the next, at least locally. Unless you use formal verification, but hardly anyone has the time and resources for that.
            • skydhash 2 hours ago
              If such an error happens, that would be a compiler bug. Why? Because I usually do checks against the length of the array or have it done as part of the standard functions like `map`. I don't write such assumptions unless I'm really sure about the statements, and even then I don't.
              • LegionMammal978 1 hour ago
                > or have it done as part of the standard functions like `map`.

                Which are all well and good when they are applicable, which is not always 100% of the time.

                > Because I usually do checks against the length of the array

                And what do you have your code do if such "checks" fail? Throw an assertion error? Which is my whole point, I'm advocating in favor of sanity-check exceptions.

                Or does calling them "checks" instead of "assumptions" magically make them less brittle from surrounding code changes?

              • tosapple 1 hour ago
                How does one defend against cosmic rays?

                Keep two copies or three like RAID?

                Edit: ECC ram helps for sure, but what else?

              • awesome_dude 1 hour ago
                Do you really have code that's

                if array.Len > 2 { X = Y[1] }

                For every CRUD to that array?

                That seems... not ideal

      • threethirtytwo 1 hour ago
        False it has value. It’s actually even better to log it or throw an exception. print(“this cannot happen.”)

        If you see it you immediately know the class of error is purely a logic error the programmer made a programming mistake. Logging it makes it explicit your program has a logic bug.

        What if you didn’t log it? Then at runtime you will have to deduce the error from symptoms. The log tells you explicitly what the error is.

      • AnimalMuppet 2 hours ago
        Worse: You may created the proof. You may have linked to the proof. But if anyone has touched any of the code involved since then, it still has no value unless someone has re-done the proof and linked that. (Worse, it has negative value, because it can mislead.)
        • skydhash 1 hour ago
          Not really. A quick git blame (or alternative) will give you the required information about the validity of such proof.
          • dullcrisp 1 hour ago
            You must have some git plugin I haven’t heard about.
    • the__alchemist 2 hours ago
      This is what rust's `unreachable()!` is for... and I feel hubris whenever I use it.
      • tialaramex 32 minutes ago
        You should prefer to write unreachable!("because ...") to explain to some future maintenance engineer (maybe yourself) why you believed this would never be reached. Since they know it was reached they can compare what you believed against their observed facts and likely make better decisions.

        But at least telling people that the programmer believed this could never happen short-circuits their investigation considerably.

    • GabrielBRAA 2 hours ago
      Heh, recently I had to fix a bug in some code that had one of these comments. Feels like a sign of bad code or laziness. Why make a path that should not happen? I can get it when it's on some while loop that should find something to return, but on a if else sequence it feels really wrong.
      • kccqzy 1 hour ago
        Strong disagree about laziness. If the dev is lazy they will not make a path for it. When they are not lazy they actually make a path and write a comment explaining why they think this is unreachable. Taking the time to write a comment is not a sign of laziness. It’s the complete opposite. You can debate whether the comment is detailed enough to convey why the dev thinks it’s unreachable, but it’s infinitely better than no comment and leaving the unreachability in their head.
  • kridsdale1 2 hours ago
    I really like modern Swift. It makes a lot of what this author is complaining about, impossible.

    The worst file I ever inherited to work on was the ObjC class for Instagram’s User Profile page. It looked like it’d been written by a JavaScript fan. There were no types in the whole file, everything was an ‘id’ (aka void*) and there were ‘isKindOfClass’ and null checks all over the place. I wanted to quit when I saw it. (I soon did).

    • JackYoustra 1 hour ago
      Modern swift makes this technically possible but so cluttered that it's effectively impossible, especially compared with typescript.

      Swift distinguishes between inclusive and exclusive / exhaustive unions with enum vs protocols and provides no easy or simple way to bridge between the two. If you want to define something that typescript provides as easy as the vertical bar, you have to write an enum definition, a protocol bridge with a type identifier, a necessarily unchecked cast back (even if you can logically prove that the type enum has a 1:1 mapping), and loads of unnecessary forwarding code. You can try and elide some of it with (iirc, its been a couple years) @dynamicMemberLookup, but the compiler often chokes on this, it kills autocomplete, and it explodes compile times because Swift's type checker degrades to exponential far more frequently than other languages, especially when used in practice, such as in SwiftUI.

      • tizio13 53 minutes ago
        I think you’re conflating 'conciseness' with 'correctness.' The 'clutter' you're describing in Swift like, having to explicitly define an Enum instead of using a vertical bar |, is exactly what makes it more robust than TS for large-scale systems.

        In TypeScript a union like string | number is structural and convenient, but it lacks semantic meaning. In Swift, by defining an Enum, you give those states a name and a purpose. This forces you to handle cases exhaustively and intentionally. When you're dealing with a massive codebase 'easy' type bridging is often how you end up back in 'id' or 'any' hell. Swift’s compiler yelling at you is usually it trying to tell you that your logic is too ambiguous to be safely compiled which, in a safety first language, is the compiler doing its job.

    • glenjamin 1 hour ago
      Any advice on how to learn modern Swift?

      When I tried to do learn some to put together a little app, every search result for my questions was for a quick blog seemingly aimed at iOS devs who didn’t want to learn and just wanted to copy-paste the answer - usually in the form of an extension method

  • WalterBright 10 minutes ago
    > How many times did you leave a comment on some branch of code stating "this CANNOT happen" and thrown an exception?

    My code is peppered with `assert(0)` for cases that should never happen. When they trip, then I figure out why it happened and fix it.

    This is basic programming technique.

  • smj-edison 1 hour ago
    > Rust makes it possible to safely manage memory without using a garbage collector, probably one of the biggest pain points of using low-level languages like C and C++. It boils down to the fact that many of the common memory issues that we can experience, things like dangling pointers, double freeing memory, and data races, all stem from the same thing: uncontrolled sharing of mutable state.

    Minor nit: this should be mutable state and lifetimes. I worked with Rust for two years before recently working with Zig, and I have to say opt-in explicit lifetimes without XOR mutability requirements would be a nice combo.

  • jez 13 minutes ago
    For another perspective on "lying to the compiler," I enjoyed the section on Loopholes in Niklaus Wirth's "Good Ideas, Through the Looking Glass"[1]. An excerpt:

    Experience showed that normal users will not shy away from using the loophole, but rather enthusiastically grab on to it as a wonderful feature that they use wherever possible. This is particularly so if manuals caution against its use.

    [...]

    The presence of a loophole facility usually points to a deficiency in the language proper, revealing that certain things could not be expressed.

    Wirth's use of loophole most closely aligns with the unchecked casts that the article uses. I don't think exceptions amount to lying to the compiler. They amount more to assuming for sake of contradiction, which is not quite lying (e.g., AFSOC is a valid proof technique, but proofs can be wrong). Null as a form of lying is not the fault of the programmer, that's more the fault of the language, so again doesn't feel like lying.

    [1] https://people.inf.ethz.ch/wirth/Articles/GoodIdeas.pdf

  • doug_durham 1 hour ago
    Typing is great, presuming that the developer did a thorough job of defining their type system. If they get the model wrong, or it is incomplete then you aren't really gaining much out of a strictly typed language. Every change is a fight. You are likely to hack the model to make the code compile. There is a reason that Rust is most successful at low level code. This is where the models are concrete and simple to create. As you move up the stack, complexity increases and the ability to create a coherent model goes beyond human abilities. That's why coding isn't math or religion. Different languages and approaches for different domains.
  • onionisafruit 1 hour ago
    He mentions this is from a podcast. Anybody know what podcast? It seems like something I might like to listen to.
  • barishnamazov 2 hours ago
    The "lies" described here are essentially the definition of weakly typed programming, even in statically typed languages.

    Functional languages like ML/Haskell/Lisp dialects has no lies built in for decades, and it's good to see the mainstream programming (Java, TS, C++, etc.) to catch up as well.

    There are also cute benefits of having strong schemas for your API as well -- for example, that endpoint becomes an MCP for LLMs automatically.

  • skydhash 2 hours ago
    The whole article gives a generated vibe, but I did want to point out this particular snippet

    > The compiler is always angry. It's always yelling at us for no good reason. It's only happy when we surrender to it and do what it tells us to do. Why do we agree to such an abusive relationship?

    Programming languages are a formal notation for the execution steps of a computing machine. A formal system is always built around rules and not following the rules is an error, in this case a malformed statement/expression. It's like writing: afjdla lkwcn oqbcn. Yes, they are characters, but they're not english words.

    Apart from the syntax, which is a formal system on its own, the compiler may have additional rules (like a type system). And you can add even more rules with a static analysis tool (linter). Even though there may be false positives, failing one of those usually means that what you wrote is meaningless in some way. It may run, but it can have unexpected behavior.

    Natural language have a lot of tolerance for ambiguous statements (which people may not be aware of if they share the same metaphor set). But a computer has none. You either follow the rules or you do not and have an error.

    • xnorswap 2 hours ago
      I also don't like that phrasing. It's like complaining of guard-rails while running around erratically.

      The guard rails aren't abusing you, they're helping you. They aren't "angry", they're just constraints.

      • scubbo 2 hours ago
        Right, and I suspect that was the author's intent - to evoke a sympathetic frustration that newer programmers might feel, and then to point out how the frustration is ill-aimed.
  • ZebusJesus 1 hour ago
    This was a great breakdown and very well written. I think you made one of the better arguments for rust Ive read on the internet but you also made sure to acknowledge that large code bases are just a different beast all together. Personally I will say that AI has made making code proofs or "formal verification" more accessible. Actually writing a proof for your code or code verification is very hard to do for most programmers which is why it is not done by most programmers, but AI is making it accessible and with formal verification of code you prevent so many problems. It will be interesting to see where programming and compliers go when "formal verification" becomes normal.
  • shevy-java 10 minutes ago
    So his view is from a programmer / developer. That's fine.

    I had an issue on my local computer system yesterday; manjaro would not boot with a new kernel I compiled from source. It would freeze, at the boot menu, which I never had before. Anyway. I installed linuxmint today and went on to actually compile a multitude of things from source. I finally finished compiling mesa, xorg-server, ffmpeg, mpv, gtk3 + gtk4 - and the prior dependencies (llvm etc...). So I am almost finished finally.

    I had to invest quite a lot of time hunting for dependencies. Most recent one was glad2 for libplacebo. Turns out "pip install glad2" suffices here. But getting that wasn't so trivial. The project project at pip website was virtually useless; respectively I installed "pip install glad" which was too old. Also took me perhaps one full minute or more to realise it.

    I am tapping into LFS and BLFS webpage (Linux from scratch), which helps a lot but it is not perfect. So much information is not described and people have to know what they are doing. You can say this is fair, as this is more for advanced users. Ok. The problem is ... so many things that compilers do, is not well-described; or at the least you can not easily find high quality documentation. Google search is almost virtually useless now; AI just hallucinates and flat out lies to you often. Or tells you things that are trivia and you already know it. We kind of lose quality here. It's as if everything got dumbed down.

    Meanwhile more and more software is required to build other software. Take mesa. Now I need not only LLVM but also the whole spirv-stack. And shaderc. And lots more. And also rust - why is rust suddenly such a huge dependency? Why is there such a proliferation of programming languages? Ok, perhaps C and C++ are no longer the best language, but WHY is the whole stack constantly expanding?

    We worship complexity. The compilers also become bigger and bigger.

    About two days ago I cloned gcc from https://github.com/gcc-mirror/gcc. The .tar.xz sits at 3.8 GB. Granted, regular tarball releases are much smaller, e. g. 15.1.0 tar.xz at 97MB (at https://ftp.gnu.org/gnu/gcc/?C=M;O=D). But still. These things become bigger and bigger. gcc-7.2.0.tar.xz from 9 years ago had a size of 59M. Almost twice the size now in less than 10 years. And that's really just like all the other software too. We ended up worshipping more and more bloat. Nobody cares about size. Now one can say "this is just static code", but this is expanded and it just keeps on getting bigger. Look at LLVM. How to compile this beast: https://www.linuxfromscratch.org/blfs/view/svn/general/llvm.... - and this will only get bigger and bigger and bigger.

    So, back to the "are compilers your best friend"? I am not sure. We seem to have the problem of more and more complexity getting in at the same time. And everyone seems to think this is no issue. I believe there are issues. Take slackware; basically it was a one person maintains it. This may not be the primary reason, but slackware slowed down a lot in the last some years. Perhaps maintaining all of that requires a team of people. Older engineers cared about size due to constraints. Now that the constraints are less important, bloat became the default.

  • moth-fuzz 1 hour ago
    I'm not a fan of the recent trend in software development, started by the OOP craze but in the modern day largely driven by Rust advocates, of noun-based programming, where type hierarchies are the primary interface between the programmer and the code, rather than the data or the instructions. It's just so... dogmatic. Inexpressive. It ultimately feels to me like a barrier between intention and reality, another abstraction. The type system is the program, rather than the program being the program. But speaking of dogma, the author's insistence that not abiding by this noun-based programming model is a form of 'lying' is quite the accusatory stretch of language... but I digress at the notion that I might just be a hit dog hollering.
    • kccqzy 30 minutes ago
      The kind of noun-based programming you don’t like is great for large teams and large code bases where there is an inherent communication barrier based on the number of people involved. (N choose 2 = N*(N-1)/2 so it grows quadratically.) Type hierarchies need to be the primary interface between the programmers and the code because it communicates invariants on the data more precisely than words. It is dogmatic, because that’s the only way it could work for large teams.

      When you are the only programmer, this matters way less. Just do whatever based on your personal taste.

  • mock-possum 2 hours ago
    Hello baader meinhoff my old friend - while I’m familiar with the convention, I was just introduced formally to the phrase “functional core, imperative shell” the other day, and now here it is again.

    “Learn to stop worrying and love the bomb” was definitely a process I had to go through moving from JavaScript to Typescript, but I do mostly agree with the author here wrt convention. Some things, like using type names as additional levels of context - UserUUID and ItemUUID each alias UUID, which in turn is just an alias for String - have occurred to me naturally, even.

  • pointbob 1 hour ago
    [dead]