Saving your sacred cows

📅 2023-01-09⏳ 8 min read

This is a rant on An open letter to language designers: Kill your sacred cows post. Please read it first if you want to make any sense out of this.

The post is more than 10 years old, and a lot of things have changed since then. It’s interesting to see how some of the author’s hopes still haven’t come true, and probably never will.

Also, I promised to have some Rust love in this blog, so here we go.

🔗Cows #1-4

I’ve read through the first paragraphs dully nodding. The idea is thought-provoking: instead of plain-text files, let’s store AST-like binary structures on disk.

In terms of Rust, we can take it even further and store MIR representation (with some additional metadata?), which will save us a lot of time compiling dependencies and running things like rust-analyzer. It’s interesting to consider how this will affect cross-edition compatibility.

However, I see a few obvious pitfalls:

  • You can’t save a file that doesn’t compile, or can’t be parsed
  • Saving now takes some additional time as the sources must be parsed and pre-compiled first.
    Both points lead to data losses in cases of parser or editor crashes or force quits.
  • In general, compilation flows one way, e.g. source code → AST → bytecode and so on. With the proposed approach a compiler now needs to deal with two way transformation:
    source codeASTbytecode…saveloadcompile
    And these transformations must be lossless, otherwise a developer will experience source code changes on save.

Does the increased complexity worth the benefits? Maybe! My point here: it’s not that obvious if the cows worth killing.

🔗Cow #5. What the world needs is a better type system

This is where the author started to loose me.

No one will say: “Woah! Check out the type system on that one!”

At my previous job we had a team of people who were really into Haskell. One has even relocated from the US to Finland to work there. I could swear I heard these exact words walking by their place 😂

The next sentence read:

What the world needs is good modules. An escape from classpath hell.

I don’t follow how modules are related to type system. They seem rather orthogonal to me.

The ability to add a new module and know it will work without modification and without breaking something else.

Err… I’ve never been in a situation when adding a brand new module would break an existing ones. Why and how could this happen?

Maybe it’s possible in highly-dynamic languages like Ruby (or Scala?), when a module can redefine standard functions or operators. But I’m pretty sure it’s considered a bad practice and you’ll hardly face it in the wild (:fingers_crossed:).

Okay, let’s go further

My code won’t get magically faster and bug free through your awesome type system.

Emm… It literally will? Rust’s type system eliminates the majority of memory-related bugs.

In any strictly-typed language you don’t need to write tests to verify that every function correctly handles all possible input types.

The code will also be faster in runtime, since the compiler can make amazing optimizations based on the type knowledge.

🔗Cow #6. We must not let programmers extend the syntax because they might do bad things

No, no, no, no. No. This is happening with Kotlin right now. Let’s open a Gradle build file:

plugins {
    id("application")
}

application {
    mainClass.set("org.sample.myapp.Main")
}

dependencies {
    implementation("org.sample:number-utils:1.0")
    implementation("org.sample:string-utils:1.0")
}

This is a valid Kotlin code. But it doesn’t look anything like a “common” code in that language:

fun main() {
    val name = "stranger"
    println("Hi, $name!")
    print("Current count:")
    for (i in 0..10) {
        print(" $i")
    }
}

Knowing Kotlin doesn’t give you a slightest hint on what Gradle is doing, even though it’s the very same language.

Giving developers the ability to easily come up with custom syntax will result in ecosystem fracture. To mitigate that, people will write “best practices” (remember “Clean Code”?) for everyone to follow. We’ve walked this path before, and modern languages are pretty strict in this regard for a very good reason.

Let’s get back to the post…

We are adults.

Not everyone. There are kids just learning programming, teenagers writing amazing and popular libraries. There are billions of people out there and if everyone would speak a dialect of their own, nobody will understand anything. We all need common grounds, a shared language to build a solid ecosystem upon.

The amazing success of JavaScript has been because it was so flexible that real world programmers could try different things and see what works, in the real world (seeing a common theme here?)

No, the amazing success of JavaScript has been because it’s the only language supported by browsers. Trying to avoid it, people came up with things like TypeScript, PureScript, Dart, Elm and hundreds of others (seeing a common typed theme here?).

Others have been trying to make experience with JS smoother by writing frameworks like React, Angular, Vue and thousands of others (most of which failed).

🔗Cow #7. Compiled vs Interpreted. Static vs Dynamic

Nobody cares.

This is where I chocked with my tea. Ask an embedded developer to write a firmware in JavaScript. Ask a researcher to create a prototype in Haskell. Observe their eye twitching and hand reaching for a revolver.

People DO care about these things.

I agree, there are some middle grounds where it’s not that important. But as a language designer, you must consider where and how your language will be used, so this is an important decision.

I’ll go a bit further and say that nowadays the “compiled” languages hold an upper hand in the industry. It’s easier to distribute and version binaries, than a set of scripts, so modern langs like Go, Rust, Nim, Zig took this approach (among other reasons).

🔗Cow #8. Garbage collection is bad

The points author makes are completely valid:

  • GC increased programmer productivity tremendously
  • GC eliminated an entire class of bugs
  • GC systems often have bad nondeterministic performance (me: and resource consumption)

However, the solution doesn’t make sense to me. To know where GC will be triggered, you must know the load profile of an application running in production. You can’t by just looking at the code reliably figure out which textures will be used often and which won’t. A developer can make assumptions, but most often they’ll turn out to be false.

Detecting long-lived objects is a solved issue with modern GCs. If an object survives a couple of GC runs, it is moved to the “Old Generation” and is rarely touched by GC again. The problem isn’t there.

Recently, Java almost simultaneously introduced two new GCs: Shenandoah and Epsilon. The former one is the state of the art: concurrent, incredible performance, very low pauses that are not proportional to the size of the heap. The latter one is a “fake GC”, it doesn’t do anything. The heap will grow forever. The need for Epsilon shows quite a bit that even a very advanced GC can still cause issues.

Java also has the greatest visualization and profiling tools, and a big ecosystem build around performance analysis.

All this is not enough to say “GC is not a problem anymore”. On every job where I programmed for JVM, I had to tune garbage collectors, look at gigabytes of GC logs, rewrite some parts of code to spawn less objects.

We can rebuild this GC cow. We have the technology.

Ten years in the future, tons of corporate mony burnt, and we still can’t. Sorry.

Imagine your car’s engine can’t burn fuel efficiently, so a lot of it pours out from the exhaust pipe. To mitigate it, a bucket is attached to the pipe and you stop the car once in a while to transfer gas form the bucket back to the gas tank. Tuning GC is like deciding how often do you need to stop and what bucket size to use. It’s fine for a prototype model, but for everyday use, I’d rather prefer the engine to be fixed.

🔗In memoriam

Once again, you really shouldn’t be building a new one language anyway, we have more than we need. Please, please. please, don’t make a new language.

No. Go 1.0 was released in 2012. Rust 1.0 — 2015. Both after the post was published and this quote written. These languages have become massively popular, and one of them found its way into the Linux kernel, managing to infiltrate this unassailable for 30 years fortress.

But nobody feels like these languages are perfect. I love Rust, but I’ll gladly jump to its successor’s train, if it’ll smooth the major pain-points, simplify the language and fix in-born flaws.

Please, make a new language, if you feel like it. At least, you will learn a lot. At most, you’ll create a project that will help people to solve their problems for decades.