Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> unlike full-JVM Clojure it has a very fast startup time

I can't believe that after all these years Java still didn't fix their startup time.



The JVM cold-starts, loads a Hello, World program from a compressed JAR, runs it and shuts down in 40ms. But Clojure compiles quite a bit of Clojure code generating hundreds if not thousands of classes and then loads them before starting up the Clojure program. Still, there's ongoing work on the JVM (Project Leyden [1]) to speed up both startup and warmup even of such programs by caching more state.

[1]: E.g. see https://spring.io/blog/2023/10/16/runtime-efficiency-with-sp...


>>The JVM cold-starts, loads a Hello, World program from a compressed JAR, runs it and shuts down in 40ms.

Im yet to reach the X-men level super qualities that can detect, and work in 40 ms chunks. Or at least even notice a 40 ms delays.

I envy the humans who can notice such small chunks of time.


Right, but the problem is that many programs do a lot more work when they start up than Hello, World. The Clojure runtime in particular does quite a lot at startup.


I don't like how this issue is constantly dismissed out of hand. It very obviously is actually a massive glaring issue which severely limits what clojure can reasonably be used for. Nobody would ever accept a 1 second startup time for CLI applications that we use all the time like git, kubectl, npm, docker, etc.


npm, docker, kubectl honestly, you are splitting hairs. I plan to run a process for hours, I think 40 ms delays are something I can live with.

git- I take more time to write the commit message to worry about 40 ms.

I mean sure optimise code to run it fast. But its not something that a human notices.


40ms? Clojure Hello World startup time is like 600ms and up. Yes sometimes you run npm and kubectl for hours but most of my usage is just running individual commands that take less than a second.


You still don't see many scripts or CLI apps in Java though. I was wondering if that would change now that you can run a source file directly, but then wouldn't the startup be slower, because now it's also having to compile.


Why can't you do some JVM equivalent of memcopy/execve the starting state of the program?

Isn't the initialization procedure (or at least the vast majority of it) exactly the same at each run ?


That's pretty much the idea behind Project Leyden's "premain" work. The tricky bit is that the program startup being almost exactly the same each time isn't quite the same as being exactly the same. The JVM already caches some things and the capabilities of that mechanism are being expanded to cover more, including JITted code as well as some program computations done at initialisation.


Back when I wrote Clojure professionally, using GraalVM to generate a native executable of things like clj-kondo basically eliminated the startup latency.


Right, but Native Image comes with its own tradeoffs. Lyeden's "premain" work aims to be somewhere between the situation today and Native Image.


It's a tradeoff. The startup time of Clojure on the JVM is slower than most, but the runtime is faster than most. It also needs a lot of memory to get going. This means it's optimized toward long-running programs like web servers. During development, you use the REPL interactively which makes this a non-issue, but it does take some getting used to at first.

That said, there are alternative runtimes that have different tradeoffs. For example, Babashka is a runtime for Clojure that uses GraalVM instead of the JVM as the foundation. Babashka scripts have about a 10ms startup time on my M1 MacBook Air.


My understanding is that this isn’t really a JVM thing, but I might be wrong.


Your understanding is correct. JVM startup accounts for a small portion of Clojure's startup time.[1] Most of the overhead is in compiling and loading clojure.core. Efforts have been made to remedy this issue[2] (e.g. direct linking, ahead-of-time compilation, ...), but this doesn't remove the fact that clojure.core is huge and virtually any Clojure program will be importing more than just clojure.core, so there is still a lot of var derefs to deal with, then code to compile, then classes to load, etc.

[1] https://clojure-goes-fast.com/blog/clojures-slow-start/

[2] https://clojure.org/reference/compilation


Well, it's still partially the JVM at play. For example, if your application has big classes, and many classes, the JVM will be slow to start. This is what is happening here. Clojure is like a large-ish Java project, it has big classes, and many of them, with static initializers, that need loading at the start, and the JVM does all that slowly.

In some sense it's Clojure's fault for having an implementation that causes slow JVM startup, but it's also the JVM's fault that the way Clojure uses it causes it to take a long time to start.


The various JVM implementations have mechanisms to fix that, like JIT caches and AOT compilation, which Clojure doesn't take advantage of.

So it is indeeed a Clojure issue, not a JVM one.


Can you talk more about these?


AOT compilation with PGO, available for free on GraalVM and OpenJ9.

Also available since around 2000 from comercial vendors, of which, Aicas and PTC are the main survivors.

https://www.aicas.com/wp/products-services/jamaicavm/

https://www.ptc.com/en/products/developer-tools/perc

OpenJ9 also does JIT caching across executions, https://eclipse.dev/openj9/docs/aot/

OpenJDK also does caching but at higher level,

https://docs.oracle.com/en/java/javase/22/vm/class-data-shar...

https://wiki.openjdk.org/display/HotSpot/Application+Class+D...

Project Leyden plans to add a similar JIT cache like on OpenJ9, https://openjdk.org/projects/leyden/notes/02-shift-and-const...

Azul and OpenJ9 have cloud JIT servers, that share execution heuristics and dedicate servers for highly optimizing compilers,

https://www.azul.com/products/prime/cloud-native-compiler-fa...

https://eclipse.dev/openj9/docs/jitserver/

Finally, although technically not really Java nor JVM, the Android Runtime (ART), does a mix of high performance interpreter written in Assembly, JIT, AOT compilation, and PGO sharing across devices via Play Store (cloud profiles).

https://source.android.com/docs/core/runtime/configure


Well, quite a few people already use AOT with PGO from GraalVM to build native executables of Clojure programs. Those start stupidly fast. I never heard of anyone doing so with OpenJ9, how good is the AOT of OpenJ9?

AppCDs in OpenJDK currently has terrible ergonomics. Clojure can't really offer it. Each user must go out of their way to leverage it. So you can't really release an app that automatically leverage it, the user needs to launch it with all the command incantations, etc. And it's so sensitive to class path changes, etc. It kind of sucks to be honest. But some people still use it for prod release, since you can set it up in a docker easily. But the use-case for fast startup are desktop apps, CLIs, scripts, etc. And for all those, AppCDs are super annoying to setup. See: https://ask.clojure.org/index.php/8353/can-we-use-appcds-to-...

Still, AppCDs don't fully solve the startup issue, because all the static initializations stuff takes a considerable amount of time, and that does not get cached by AppCDs.


When you get into REPL-driven development, the JVM startup time (which is often under a second for me anyways) is a total non-issue. You don't continuously restart your program to see changes or run tests. You can refresh all your state instantly without exiting.

But before Babashka, that was indeed a barrier to using Clojure in shell scripts. Now we have it all!


Except in Babashka you can't use Java libraries.. Right?

I feel a good fraction of code will dip into Java libs at least a bit - so you're limited in what libraries you can use

I think the real solution is probably Graal native - though it's not part of the official toolbox/deps.edn


In Unix it is customary to invoke tools from the Bash shell or Bash scripts. That doesn't mix well with a separate REPL.


I see your point here... I was addressing to the feedback loop during development time. Babashka works well for Clojure in a Unix tool pipeline.

It's also possible to compile your JVM Clojure program yourself to a binary with GraalVM for even better performance than Babashka and even faster startup.


Clojure and Common Lisp are the two cases where I’ve never felt the need to work in bash: most projects grow a library of utilities for development that aren’t limited by the stringly-typed nature of bash or zsh


We detached this subthread from https://news.ycombinator.com/item?id=40445197.


I think, in a lot of cases when people still complain about "the slow startup time of the JVM", they're really just talking about how the big JVM GUI apps (like IDEs) struggle to get started on heavily-loaded systems. And this, I think, is mostly just due to these apps eagerly reloading the most recent workspace on startup, and so pre-allocating big chunks of memory on startup to be ready for that — which can ripple out, on systems with low free memory, as other, colder processes all bottlenecking together on the IO of having their own memory written out to swap; and/or to having dirty mmaped pages forcibly-flushed to disk en masse so that the page-cache entries they live in can be purged.

Much more rarely — mostly when talking about writing CLI tools in a JVM language — people actually are complaining about the single second-or-so it takes the JVM to start up. (Usually because they want to run this tool in a loop under xargs(1) or find(1) or something.)

This last second of startup lag is (AFAIK) quite hard to improve, as it's mostly not the JVM itself starting up, but the static methods of JVM classes being called as those classes are loaded — which can do arbitrarily much stuff before any of your own code gets control. (Due to legacy code expecting to read certain per-boot-dynamic info as static fields rather than as the results of static method calls, I believe the JRE runtime is actually required to do quite a lot of that kind of static initialization, to pre-populate all those static fields, just in case something wants to read them.)

---

You'd think that GraalVM could inherently skip most of this, because the Graal compiler does dead-code analysis. "If nothing in your code reads one of those static fields, then nothing needs to write that field, so let's not invoke the static initializer for that field." But that's not true: static initializers are exported and called by the runtime — so they're always "alive" from the compiler's perspective. The Graal compiler isn't doing full-bore data-flow analysis to determine which static members of which classes are ever read from.

I believe GraalVM does try to work around static initializers as much as it can, by pre-evaluating and snapshotting as much of JVM runtime's static initializer code as possible by default, converting it all into const data structures embedded in the class files before the native codegen step gets run on it (see: https://www.graalvm.org/latest/reference-manual/native-image...).

It's not possible to apply this pre-baking to 100% of classes, sadly — some of these data structures need environment data like env-vars or system network config threaded into them on boot.

(I wonder anyone on the Graal project is working on a fully-general static-initializer optimization pass, that does something like concolic execution to bake these initializers out into either fullly-constant data if possible, or if not, then constant "data templates" plus trivial initializer functions that just gather up the few at-boot context fragments, shove them into the "data template" using a low-level hook, and then memcpy the result out onto the heap and call it an object.)


They are working with PGO, and adding also AI based optimization algorithms, both can help.


The problem lies with Clojure implementation, not Java or the JVM.


It's not the JVM.

It's how much code you load.


I mean... That's not an issue in other runtimes, so it's kind of a JVM quirk no?


Those other runtimes also don't do half of the features a JVM usually has to offer, and most people complaining also don't bother to actually learn the Java ecosystem, the existing set of JVM implementations, and the optimizations features made available to them.


No, that's defiantly true in other non-native runtimes.

E.g. any Python project has to deal with this, like Mercurial.

Java projects tend to (insert stereotype) have a lot of code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: