Is the JVM the "next assembly"? I hope not. (repost)

Published on June 4, 2007. writing (41), java (2)

This is a transplant from the original Irrational Exuberance, and was written in mid 2007: nearly two years ago.

The original title of this was "Is Java the "next assembly"?". A year or two it was pointed out that JVM is more appropriate than Java. Better late than never, right?

Recently I have been looking at a number of languages built on top of the JVM, and pondering their allure. I was planning on writing an article examining the code needed to speak between several of these languages and Java, but after finishing it, I realized it was an awful article, so I scrapped it.Instead, I ended up thinking about the difficulties facing these languages, what they us as programmers, and how they will fit into the cycle of language development.

Advantages of JVM based languages

Access to Java libraries

This is the favorite explanation for why we ought to love JVM-based languages: they can use the plethora of Java libraries that--like low hanging fruit--are begging to be picked.For languages that are new (Scala), or have small user bases (Common Lisp), I think that this is a huge boon (one which Scala takes complete advantage of, and CL essentially has not (ABCL is still too immature for serious usage)). For many other languages though, it actually branches you away from the best tools and libraries written for you language (although this can be overcome, as with JRuby gradually evolving to the point of running Ruby on Rails).I think many Java libraries can be somewhat clumsy to use though, and that at this point not enough effort goes into creating language appropriate wrappers around the Java libraries. Scala--which I solidly believe is the best JVM based language to date--has done a fairly good job of this on commonly used libraries; they have created Scalaesque wrappers for JUnit, the Swing graphics library, and others libraries. However, I would like to see other languages look into this more, as it can be unsettling to switch into procedural programming mode to play with the Java libraries.

‘Free’ speed improvement as JVM improves

Another commonly cited upside is that the language will get speed improvements “for free” as the JVM improves in efficiency. I am skeptical of this, because I see the JVM getting increasingly complex with things like HotSpot being added, and that these increasing complexities will help fine tune Java performance, but may easily hurt performance for other languages built on the JVM.This argument is no different from compiling your code into C, and then stating that C compilers will continue to improve, and thus your code will get faster over time. This is true, but gcc probably isn’t the bottleneck that makes your language twelve times slower than C (looking at you Python).

Downsides of JVM

Some language mix poorly with Java

Java focuses on two paradigms: procedural, and object oriented. When I went through and examined the syntax in a number of languages for calling Java, it became fairly clear that languages that are strongly object oriented and procedural mix well with Java, and others don’t. Let me clarify with an example from each camp (the cordial champion is Jython, competing against the “We don’t serve your kind here” representative: SISC Scheme).

SISC Scheme

(import s2j)
(import oo)
(import generic-procedures)
(define-java-class jstring |java.lang.String|)
(define-generic-java-methods length to-class hash-code)
(define str (java-new jstring))
(length str)    ; ==> 0
(hash-code str) ; ==> 0

Jython

import javas = java.lang.String("yes")
print s # ==> yes
s.length() # ==> 3
s.length # ==> its a method
sb = java.lang.StringBuffer()
print sb # ==> outputs a blank line, as sb is empty
sb.append(s) # ==> yes
sb.append("no") # ==> yesno
sb.append(java.lang.String("maybe")) # ==> yesnomaybe

The Jython code is pretty much what we might expect from in CPython, you’re just playing around with some objects you’ve imported from somewhere else. Fairly clean, and easy to understand: its intuitive.The SISC Scheme stuff pretty much blows my mind. It is coercing Scheme into a procedural language, and I find it completely unintuitive, and downright difficult to use (I repeatedly failed to get the to initialize with a value). My point of these two examples being that languages are designed with some specific techniques or paradigms in mind, and languages that were not designed with the same paradigms as Java, are likely to be strange bedfellows. Also relevant here is that languages with very different conceptions and implementations of object orientation suffer in compatibility (prototyping, generics, et all).

Certain functionality is precluded by Java technology

This is a relatively minor complaint, but the JVM is simply incapable of performing some operations that a C compiler can perform. Much of this is the result of the safety and security checks built into the JVM, which is a good thing, but its still a limit on the potential power of a language built on top of it.A fairly obvious example of this (although not particularly fatal) is that Java’s interface with filesystems is pretty limited compared to many languages that are linking into the JVM. Thus you’ll have a lot more difficulty with some scripting tasks in Jython than you would in Python.

Execution speed is often worse (but not always)

For the most part, JVM based versions of languages are much slower than the C based implementations. This is true for both Python/Jython and Ruby/JRuby. This is likely the consequence of the JVM based projects being younger and less mature than those based on C. These differences may eventually be resolved, but that remains to be seen.Scala is a real exception here, it has no C based implementation (initially they intended to have a .NET version too, but I believe their .NET implementation is lagging a ways behind), and runs just as quickly as Java does, despite having numerous advantages (of which its syntax may or may not be one).

Application distribution more complex

Much like interpreted languages that depend on an interpreter, basing your language on the JVM means you’ll have to either bring your selected JVM with you (bulky), or deal with the hassle of integrating with non-standardized user setups by trying to use their already existing JVM.Even installing the languages themselves can be an exercise in voodoo. Jython and Scala both installed easily, but I fought with JRuby for about half an hour and couldn’t get it to stop throwing an error message (built from trunk, and also downloaded a prebuilt version, both had the same complaint).(Personally I lean more towards the “if you need it, take if with you” philosophy embodied by Py2App, which brings your interpreter and any needed modules along with it. This bloats program size, but it allows things to just work, which is really important when packaging products for non-gurus.)

Inevitable idiosyncrasies

There is a catch-all type category, but all of the projects implementing their languages using the JVM have a lot of errata and little technicalities that can bite you when you if you don’t read about them ahead of time. Admittedly, all language implementations have these little potholes for you to fall into, but JVM issues are often more subtle (or limiting) than those in traditional C implementations.

Consequences of JVM Languages

The biggest consequence I see of this trend towards basing languages on the JVM is that Java components are being viewed increasingly as blocks that are glued together by a more expressive language; that Scala/Jython/JRuby is the one ring to bind them, and that Java libraries exist solely to be bound.This seems self-defeating. If this method of conducting Java library golems with your magical scripting language wand is truly preferable, then people will stop writing in Java, and the raw material to build out of will stagnate.Essentially, these languages will once again diverge away from Java, and the flow of new Java libraries will be reduced (programmers will prefer the newer, more expressive languages), and all of this new code won’t be able to talk (can we put out a bounty for a Jython to JRuby bridge? anyone?). Thus, languages built on the JVM strike me as being parasites riding the wave of Java code, but in the end will hear their own dirges not long after Java’s.We can see here the evolution of new languages over time: a new language arrives that can take advantage of existing languages. Then other languages arise that also take advantage of the existing languages, and gradually programming in the existing languages slows to a trickle. Now we find ourselves at the beginning of a new cycle, where yet another language arises that can take advantage of the most abundant libraries and resources, but these languages too will come to an end.The only escape I see to this cycle is to focus on building languages with robust foreign function interfaces that are flexible enough to incorporate new languages that don’t even exist at the time when the FFI is designed and implemented. Essentially, we need to ensure that these new languages can talk both ways with the languages they siphon off, otherwise they will only promote slow deaths and recurring cycles.