ULPS

Robbert Haarman

2010-12-11

Introduction

ULPS is short for Useless Loops Per Second. It is a measure of how quickly a machine executes simple instructions, similar to BogoMips, although the BogoMips rating of a system will typically be about a factor 5 higher than its ULPS rating. In this comparison, I use ULPS to determine the overhead imposed by the runtimes of various language implementations.

C

First of all, I measured how many megaulps native code got on my system. Using a C program (ulps.c), I got about 867 megaulps (N: 10, mean: 867, standard deviation: 7.0). This compares to a BogoMIPS rating of 3619. The program was compiled with gcc -Wall -s -O3 ulps.c -o ulps.

OCaml

OCaml is a variant of ML that includes a classes-and-instances object system. It can be compiled to native code as well as bytecode, and both are considered very fast. I implemented the ULPS benchmark in OCaml (ulps.ml).

When compiled to native code with ocamlopt unix.cmxa ulps.ml -o ulps, the program yields about 867 megaulps (N: 10, mean: 867, standard deviation: 16). This is about equal to the number obtained from the C version.

When compiled to bytecode with ocamlc unix.cma ulps.ml -o ulps, the program yields 10 megaulps (all 10 scores were exactly 10). This means that there is about a factor 87 slowdown between native code and the OCaml bytecode interpreter on my system.

Mono

Mono is an open source implementation of .NET. To test the performance of Mono's virtual machine, I have implemented the ULPS benchmark in C# (Ulps.cs). The program was compiled with mcs Ulps.cs.

Running the program using mono Ulps.exe yielded a score of about 324 megaulps (N: 10, mean: 324, standard deviation: 15). This means that the Mono JIT compiler brings the performance of .NET code to within a factor 2 to 3 of native code.

SBCL

Common Lisp is one of the oldest programming languages around. It is very powerful and flexible, but has a reputation for being slow. Although some implementations are indeed slow, there are also very fast implementations.

I implemented the ULPS benchmark in Common Lisp (ulps.lisp) and ran it with Steel Bank Common Lisp, one of the fastest open source implementations of Common Lisp. The program was first compiled with sbcl --eval '(compile-file "ulps.lisp")' --eval '(quit)', and then run with sbcl --noinform --load ulps.fasl --eval '(quit)'.

SBCL achieved about 886 megaulps (N: 10, mean: 886, standard deviation: 37). This in on par with the native versions produced by GCC and OCaml.

TurboVM

I am working on a virtual machine called TurboVM, which features a RISC instruction set that lends itself to both efficient interpretation and easy compilation to other languages, including C and native machine code.

I have not ran the ULPS benchmark yet, but earlier tests indicate that the TurboVM bytecode interpreter is about a factor 20 to 30 (depending on host architecture) slower than native code, whereas compiling TurboVM code down to native code (through C) yields performance on par with an equivalent program written in C and compiled to native code directly.

Both results are very interesting. The performance of TurboVM bytecode that has been compiled to native code shows that TurboVM is an interesting target for compilers, as they can get the performance of having backends for all architectures supported by TurboVM (currently, any 32-bit architecture for which there is a C compiler) for the price of a single target. The performance of the bytecode interpreter is interesting, because it outperforms the OCaml bytecode interpreter by a factor 3 to 4.