Leveraging Integer Overflow to Improve Performance While Preserving Correctness

If you use the upcoming UInteger32.isEven or isOdd methods, you’ll notice that it uses a bitwise AND operation. The reason, as described in a previous post, is because it improves performance.

However, while this is straightforward for all other integer wrapper classes, UInteger32 is an exception. According to ECMAScript 3 11.10:

The production A : A @B, where @ is one of the bitwise operators in the productions above, is evaluated as follows:

  1. Evaluate A.
  2. Call GetValue(Result(1)).
  3. Evaluate B.
  4. Call GetValue(Result(3)).
  5. Call ToInt32(Result(2)).
  6. Call ToInt32(Result(4)).
  7. Apply the bitwise operator @ to Result(5) and Result(6). The result is a signed 32 bit integer.
  8. Return Result(7).

The operands (and, thus, the result type) for the bitwise AND operation in ECMAScript are converted to 32-bit signed integers. System.UInteger32 represents an unsigned 32-bit integer.

This is inconvenient because we’d obviously have to fall back to the slower modulus operation for isEven/isOdd on UInteger32. Unless…

((Math.pow(2, 32) + 1) >>> 0 | 0) & 1 // true
((Math.pow(2, 32) + 2) >>> 0 | 0) & 1 // false

We can take advantage of overflow behavior. (Note: since we’re able to get the correct result by leveraging overflow behavior, we actually don’t perform the extraneous zero-fill right shift as illustrated in the example.)

This is completely safe because:

A) Mathematically, in base 2, the last bit will always be 1 for odd numbers and 0 for even numbers… no matter how big the number is.

B) Bitwise AND will compare both bits in equal-length binary forms. Thus, no matter how big the number is, when you AND against 1, it will always be 0000001 (or zero-padded to whatever length is needed). Therefore, all the preceding bits don’t matter, because they will always be ANDed against a zero bit. The only bit that matters is the trailing bit, see A for why this will always work.

Standard Library Performance: isEven() and isOdd()

Remember what I always advise: always use the JS++ Standard Library if you can. The methods aren’t just well-tested for validity, but we also test for performance.

Checking if a number is even or odd is the classic fizzbuzz test. Most professional developers can use the modulus operator. However, that’s not always the fastest implementation.

> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i & 1) == 0; console.log(x); new Date - t;
false
87
> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i & 1) == 0; console.log(x); new Date - t;
false
90
> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i & 1) == 0; console.log(x); new Date - t;
false
86
> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i & 1) == 0; console.log(x); new Date - t;
false
86
> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i & 1) == 0; console.log(x); new Date - t;
false
86

= 87ms

> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i % 2) == 0; console.log(x); new Date - t;
false
105
> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i % 2) == 0; console.log(x); new Date - t;
false
100
> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i % 2) == 0; console.log(x); new Date - t;
false
100
> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i % 2) == 0; console.log(x); new Date - t;
false
104
> var t = new Date(); var x; for (var i = 0; i < 50000000; ++i) x = (i % 2) == 0; console.log(x); new Date - t;
false
101

= 102ms

Node.js v8.11.1 Linux x64
Core i7-4790k, 32gb RAM
var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i & 1) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
1948 debugger eval code:1:96
var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i & 1) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
2072 debugger eval code:1:96
var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i & 1) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
2086 debugger eval code:1:96
var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i & 1) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
2092 debugger eval code:1:96
var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i & 1) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
2102

= 2060ms

var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i % 2) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
2058 debugger eval code:1:96
var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i % 2) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
2082 debugger eval code:1:96
var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i % 2) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
2114 debugger eval code:1:96
var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i % 2) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
2102 debugger eval code:1:96
var t = new Date(); var x; for (var i = 0; i < 5000000; ++i) x = (i % 2) == 0; console.log(x); console.log(new Date - t);
false debugger eval code:1:80
2104 debugger eval code:1:96

= 2092ms

Firefox 59.0.2, Linux x64
Core i7-4790k, 32gb RAM

While the results are not statistically significant in Firefox (because it's very possible SpiderMonkey is manually optimizing this case via a pattern-matched optimization), you can get a 17% performance gain in Node.js via bitwise AND.

Due to all the layers of abstraction in JavaScript, it's not entirely evident how much faster a bitwise AND for isEven/isOdd can really be. In our benchmarks, we were able to achieve a 17% performance improvement in Node.js. As our lead engineer pointed out via email, according to this table, "for Intel Skylake-X `div` has a latency of 26 (for 32-bit integers), whereas `and` has latency 1 ("reciprocal throughput" has similar difference) so it is an order of magnitude slower, not 20% as in your tests."

Look for isEven() and isOdd() to appear in a future version of the JS++ Standard Library.

You may also be interested in reading Part II of this post which describes how we leveraged overflow behavior to improve performance while preserving correctness for UInteger32.

Code Written in JS++ Can be Significantly Faster than the Equivalent JavaScript

JS++ type guarantees don’t just mean more reliable applications. They can also mean faster applications.

People have expressed concern about the “conversion overhead” that comes with the JS++ type system. However, we’ll show you today that having code that varies in type (such as JavaScript code) can be substantially slower, and even if you lose 1ms in JS++ “conversion overhead”, you may gain 10ms, 100ms, or more from typed optimizations that you can only get with JS++.

Here are two side-by-side benchmarks:

JS++: test.jspp

import System;

double t = (new Date).getTime();
string z;
for (int i = 0; i < 5000000; ++i) {
	z += i.toString();
}
Console.log((new Date).getTime() - t);
$ time node test.jspp.js
484

real	0m0.537s
user	0m0.496s
sys	0m0.100s
$ time node test.jspp.js
487

real	0m0.536s
user	0m0.500s
sys	0m0.096s
$ time node test.jspp.js
488

real	0m0.536s
user	0m0.508s
sys	0m0.088s
$ time node test.jspp.js
486

real	0m0.534s
user	0m0.480s
sys	0m0.116s
$ time node test.jspp.js
484

real	0m0.533s
user	0m0.496s
sys	0m0.096s

JavaScript: test.js

var t = (new Date).getTime();
var z = "";
for (var i = 0; i < 5000000; ++i) {
    z += i.toString();
}
console.log((new Date).getTime() - t);
$ time node sample.js 
523

real	0m0.575s
user	0m0.572s
sys	0m0.076s
$ time node sample.js 
518

real	0m0.561s
user	0m0.556s
sys	0m0.072s
$ time node sample.js 
524

real	0m0.572s
user	0m0.544s
sys	0m0.092s
$ time node sample.js 
521

real	0m0.565s
user	0m0.560s
sys	0m0.076s
$ time node sample.js 
519

real	0m0.563s
user	0m0.540s
sys	0m0.088s

Specifications:
Core i7-4790k
Linux x64 (Debian)
32gb RAM
Node.js v.7.9.0 (Google V8)
JS++ v.0.5.2

Take a close look at the JS++ code and the JavaScript code. They are nearly identical except the JS++ code has types and an import statement.

In this particular case, JS++ is roughly 7.2% faster than JavaScript on a basic toString() benchmark. We do some special things under the hood here, but it should be clear that even with the perceived overhead of JS++ conversions and the much heavier overhead of auto-boxing, JS++ code is still noticeably faster. However, the topic of today's post is not about toString(). If we slightly change the JavaScript code:

var t = (new Date).getTime();
var z; // variable is no longer initialized to string
for (var i = 0; i < 5000000; ++i) {
	z += i.toString();
}
console.log((new Date).getTime() - t);
$ time node sample.js 
735

real	0m0.779s
user	0m0.804s
sys	0m0.072s
$ time node sample.js 
735

real	0m0.783s
user	0m0.784s
sys	0m0.092s
$ time node sample.js 
749

real	0m0.794s
user	0m0.824s
sys	0m0.088s
$ time node sample.js 
742

real	0m0.787s
user	0m0.788s
sys	0m0.092s
$ time node sample.js 
738

real	0m0.782s
user	0m0.808s
sys	0m0.064s

Suddenly, JS++ has jumped from being ~7% faster to being 52.2% faster.

First of all, without static typing, you lose correctness. The result of the above small change results in "undefined0123..." rather than "0123...". Therefore, while it may not be the best benchmark, it illustrates the point. Besides losing correctness, the drop in performance is substantial when types change. You often see JavaScript code written where the type constantly changes, and you can see the cost of this negligence compared to JS++.

The reason this happens is because V8 optimizes "optimistically". When you initialize your variable to a string, V8's JIT code generator will generate code optimized for strings. However, when the type changes, V8 has to "de-optimize", change the variable type, convert data, and more. These are expensive operations which highlight how performance can degrade. The example we showed today involved only one variable. Imagine the performance difference in an application with thousands of variables.

Finally, the JS++ type system is patent-pending. I've made my case clear about patents being only used for competition. Since most programmers immediately jump to conclusions, I will use this example: you can use a patented skateboard without getting in trouble with the law; however, you cannot make and sell a patented skateboard without getting in trouble with the law. If you're on the fence with JS++, don't expect to see our technologies in TypeScript, Flow, or any competing language any time soon. You're free to use them, and we encourage you to do so if your only criteria when selecting technologies are "open source" and "unpatented."

JS++ is free, closed-source software licensed under the BSD License, and the benchmarks above can be executed with the latest JS++ 0.5.2 release.