For Loops: Patterns and Low-Level Tour

I wanted the challenge of taking an elementary programming concept (‘for’ loops) and writing an article for advanced programmers. All of the patterns discussed today are applicable to high-level languages that have ‘for’ loops, but we will examine ‘for’ loops at a low level to discover why things are the way they are. By “low level,” I mean compilers and hardware. Examples are in C, C++, x86 assembly, and, of course, JS++. In addition to covering for loop patterns for everyday programming, I’m going to cover for loops from the perspective of optimizing compilers.

Length/Size Caching

Compilers may not automatically cache your loop length or array size just because the array you’re looping does not change. In JS++:

int[] arr = [ 1, 2, 3 ];

for (int i = 0; i < arr.length; ++i) {
    // ...
}

// is not the same as:

for (int i = 0, len = arr.length; i < len; ++i) {
    // ...
}

Now, for more advanced users, I'm going to show you why. We consider C/C++ optimizing compilers to be the state of the art. As of this writing, the latest gcc 11.2 (and clang 13) do not perform this optimization for this basic case:

// Assuming 'v' is const std::vector<int> &
for (size_t i = 0; i != v.size(); ++i) {
    printf("%zu\n", i);
}

... with special emphasis on the type being a const reference.

On -O3, the comparison on each iteration becomes:

.L3:
        mov     rsi, rbx
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        add     rbx, 1
        call    printf
        mov     rax, QWORD PTR [rbp+8] # occurs
        sub     rax, QWORD PTR [rbp+0] # every
        sar     rax, 2                 # iteration
        cmp     rbx, rax
        jne     .L3

I'm using printf versus std::cout here because the generated assembly code is easier to read. Furthermore, it doesn't do any exception bookkeeping.

Now, if we cache the size:

for (size_t i = 0, sz = v.size(); i != sz; ++i) {
    printf("%zu\n", i);
}
        push    rbp
        push    rbx
        sub     rsp, 8
        mov     rbp, QWORD PTR [rdi+8] # before
        sub     rbp, QWORD PTR [rdi]   # the
        sar     rbp, 2                 # loop
        je      .L1
        xor     ebx, ebx
.L3:
        mov     rsi, rbx
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        add     rbx, 1
        call    printf
        cmp     rbx, rbp
        jne     .L3

The assembly code is subtracting a start pointer from an end pointer (at a +8 byte offset, denoting a 64-bit size type). If the pointers are equal, there are zero elements, so we jump to the basic block immediately following the for loop. SAR rbp, 2 is equivalent to >> 2 (division by 4, sizeof(int)). (pointer_end - pointer_start) / sizeof(T) gives us the number of elements. We can confirm in the libstdc++ source code:

struct _Vector_impl_data
{
    pointer _M_start;
    pointer _M_finish;
    // ...
};
      /**  Returns the number of elements in the %vector.  */
      _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
      size_type
      size() const _GLIBCXX_NOEXCEPT
      { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }

The problem, from the perspective of compilers, is that the side effect (writing to standard output) causes the optimizer to discard the loop length caching optimization. The fix is simple: use only pure functions if you want a compiler to automatically apply this optimization. The alternative is to simply cache the size if no mutations occur inside the loop.

One more reason compilers may not perform this optimization, which is important to JS++ but maybe not for C/C++, is compile times. The analysis required to prove the loop length does not change can be expensive.

There are a lot of moving parts: the language, the compiler, the CPU cache architecture, and so on. Whether or not you need this optimization at all should depend on your benchmarks. If the size is in the L1 CPU cache, it would make no practical difference unless you needed to shave 3-4 cycles per iteration (e.g. high-frequency trading). The key takeaway for general developers is that you cannot assume the compiler will do this for you—even when it seems obvious that the size or loop length never changes.

Unsigned Reverse Iteration

Oftentimes, I'm surprised when programmers don't know how to manually loop in reverse when the data type is unsigned. Yes, C++ has rbegin/rend/crbegin/crend, but I'm talking about manual reverse iteration.

First, let's see what does not work. I'll use the JS++ syntax because, in my opinion, it's the most readable for a general audience:

int[] arr = [ /* ... */ ];
for (unsigned int i = (unsigned int) arr.length - 1;
     i > 0;
     i--)
{
    // incorrect
}

The above code will never print the first element.

"No problem," you say to yourself. "I'll just compare against -1."

Wait. You're going to compare an unsigned (non-negative) value against -1? Two things can happen. In C and C++, -1 will wrap. If you're comparing a 64-bit number, -1 becomes 0xFFFFFFFFFFFFFFFF. Your loop condition will never be true (because i will never be "greater than" UINT64_MAX), and the optimizing compiler will simply eliminate the loop. In JS/JS++, your i counter variable will wrap, and you'll get an infinite loop. (The difference, then, is that C/C++ will wrap the RHS; while JS++ will wrap the LHS. JS++ does this for type system soundness reasons, which extend beyond the scope of for loops.)

The code discussed would be perfectly fine if the container size type were signed. Thus, I presented the "intuitive" (but incorrect) method of unsigned reverse iteration. Instead, the proper way to do unsigned reverse iteration looks something like this:

for (unsigned int i = arr.length; i --> 0; ) {
    // ...
}

There's a curiously-titled Stack Overflow thread on this:

What is the "-->" operator in C/C++?

The above code can be rewritten as:

for (unsigned int i = arr.length; (i--) > 0; ) {
    // ...
}

You should prefer the latter—for readability.

Interestingly, being that unsigned reverse iteration is deceptively unintuitive, it was brought up as a consideration to make all JS++ container size types signed. I was in the unsigned camp, but, besides being easier, what pushed us into the signed types direction is because JS++ is a superset of JavaScript (ECMAScript 3). If you want to see a bug in language design and specification for a very popular language (JavaScript), please read Design Notes: Why isn’t System.Array.length an ‘unsigned int’? As far as I know, JS++ was the first to uncover this design bug.

The common misconception with signed container size types is that you have to pay for two bounds checks instead of one to check for negative index accesses:

if (i < 0 && i >= arr.length) {
    // out of bounds
}

However, besides being more beginner-friendly, signed container size types are actually equally efficient to unsigned:

if ((unsigned int) i >= (unsigned int) arr.length) {
    // out of bounds
}

If the index is a negative value, it will wrap (and, thus, will always be greater than the length property cast to unsigned because length is limited to the signed System.Integer32.MAX_VALUE). In other words, the "negative index check" is redundant. The casts do not result in additional instructions.

Labeled break/continue

This brings me to an example that will not work in C or C++, because those languages do not have labeled break/continue statements.

Somebody actually emailed me to remark on how impressed he was with the quality of the JS++ documentation. He spent all these years programming and never knew about labeled break/continue in JavaScript! The following will work in JS++, JS, and Java. I'll just take it directly from the JS++ documentation:

outerLoop: for (int x = 1; x <= 5; ++x) {
    innerLoop: for (int y = 1; y <= 5; ++y) {
        break outerLoop;
    }
}

Notice we labelled the loops (outerLoop and innerLoop) at lines 1 and 2, respectively. We also provided a label (outerLoop) to the break statement at line 3. By referring to the outerLoop in our break statement, we were able to exit the outermost loop; without the label, we would not have been able to exit the outermost loop. (The innermost loop would have been exited since it is the closest loop to the break statement.)

Source: Label Statement (JS++ Documentation)

One reason we might want to break out of an outer loop is if we are searching a jagged array. Once the element is found, we break out of both loops by breaking out of the outer loop.

Looping in Row-Major Order

Given the following C code, would it take more CPU cycles to loop the columns first or the rows first?

const int rows = 64, cols = 64;
int matrix[rows][cols];

In the code above, assume int requires 32 bits of storage space (4 bytes). Furthermore, we'll assume a cache line size of 64 bytes. We can also be assured the arrays are contiguous. From the C11 language specification, 6.2.5 Types:

An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type.

For visualization, treat each cell as consuming 256 bytes of memory (sizeof(int) ✕ 64 cols ✕ 1 row). The matrix will look like this in memory:

0 1 2 3 4 63

Notice the "shape" of the matrix in memory is not a square, as we might expect in mathematics for a 64 x 64 matrix. Instead, each cell represents one row containing 64 columns consuming 256 bytes each.

Equipped with this visualization, let's first examine looping columns first:

for (int i = 0; i < cols; i++) {
    for (int j = 0; j < rows; j++) {
        // ...
    }
}

This code has an access pattern that looks like this:

r0
c0, c1, c2, …
r1
c0, c1, c2, …
r2
c0, c1, c2, …
r3
c0, c1, c2, …
r4
c0, c1, c2, …
r…
c0, c1, c2, …
r…
c0, c1, c2, …
r63
c0, c1, c2, …

I've prefixed the cell values with lowercase "r" to mark row 1, 2, 3, 4, and so on. Likewise, the columns are marked c0, c1, and so on. There's a problem here though. We're jumping around in memory.

Each "row" is actually divided into 64 columns, stored contiguously. You can imagine the storage as 4096 (64 x 64) columns stored contiguously in memory.

In the first iteration of the loop, we access the first column. We enter the innermost loop, which iterates the rows. We're at c0, r0. On the next iteration of the innermost loop, we're at c0, r1. Then we're at c0, r2, and so forth. When the innermost loop finishes, we start at c1, r0.

We have a cache line size of 64 bytes. At c0, r0, we can cache the first 16 columns of r0 (64 bytes / sizeof(int) = 16). Suppose we have an L1 cache hit with a cost of ~4 cycles and a cache miss of ~100 cycles. Thus, a cache miss would be roughly 25x slower. We have the first 16 columns cached (c0, c1, c2, c3, and so on for r0), but the innermost loop immediately jumps to r1. Thus, instead of getting the data we need from the cache, we have to fetch the data from DRAM... costing us 100 cycles. We have to pay for this penalty 64 times in the innermost loop.

Clearly, this would not be efficient. It is better to just access the contiguous data in the order which it is stored:

for (int i = 0; i < rows; i++) {
    for (int j = 0; j < cols; j++) {
        // ...
    }
}
c0
r0
c1
r0
c2
r0
c3
r0
c4
r0
c…
r0
c…
r0
c63
r0

Notice in the visualization, all of the columns of row 0 are accessed first. We do not move on to row 1 until all the columns of row 0 are accessed. Coincidentally, this results in fewer cache misses. We pay for a first fetch from DRAM (100 cycles) for the first 16 columns, the next accesses come from SRAM (~4 cycles), and we only pay for three more DRAM fetches for the full 64 columns. In the end, this cache-friendly code costs us ~10x fewer cycles in the innermost loop.

Loop-Invariant Code Motion

I had a family member studying computer science at a top 10 CS school. The homework assignment presented a multiple choice question, and she was asked, "Does this code do anything?" The choices were yes/no.

int main(void) {
    int x;
    for (int i = 0; i < 10; i++) x = 2;
    return 0;
}

The correct answer is "no." Loop-invariant code motion might sound intimidating or arcane, but we can break it up into its components: 1) loop invariant, and 2) code motion. First, the compiler begins by identifying "loop invariants"—essentially, code that does not change across loop iterations. The assignment to x = 2 does not change with each iteration. (If the assignment were x = i + 1 instead, it would not qualify as a "loop invariant.") The second half of the optimization, "code motion", permits us to move code (code motion = moving code) that does not change outside of the loop—if safe. I'm emphasizing "if safe" because, if the loop condition were never true (e.g. if we changed the condition to i != 0), x = 2 should never occur.

And that, in a nutshell, is loop-invariant code motion: moving code that doesn't change outside of the loop. The code should now look like this, in high-level code:

int main(void) {
    int x;
    x = 2; // moved outside loop
    for (int i = 0; i < 10; i++);
    return 0;
}

However, we can dive deeper into modern compilers. After loop-invariant code motion, the compiler can perform a second pass: dead code elimination. The for loop is no longer doing anything, and it gets eliminated. The variable x is written to but never read; thus, it can also be eliminated. The final program, in high-level code:

int main(void) {
    return 0;
}

At a low level, if we compile the dead code, gcc -S -O2 finally gives us proper code:

int main(void) {
    int x;
    x = 2; // moved outside loop
    for (int i = 0; i < 10; i++);
    return 0;
}
main:
        xor     eax, eax
        ret

In the end, my family member reported back that the correct answer was indeed "no." I asked if she gave my detailed explanation. She said, "No, because the professor would know I cheated." (?)

There's a question relating to loop-invariant code motion (via gcc -fmove-loop-invariants) on Stack Overflow dating back 5 years, but the only answer is incorrect. The proper answer is buried in the gcc documentation:

Most optimizations are only enabled if an -O level is set on the command line. Otherwise they are disabled, even if individual optimization flags are specified.

Source: GCC Manual 3.10 Options That Control Optimization

The DCE Metaprogramming Pattern: Writing JavaScript Libraries using JS++

One of the primary concerns when writing JavaScript is the issue of cross-browser compatibility. As web browsers continue to evolve and change quickly, it is important for your code to support this rapid change and to provide a consistent user experience across all web browsers and devices. However, while JavaScript libraries like Modernizr provide a web GUI interface for you to manually select and de-select which components you need, experience has shown it is better if this process is seamless and automatic—or, rather, the compiler or build tool should automatically know which components you need or do not need. This automatic process is known as dead code elimination (DCE).

In theory, DCE performs best with static typing and static structure. Since JS++ is the first sound gradually typed language, it is possible for DCE to perform optimally, such as in the case of identifying which function overloads are necessary. This is, generally, not possible with other JavaScript supersets or with JavaScript itself. In particular, given JavaScript’s most popular runtime environment is execution via JIT engines like V8 (for Google Chrome), it is natural that DCE provides advantages by reducing parse times, analysis times, and compile times for JIT environments; thus, page load times are reduced and responsiveness improves. Beyond JIT execution, avoiding the execution of irrelevant operations reduces program running time, and smaller program sizes allow websites to load faster by reducing network payloads.

In this article, we will explore writing a JavaScript library via JS++, which supports DCE, and subsequently show you how that library can be used by JavaScript developers with no knowledge of JS++. In addition to “automatic” and seamless DCE, which is built into the JS++ language, we will also explore “programmable DCE”—a metaprogramming pattern unique to JS++.

What is Dead Code Elimination (DCE)?

Dead code elimination (DCE) means that if you don’t use a module, class, or function, it will not be compiled into your final output.

Here’s a simple example:

void A() {}
void B() {}

A();

In the above example, only the function A gets called. The function B never gets called and never gets executed. Therefore, it is safe for the compiler to not include B in the final compiled output.

A real-world example of the need for dead code elimination in JavaScript (but not JS++) is jQuery. You have to include the entire jQuery library even if you only use one function.

In JS++, it is sufficient to just write the code and compile it. Dead code elimination in JS++ is seamless and automatic—it ships as a default with JS++. The JS++ compiler is able to determine that the function B is never used, and it will not compile the code for function B—by default. In most cases, if all you want is dead code elimination, this is all you need to know, but, for cross-browser and cross-device/mobile development, it is useful to explore more sophisticated DCE.

Programmable DCE: Mobile & Library Development

Arguably, the most important reason to understand DCE is for library development. We will explore library development for mobile devices by example. At the time of this writing, the HTML5 Vibrate API is not supported on the iPhone. In a very basic example, we will develop a small library that allows the user of the library to specify which phones he wants to support (iPhone or Android), and the library will provide notifications to the end user based on the features supported by the requested device(s). We will name this library: Notify.

class Notify
{
}

Save the file as Notify.jspp.

Next, let’s define the implementation:

import Externals.DOM;

class Notify
{
    private static void vibrate() {
        window.navigator.vibrate(2000);
    }
    private static void infobox() {
    	var el = document.createElement("div");
    	el.style.border = "1px solid #000";
    	el.innerText = el.textContent = "You have a new notification.";

    	document.body.appendChild(el);
    }
}

The first line, which imports the Externals.DOM module, allows us to use the JavaScript DOM API.

The method vibrate does exactly what the method name suggests: it will make the phone vibrate (for supported devices).

Finally, the infobox method creates a DIV element and inserts it into the DOM.

Both methods are private and static. The reason is because these methods are platform-specific implementation details. For the library user, we only want to expose to him whether he wants iPhone notifications or Android notifications. Here’s how we expose this to the user:

import Externals.DOM;

class Notify
{
    private static void()? iphoneNotify = null;
    private static void()? androidNotify = null;

    public Notify(int platforms) {
    }
	
    public static property int IPHONE() {
    	iphoneNotify = infobox;
    	return 1 << 0;
    }
    public static property int ANDROID() {
    	androidNotify = vibrate;
    	return 1 << 1;
    }
    
    private static void vibrate() {
        window.navigator.vibrate(2000);
    }
    private static void infobox() {
    	var el = document.createElement("div");
    	el.style.border = "1px solid #000";
    	el.innerText = el.textContent = "You have a new notification.";

    	document.body.appendChild(el);
    }
}

Here we are defining two getter methods: IPHONE and ANDROID. These two getter methods will allow the user to specify which platforms he wants to support. In order to specify the desired platforms, we instantiate the library like so:

new Notify(Notify.IPHONE | Notify.ANDROID); // iPhone *and* Android support
new Notify(Notify.IPHONE);  // iPhone support only
new Notify(Notify.ANDROID); // Android support only

You can try compiling with the variations in the instantiation and confirm that, indeed, only the specified platform code is compiled into the final output. In essence, we get “programmable DCE.” Furthermore, rather than specifying an int return type on the getter methods, one can define an enum to create a more specific type, but this is left as an exercise to the reader.

While the example we’ve explored is very basic, in real-world applications with complex dependency graphs, a library user can experience significant reductions in code size.

Exporting to JavaScript

While JavaScript cannot support DCE—and especially not the advanced DCE patterns of JS++—we can still “pre-DCE” our code before shipping it to JavaScript users. In order to do this, we should first wrap our class in a module. In Notify.jspp:

module NotifyLib
{
    class Notify
    {
        // ...
    }
}

In JS++, there is a toExternal design pattern for exposing JS++ code and libraries to JavaScript users. We need to define a `toExternal` method in our class:

module NotifyLib
{
	class Notify
	{
		// ...

		public function toExternal() {
			void() send;
			
			if (null != iphoneNotify) {
				send = iphoneNotify ?? dummy;
			}
			else if (null != androidNotify) {
				send = androidNotify ?? dummy;
			}
			else {
				send = dummy;
			}
			
			return {
				send: void() {
					send();
				}
			};
		}
		
		private static void dummy() {
			/* INTENTIONALLY EMPTY */
		}
	}
}

Notice how, in the toExternal method, we are transitioning from static to dynamic programming. We use if statements to determine, at runtime, which method to execute. In statically-typed programming languages, one would normally have the compiler resolve the method(s) to call. The purpose of the toExternal design pattern in JS++ is to facilitate complex transitions between the static and dynamic worlds.

Next, we create three files:

  1. Notify.iPhone.jspp
  2. Notify.Android.jspp
  3. Notify.All.jspp

In this tutorial, we will implement Notify.iPhone.jspp, and the other files are left as an exercise for the reader.

In Notify.iPhone.jspp:

import NotifyLib;
import Externals.JS;

auto notify = new Notify(Notify.IPHONE);
global.Notify = notify.toExternal();

First, we import the NotifyLib library. We also import Externals.JS, which defines all JavaScript (ECMAScript 3) symbols as external (such as `Math`, `Array`, `Object`, and so forth). However, Externals.JS does define one symbol that is not in the ECMAScript 3 specification: global. It gives us universal access to JavaScript’s global object, and this non-standard object was added for convenience so that JS++ users would not need to learn all the edge cases that come with trying to access JavaScript’s global scope (such as window being a DOM API object that is not defined in Node.js). Once we are able to access JavaScript’s global scope, we just export our JS++ library to it by converting it to the `external` type (via calling the toExternal method).

Compile Notify.iPhone.jspp:

> js++ Notify.iPhone.jspp Notify.jspp -o Notify.iPhone.js

Now, it should be straightforward to use the `Notify` library you developed completely in JS++ from plain JavaScript:

<!DOCTYPE html>
<html>
<head>
<title>Notify</title>
</head>
<body>
<script type="text/javascript" src="Notify.iPhone.js"></script>
<script type="text/javascript">
Notify.send();
</script>
</body>
</html>

The example for the iPhone above will insert a DOM notification on page load. (Note that, for Android devices, which will vibrate, user interaction is required before the vibration will trigger on the phone for security reasons. Keep the security restriction in mind when compiling Notify.Android.js.) As you compile the remaining files, you will observe that the file sizes are very different—reflecting how only the code for the specified platforms are shipped.

Conclusion

In this article, we have learned several advanced techniques unique to JS++, from programmable DCE to exporting an entire library to JavaScript. It should be clear JS++ is a powerful language, but, since its libraries can be used in JavaScript, there is no “vendor lock-in.”

Tips & Tricks: Object-oriented Sorting in JS++ with IComparable<T>

JS++ makes object-oriented sorting easy with the IComparable<T> interface and the Comparison enumeration for type-safe (and readable) comparisons.

Here’s the code. (Don’t worry; I’ll dissect it.)

import System;

class Employee : IComparable<Employee>
{
    private string firstName;
    private string lastName;

    public Employee(string firstName, string lastName) {
        this.firstName = firstName;
        this.lastName = lastName;
    }

    public Comparison compare(Employee that) {
        // Sort by employee surname
        return this.lastName.compare(that.lastName);
    }

    public override string toString() {
    	return this.firstName + " " + this.lastName;
    }
}

Employee zig  = new Employee("Zig", "Ziglar");
Employee john = new Employee("John", "Smith");
Employee abe  = new Employee("Abe", "Lincoln");

Employee[] employees = [ zig, john, abe ];
employees.sort();
Console.log(employees.join(", "));

// Output:
// Abe Lincoln, John Smith, Zig Ziglar

This is beautiful, object-oriented code. All of the custom sorting logic is one line of code. Let’s break down how that happens step-by-step.

1. Implement IComparable<T>

The first step is to implement the IComparable<T> interface. The interface provides only one method to implement: compare.

compare expects the Comparison enumeration as a result. As we can see from the documentation, Comparison can have three possible results: LESS_THAN, GREATER_THAN, and EQUAL. While Java/C# expect -1, 0, and 1, JS++ gives you type-safe and readable comparisons.

IComparable<T> and Comparison form the basis for custom sorting.

2. Determine how to sort

We want to sort Employee objects based on the employee’s last name. In order to do this, we want to compare strings and sort in alphabetical order. While we can do this manually, the JS++ Standard Library already provides these comparisons for us for primitive types.

All primitive types in JS++ are auto-boxed. (Don’t worry, it gets optimized away.) In addition, all primitive types implement IComparable<T> (which provides the compare method).

Thus, since all primitive types provide the compare method, sorting is as easy as this one line of code:

return this.lastName.compare(that.lastName);

This is calling the System.String.compare method, which compares strings lexicographically (in alphabetical order). (Likewise, if you wanted to compare by employee ID number, you might declare an unsigned int and use System.UInteger32.compare.)

Thus, our sorting code and implementation of IComparable<T>.compare is just:

public Comparison compare(Employee that) {
    // Sort by employee surname
    return this.lastName.compare(that.lastName);
}

3. Define toString() Behavior

In addition, we want to be able to easily visualize our sorted arrays. Therefore, we should define how our Employee class looks when converted to a string so we can easily call System.Console.log on it.

JS++ internal types use a “unified type system” where everything inherits from System.Object. If we look at the System.Object.toString documentation, we can see that System.Object.toString is a virtual method based on its signature:

public virtual string toString()

We override it with this code:

public override string toString() {
    return this.firstName + " " + this.lastName;
}

Thus, whenever we want a string representation of our Employee object, we will get the employee’s first name followed by his last name. This will help us visualize our sorted employees.

4. Instantiate some Employees

The next lines of code instantiate the Employee class and inserts them in an array:

Employee zig  = new Employee("Zig", "Ziglar");
Employee john = new Employee("John", "Smith");
Employee abe  = new Employee("Abe", "Lincoln");

Employee[] employees = [ zig, john, abe ];

Currently, the array is unsorted, and “Zig Ziglar” will be the first element.

5. Sort the Array

Sorting is as simple as one line of code:

employees.sort();

It’s just one line of code because we implemented IComparable<T>. Instead of implementing IComparable<T>, we could have also used the other overload of Array.sort, which expects a callback:

employees.sort(Comparison(Employee a, Employee b) {
    return a.lastName.compare(b.lastName);
});

The callback allows flexibility; for example, you may choose to sort by employee first name in some cases.

Implementing IComparable<T> simply provides a default sort so you can use System.Array.sort without a callback. These are the signatures for the System.Array.sort overloads:

public T[] sort() where T: IComparable<T>
public T[] sort(Comparison(T element1, T element2) comparator)

Thus, if you do not provide a callback, you are using the overload that expects a class implementing IComparable<T>. If you try to sort objects whose respective classes do not implement the IComparable interface, you’ll receive an error:

[  ERROR  ] JSPPE5056: System.Array.sort()' can only sort classes implementing 'IComparable'. Please implement 'IComparable' for `Employee' or use 'System.Array.sort(Comparison(T element1, T element2) comparator) at line 23 char 0 at test.js++

6. Print the Result

The final step is to just print the result:

Console.log(employees.join(", "));

Et voila!

(The toString method we implemented earlier will get called for each element that gets joined. Thus, you get a readable output.)

Tips & Tricks: Overriding ‘toString’

JS++ has a default ‘toString’ method implementation but, sometimes, it is necessary to override this implementation. For example, when using Console.log, it may be desirable to be able to fully log and inspect a complex JS++ object.

In addition to the Unified External Type, there is also a “Unified Internal Type”: System.Object. All JS++ classes, including user-defined classes, inherit from System.Object. Due to auto-boxing, even primitive types such as int (wrapped by System.Integer32), inherit from System.Object.

Aside: Don’t worry about the performance implications of auto-boxing. JS++ is able to optimize auto-boxing to the point that toString is actually 7.2% faster in JS++ than JavaScript in the worst case (assuming the JavaScript variable is monomorphically-typed) and more than 50% faster for polymorphically-typed (and potentially type-unsafe) JavaScript variables as shown in benchmarks here.

System.Object has a toString method which is marked as virtual. In other words, this method can be overridden by derived classes – which are effectively all classes in JS++. Here’s an example of how to do it:

import System;

class Point
{
    int x;
    int y;

    Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    override string toString() {
        return "(" + x.toString() + ", " + y.toString() + ")";
    }
}

Point p = new Point(1,2);
Console.log(p); // "(1, 2)"

You’ll notice the Console.log statement doesn’t even make an explicit toString call. The reason is because passing any JS++ object to Console.log will call the toString method on the object for you.

Tips & Tricks: fromExternal/toExternal Design Pattern

JS++ provides toString and fromString (one example) methods in the Standard Library. However, it can be argued that the external type is just as important or even more important in JS++ as string.

We introduce a design pattern for converting complex user-defined JS++ types (such as classes) to JavaScript.

toExternal

You can define a toExternal method that enables you to convert an object of an internal type to external:

import System;

class Point
{
    int x;
    int y;

    Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    function toExternal() {
        return {
            x: x,
            y: y
        };
    }
}

Point p = new Point(2, 3);
var p2 = p.toExternal(); // conversion to 'external'
Console.log(p2.x); // 2
Console.log(p2.y); // 3

fromExternal

Likewise, you can convert incoming JavaScript data to a complex, user-defined JS++ type:

import System;
import System.Assert;

class Point
{
    int x;
    int y;

    Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    static Point fromExternal(obj) {
        assert(typeof obj == "object", "Expected external 'object' type");
        assert(typeof obj.x == "number", "Expected incoming external to have numeric 'x' property");
        assert(typeof obj.y == "number", "Expected incoming external to have numeric 'y' property");
        return new Point(Integer32.fromString(obj.x), Integer32.fromString(obj.y));
    }
}

Point p1 = Point.fromExternal({ x: 2, y: 3 });
// Point p2 = Point.fromExternal({ x: "x", y: 3 }); // this will fail
// Point p3 = Point.fromExternal({ x: 2, y: "y" }); // this will fail

Protecting References

For functions, you don’t want to send out a direct reference to external JavaScript. Otherwise, external JavaScript code can modify the JS++ reference in unsafe ways. Therefore, you should wrap the function using closures:

class Foo
{
    void bar() {}

    function toExternal() {
        Foo self = this;

        return {
            bar: void() {
                self.bar();
            }
        };
    }
}

Furthermore, you can use the Standard Library’s System.BoxedExternal to handle this case without wrapping in a closure:

import System;

class Foo
{
    BoxedExternal bar;

    Foo() {
        this.bar = new BoxedExternal(void() {
        // ...
        });
    }
 
    function toExternal() {
        return {
            bar: this.bar
        };
    }
}

If the reference to the function accidentally escapes to external, you’ll be alerted by the compiler:

[ ERROR ] JSPPE5000: Cannot convert `System.Dictionary‘ to `external’ at line 14 char 15

However, if you actually intended to allow the function reference to escape to external, you can call the unbox method on System.BoxedExternal:

import System;

class Foo
{
    BoxedExternal bar;

    Foo() {
        this.bar = new BoxedExternal(void() {
            // ...
        });
    }
 
    function toExternal() {
        return {
            bar: this.bar.unbox()
        };
    }
}

The above code will now compile and the bar function can be passed to external code. However, unlike the code where we wrapped the function in a closure, the external code can now modify the reference to the bar function directly so be careful.

For arrays and containers, you can likewise pass a shallow copy or manually clone each element – depending on the level of trust and safety you desire.

Tips & Tricks: Only Fields are ‘private’ by Default

Programmers often complain about the verbosity of Java. Once you specify all the modifiers that must be applied, it’s not difficult to see how it can quickly become verbose:

public static void veryLongNamingConventions() {
    // ...
}

JS++ does this differently. Following the OOP principle of encapsulation, JS++ provides convenient default rules for access modifiers.

By default, only fields (variable members of classes) are private. All other class members – such as methods, getters, setters, and constructors – are public by default.

This makes it very easy to write concise code:

class Point
{
    int x, y;

    Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    int getX() { return this.x; }
    int getY() { return this.y; }
}

In the above code, the fields x and y are private. Meanwhile, the constructor and the getX/getY methods are all public. We can be explicit and manually specify the access modifiers, but it’s not necessary in JS++.

Tips & Tricks: Structural Typing (in a Nominal Type System) with ‘auto’ Type Inference

By default, JS++ uses nominal typing over structural typing. However, structural typing can be achieved using auto.

The following is a contrived example but illustrates a consideration in your application when using auto:

class Foo
{
    bool isError() { return false; }
}

auto foo = new Foo();
if (foo.isError()) {
    exit(1);
}

Now, during refactoring:

class Foo
{
    bool isError() { return false; }
}
class Bar
{
    bool isError() { return true; }
}

auto foo = new Bar(); // type changed
if (foo.isError()) { // this effectively becomes static duck typing
    exit(1);
}

Since the variable foo above effectively becomes “duck typed” (or, more precisely, structurally typed), it becomes harder for the compiler to potentially catch your errors.

While the above example is contrived, it’s not difficult to see how this can crop up in more complex applications.

Metaprogramming: Programmable DCE (Dead Code Elimination) in JS++

Problem:

You’re writing a library in JS++. You need it to support as many web browsers as possible.

Some of your users’ visitors are on latest Chrome and support the latest HTML5 features. Some of your users’ visitors are from developing nations with limited mobile devices. Some of your users’ visitors are corporate users with legacy web browsers running intranet applications, and, if it isn’t broken, they won’t “fix” it.

In other words, you need to support as many web browsers as possible. However, adding polyfills for all browsers will slow down the library for all users: more code to deliver over the network, more parsing and initialization time on mobile, more code to execute at runtime, and so on.

Solution:

Programmable Dead Code Elimination (DCE).

In the same way that template metaprogramming was “accidentally” discovered in C++, programmable DCE was a technique that we never originally intended but stumbled upon. I’m filing under advanced JS/JS++ tips & tricks. You should know how to use bitmasks, getters, classes, and JS++ types (because JS++ is not a linter like TypeScript).


What is Dead Code Elimination (DCE)?

Dead code elimination (DCE) means that if you don’t use a module, class, or function, it will not be compiled into your final output.

Here’s a simple example:

void A() {}
void B() {}

A();

In the above example, only the function A gets called. The function B never gets called and never gets executed. Therefore, it is safe for the compiler to not include B in the final compiled output.

A real-world example of the need for dead code elimination in JavaScript (but not JS++) is jQuery. You have to include the entire jQuery library even if you only use one function.

Live Example: JS++ Sockets (Real-time Streaming)

The JS++ Sockets library provides low-overhead real-time streaming even for legacy web browsers. It’s “low overhead” because we avoid long polling in almost every case except for the very first Android phones (HTC Dream and the like). Therefore, you get real-time streaming without the HTTP header overhead from constant polling, and you don’t have to deal with the delays of polling. (This can be useful for multiplayer games, corporate finance applications on legacy browsers where excessive latency is undesirable, or just to save a pretty penny on your bandwidth bill.)

JS++ has had support for dead code elimination since October 2016 (version 0.4.2). However, it wasn’t until we worked on JS++ Sockets that we encountered the need for “programmable” dead code elimination. In addition, JS++ users have been asking us to stop shipping polyfills for old browsers they don’t need because all their customers are on modern browsers.

JS++ Sockets can be instantiated like so:

new Socket(
    "127.0.0.1",
    8081,
    Socket.WEBSOCKETS | Socket.IE6 | Socket.IE7
);

It’s in the bitmask where all the magic happens. If your customers don’t use IE6, don’t include the Socket.IE6 option. If your customers use ONLY IE7 (like some big companies supporting legacy code), you can specify only the IE7 option. This gives the users of your library the flexibility; they’re not shipping thousands of lines of code they don’t need.

What’s interesting is that the options in the bitmask are actually getter functions. It only looks like an enum-based bitmask. We could have just as easily made them “regular” functions, and it becomes easier to visualize how “programmable DCE” happens:

new Socket(
    "127.0.0.1",
    8081,
    Socket.WEBSOCKETS() | Socket.IE6() | Socket.IE7()
);

Remember the basic principle of DCE: if a function isn’t called, it isn’t included in the final output. When you view the above code as function calls – rather than bitmasks – you can begin to visualize how we can enable the user to customize the polyfills she wants. If the user did not call Socket.IE6(), the IE6 polyfills will not be included.

Even if your function is a 1000+ line monster with complex polyfill code, if it isn’t called, it isn’t compiled.

The Programmable DCE Pattern

In a nutshell, programmable DCE happens like this:

  1. Define a private static polyfill method.
  2. Define a public static getter that returns a bitmask-compatible value and invokes the relevant private static polyfill method.
  3. Repeat steps 1-2 for each browser/polyfill you need to support.
  4. Define a constructor that accepts bitmasked options.

Everything is perfectly encapsulated. When the user instantiates your class, they will need to use the bitmask. The bitmask values invoke getters with custom logic. The custom logic invokes the polyfills the user needs.