{"id":463,"date":"2017-11-19T05:26:54","date_gmt":"2017-11-19T05:26:54","guid":{"rendered":"http:\/\/www.onux.com\/jspp\/blog\/?p=463"},"modified":"2017-11-19T06:33:53","modified_gmt":"2017-11-19T06:33:53","slug":"design-notes-why-isnt-system-array-length-an-unsigned-int","status":"publish","type":"post","link":"https:\/\/www.onux.com\/jspp\/blog\/design-notes-why-isnt-system-array-length-an-unsigned-int\/","title":{"rendered":"Design Notes: Why isn&#8217;t System.Array.length an &#8216;unsigned int&#8217;?"},"content":{"rendered":"<p>It makes sense, doesn&#8217;t it? Array sizes should be <code>unsigned<\/code> because they can never be negative. Yet, JS++ chose to make <a href=\"https:\/\/docs.onux.com\/en-US\/Developers\/JavaScript-PP\/Standard-Library\/System\/Array\/length\" rel=\"noopener\" target=\"_blank\">System.Array&lt;T&gt;.length<\/a> return a signed 32-bit integer (<code>int<\/code>). We&#8217;ve discussed this internally, and the underlying reasons are not so simple.<\/p>\n<h3>JavaScript<\/h3>\n<p>The most important reason is that this is a bug in JavaScript and ECMAScript 3&#8217;s original design. ECMAScript 3 15.4 specifies that Array#length is an unsigned 32-bit integer. However, it gets a bit tricky when you view these method signatures:<\/p>\n<ul>\n<li><code>T[] Array#slice(int start)<\/code> (ES3 15.4.4.10)<\/li>\n<li><code>T[] Array#slice(int start, int end)<\/code> (ES3 15.4.4.10)<\/li>\n<li><code>T[] Array#splice(int startIndex)<\/code> (ES3 15.4.4.12)<\/li>\n<li><code>T[] splice(int startIndex, int deleteCount)<\/code> (ES3 15.4.4.12)<\/li>\n<li><code>T[] splice(int startIndex, int deleteCount, ...T replaceElements)<\/code> (ES3 15.4.4.12)<\/li>\n<li><code>int Array#indexOf(T element)<\/code> (ES5 15.4.4.14)<\/li>\n<li><code>int Array#indexOf(T element, int startingIndex)<\/code> (ES5 15.4.4.14)<\/li>\n<li><code>int Array#lastIndexOf(T element)<\/code> (ES5 15.4.4.15)<\/li>\n<li><code>int Array#lastIndexOf(T element, int endingIndex)<\/code> (ES5 15.4.4.15)<\/li>\n<\/ul>\n<p>All of the above deal with array indexes as <em>signed<\/em> 32-bit integers even though the specification clearly states array lengths are <em>unsigned<\/em>. Specifically, if we indexed arrays using <code>unsigned int<\/code>, we would <strong>break<\/strong> JavaScript&#8217;s indexOf and lastIndexOf (because they return <code>-1<\/code> when the element is not found). This gets further complicated because <code>Array#push<\/code> and <code>Array#unshift<\/code>, which return <code>Array#length<\/code>, return <em>unsigned<\/em> 32-bit integers.<\/p>\n<p>Just know that I brought the proposal forward internally for indexing arrays as <code>unsigned int<\/code>, but I shut down my own proposal after the self-realization that it would break indexOf and lastIndexOf &mdash; it was just unacceptable.<\/p>\n<p>In other words, we were handicapped by JavaScript in our design (as we often are).<\/p>\n<h3>Java and C#<\/h3>\n<p>A lot of website backends are written in Java, C#, PHP, and &#8211; nowadays &#8211; JavaScript. JavaScript and PHP are dynamically-typed, so you don&#8217;t have to worry about signed\/unsigned, but this brings me to Java and C#.<\/p>\n<p>Java doesn&#8217;t have unsigned integer types. I actually feel like this can be a good design decision in some ways. It makes reverse array iteration intuitive and obvious: just flip the logic for forward random-access iteration around. Likewise, in C#, <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/27b47ht3(v=vs.110).aspx\" target=\"_blank\">List&lt;T&gt;.Count<\/a> returns a <em>signed<\/em> integer (32-bit). Just as in Java, reverse iteration with a <code>for<\/code> loop is just flipping the logic around.<\/p>\n<p>With signed integers, you don&#8217;t have to worry about integer overflow. If you perform forward iteration with:<\/p>\n<pre class=\"brush:c\">\r\nfor (int i = 0; i < list.Count; ++i);\r\n<\/pre>\n<p>Then, intuitively, reverse iteration might look like:<\/p>\n<pre class=\"brush:c\">\r\nfor (int i = list.Count - 1; i >= 0; --i);\r\n<\/pre>\n<p>Of course, this won't work for C\/C++ because, on the final iteration, you get integer overflow.<\/p>\n<p>Once again, in dynamic languages like JavaScript, you don't even have to worry about such things. It was all abstracted away by dynamic typing.<\/p>\n<h3>Reverse Array Iteration<\/h3>\n<p>Reverse array iteration over unsigned types becomes non-trivial. Anyone that has done this in C\/C++ will know what I mean. The <em>correct<\/em> way to do it is to do it in a way that takes integer overflow into account. In C and C++, array sizes are unsigned, and C doesn't have C++ reverse iterators. Here's the code in C:<\/p>\n<pre class=\"brush:c\">\r\nint arr[3] = { 1, 2, 3 };\r\nsize_t len = sizeof(arr)\/sizeof(arr[0]);\r\n\r\nfor (size_t i = len; i --> 0;) {\r\n    printf(\"%d\\n\", arr[i]);\r\n}\r\n<\/pre>\n<p>So you initialize to the length of the array (without subtracting 1) and <code>i --&gt; 0<\/code> is better formatted as <code>(i--) &gt; 0<\/code>. Thus, inside the loop body, you will only access - at most - <code>length - 1<\/code> and it will count down until zero.<\/p>\n<p>However, this isn't intuitive unless you come from a C\/C++ background, and most C\/C++ programmers are not web developers.<\/p>\n<h3>Conclusion<\/h3>\n<p>Reverse iteration in for loops may or may not be intuitive for you I didn't want users tearing their hair out over a basic programming exercise of iterating over an array backwards. Coupled with the fact that ECMAScript 3's original design was buggy, it only made sense to use <code>int<\/code> instead of <code>unsigned int<\/code> to avoid breaking old code from JavaScript.<\/p>\n<p>Oh, and <code>int<\/code> is just so much more pleasant to type than <code>unsigned int<\/code> with casts everywhere.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It makes sense, doesn&#8217;t it? Array sizes should be unsigned because they can never be negative. Yet, JS++ chose to make System.Array&lt;T&gt;.length return a signed 32-bit integer (int). We&#8217;ve discussed this internally, and the underlying reasons are not so simple. JavaScript The most important reason is that this is a bug in JavaScript and ECMAScript &hellip; <a href=\"https:\/\/www.onux.com\/jspp\/blog\/design-notes-why-isnt-system-array-length-an-unsigned-int\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Design Notes: Why isn&#8217;t System.Array.length an &#8216;unsigned int&#8217;?&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[3,2],"tags":[],"_links":{"self":[{"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/posts\/463"}],"collection":[{"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/comments?post=463"}],"version-history":[{"count":17,"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/posts\/463\/revisions"}],"predecessor-version":[{"id":480,"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/posts\/463\/revisions\/480"}],"wp:attachment":[{"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/media?parent=463"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/categories?post=463"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.onux.com\/jspp\/blog\/wp-json\/wp\/v2\/tags?post=463"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}