Sunday, May 15, 2011

Porting to JS 1.8.5 with GPSEE 0.3: Part III, Strings

SpiderMonkey 1.8.5 changed a number of internal details with respect to JSString representation. Fortunately, most of these changes do not affect the JSAPI embedder writing well-behaved applications.

The biggest changes are with respect to how the memory for the JSString storage is allocated. JSAPI no longer has functions for taking memory allocated by the embedder and pushing it to become the backing storefor a JSString object, nor will it GC-entrain alternate (C string) representations to accommodate the embedder.

JS_GetStringBytes

The JS_GetStringBytes() function has been removed from SpiderMonkey 1.8.5. This function previously returned an approximate representation of the JSString as a C string (the rules for moving from 16- to 8-bit strings vary based on JS_CStringsAreUT8()). This memory was allocated by the JS Garbage Collector, and the GC held a reference to it until the JSString was collected.

GPSEE 0.3 introduces gpsee_getStringBytes(), in a effort to restore this functionality. This allows fast updating of existing JS embeddings, and it makes it possible to write embeddings which continue to work where the newer interfaces (like JS_EncodeBuffer()) are not available.

Please be very careful when substituting gpsee_getStringBytes() for JS_GetStringBytes(): the memory allocation semantics are subtly different. The GPSEE version of this function will free the returned C string when the calling function returns (i.e. the string is alloated with alloca()). We suggest examining each call site, and making sure the resultant pointer does not escape the calling function; if it does, the string should be strdup'd and later free'd.

-  cstrId = JS_GetStringBytes(jstrId);
+  cstrId = gpsee_getStringBytes(cx, jstrId);

JS_GetStringChars

This interface was removed from JSAPI, but we have re-created it with GPSEE-0.3, with one important difference: the returned pointer is now const. This means that your embedding or GPSEE mdoules will probably generate a lot of warnings like this:

vm.c:234: warning: assignment discards qualifiers from pointer target type

The solution is to change your variable declarations from JSString * to const JSString * whenever they reference the return value of JS_GetStringChars(). This will also help you to audit your code for any call sites which try to modify string data where it is stored by the JS engine: this has always been unsupported, but it is now more dangerous than ever.

   JSString      *str;
-  jschar        *ptr;
+  const jschar  *ptr;

   str = JS_ValueToString(cx, argv[0]);
   if (!str)
    return JS_FALSE; /* OOM */
   argv[0] = STRING_TO_JSVAL(str);                  /* Provide a temporary GC root */
   ptr = JS_GetStringChars(str);
Aside
The assignment to argv[0] in this sample is an example of code using the old "root-as-you-go" GC safety technique. It is no longer necessary with SpiderMonkey 1.8.5, as str is stored on the C stack, and as such will be found by the new stack-scanning conservative garbage collector.

Porting to JS 1.8.5 with GPSEE 0.3: Part II, Fat JSValues

JavaScript 1.8.5 changed the underlying representation of jsval, the fundamental data type which represents every possible value a JavaScript variable can hold.

In previous versions of SpiderMonkey, jsval was respresented in JSAPI as a tagged 32-bit integer; the bottom three bits were reserved for type tagging information (two tagging bits re-used when it stored an integer), all pointers were 8-byte aligned, and floating numbers (and integers larger than 31 bits) were stored in the jsval as pointers to jsdouble.  The type checking and manipulation macros, i.e. JSVAL_IS_DOUBLE, INT_TO_JSVAL, etc, abstracted the relevent bit-shifting masking operations required to extract the relevant tagging information, but left visible to the API consumer the fact that doubles were  stored on the heap rather than in the jsval itself, like the < 31-bit integers.  (Recall that JavaScript does not actually have the concept of integers and floats, rather just "numbers" -- but we differentiate in JSAPI because C integers are so much faster to work with, and more commonly used in JavaScript programs).

In SpiderMonkey 1.8.5, we have "fatvals" - again, we have a C type named jsval which encodes all possible JavaScript values, but now it is represented by a 64-bit quanity, which is a C struct in debug builds.  This means that we can no longer use expressions like INT_TO_JSVAL(123) in a C switch statement, and we can no longer directly compare jsvals with the C == operator.

The new jsval representation also now directly encodes doubles, rather than pointers to doubles on the heap. This alone yields a significant performance boost, but it causes code breakage for those using the JSDOUBLE_ macros, rather than more flexible ways of converting numbers like JS_ConvertArguments() or JS_NewNumberValue().

GPSEE 0.3 provides macros to allow you to maintain the pre-1.8.5 pointer-to-double storage semantic for doubles.  These macros allow old code to be converted quickly, while maintaining backwards compatibility with older versions of JSAPI -- GPSEE 0.3 built against older-JSAPI preserves the new macros, manipulating the older datatypes appropriately.  The affected macros are DOUBLEPTR_TO_JSVAL and JSVAL_TO_DOUBLEPTR; they behave the same as DOUBLE_TO_JSVAL and JSVAL_TO_DOUBLE from previous versions of JSAPI.

We have also provided a replacement implementation of JS_NewDoubleValue() which was removed from JSAPI.

Monday, April 4, 2011

Porting to JS 1.8.5 with GPSEE 0.3: Part I, Native Constructors

Introduction

SpiderMonkey 1.8.5 was released last week; with it came a whole host of JS API changes. These changes will affect any non-trivial native (C, C++) modules written for GPSEE. This post starts the first post of a multi-part series detailing how to easily handle these API changes.

While these API changes are a little painful from the embedder's point of view, they are certainly worth it for anybody who is interested in fast JavaScript. Here is a graph showing performance improvements over time; we started GPSEE shortly before the release of JS 1.7:

Note: JS 1.8.5 support requires GPSEE 0.3, which has not been merged with mainline code yet. The unstable (and currently unusable) branch is available at http://code.google.com/r/wes-js185/.

Porting a Native Constructor

GPSEE 0.3 defines three macros for this task:
  • GPSEE_SLOW_CONSTRUCTOR(MyConstructor): emit a reference (function pointer) to the implementation of the constructor
  • GPSEE_IS_SLOW_CONSTRUCTING(cx): Inside your constructor, this is is equivalent to the JS 1.8.0 call JS_IsConstructing(cx).  It has the value JS_TRUE when your function was invoked as a constructor (i.e. with the new keyword).
  • GPSEE_DECL_SLOW_CONSTRUCTOR(MyConstructor): This macro call replaces the function prototype, and must be followed immediately by your constructor's function body.  This also emits the fast-to-slow shim function.
These macros always emit static functions. If you need to declare a constructor which is not static (not recommended), you can use the GPSEE_SLOW_CONSTRUCTOR(MyConstructor) macro to take its address and make it visible under another name.

You can use these macros with any version of JSAPI; they will do the right thing. (Note: JSAPI backwards compatibility via GPSEE is not well-tested at this point in time. If you are supporting an embedding which can build on JS 1.8.5 and an older version, please let us know!)

Here is what we had to do to "port" the Binary constructor:

-static JSBool Binary(JSContext *cx, JSObject *obj, uintN argc, jsval *argv, jsval *rval)
+GPSEE_DECL_SLOW_CONSTRUCTOR(Binary)
 {
   /* Binary() called as function. */  
-  if (JS_IsConstructing(cx) != JS_TRUE)
+  if (GPSEE_IS_SLOW_CONSTRUCTING(cx) != JS_TRUE)
     return gpsee_throw(cx, CLASS_ID ".constructor.notFunction: Cannot call constructor as a function!");
@@ later in the same file
   JSObject *proto =
       JS_InitClass(cx,             /* JS context from which to derive runtime information */
            obj,                    /* Object to use for initializing class (constructor arg?) */
            NULL,                   /* parent_proto - Prototype object for the class */
            &binary_class,          /* clasp - Class struct to init. Defs class for use by other API funs */
-           Binary,                 /* constructor function - Scope matches obj */
+           GPSEE_SLOW_CONSTRUCTOR(Binary), /* constructor function - Scope matches obj */



Piece o' cake, eh?

Monday, December 20, 2010

Wrapped modules with GPSEE 0.2 / GSR 1.0

There has been some discussion on the CommonJS mailing list lately about experimenting with wrapped modules in a move toward a new CommonJS module standard which works better with browser-side environments.

One of these suggestions has been to wrap modules in a module.declare statement, which looks something like this:
module.declare([/* optional dependencies */],
function (require, exports, module) {
/*
* Regular Modules/1.1 module goes here
*/
})
It is possible to retrofit this style of module declaration -- without breaking Modules 1.0 -- into a system based on GSR 1.0 and GPSEE 0.2. (GSR is the script runner which ships with GPSEE)

To add this feature, simply link your gsr binary to another file name -- say, /usr/bin/gsr2 -- and create a preload file (e.g. /usr/bin/.gsr2_preload) containing the following code:
module.constructor.prototype.declare = function(deps, factory) {
if (typeof deps === 'function')
factory = deps;

const ffi = require("gffi");
const JS_GetGlobalObject = new ffi.CFunction(ffi.pointer, "JS_GetGlobalObject", ffi.pointer);
JS_GetGlobalObject.jsapiCall = true;
const gptr = JS_GetGlobalObject(require("vm").cx);
const g = require("vm").objectPtrValue(gptr);

factory(g.require, g.exports, this);
};

With this simple patch, any scripts invoked with #! /usr/bin/gsr2 will be able to use modules written in the current CommonJS Modules/1.1 idiom, or with new the proposed module.declare() syntax.

Of course, when CommonJS blesses a new module format, we will add support for the new format, without breaking your current modules.

Wednesday, July 14, 2010

GPSEE, Valgrind and MacPorts

Running Valgrind in recent versions of TraceMonkey -- in particular versions since the addition of the conservative stack-scanning garbage collector -- requires special code to quiesce spurious noise from Valgrind.

This special code includes valgrind/valgrind.h, which in turn requires that the MacPorts include directory, /opt/local/include, be included in the path searched by the C precompiler.

Unfortunately, simply tweaking the global CFLAGS or CPPFLAGS on a project-wide basis doesn't work well with GPSEE, as we use the system iconv library. Including the header for GNU iconv, via MacPorts, will cause GPSEE to crash whenever we load a module which uses iconv, such as Binary/B or GFFI.

So, we need to include the MacPorts include directory when building TraceMonkey, but not GPSEE. We also need to pass the --enable-valgrind option to TraceMonkey's autoconf script. Here's how you do it:

# cd gpsee
# ./configure --prefix=/opt/local/gpsee --with-jsapi-build=DEBUG --with-jsapi-options=--enable-valgrind --with-jsapi-cppflags=-I/opt/local/include --with-build=DEBUG

# make build
# sudo make install


Presumably something very similar will also need to be done for users running Leopard+Homebrew or Solaris. Linux users are in the clear because they should not have iconv library/header conflicts.