Thursday, May 29, 2014

The JavaScript technology stack

Context

I've been developing with the JavaScript language since 1999. I was then involved in the development of an administrative console for clusters of calendar servers. Coming from an experience in Windows development with the MFC environment, I quickly implemented the MVC pattern in the browser with a set of frames: one frame taking all the visible space to render the interface, another frame kept hidden (with height="0") for exchanging data with my service (implemented in C++ as a FastCGI module), and the frameset to save the state.
Later, while working for IBM Rational, I had the chance to discover the Dojo Toolkit in its early days (v0.4). After that experience, even if I extended my expertise on mobile (mainly native Android and cross-platform with Unity), I continued to contribute to Web projects. However, I did not have a chance to start a project from scratch, always having to deal with legacy code and constrained schedules... until I joined Ubisoft last February!

The presentation layer and user's input handling

Most of "Web developers" are more hackers than developers: they use PHP on the server to assemble HTML components, they use JavaScript to control the application behaviour in the browser, they control the interface from any place, and they deploy often without the safety net of automated tests... Their flexibility is a real asset for editorial teams but a nightmare for the quality assurance teams!
Here are my recommendations for the development of Web applications:
  • Use the server as little as possible when preparing the views, just to detect the browser user agent and the user's preferred language in order to deliver the adequate HTML template;
  • Define all the presentation elements with HTML, CSS, and images. Don't use the JavaScript to create HTML fragments on the fly, only if it clones parts of the HTML template;
  • Use the JavaScript to inject data and behaviour: switch from one section to another, submit a request to the server, display data and notifications, etc.
The main advantage of letting the presentation coded with HTML, CSS, and images is that it can be delegated to designers. Developers can take the materials designers have produced with Dreamweaver, for example, and insert the identifiers required to connect the JavaScript handlers. Or they provided the initial skeleton instrumented with these identifiers, and designers iterate over them freely. There are so many tools to optimize the presentation layer:
  • HTML minifier, going up to drop optional tags;
  • CSS minifier and tools to detect unused rules, like uncss;
  • Image optimizer and sprite generators.
Designers can then focus on defining on the best interface regardless optimization. The only limitations: don't let them introduce or depend on any JavaScript library!
As for choosing a JavaScript library, I recommend to consider the following points:
  • A mechanism to define modules and dependencies. The best specification is AMD (for Asynchronous Module Definition), implemented by RequireJS, Dojo, and jQuery, among others. Note that AngularJS has its own dependency injection mechanism.
  • Until ES6 (or ECMAScript 6) makes it standard, use a library that provide an implementation of Promise, especially useful for its all() method;
  • A test framework that reports about the code coverage. IMHO, without the code coverage statistics, it's difficult to judge the quality of the tests, then determine our ability to detect regressions early...
  • Even if you cannot rely on a dispatcher that detects user-agents and preferred languages, use the functionality of the has.js library or equivalent to expose methods for test purposes. Coupled with a smart build system, the exposed methods will be hidden in production.
  • Just minifying the JavaScript code is not sufficient. It's important to have the dead code removed (especially the one exposed just for test purposes). The Google Closure compiler should be part of your tool set.

The data access layer

In the last 10 years, my back-end services were always implemented in Java, sometimes with my own REST compliant library, sometimes with other libraries like Spring, RestEasy, Guice, etc. Java is an easy language to develop with and, with all the available tooling, ramping up new developers is not difficult. Without counting services like Google App Engine which hosts low profile applications for free.
On the other end, Java is really verbose and not done for asynchronicity. It lacks also the support of closures. And, as with many programming languages, nothing prevents you to use patterns to clarify the code behaviour.
At Ubisoft, I've been given the opportunity to host the back-end services on Node.js. The major factor in favor to Node.js is its WebSocket support (not yet decided between ws and engine.io). The second factor is related to the nature of the application: 99% of the transactions between the clients and the server are short lived. Operations that require long computations are handled by services based on Redis and Hadoop. And finally, Node.js scales well.
When I joined the project, the team had a small working environment made a-la Node.js style: no clear definition of dependencies, a lot of nested callbacks to handle asynchronous behaviours, all presentation built with Jade, no commonly adopted patterns to organize the code logic, no unit tests (obvious as nested callbacks are nearly impossible to test!). I then rebooted the project with a better approach for the separation of concerns:
  • AMD as the format to define modules, with the Dojo loader to bootstrap the application;
  • Express to handle the RESTful entry points
  • A layered structure to process the requests:
    • Resource: one class per entity, to gather data from the input streams, and forwarding them to the service layer;
    • Service: one class per entity, possibly communicating with other services when data should be aggregated for many entities;
    • DAO: one class per entity, controlling data from the file system, from MongoDB, from MySQL or from another service over HTTP.
  • A set of classes modelling each entities; I imposed this layer to serve two needs: 1) allow some restrictions of entities (attributes can be declared mandatory or read-only, or need to match a regular expression) and 2) support non destructive partial updates.
  • Unit tests to cover 100% of the implemented logic.
At this stage, the most complex classes are the base classes for the MongoDB and the MySQL DAOs (I judge their complexity by the tests that requires 3 times more code). But with the help of Promises, the code is elegant and compact ;)
/**
 * Select the identified resources, or all resources if not filter nor range are specified
 *
 * @Param {Object} filters bag of key/value pairs to be used as a filter the resources to be returned
 * @Param {Range} range limit the number of results returned
 * @Param {Object} order bag of key/value pairs to be used to order the results
 * @Return a Promise with a list of resources as the parameter of the onSuccess method
 *
 * @Throw error with code 204-NO CONTENT if the selection is empty, as a parameter of the onFailure method of the promise
 */
select: function (filters, range, order) {
    return all([this._select(filters, range, order), this._count(filters)]).then(function (responses) {
        range.total = responses[1];
        responses[0].range = range;
        return responses[0];
    });
},

// Helper forwarding the SELECT request to the MySql connection
_select: function (filters, range, order) {
    var query = this.getSelectQuery(filters, range, order),
        ModelClass = this.ModelClass;

    return this._getConnection().then(function (connection) {
        var dfd = new Deferred();

        connection.query(query, function (err, rows) {
            connection.release();
            if (err) {
                _forwardError(dfd, 500, 'Query to DB failed:' + query, err);
                return;
            }

            var idx, limit = rows.length,
                entities = [];
            if (limit === 0) {
                _forwardError(dfd, 204, 'No entity match the given criteria', 'Query with no result: ' + query);
                return;
            }
            for (idx = 0; idx < limit; idx += 1) {
                entities.push(new ModelClass(rows[idx]));
            }
            dfd.resolve(entities);
        });

        return dfd.promise;
    });
},

// Helper forwarding the COUNT request to the MySql connection
_count: function (filters) {
    var query = this.getCountQuery(filters);

    return this._getConnection().then(function (connection) {
        var dfd = new Deferred();

        connection.query(query, function (err, rows) {
            connection.release();
            if (err) {
                _forwardError(dfd, 500, 'Query to DB failed:' + query, err);
                return;
            }

            dfd.resolve(rows[0].total);
        });

        return dfd.promise;
    });
},
Code sample: the select() method of the MySqlDao, and its two direct helpers.
Few comments on the code illustrated above:
  • Each helper wraps the asynchronicity with callbacks of the MySQL plugin into Promise (via the Deferred class);
  • The main entry point relies on the Promise all() method to convey the request result only when the responses from the two helpers are ready;
  • The method _getConnection() returns a Promise which is resolved with a connection from the MySQL pool (i.e. from mysql.createPool().getConnection());
  • The method _forwardError() is a simple helper logging the error and rejecting the Promise; at the highest level, express use the error code as the status for the HTTP response;
  • The method _select() converts each results into an instance of the specified model, providing transparently a support for field validation and partial updates;
  • With the use of given ModelClass, this MySqlDao class acts like a Java and C# Generics.

The persistence layer

I'm not a DBA and will never be one. Regarding the piece of code above, one of my colleague proposed to update the query for the SELECT in order to get the record count at the same time. It would be repeated with each rows but it would save a round trip. At the end, I decided to keep the code as-is because it's consistent with the code of the MongoDB DAO. We'll measure the impact of the improvement later when the entire application is ready.
If I'm not an expert, I always had to deal with databases: Oracle 10g, IBM DB2, MySQL, PostgreSQL for the relational databases and Google datastore and MongoDB for the Non-SQL ones. My current project relies on MongoDB where player's session information are stored and on MySQL which stores static information. I like working with MongoDB because of its ability to work with documents instead of rows of normalized values. It is very flexible, well aligned with entities used client-side. And MongoDB is highly scalable.
Once the DAOs have been correctly defined, implemented, and tested, dealing with any database at the service level is transparent. Developers can focus on the business logic while DBAs optimize the database settings and deployments.

The continuous integration

Java is an easy language to deal with. First, they are a lot of very good IDE, like Eclipse and IntelliJ. Then, they are plenty of test tools to help verifying the code behaves as expected&em;my favorites ones are JUnit, Mockito, and Cobertura. And finally Java applications can be remotely debugged, profiled, and even obfuscated.
In the past, I controlled the quality of my JavaScript code with JSUnit and JSCoverage. Now I recommend Intern to run unit tests efficiently with Node.js and functional tests with Selenium. I really like Intern because it's AMD compliant, it produces coverage reports, and it mainly do organize my tests as I want! A small run of around 1,000 unit tests by Node.js takes around 5 seconds. The functional test suite with 20 green-path scenarios takes 20 seconds to run in Firefox and Chrome in parallel.
Here is a small nonetheless important point about Intern flexibility:
  • I want my tests to work on modules totally isolated one from another. To have them isolated, I inject mocks to replace the injected dependencies in the module to be tested.
  • Intern suggested way requires:
    • Removing the module to be tested from the AMD cache, with require.undef([mid]);;
    • Replacing references of dependent classes by mock ones, with require({map: { '*': { [normal-mid]: [mock-mid] } });;
    • Reloading the module to be tested that will now use the mock classes instead of the original ones;
    • Calling and verifying the behaviour of the module.
  • Currently, I prefer instrumenting the modules with the help of dojo/has to be able to access private methods and replace on-the-fly dependent classes with mock ones. Each test injects the required mocks, and the afterEach test method restore all original dependent classes.
  • My Intern configuration file contains the definition used by dojo/has to expose test-friendly methods, while my index.html and my app.profile.js (used by the Dojo build system) leave it undefined. So these methods are not accessible from the browser, and not even defined in the built code.
  • With the help of Mockery, I can test everything, up to the classes controlling the access to MySQL, as illustrated above.
In the Java world, maven has replaced ant as the configuration tool of choice. In the JavaScript world, developers have to rely on many tools:
  • Node.js, npm, and bower to manage the libraries required server-side (npm) and client-side (bower);
  • Grunt to run administrative tasks like: building CSS file from Stylus ones, running the tests, compiling the code, deploying the built code, etc.
  • Intern produces test and coverage reports for Jenkins and TeamCity CI tools.

The development environment

My editor of choice is Brackets, by Adobe. It's a very powerful tool, still being actively developed, so continuously better than before. It has a lot of extensions, like an interactive linter and the Theseus debugger. And debugging and fixing extensions to fit your needs is very easy.
MongoDB consumes as much as memory as possible. To avoid cluttering your development environment while keeping your database on hand, I suggest you use vagrant to configure a virtual machine where MongoDB and your Node.js server run in isolation. Coupling a Vagrantfile with a provisioning script allows all your collaborators to benefit from the same configuration.
When it's time to push on production environments, check if there isn't a Grunt extension that can help you sending the compiled code via FTP or SSH on a remote machine or on Amazon Web Service or on Google Cloud Engine.
I hope this helps, Dom

No comments:

Post a Comment