Showing posts with label Technology. Show all posts
Showing posts with label Technology. Show all posts

Monday, October 5, 2009

Internationalization and my two-tiers-utils library

This is a follow-up article of Internationalization of GAE applications, which itself is part of the series Web Application on Resources in the Cloud.

In my initial article, I explained some hurdles of globalizing applications, especially the ones being implemented with many programming languages. In this article, I'm going to describe few use-cases and how my open-source library two-tiers-utils can ease the implementation. Here are the covered topics:
  1. Get the user's preferred locale
  2. Display messages in different locales
  3. Handle localized messages with different programming languages
  4. Generate the localized bundles per programming language
  5. Bonus

Get the user's preferred locale

For this use-case, let's only consider the Java programming language. Another assumption is the availability of the localized resources in the corresponding Java format (i.e. accessible via a PropertyResourceBundle instance).

In a Web application, the user's preferred locale can be retrieved from:
  • The HTTP headers:
    locale = ((HttpServletRequest) request).getLocale();
  • The HTTP session (if saved there previously):
    HttpSession session = ((HttpServletRequest) request).getSession(false);
    if (session != null) {
      locale = new Locale((String) session.getAttribute(SESSION_USER_LOCALE_ID));
    }
  • The record containing the user's information:
    locale = ((UserDTO) user).getPreferredLocale();
To ease this information retrieval, the two-tiers-utils library provides the domderrien.i18n.LocaleController class

Excerpt of public methods offered within domderrien.i18n.LocaleController

This class can be used in two situations:
  1. In a login form, for example, when we can just guess the desired locale from the browser preferred language list or from an argument in the URL.
  2. In pages accessible to identified users thanks to the HTTP session.
Usage example of the domderrien.i18n.LocaleController.detectLocale()
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<%@page
    language="java"
    contentType="text/html; charset=UTF-8"
    pageEncoding="UTF-8"
    import="org.domderrien.i18n.LabelExtractor"
    import="org.domderrien.i18n.LocaleController"
%><%
    // Locale detection
    Locale locale = LocaleController.detectLocale(request);
%><html>
<head>
    <title><%= LabelExtractor.get("dd2tu_applicationName", locale) %></title>
    ...
</head>
<body>
    ...
    <img
        class="anchorLogo"
        src="images/iconHelp.png"
        width="16"
        height="16"
        title="<%= LabelExtractor.get("dd2tu_topCommandBox_helpIconLabel", locale) %>"
    />
    ...
</body>
</html>

The same message in different locales

The previous example introduces also a second class: domderrien.i18n.LabelExtractor. Being given an identifier, an optional array of Object references, and a locale, the get static method loads the corresponding string from the localized resource bundle.

Excerpt of public methods offered within domderrien.i18n.LabelExtractor
A series of localized entries like en:“Welcome {0} {1}”, fr:“Bonjour {2} {1}”, and ja:“お早う {0}{1}” can be easily invoked with a simple command like: LabelExtractor.get("welcome_message", new Object[] { user.getFirstName(), user.getLastName() }, user.getLocale());.

The same message used from different programming languages

Java is a pretty neat language with a large set of editors and code inspectors. But Java is not the only languages used for Web applications. If the two-tiers-utils library provides nice Java features, the delivery of the same library interfaces for the programming languages JavaScript and Python libraries makes it way more valuable!

Code of the domderrien.i18n.LabelExtractor.get() method for the JavaScript language.
(function() { // To limit the scope of the private variables

    /**
     * @author dom.derrien
     * @maintainer dom.derrien
     */
    var module = dojo.provide("domderrien.i18n.LabelExtractor");

    var _dictionnary = null;

    module.init = function(/*String*/ namespace, /*String*/ filename, /*String*/ locale) {
        // Dojo uses dash-separated (e.g en-US not en_US) and uses lower case names (e.g en-us not en_US)
        locale = (locale || dojo.locale).replace('_','-').toLowerCase();

        // Load the bundle
        try {
            // Notes:
            // - Cannot use the notation "dojo.requirelocalization" because dojo parser
            //   will try to load the bundle when this file is interpreted, instead of
            //   waiting for a call with meaningful "namespace" and "filename" values
            dojo["requireLocalization"](namespace, filename, locale); // Blocking call getting the file per XHR or <iframe/>

            _dictionary = dojo.i18n.getLocalization(namespace, filename, locale);
        }
        catch(ex) {
            alert("Deployment issue:" +
                    "\nCannot get localized bundle " + namespace + "." + filename + " for the locale " + locale +
                    "\nMessage: " + ex
                );
        }

        return module;
    };

    module.get = function(/*String*/key, /*Array*/args) {
        if (_dictionary == null) {
            return key;
        }
        var message = _dictionary[key] || key;
        if (args != null) {
            dojo.string.substituteParams(message, args);
        }
        return message;
    };

})(); // End of the function limiting the scope of the private variables

The following piece of code illustrates how the JavaScript domderrien.i18n.LabelExtractor class instance should be initialized (the value of the locale variable can come from dojo.locale or a value injected server-side into a JSP page) and how it can be invoked to get a localized label.

Usage example of the domderrien.i18n.LocaleController.get()
(function() { // To limit the scope of the private variables

    var module = dojo.provide("domderrien.blog.Test");

    dojo.require("domderrien.i18n.LabelExtractor");

    var _labelExtractor;

    module.init = function(/*String*/ locale) {
        // Get the localized resource bundle
        _labelExtractor = domderrien.i18n.LabelExtractor.init(
                "domderrien.blog",
                "TestBundle",
                locale // The library is going to fallback on dojo.locale if this parameter is null
            );

        ...
    };

    module._postData = function(/*String*/ url, /*Object*/ jsonParams) {
        var transaction = dojo.xhrPost({
            content : jsonParams,
            handleAs : "json",
            load : function(/*object*/ response, /*Object*/ioargs) {
                if (response == null) {
                    // Message prepared client-side
                    _reportError(_labelExtractor.get("dd2tu_xhr_unexpectedError"), [ioargs.xhr.status]);
                }
                if (!response.success) {
                    // Message prepared server-side
                    _reportError(_labelExtractor.get(response.messageKey), response.msgParams);
                }
                ...
            },
            error : function(/*Error*/ error, /*Object*/ ioargs) {
                    // Message prepared client-side
                _reportError(error.message, [ioargs.xhr.status]);
            },
            url : url
        });
    };

    var _reportError = function(/*String*/ message, /*Number ?*/xhrStatus) {
        var console = dijit.byId("errorConsole");
        ...
    };

    ...

})(); // End of the function limiting the scope of the private variables

The following series of code excerpts show the pieces involved in getting the localized resources with the Python programming language.

LabelExtractor methods definitions from domderrien/i18n/LabelExtractor.py
# -*- coding: utf-8 -*-

import en
import fr
 
def init(locale):
    """Initialize the global dictionary for the specified locale"""
    global dict
    if locale == "fr":
        dict = fr._getDictionary()
    else: # "en" is the default language
        dict = en._getDictionary()
    return dict

Sample of a localized dictionary from domderrien/i18n/en.py
# -*- coding: utf-8 -*-
 
dict_en = {}
 
def _getDictionary():
    global dict_en
    if (len(dict_en) == 0):
        _fetchDictionary(dict_en)
    return dict_en
 
def _fetchDictionary(dict):
    dict["_language"] = "English"
    dict["dd2tu_applicationName"] = "Test Application"
    dict["dd2tu_welcomeMsg"] = "Welcome {0}."
    ...

Definitions of filters used by the Django templates, from domderrien/i18n/filters.py
from google.appengine.ext import webapp
 
def get(dict, key):
    return dict[key]
 
def replace0(pattern, value0):
    return pattern.replace("{0}", str(value0))
 
def replace1(pattern, value1):
    return pattern.replace("{1}", str(value1))
 
...
 
# http://javawonders.blogspot.com/2009/01/google-app-engine-templates-and-custom.html
# http://daily.profeth.de/2008/04/using-custom-django-template-helpers.html
 
register = webapp.template.create_template_register()
register.filter(get)
register.filter(replace0)
register.filter(replace1)
...

Django template from domderrien/blog/Test.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>{{ dictionary|get:dd2tu_applicationName }}</title>
    ....
</head>
<body>
    ...
    <div class="...">{{ dictionary|get:dd2tu_welcomeMsg|replace0:parameters.loggedUser }}</div>
    ...
</body>
</html>

Test handler from domderrien/blog/Test.py
from google.appengine.api import users
from google.appengine.ext import webapp
from google.appengine.ext.webapp import template

def prepareDictionary(Request):
    locale = request.get('lang', 'en')
    return LabelExtractor.init(locale)

class MainPage(webapp.RequestHandler):
    def get(self):
        parameters = {}
        parameters ['dictionary'] = domderrien.i18n.LabelExtractor.init(self.request)
        parameters ['loggedUser'] = users.get_current_user()
        path = os.path.join(os.path.dirname(__file__), 'domderrien/blog/Test.html')
        self.response.out.write(template.render(path, parameters))

application = webapp.WSGIApplication(
    [('/', MainPage)],
    debug=True
)
 
def main():
    webapp.template.register_template_library('domderrien.i18n.filters')
    run_wsgi_app(application)
 
if __name__ == "__main__":
    main()

Generate the localized bundles per programming language

In my previous post Internationalization of GAE applications, I suggest to use a dictionary format that would be programming lnaguage agnostic while being known by translator: TMX, for Tanslation Memory eXchange.

Snippet of a translation unit definition for a TMX formatted file
<tu tuid="dd2tu_welcomeMessage" datatype="Text">
 <tuv xml:lang="en">
  <seg>Welcome {0}</seg>
 </tuv>
 <note>{0} is going to be replaced by the logged user's display name</note>
 <prop type="x-tier">dojotk</prop>
 <prop type="x-tier">javarb</prop>
 <prop type="x-tier">python</prop>
</tu>

The two-tiers-utils library provides a Java runtime domderrien.build.TMXConverter that generates the resource bundles for Java/JavaScript/Python. If a simple series of XSL-Transform runs can do the job, the TMXConverter does a bit more by:
  • Comparing the modification dates of the generated files with the TMX one to generate them only if needed
  • Check the uniqueness of the label keys
  • Generate the list of supported languages
Invoking the TMXConverter runtime from an ant build file is very simple, while a bit verbose:

Ant target definition invoking the TMXConverter
<target name="step-tmx-convert">
    <mkdir dir="${temp.dir}/resources" />
    <mkdir dir="src/WebContent/js/domderrien/i18n/nls" />
    <java classname="domderrien.build.TMXConverter" fork="true" failonerror="true">
        <classpath refid="tmxconverter.classpath" />
        <classpath location="${temp.dir}/resources" />
        <jvmarg value="-Dfile.encoding=UTF-8" />
        <arg value="-tmxFilenameBase" />
        <arg value="${dd2tu.localizedLabelBaseFilename}" />
        <arg value="-sourcePath" />
        <arg value="${basedir}\src\resources" />
        <arg value="-jsDestPath" />
        <arg value="${basedir}\src\WebContent\js\domderrien\i18n\nls" />
        <arg value="-javaDestPath" />
        <arg value="${temp.dir}/resources" />
        <arg value="-languageFilenameBase" />
        <arg value="${dd2tu.languageListFilename}" />
        <arg value="-buildStamp" />
        <arg value="${dd2tu.stageId}" />
    </java>
    <native2ascii
        src="${temp.dir}/resources"
        dest="${temp.dir}/resources"
        encoding="UTF8"
        includes="*.properties-utf8"
        ext=".properties"
    />
    <copy
        file="${temp.dir}/resources/${dd2tu.localizedLabelBaseFilename}.properties"
        tofile="${temp.dir}/resources/${dd2tu.localizedLabelBaseFilename}_en.properties"
    />
    <mkdir dir="src/WebContent/js/domderrien/i18n/nls/en" />
    <copy
        file="src/WebContent/js/domderrien/i18n/nls/{dd2tu.localizedLabelBaseFilename}.js"
        todir="src/WebContent/js/domderrien/i18n/nls/en"
    />
</target>

With the TMX file as the source of thruth for the label definitions, it is just a matter of altering the value a <prop/> tag and running the build once again to move one label definition from one programming language to another. No more error prone copy-and-paste of text between different file formats!

Excerpt of the generated Java resource bundle
bundle_language=English
unit_test_sample=N/A
dd2tu_applicationName="Test Application"
dd2tu_welcomeMessage=Welcome {0}
...
x_timeStamp=20091001.1001

Excerpt of the generated JavaScript resource bundle
({bundle_language:"English",
unit_test_sample:"N/A",
dd2tu_applicationName:"Test Application",
dd2tu_welcomeMessage:"Welcome ${0}",
...
x_timeStamp:"20091001.1001"})

Excerpt of the generated Python class definition
# -*- coding: utf-8 -*-
 
dict_en = {}
 
def _getDictionary():
    global dict_en
    if (len(dict_en) == 0):
        _fetchDictionary(dict_en)
    return dict_en
 
def _fetchDictionary(dict):
    dict["_language"] = "English"
    dict["dd2tu_applicationName"] = "Test Application"
    dict["dd2tu_welcomeMsg"] = "Welcome {0}."
    ...
    dict["x_timestamp"] = "20091001.1001"

Bonus

The TMXConverter being part of the build process and going over all localized TMX files, it generates the list of supported languages.

JSP code fetching a HTML &ltselect/> box with the list of supported languages
<span class="topCommand topCommandLabel"><%= LabelExtractor.get("rwa_loginLanguageSelectBoxLabel", locale) %></span>
<select
    class="topCommand"
    dojotType="dijit.form.FilteringSelect"
    id="languageSelector"
    onchange="switchLanguage();"
    title="<%= LabelExtractor.get("rwa_loginLanguageSelectBoxLabel", locale) %>"
><%
    ResourceBundle languageList = LocaleController.getLanguageListRB();
    Enumeration<String> keys = languageList.getKeys();
    while(keys.hasMoreElements()) {
        String key = keys.nextElement();%>
        <option<% if (key.equals(localeId)) { %> selected<% } %> value="<%= key %>"><%= languageList.getString(key) %></option><%
    } %>
</select>

The following figures illustrates the corresponding code in action.

Part of a login screen as defined with the default (English) TMX file.


Part of a login screen as defined with the French TMX file.


Conclusion

The two-tiers-utils library is offered with the BSD-like license. Anyone is free to use it for his own purposes. but I'll appreciate any feedback, contribution, and feature requirement.

See you on github.com ;)
“May the fork be with you.”

A+, Dom

Friday, June 12, 2009

JavaOne Conference


Duke and myself ;)
It has been a long week in San Francisco while I was attending the JavaOne conference. Sessions started at 8:30 AM and finished after 9:30 PM!

Among all reviews that have been published, read this ones: Community day on Monday, Conference day 1 on Tuesday, Day 2 on Wednesday, Day 3 on Thursday, and Day 4 on Friday. Sun's website contains also a complete list of conference articles. Pictures of the event are available on Sun's Photo Center website.

I went there with two objectives:
  • See JavaFX in action;
  • Look at all efforts around JavaME and the MSA initiative.
JavaFX technology stack


Eric Klein, Vice President, Java Marketing, Sun Microsystems
I came without preconceived idea around JavaFX, just with my background as a JavaScript/Java developer and my knowledge of Flex and AIR.

The first deception came from looking at the scripting language: Man! they invented yet another language :( For sure, the JavaFX scripting language is nicely handled by NetBeans 6.5+ (except the code formatting) but new paradigms and new conventions are a big barrier to adoption to me. Can you figure out what the following code is doing?

public class Button extends CustomNode {
  public var up: Node;
  content: Node = up;
  public override function create(): Node {
    return Group {
      content: bind content
    }
}
If the visual effects to animate texts, pictures, and video are really great, the native support of a sound library is really missing! For example, I would expect to be able to synchronize KeyFrame objects with sound tracks but the KeyFrame.time attribute has to be set manually—not very flexible when it's time to change the sound track...

The pros are:
  • A coming visual editor to prepare clips;
  • A large library of image and video effects;
  • The multi-platform support, especially for mobile devices.
Platform dependent packaging is nicely handled: NetBeans project properties pane provides choices among {Standard Execution, Web Start Execution, Run in Browser, Run in Mobile Emulator}. As a Web developer, I am just sorry to see that the application window size cannot be expressed in percentage, that the auto-resizing is handled transparently, which is not better than usual Flex applications.

Last point: I have not seen how to invoke JavaFX handlers from JavaScript ones, and vice-versa, when the application is deployed to run in browsers. If you have a source for these information, please, drop the link in a comment.

JavaME technology stack


Christopher David, Rikko Sakaguchi and Patrik Olsson
during Sony Ericsson General Session
This was definitively the most interesting domain to me. During one session, it has been announced that 60% of shipped mobile devices in 2008 are Java-enabled. In 2009, the market share should grow up to 70%. In relation with the iPhone, Scott McNeally did this joke: the possible future Sun owner Larry Ellison might succeed to open the iPhone platform to Java because he is a well known friend of Steve Jobs.

Compared to the Java Standard Edition (J2SE) and the Java Enterprise Edition (J2EE), the Java Mobile Edition (J2ME) has more external contributors. Under the Mobile Service Architecture (MSA) initiative, a lot of mobile device manufacturers and telecommunication operators participate to the Java Community Process (JCP) to deliver Java Specification Requests (JSRs). Note that MSA itself is defined as a JSR: JSR 248. As of today, most of the recent phones are MSA 1.1 compliant (this is a mandatory requirement for the telco Orange, for example). Nokia and Sony Ericsson have shipped a lot of MSA-compliant handsets, LG, Samsung, and Motorola shipped very few ones. The standard MSA 2 (JSR 249) is being finalized and it contains the promising JSRs:
Additional APIs I imagine many developers are looking forward: JSR 257 Contactless communication API and JSR 229 Payment API.


MSA evolution and its JSR set
(from the MSA specification documentation)
(click to enlarge)

All major manufacturers have opened or are opening "App Stores" a-la Apple. They open also their development platforms. More companies will be able to adapt their software offering to mobile devices. Even Sony allows anyone to write application to run in Blu-ray Disc players. The main difficulty on developer-side is the fragmentation: there is no standard API allowing to discover the features supported by a device! Developers have to rely on each manufacgturer's feature list and on exception handling :(

The Blackberry platform is pretty well controlled and should be easy to develop on. Then follows Sony Ericsson which provides consistent phone classes (i.e. what works for one phone in the JP 8.4 class work for all phones in that class). The delivery of the Sun JavaME SDK 3.0 containing many third-party emulators (even one for Windows Mobile devices) added to better on-device deployment and on-device debugging capabilities, should motivate more and more developers.

I have not enough experience with Android (just got one Android dev phone two weeks ago) to compare it to the JavaME technology stack. I don't know neither about Symbian (Nokia devices) or LiMo (Motorola devices) platforms.

Exhibition hall

Besides the visit of mobile device manufacturers (RIM and Ericsson) booths, I visited:
  • Sun's project Fuji (open ESB) with a Web console using <canvas/> from HTML5, like Yahoo! Pipes.
  • Convergence, the Web client for Sun's communication suite, built on the top of Dojo toolkit ;)
  • INRIA (French national R&D lab) for its static code analysis Nit.
  • Isomorphic Software for its SmartClient Ajax RIA System.
  • eXo Platform (on OW2 booth) for its eXo Portal offering.
  • Liferay, Inc. for its eponymous portal.
Other discoveries


James Gosling and myself ;)
I attended very good presentations, like the opening keynote which was fun. Among the good presenters, I can mention (ordered alphabetically):
If you have a Sun Developer Network (SDN) account (by the way, it's free), you can view the slides of the technical sessions at: http://developers.sun.com/learning/javaoneonline/.

Special mention

I want also to mention the call to developers by MifOS people who have been awarded by James Gosling during the Friday morning general session. This organization develops open source software for microfinance institutions (MFIs) to help them managing loans and borrowers (see demo). Really nice initiative started by the Grameem Bank!

Excerpt from James Gosling Toy Show report:
Microfinancing Through Java EE Technology

Gosling next introduced a group whose great innovation with Java technology was social and not technical. Sam Birney, engineering manager and Mifos alumnus, and Van Mittal-Hankle, senior software engineer at the Grameen Foundation, took the stage to receive Duke's Choice awards for their work using Java EE to serve 60,000 clients worldwide in microfinancing, a highly successful means of helping poor people get small loans and start businesses.

Mifos is open-source technology for microfinance that is spearheaded by the Grameen Foundation.

"Sometimes excellence comes not from technical innovation but in how technology is used," explained Gosling. "This is an example of using Java technology to really improve people's lives."

With an estimated 1.6 billion people left in the world who could benefit from microfinance, the men put out a call for volunteers to contribute to the Mifos project..

What's next?

What are my resolutions? Get a Mac as the development platform (Eclipse works on MacOS and I can use a Win7 image within VirtualBox), and start development on Java enabled phone (at least MSA 1.1 compliant).

A+, Dom

Wednesday, June 10, 2009

Internationalization of GAE applications

Costs of badly planned internationalization

In my experience, internationalizing an application is very expensive if it is not planned upfront.

The first source of costs is due to developers who are used to lazily hard-coding labels. Extracting them at the end of the development process is always error prone because developers may have done assumptions on exact labels. Good regression test suites can help detecting such situations but they cannot avoid the cost of the required fix and its corresponding test runs. With late label extraction, developers without enough context tend to add extra dictionary entries. With the label extraction near the release milestone, developers in a rush and without the initial context tend also to produce non documented dictionaries—that is limit their re-usability and increase the difficulty of possible defect fixings.

Non localized Java code
protected String formulatePageIndex(int index, int total) {
    String out = "Page " + index;
    if(0 < total) {
        out += " of " + total;
    }
    return out;
}

In the example above, I have seen the extraction of the labels Page and of instead of Page %0 and Page %0 of %1. The quick extraction leads to the impossibility to invert the two arguments! Think about the people naming conventions: in many countries, the last name is displayed before the first name, in others the first name precedes the last name (%last %first compared to %first %last), and in Japan both names are printed without separator in between (%last%first).

Localized Java code
protected String formulatePageIndex(int index, int total) {
    // Get the resource bundle already initialized for the correct locale
    ResourceBundle rb = getCurrentResourceBundle();
    // Prepare values
    Object[] values = Object[] {Integer.valueOf(index), Integer.valueOf(total)};
    // Get the right localized label
    String label = rb.getString("PageIndexLabel_Page");
    if(0 < total) {
        label = rb.getString("PageIndexLabel_PageOf");
    }
    // Return the localized label with injected values
    return Message.format(label, values);
}

The second source of costs is due to the missed opportunities of deploying localized builds early in the development process. In Agile environments, we expect to get runnable builds on regular basis (at the end of each sprints, each 4 to 6 weeks, for example). And these fool-proof builds can be demoed to customers to get early feedback. If the development organization can work with translators iteratively, there are a lot of chance to detect localization defects while their fixing cost is not too high. In the past, I have seen product developments hugely hit when bi-directional languages (like Arabic and Hebrew) had been introduced...

Different aspects of the internationalization

Internationalization (i18n) [1] has two aspects:
  • The translation of the labels;
  • The localization (l10n) of these labels.
The localization takes into account the language and the country, sometimes with variants in a country. For example, the Spanish language spoken 19 identified countries. In Mexico (ES_MX), the language is slightly different from the one generally spoken in Spain (ES or ES_ES). In Spain, there are many regional languages like the Catalan (CA_ES). The different locales are normalized by the Unicode consortium (ISO-639 and ISO-3166). Codes are composed of a sequence of two letters for the language plus two letters for the country plus two letters for the region. If letters are missing after the language, most of programming languages fallback on common defaults.

In order to ease application localizations, Unicode references a Common Locale Data Repository (CLDR) [2]. This repository is used and updated by many companies like IBM, Sun, Microsoft, Oracle, etc. The repository describes rules on how to:
  • Localize currencies;
  • Localize metrics (distance, speed, temperature, etc.);
  • Localize dates and calendars.
As of today, I think only timezone definitions are still not centrally managed... This is especially bad because conversions between Universal Time (UTC) dates and local dates are operating system dependent (Sun Solaris have small differences with Microsoft Windows, for example). Many tools use the Unicode CDLR information. For example, each release of the Dojo toolkit use its information to provide the Calendar widget for 27 locales [3].

Internationalization with different programming languages

Almost all programming languages have ways to facilitate application globalization. Java provide resource bundles (*.properties file), Microsoft .Net has resource files (.rc files), Python has dictionaries, etc. If JavaScript lacks of native support for globalization, some libraries offer various support. To my knowledge, Dojo toolkit is the first providing a full support.
If developing an application on Google App Engine infrastructure can be done with only one programming language, Python or Java as this time of writing, it is highly possible that developers will use some JavaScript libraries to speed up their development. This is without counting the delivery of a similar program front-end as a native application (made with Adobe AIR, Microsoft .Net, C/C++, Groovy, etc.).

In different situations, I have seen developers moving manually label definitions from one environment to another one. Sometimes, definitions were left over, cluttering the system. In Agile environments, developers should focus on the requirements for the current sprint, leaving some tuning for later sprints. For example, at one point during the development, some labels defined in a JavaScript bundle might be moved to a Java bundle because the localization will be done server-side into a JSP file.

My solution is to put all labels in one localized central repository. The dispatch among the different programming languages is done at build time. When I looked for this repository format, my solution was selected against the following criteria:
  • Easily editable;
  • Has a standard format;
  • Usable by static validation processes;
  • Has excellent re-usability factors;
  • Easily extensible to new programming languages.
I chose the TMX format (TMX for Translation Memory eXchange [4]). This is an XML based format (good for edition, extensibility, and use by static validation tools) which has been defined to allow translation memory export/import between different translation tools like DejaVu. The XlDiff format would have been another good candidate.

The following table illustrates the flow of interactions between the different actors in a development team. This sequence diagram shows that, once the developers have delivered a first TMX file, testers and translators can work independently to push tested and localized builds to the customer. As explained later, if developers tune the TMX entries without updating the labels themselves, translators and testers (at least from the l10n point-of-view) can stay out of the loop—only steps [1, 7, 8] are replayed.

Simplified view of the overall interaction flow
Developers Testers Translators Build process End-users
1. Write labels in one language into the TMX file. These labels are extracted from design documents.
2. Generate the application with for one locale.
3. Produce a generic bundle to identify non extracted labels.
4. Generate the application with for two locales.
5. Use the application in one locale (switching to the test language is hidden).
6. Use the initial TMX to produce n localized TMXs.
7. Generate the application with for 2 + n locales.
8. Can use the application in 1 + n locales.


The following code snippet shows how an entry into the base TMX file is defined.

Snippet of a translation unit definition for a TMX formatted file
<tu tuid="entry identifier" datatype="Text">
 <tuv xml:lang="locale identifier">
  <seg>localized content</seg>
 </tuv>
 <note>contextual information on the entry and relations with other entries</note>
 <prop type="x-tier">dojotk</prop>
 <prop type="x-tier">javarb</prop>
</tu>

The key features of the TMX format are:
  • The format can be validated with an external XSD (XML Schema Description);
  • One entry (tu: translation unit) can contain many localized contents (tuv: translation unit value);
  • Developers have a normalized placeholder (<note/>) to register contextual information;
  • Extensions are used by the build process to target the type of resource bundle to receive the localized label.
With such an approach, I have seen a drastic reduction of translation mistakes, especially thanks to the <note.> tag. Sometimes, graphical elements contain inter-related labels that cannot be grouped under a generic entity. The following set of elements illustrates the situation. The TMX approach saves translators headaches because they are simply informed about the relation between four entities.


The conversion from the TMX to the various resource bundles is done by an XSL-Transform. With the continuous integration handled by ant, the corresponding task generates the output after having appended the XSLT file coordinates to a copy of the TMX file and after asked for the transformation with the corresponding <xsl/> ant task. Depending on the machine performance, depending on the TMX file size, I found that the process can be time consuming. If this is your case too, I suggest you write your own little Java program to handle it. You can also use mine ;)

Stylesheet transforming label definitions for the Dojo toolkit
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text" />
 <xsl:template match="/tmx/body">
  {
  <xsl:for-each select="tu">
    <xsl:for-each select="prop">
      <xsl:if test="@type='x-tier' and .='dojotk'">
        "<xsl:value-of select="../@tuid" />":"<xsl:value-of select="../tuv/seg" />",
      </xsl:if>
    </xsl:for-each>
  </xsl:for-each>
  "build", "@rwa.stageId@"}
 </xsl:template>
</xsl:stylesheet>

Use of the stylesheet above to convert Dojo toolkit related definitions from the TMX files by an Ant task [5]
<target name="convert-tmx">
  <style
    basedir="src/resources"
    destdir="src/resources"
    extension=".js"
    includes="*.tmx"
    style="src/resources/tmx2dojotkxsl"
  />
</target>

A+, Dom
--
Sources:
  1. Introduction to internationalization and localization on Wikipedia.
  2. Unicode Common Locale Data Repository (CLDR).
  3. Dojo toolkit API: dojo.cldr, dojo.i18n, private dijit._Calendar, dojox.widget.Calendar, and dojox.widget.DailyCalendar.
  4. Definition of the Translation Memory eXchange (TMX) format.
  5. Reference of the XSLT/tyle task for Ant scripts.

Friday, June 5, 2009

Android Dev Phone 1 Setup

To start investigations on mobile application development for Compuware, I have just acquired a first Android Dev Phone, also known as G1 [1]. Last week at Google I/O, Vic Gundotra delivered an Oprah event by offering to the audience a second generation Android phone, also known as G2 or HTC Magic [2, 3]. The G1 is more limited but it is still the only Android platform legally available.

With a bit of luck, few of 18 new phones Google expects [4] will be made available to developers during the year.

I have been able to activate the phone with the Fido pre-paid plan (10 $/month) [5]:
  • I inserted the SIM card as illustrated into the documentation, put the the battery in place, and connected the phone to the AC adapter.
  • Before signing in with my Google account, I had to create an Access Point Network (APN) entry:
    • Name: Fido
    • APN: internet.fido.ca
    • Username: fido
    • Password: fido
    • MCC: 302
    • MNC: 37
  • In some forums, it is reported that new Fido SIM cards use 370 as the MNC value.
  • A post of Olivier Fisher's blog [6] gives also the coordinates to connect to Rogers network, another GSM provider in Canada.
  • To limit interference, I deleted all pre-loaded APN entries (related to T-Mobile networks).
  • At one point, a popup asked me to enable data transfers. It is important to enable it and to activate the Data roaming, disregard how expensive are the costs for a prepaid plan.
  • Then I specified my Google account credentials and let the phone contacting Google servers via Fido network.
  • Once the activation has been successfully reported, I disabled the Data roaming, even before the synchronization of the applications {GMail, Contacts, Calendar} ended. The impact on my plan should be limited ;)
  • Then I added the description of my home Wi-Fi network.
  • I found the MAC address of the phone in the menu Settings > About phone > Status, with the entry near the end of the list. I used it to let my Wi-Fi controller accepting connections from the phone.
  • At this step, I was able to use my phone for regular calls over Fido network, and for data transfers over my Wi-Fi network.
The phone comes installed with Android 1.0 installed. I will blog later about updating the phone OS to the 1.5 version (also known as Cupcake)...

Update 2009/06/16:
Instructions on how to upgrade the Android dev phone to Android 1.5 is pusblished on HTC website: Flashing your Android Dev Phone with a Factory System Image.

Update 2009/07/15
Because of some restrictions to access Internet from the office, I have decided to pay for a 1GB/month data plan with Fido (30$/month). The activation has been made pretty quickly but none mentionned the following limitation:
  • On HTC website, you can see the network specifications for the G1: HSPA/WCDMA (US) on 1700/2100Mhz.
  • Fido/Rogers GSM only operates on 850/1900Mhz, so there's no possibility to go at a 3G speed in Canada!
Using this phone mainly for development purposes, it is not a blocking issue. It is just sad to not benefit from a better bandwidth...

A+, Dom
--
Sources:
  1. Order the Android dev phone 1 from Android Market.
  2. Techcrunch reports the Oprah moment by Vic Gundotra.
  3. G2 review by MobileCrunch
  4. Google expects 18 to 20 new phones on the market by the end of 2009.
  5. Fido pre-paid plan.
  6. Android G1 Phone in Canada on Rogers by Olivier Fisher. Posted comments are especially useful.

Wednesday, April 29, 2009

Meet you at JavaOne Conference

RockstarAs the title let you imagine, I will be in San Francisco the week of June 2-5 for the JavaOne 2009 conference.

It will be my first visit to the event, not to the Moscone Center. While working for Oracle, I attended one Open World event there. With the recent acquisition (not officially closed yet) of Sun by Oracle, I am going to go in California for Oracle again ;)

With another Compuware colleague, we will mainly focus on mobile related sessions, but cloud computing and rich interactive application related ones will have my attention too. With sessions finishing at 10:20 PM, days will be surely long. I expect a fruitful trip.

Are you going to attend too? Tweet me your schedule to meet you there ;)


JavaOne June 2-5, 2009


A+, Dom

Monday, April 13, 2009

Agile: SCRUM is Hype, but XP is More Important...

(This post is part of the series Web Application on Resources in the Cloud.)

I have been doing “Agile development” for more than 5 years. I am used to saying that an organization is Agile at the level of its weakest element. So I cannot claim having worked on any fully Agile projects. However, I have always tried to apply as many as possible Agile principles to my work. This blog entry goes over different practices and identifies the ones that worked best for me and my teams.

Agile

The Agile methodology is a not pure invention, this is the compilation of best directives gathered from various practices:
  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan
Agile principles and sub-principles have been defined by a group of technical leaders: Beck from eXtreme Programming (XP), Schwaber (Scrum), etc. The Agile Manifesto [1] is the result of their collaboration.

Scrum

“Scrum is a lightweight Iterative and Incremental (Agile) Development Method that focuses on delivering rapidly the highest priority features based on business value.” It has been defined by Ken Schwaber and Jeff Sutherland in early 1990s.

Scrum promotes high collaboration and transparency. There are different backlogs helping delivering the best business values at each iteration. Capturing and integrating feedback (from business users, stakeholders, developers, testers, etc.) is a recurrent task. Deliveries occur often and their progression is continuously monitored.




Scrum in Under 10 Minutes by Hamid Shojaee


The points I really like about Scrum:
  • Task reviews done with all actors, in Poker Planning [2] sessions, for example.
  • Product designed, coded AND tested during the Sprint.
  • Sprint deliveries are workable products, with limited/disabled features, but working without blocking issues.
  • Defined roles: Product Owner, Scrum Master (aka Project Manager), and Project Team (composed of cross-functional skills: dev., QA, DBA, Rel. Eng., etc.).
Pig and Chicken are traditional roles in the Agile teams [3].

eXtreme Programming

While Scrum is mainly managers (chicken) oriented, eXtreme Programming (XP) focuses more on do-ers (pigs).

XP is more a matter of having the right tools and having real technical exchanges within the Scrum team. For example, XP strongly suggests the adoption of peer-programming: two developers per computers, one coding and the other thinking about the coding and correcting the code on-the-fly.

Applying peer programming in teams with actors from various backgrounds is sometimes too constraining. Matching peers is a difficult exercise. However, enforcing peer code reviews allows to get almost the same benefits without too much frustration. With code reviews, junior developers can see seniors' work in action, and senior developers can learn new programming paradigms. I found it's good also for the team cohesion, because team members really know about each others' work.

Among the practices XP incites to follow, there are:
  • Continuous Integration: every time a developer or a tester delivers new materials, a complete build process starts to compile, package and test the entire application. Ideally, the build time is short enough to prevent committers to start any other tasks, so they can fix it right away. A safe strategy is to put “fixing the build” as the top priority whenever a problem occurs.
  • Unit testing and code coverage: when developers write unit tests, they provide the first piece of code consuming their own code, and experience shows that it really helps delivering better code. Unit tests without code coverage measurements does not mean much. And not trying to reach 100% coverage leaves too much space to defective code... Using mock objects [4] is an essential tool to test accurately. Test Driven Development (TDD) methodology is pushing this practice up to writing the tests before the code.
  • Continuous refactoring: during each sprint, developers should focus on the immediate requirements, because they have very little control on future sprints (the product owner can adjust the development plan anytime). This is sometimes difficult to limit them to their immediate tasks because many do not like the perspective of having to rewrite their code later. Investing in tools like IntelliJ IDEA which provides extended refactoring features is really worth it because developers can adapt their code efficiently while being secured by the continuously growing regression test suites.

Best of both approaches

In medium to big companies, they are often many layers of management. In such environments, when managers should be facilitators[5], they often add weight to the processes.

About the issues in shipped products, here is an anecdote about IBM:
An internal team reviewed the quality of the released products and came to the initial conclusion that minor and maintenance releases contain more flaws than major releases. The conclusion was made after studying the number of defects reported by customers: this number was sometimes twice higher for intermediate releases. But the team pushed its investigation further and polled many customers. At the end, it appears that very few customers were installing major releases immediately, most of them would wait for the first maintenance release to deploy in pre-production environments (one stage before production one).
In this story, IBM used the results of this study to size and train the support teams according to the product release plans. As you can expect, more support people are trained and made available on releases following major ones. It did not help IBM delivering better products up front, it mostly smoothed the experience of customers reporting problems ;)
Development labs are often known for delivering over the budget, over the allocated time, and with too many issues. Many times, I have seen the maintenance being operated by specific teams, without relation with the development ones. In such environments, development teams focus on delivering features and maintenance teams fix issues: each team has its own budget and life goes on!

The combination of the relatively poor delivered software, the accumulation of managers, and the Scrum burn-down chart (the chart that shows how the work progress on a daily basis [6]) favors Scrum adoption in IT organizations.




Burn down chart sample


My problem with Scrum as I see it in action is related to its usage by managers: it is a one-way communication channel for them to put the pressure on Scrum teams. And because Scrum is task oriented, if the task set is incomplete (or deliberately cut through), these managers mostly follow the feature completion rate, and sometimes the defect detection and fixing rates.

In my experience, with organizations transitioning from waterfall methodologies to Scrum, the feature check list has always precedence on the quality check list! If tasks have a risk to break the deadline, the test efforts are cut. And because these organizations have very few ways to measure the delivered quality (because they adopted Scrum but refused to invest in XP), results are not really better for customers...

This is why I think it is important to balance the importance of Scrum with the one of XP, why as the same time managers should tools to monitor the work progress, Scrum teams should publish quality metrics about all delivered pieces of code. With both sides being instrumented, it is be easier to identify decision impacts and product owners can make informed decisions.

A+, Dom
--
Sources:
  1. Principles of the Agile Manifesto, and definition of Agile methodology on Wikipedia.
  2. Description of Poker Planing on Wikipedia.
  3. The Classic Story of the Chicken and Pig on ImplementingScrum.com, and the role definition by Nick Malik.
  4. Mock object definition on Wikipedia, and Chapter 7: Testing in isolation with mock objects from the book JUnit In Action, by Vincent Massol.
  5. My personal view on the facilitator role managers should have: Manager Attitude.
  6. Burn-down chart described as a Scrum artifact on Wikipedia and Burn Baby Burn on ImplementingScrum.com.

Friday, April 10, 2009

Google App Engine Meets Java

On April 7, three days ago, Google people announced and demonstrated the new support of the Java programming language in Google App Engine ecosystem [1].

Before starting my side project [2], I was mostly a Java[3] adopter on back-end servers. Java is widely supported and have a large and active contributing community (to its core via the Java Community Process (JCP) [3] or to 3rd party libraries).

Once I decided to go with GAE, I invested a bit in improving my knowledge of Python [4], for example by looking at the WSGI specification [4] and at Django [4]. I have been impressed about the integration done by GAE people, about how easy it is to program complex steps in very few lines! My favorite part is the main function to dispatching events:
# -*- coding: utf-8 -*-

# Handlers
...

# Dispatcher
application = webapp.WSGIApplication(
    [
        ('/API/requests', RequestList),
        (r'^/API/requests/(.+)$', Request),
    ],
    debug=True
)

def main():
    run_wsgi_app(application)

if __name__ == "__main__":
    main()

Implementing a REST API, the RequestList class defines get(self) to return selected resources, and put(self) to create the proposed resource. The Request class defines get(self, key) for the identified resource retrieval, post(self, key) to update the identified resource, and delete(self, key) to delete the identified resource.

In the J2EE world, with the web.xml file forwarding /requests URLs to a servlet (as done in the app.yaml file), the servlet code will have to get the URI (with HttpServletRequest.getPathInfo()) and will have to parse it in order to detect the possible request identifier and the possible version number. IMO, Python offers slicker interface.

Another example is the support of the JPA and the JDO [5] specifications: few annotations decorating the Data Transfer Object (DTO) class definitions allow GAE/J to deal with the persistence layer (i.e. BigTable). Compared with the Python model definitions, the getters and setters plus the annotations seem overkill, but are necessary.

With Python allowing to rely on a more compact code, why would I switch to Java?
  • Even if I gave directions on how to test GAE/P applications [6], I should admit testing GAE/J code is easier: JUnit is de facto standard copied by many frameworks, and JCoverage is the tool helping to determine the quality of these unit tests. While working on an open source project, with possibly contributors from various horizons, relying on a strong testing infrastructure is a top priority. It is then possible I will go over the Java tradeoff in a near future...
  • The Java Virtual Machine (JVM) ported on GAE opens the door to many other languages, as reported by App Engine Fan [7]. I have a personal interest in the port of JavaScript language, processed by Rhino, the Mozilla JavaScript engine. I would be nice to be able to run the Dojo build process on GAE itself ;)

A+, Dom
--
Sources:
  1. New features and an early look at Java for App Engine on Google official blog and Seriously this time, the new language on App Engine: Java™ of Google App Engine Blog.
  2. Announcement in preparation this time.
  3. Java history on Wikipedia, on Sun microsystems website, on IBM developerWorks website; site of the Java Community Process.
  4. Key components of GAE/P: Python, Django and its template language, WSGI
  5. Description of standards supported on GAE/J: Standards-based Persistence For Java™ Apps On Google App Engine.
  6. Automatic Testing of GAE Applications from the series Web Application on Resources in the Cloud.
  7. Hand me the Kool-Aid :-) by App Engine Fan.

Tuesday, March 17, 2009

MVC Pattern and REST API Applied to GAE Applications

(This post is part of the series Web Application on Resources in the Cloud.)

I have been developing applications for various categories of end-users since 1990. I coded front-ends in X/Motif [1] for Solaris, then in Borland OWL and Windows MFC for Windows [2], then HTML/JavaScript for Web browsers. Most of the time for good reasons, JavaScript has been considered as a hacking language: to do quick and dirty fixes. Anyway, I have always been able to implement the MVC pattern:
  • My first enterprise Web applications was relying on a back-end written in C++ for a FastCGI run by Apache [3]. I wrote a HTML template language which, in some ways, was similar to the JSP concept [4].
  • The second version of this administrative console was relying on JavaScript for the rendering (a-la Dojo parser) and was using frames to keep context and to handle asynchronous communication.
  • Then I learned about XMLHttpRequest [5], and with a great team, I started building a JavaScript framework to be able to handle complex operations and screen organizations within an unique Web page. We came up with a large widget library and a custom set of data models.
  • After a job change, I discovered Dojo [6] in its early days (0.2 to 0.4) and I closely followed the path to get to 1.0. With Dojo, I am now able to relax client-side because the widget library is really huge, while still easy to extend, and it has advanced data binding capabilities. Now, I can focus on the middle-tier to build efficient REST APIs.
Among the design patterns [7], Model-View-Controller (MVC) is maybe one of the most complex (it is a combination of many basic patterns: Strategy, Composite, Observer) and maybe the one with the most various interpretations. I did write my own guidelines when I ported the MVC pattern browser-side (proprietary work). Today, Kris Zip blog entry on SitePen side summarizes nicely my approach: Client/Server Model on the Web. The following diagrams illustrate the strategy evolution.

   
Credits: SitePen.
In Google App Engine documentation, Django [8] is the template language proposed to separate the View building from the Model manipulation. Django implements the “Traditional Web Application” approach illustrated above.

My initial concern about the traditional approach is the absence of a clean and public API to control the Model. It is not rare that APIs are considered as add-ons, as optional features. IMHO, APIs should be among the first defined elements: it helps defining the scope of the work, it helps defining iterations (in the Agile development spirit), it helps writing tests up-front, it helps isolating bottlenecks.

With the move of the MVC pattern browser-side, the need to define a server-side API becomes obvious. The Model is now ubiquitous: interactive objects client-side have no direct interaction with the back-end server, they just interact with the Model proxy. This proxy can fetch data on demand, can pre-fetch data, forwards most of the update requests immediately but can delay or abort (to be replayed later) idempotent ones.

My favorite API template for the server-side logic is the RESTful API [9]. It is simple to implement, simple to mock, and simple to test. For my side project [10], the repository contains descriptions of products made available by micro entrepreneurs (see table 1).

Table 1: Samples of RESTful HTTP requests
Verb URL pattern Description
GET /API/products Return HTTP code 200 and the list of all products, or the list of the ones matching the specified criteria.
GET /API/products/{productId} Return HTTP code 301 with the versioned URL for the identified product, or HTTP code 404 if the identified product is not found.
GET /API/products/{productId}/{version} Return HTTP code 200 and the attributes of the identified product for the specified version, or HTTP code 301 "Move Permanently" with a URL containing the new version information, or HTTP code 404 if the product is not found.
DELETE /API/products/{productId}/{version} Return HTTP code 200 if the deletion is successful, or HTTP code 303 or 404 if needed.
POST /API/products Return HTTP code 201 if the creation is successful with the new product identifier and its version number.
PUT /API/products/{productId}/{version} Return HTTP code 200 if the update is successful with the new product version number, or HTTP codes 303 or 404 if needed.

HTTP CodeDescription
200OK
201Created
301Moved Permanently
303See Other
“The response to the request can be found under another URI using a GET method. When received in response to a PUT or POST, it should be assumed that the server has received the data and the redirect should be issued with a separate GET message.”
304Not Modified
307Temporary Redirect
400Bad Request
401Unauthorized
404Resource Not Found
410Gone
500Internal Server Error
501Not Implemented
Table 2: Partial list of HTTP status codes (see details in [11]).

Parsing RESTful API for Google App Engine application is not difficult. It is just a matter of using regular expressions in the app.yaml configuration file and in the corresponding Python script file.
application: prod-cons
version: 1
runtime: python
api_version: 1

handlers:
- url: /API/products.*
  script: py/products.py

- url: /html
  static_dir: html

- url: /.*
  static_files: html/redirect.html
  upload: html/redirect.html
Code 1: Excerpt of the app.yaml configuration file.
# -*- coding: utf-8 -*-

import os

from google.appengine.api import users
from google.appengine.ext import db
from google.appengine.ext import webapp
from google.appengine.ext.webapp import template
from google.appengine.ext.webapp.util import run_wsgi_app

from prodcons import common
from prodcons import model

class ProductList(webapp.RequestHandler):
    def get(self):
        # Get all products
        # ...
    def post(self, productId, version):
        # Create new product
        # ...

class Product(webapp.RequestHandler):
    def get(self, productId, version):
        # Get identified product
        # ...
    def delete(self, productId, version):
        # Delete identified product
        # ...
    def put(self, productId, version):
        # Update identified product
        # ...

application = webapp.WSGIApplication(
    [
        ('/API/products', ProductList),
        (r'^/API/products/(\w+)/(\d+)$', Product),
    ],
    debug=True
)

def main():
    # Global initializations
    # ...
    run_wsgi_app(application)

if __name__ == "__main__":
    main()
Code 2: Excerpt of the Python file processing product-related requests

Note that the second code sample shows an ideal situation. In reality, I had to change the verb PUT when updating product definitions because the method self.request.get() cannot extract information from the stream—it does only work for GET and POST verbs. The corresponding client-side code relies on dojo.xhrPost() instead of dojo.xhrPut(). If you know the fix or a work-around, do not hesitate to post a comment ;)

While developing application front-ends, developers should always rely on the MVC pattern to separate the data from user interface, to separate the data flows from the interaction processing. IMHO, organizing the server-side interface as a RESTful API is very clean and efficient. If you use Dojo to build your JavaScript application, you can even rely on their implementation of various RESTful data sources [12] to simplify your work.


Credits: SitePen

Pushing the MVC pattern browser-side has nasty side-effects when too much information are managed by the Model proxy:
  • Large data sets consume a lot of memory.
  • HTTP connections being a rare resource sometimes unreliable, rescheduling requests (important ones first, non important to be replayed later) or replaying requests (because Microsoft Internet Explorer status reports WSAETIMEDOUT for example) complexify the data flows.
  • Fine grain API consumes a lot of bandwidth especiallywhen the ratio data vs envelope is negative).
  • Applications have often too few entry points, hiding then the benefit of one URI per resource (intrinsic REST value).
  • In highly volatile environments, data synchronization become rapidly a bottleneck if there is no push channel.
So the application performance (response time and memory consumption) should be carefully monitored during the development. If applications with the MVC pattern organized browser-side and relying on RESTful APIs cannot do everything, they are definitively worth prototyping before starting the development of any enterprise application or application for high availability environments.

A+, Dom
--
Sources:

  1. Motif definition on wikipedia and online book introducing X/Motif programming.
  2. History of Borland Object Windows Library (OWL) on wikipedia and its positioning against Microsoft Foundation Class (MFC) library.
  3. FastCGI website, and its introduction on wikipedia.
  4. Presentation of the JavaServer Pages (JSP) technology on SUN website.
  5. History of XMLHttpRequest on wikipedia, and its specification on W3C website
  6. Dojo resources: introduction on wikipedia, Dojo Toolkit official website, Dojo Campus for tutorials and live demos, online documentation by Uxebu.
  7. Introduction of the Design Patterns on wikipedia and reference to the “Gang of Four” book (Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides). Specific presentation of the Model View Controller (MVC) pattern on wikipedia.
  8. Django : default template language available in Google App Engine development environment.
  9. Definition of the Representational State Transfer (REST) architecture on wikipedia.
  10. Future post will describe the nature of this project ;)
  11. HTTP status codes on wikipedia, and section 10 of the RFC2616 for the full status code list. Don't forget to look at the illustrations of the HTTP errors codes by the artist Adam "Ape Lad" Koford (license CC-by).
  12. RESTful JSON + Dojo Data by Kris Zyp, with details on the dojox.data.JsonRestStore introduced in Dojo 1.2.

Thursday, March 12, 2009

Telcos vs Internet providers

What a big break, isn't it? I did not give up ;) I was just busy at work to help tuning the performance of a product which is going to be “GA” this month. I mostly focused on the browser-side code, trying to mitigate effects of flawed designs. Anyway, I'll try to use some of the collected materials in a future post.


From jopemoro, with CC-ByND.
Telecommunication operators (telcos) used to have a captive market:
  • Land lines subscribers do not switch very often from one provider to its competitor, even for long distance call plans. There is some inertia that helps securing their investments in other technologies.
  • Cellular phone subscribers are more susceptible to change, especially now that they can keep their number when subscribing to another operator. But 3-years plans are efficient tools used by operators to keep their subscribers under control.
  • Usage of the communication bandwidth is very much predictable. Companies have sized the minutes allocated to each plan so they can get the most from users who do not consume their quotas.
In this industry, the rule of “segment the market to maximize your revenues” is well applied.

Late last year, few operators [1, 2] started offering month-to-month plans, without term contract. Why? Because subscribers want more than just voice communication, they want more flexibility at a reasonable price.

With services like Twitter [3] or Google Calendar [4], any cellphone users can get notifications by SMS. When incoming SMS are billed, the incentive to look for an alternative is big!

Nowadays, a growing number of people have (or will to have) smart phones with multiple communications capabilities: to send pictures, to get live information, to access a map and benefit from their embedded GPS, to use VOIP services, etc. [5]

If local communication plans have usually a fair price (not the long distance plans or the fees when on roaming), prices of the “data” plans are usually crazy. With a smartphone equipped with a 8 MegaPixel digital camera, sending quite a few images over the telco networks is prohibitive...


From mag3737, with CC-BA-NC-SA
So people tend to use more and more direct Internet accesses, in the offices, at home, in cafes, libraries, etc. It so demanded, that non-profit organizations offers networks of Internet hotspots [6].

Thanks to direct Internet accesses which provide better bandwidth than broadband ones, smartphone users can use long distance calls (with Skype [7]), get their voicemail (GrandCentral [8]), stream videos (Qik [9]).

If I can see the month-to-month plan offering as a tentative to keep their customers, telcos should also revisit the data plan offering to avoid more customer base erosion! Because carrying data means carrying almost anything, I think should transform themselves from pure telecommunication operators to Internet providers.

A+, Dom
--
Sources:

  1. No Contract Required — New Month-To-Month Agreement Gives Verizon Wireless Customers Even More Freedom, on Verizon website.
  2. Fido removing system access fees, by on CBCNews website.
  3. Twitter is a free social software where people post updates (limited to 140 characters, same limitation with SMS) and that followers can get automatically.
  4. Google Calendar is time management platform where people can manage their agendas and invite other people. Event reminders can be sent by e-mail and by SMS for free.
  5. For a better description of possible services, see my post: Hand held devices and sensors.
  6. Free Wifi initiatives: Île Sans Fil of Momtréal, Wireless Toronto, etc. More in this directory.
  7. Skype is a service allowing users to make phone calls over Internet. This popular service belongs to eBay (acquired in Sept. 2005).
  8. GrandCentral: see the recent update posted on TechCrunch, about GrandCentral which is going to reborn as Google Voice.

Monday, February 9, 2009

Google App Engine: Free Hosting and Powerful SDK

(This post is part of the series Web Application on Resources in the Cloud.)

Google App Engine (GAE) [1] is an open platform made available by Google to host Web applications:
  • It can serve static pages (HTML, CSS, JavaScript, images, etc.).
  • It can serve dynamic pages. The programming language is Python [2] (with limited features). The default template framework is Django [3].
  • It can persist data in Google BigTable (with a query language similar to SQL, but with restricted features).
  • It offers transparent scaling and load-balancing.
  • Its sources are open and freely available. Google allows to host up to three applications by account, as sub-domains of appspot.com (at least during the preview period).
Some people, like Dare Obasanjo [4], consider GAE as implementing “Platform as a Service” paradigm. I agree and think GAE offers a core element to implement “Software as a Service” (the hype SaaS). In general, I think that SaaS can help IT companies delivering value to their customers at a better quality/price ratio. Understanding GAE strengths should encourage development teams to give a close look at the entire SaaS concept.

Free Service

GAE is offered free of charges during the preview period. In the future, customers will be billed only for what they have consumed (disk space, bandwidth, CPU time, etc.). This practice has been adopted by many providers of services in the cloud, like Amazon [5] and it Amazon Web Services (AWS) offer.

The Software Development Kit (SDK) [1] is open, and anyone can take a look at it, can customize it for his own needs, and can even submit patches. For now, only one programing language is supported: Python [2]. The SDK is delivered with a standalone runtime environment.

Python is also an open system, created by Guido van Rossum who has been working for Google since 1995. In my opinion, this combination is an argument against developers complaining about the need to learn yet another language: Python is a really powerful language and will continue to have a full support by Google as their favorite language.

In association to Python, Django [3] is the template language helping to create applications compliant with the Model-View-Controller (MVC) pattern [6]. Django is also an open source software.

To get the best of the languages and of the standalone GAE runtime, I strongly suggest setting up Eclipse (another open-source software) [7, 8]. Eclipse might not be the ideal candidate for GAE application development, but it provides an extensible platform easy to leverage. For example, egit [9] is a Eclipse plug-in handling transactions with Git repositories (like Github.com [10]).

Servicing static and dynamic pages

GAE can host 1,000 files, each one smaller than 1Mb, for a grand total of 500 Mb per application [11]. Usually, the static files are accessories: images, style sheets, etc. But the offered space allied to Google's scalable infrastructure can be also leverage to host almost any file (HTML, FLV (Flash), CSS, JavaScript, etc.). App Engine Fan describes how to setup GAE for this usage [12], as Matt Riggott [13].

The following handler definition, which should be located into the app.yaml configuration file, indicates that all requests should be served from the corresponding files located in the directory static.
handlers:
- url: /
  static_dir: static/
Dynamically generated content, like developers are used to producing with PHP for example, can be implemented with Django templates [3]. The following template defines the general Web page pattern. And the second template is just extending it by overriding the extension points.


Common.html template with the Web page organization and the extension points.


Producer.html template overriding the extension points with the page specific elements.

Note: because of internationalization concerns, I strongly recommend to NOT code Web pages as the ones above. Refer to my post on Internationalization of GAE Applications [14] for a better implementation.

Quickly, it is possible to use GAE to host static and dynamic pages on the domain appspot.com (pattern is http://[application-name].appspot.com/). Integrating these pages transparently in your own domain allows future updates without having your readers to point to a new Web address. You need to setup Google Apps for your Domain and follow their instructions [15].

App Engine Fan explains how to prevent access to your application from unknown domains [16]. In a private network, you can even open the GAE server to remote access [17].

Access to Google BigTable

In the Web application world, data persist mainly in databases. Databases scale, maintain indexes (providing quickly search results), support transaction (update, then commit or revert). Most databases are relational databases [18]. Among the well know relational databases, there are: Derby, Oracle, DB2, MySQL. SQL (Structured Query Language) is often the query language of relational databases.

GQL (Google Query Language) is very similar to SQL [19]. The discrepancies are due to GAE architecture. For example, to preserve its scalability of the underneath database, GQL does not offer the possibility to JOIN tables. I am not database expert, but I consider all limitations being workable and some of them are very sane.

One important issue with database is related to their central place: if they are corrupted, system can stop working. Being able to backup and restore them is critical. In April 2008, Google communicated about possible export file formats [20]. I have not found if this feature has been published... However, I found Aral Balkan's Gaebar application (GAE Backup And Restore) [21] which covers the basic functionality and even more (like the staging concept).

Update 2009/02/10:
In the SDK release 1.1.9, Google describes ways to upload data from a CSV into BigTable, and to download data into a local development server. Refer to the documentation on GAE Website [1].

Going further

Google has developed a GAE application that is gallery for other GAE applications [22]. Many applications are described there. Interviews of successful implementers are also available on GAE Website [1].

On April 10, 2008, Niall Kennedy posted a detailed article describing GAE architecture [23]. Many others people continue to publish on GAE and on Cloud computing issues in general [24]. It is a really hot topic ;)

Update 2009/02/10:
Dare Obasanjo published another post on Dare Obasanjo: Google App Engine on the road to becoming useful for building real web applications

A+, Dom
--
Sources:
  1. Google App Engine Website, and GAE Service API documentation.
  2. Official Python Website. Python history on Wikipedia. Guido van Rossum's blog (Python inventor).
  3. Django Website, with the section on its template language.
  4. Cloud Computing Conundrum: Platform as a Service vs. Utility Computing by Dare Obasanjo.
  5. Amazon Web Services Website.
  6. MVC Pattern applied to GAE Applications... (another post to be published soon).
  7. Eclipse Website.
  8. Article Configuring Eclipse on Windows to Use With Google App Engine from GAE documentation site.
  9. Eclipse plug-in for Git repositories: egit. Check egit short installation guide.
  10. My post on Git as my New SCM Solution.
  11. Quota description on GAE Website, on GAE blog, and on Wikipedia.
  12. Free Webhosting, Google App Engine style, by App Engine Fan.
  13. Using Google App Engine as your own Content Delivery Network by Matt Riggott.
  14. Internationalization of GAE Applications... (another post to be published soon).
  15. Access to the Standard Edition of Google Apps for Your Domain (GYAD) service and instruction on how to setup a GAE application for your domain.
  16. The darker side of multiplexing, or how to prevent site hijacking by App Engine Fan.
  17. Access Google App Engine Development Server Remotely, by Josh Cronemeyer.
  18. Definition of relational database by Wikipedia.
  19. GQL reference page on GAE Website.
  20. Getting your data on, and off, of Google App Engine on GAE official blog.
  21. Google App Engine Backup and Restore (Gaebar) by Aral Balkan
  22. Google GAE Application Gallery, being itself a GAE application.
  23. Google App Engine for Developers, by Niall Kennedy.
  24. Architectural manifesto: An introduction to the possibilities (and risks) of cloud computing on developerWorks.