Why Internationalization is Hopelessly Broken in ASP.NET
Here's why.
First, let me give a quick rundown on the industry standard way of localizing websites: gettext. It's a set of tools from the GNU folks that can be used to translate text in computer programs. The ever-humble GNU crowd have a lot of documentation you can read about these tools explaining why they're so well suited for i18n and how they're a milestone in the history of computer science and incidentally how much smarter the GNU folks are than, say, you. And why you should be using emacs. But anyway, to demonstrate why the gettext way of doing things makes so much more sense than the Microsoft way, let me run down a short list of the things you need to do to translate a website. For each task, I'll give an indication of how ASP.NET would have you do it, along with how you'd do it using hacky fixes I've put in place for the FairlyLocal library I discussed at length last week. Also, if there's a difference, I'll talk briefly about how "Everybody Else" (meaning gettext, which is in fact used by Everybody Else in the world to localize text) does it.Identifying strings that should be marked for translation
ASP.NET: Find them by handFairlyLocal: Find them by hand
Everybody Else: Find them by hand, (unless you're using a language that supports the emacs gettext commands for finding text and wrapping them automatically)
Marking text for translation in code
ASP.NET: Ensure that they're wrapped in some form of runat="server" controlFairlyLocal: Wrap with _()
Everybody Else: Wrap with _() ASP.NET actually does offer one advantage here, in that many of the text messages in need of translation will already be surrounded by a runat="server" control of some description. Unfortunately, that advantage is compensated for by the sheer amount of typing (or copy/pasting or Regex Replacing) involved in surrounding all the static text in your application with "<asp:literal runat="server"></asp:literal>", and by the computational overhead involved in instantiating Control objects for every one of those text fragments. Everybody Else gets to suffer through the steady-state habit of surrounding all their text with _(""), or with a long copy/paste or Regex Replace session similar to the ASP.NET experience. It's still not all that much fun, but at least it's less typing.
Compiling a list of text fragments for use in translation
ASP.NET: Pull up each file in Design View, right click and select Create Local ResourcesFairlyLocal: Build the project (thus running xgettext automatically)
Everybody Else: run xgettext ASP.NET uses a proprietary XML file format called .resx, which is incomprehensible to humans in its raw form, but has an editor in Visual Studio.NET. Everybody Else uses .po files, which is a text format that's simple enough to be read and edited by non-technical translators, but there are also a variety of good standalone editors available.
Updating that list of text fragments as code changes
ASP.NET: Pull up each file in Design View (again), right click and select Create Local Resources (again)FairlyLocal: Build the project (thus running xgettext automatically (again))
Everybody Else: run xgettext again
Specifying languages for translation:
ASP.NET: Copy the .resx file for each page on your site to a language-specific version, such as .es-ES.resx.FairlyLocal and Everybody Else: create a language-specific folder under /locale and copy a single .po file there. Surely there must be a tool to copy and rename the hundreds of locale-specific .resx files that ASP.NET needs for every single language, but I haven't found it yet. Please ASP.NET camp, point me in the right direction here so I don't need to go off on a rant about this one…
Translating strings from one language to another
ASP.NET: Translator opens the project in Visual Studio.NET (seriously!) so that he can use the .resx editor there to edit the cryptic XML files containing the text.FairlyLocal & Everybody Else: Give your translator a .po file and have him edit it as text or with a 3rd party tool such as POedit
Identifying the language preference of the end user
Everybody: Automatically happens behind the scenes, but you can specify language preference too.Referencing Translated Text (by using):
ASP.NET: Uniquely named Resource KeysFairlyLocal: The text itself
Everybody Else: The text itself When Visual Studio.NET does its magic, every runat="server" control will get a new attribute called meta:resourceKey containing a unique key with a helpful name such as "Literal26" or "HyperLink7" that is used to relate the text in the .resx file back to the control that uses it. This is not actually as unhelpful as it seems, since translators will still see the Original Text in the .resx file alongside that meaningless key, so they will in fact know what text they're translating. Just not its context. Further, as ASP.NET developers we've learned to put up with a certain amount of VS.NET's autogenerated metagarbage, so we can generally gloss over these strange XML attributes that suddenly appear in our source. Everybody else simply uses the text itself as the lookup key.
Displaying text to the end user in his preferred language
ASP.NET: Automagic. Can also ask for text directly from AppLocalResourcesFairlyLocal: Automagic. Can also ask for translated text directly.
Everybody Else: Automagic. Can also ask for translated text directly. In ASP.NET, you can add keys to your .resx file by hand if there are any messages you need that didn't get sniffed from the source. Other technologies don't need to bother with this step as often, since any text appearing in the source code will be marked for translation, whether it's associated with a control or not.
Wrapping Up
Labels: development, frustration, productivity, software



