Expat Software
A laptop, some ideas, and a one-way ticket.
 
 

Wednesday, March 10, 2010

Why Internationalization is Hopelessly Broken in ASP.NET

I wrote an article last week describing ASP.NET's Internationalization (i18n) scheme in less than favorable terms, and it occurs to me that I should probably offer up a proper justification if I'm going to start throwing terms like 'Hopelessly Broken' around.

As several members of the ASP.NET community so eloquently pointed out in response to that article, ASP.NET does in fact offer a way to translate web sites from one language to another, and it does indeed work perfectly fine, thank you very much. That fact, I omitted to mention last week, is not in dispute and I apologize for implying as much.

To clarify, I don't mean to say that ASP.NET i18n is Hopelessly Broken to the point where it's not possible to do it, but rather that ASP.NET handles i18n in a fashion that is demonstrably worse than the accepted industry standard way of doing things which, incidentally, pre-dates ASP.NET.

Here's why.

First, let me give a quick rundown on the industry standard way of localizing websites: gettext. It's a set of tools from the GNU folks that can be used to translate text in computer programs. The ever-humble GNU crowd have a lot of documentation you can read about these tools explaining why they're so well suited for i18n and how they're a milestone in the history of computer science and incidentally how much smarter the GNU folks are than, say, you. And why you should be using emacs.

But anyway, to demonstrate why the gettext way of doing things makes so much more sense than the Microsoft way, let me run down a short list of the things you need to do to translate a website. For each task, I'll give an indication of how ASP.NET would have you do it, along with how you'd do it using hacky fixes I've put in place for the FairlyLocal library I discussed at length last week. Also, if there's a difference, I'll talk briefly about how "Everybody Else" (meaning gettext, which is in fact used by Everybody Else in the world to localize text) does it.

Identifying strings that should be marked for translation

ASP.NET: Find them by hand
FairlyLocal: Find them by hand
Everybody Else: Find them by hand, (unless you're using a language that supports the emacs gettext commands for finding text and wrapping them automatically)

Marking text for translation in code

ASP.NET: Ensure that they're wrapped in some form of runat="server" control
FairlyLocal: Wrap with _()
Everybody Else: Wrap with _()

ASP.NET actually does offer one advantage here, in that many of the text messages in need of translation will already be surrounded by a runat="server" control of some description. Unfortunately, that advantage is compensated for by the sheer amount of typing (or copy/pasting or Regex Replacing) involved in surrounding all the static text in your application with "<asp:literal runat="server"></asp:literal>", and by the computational overhead involved in instantiating Control objects for every one of those text fragments.

Everybody Else gets to suffer through the steady-state habit of surrounding all their text with _(""), or with a long copy/paste or Regex Replace session similar to the ASP.NET experience. It's still not all that much fun, but at least it's less typing.

Compiling a list of text fragments for use in translation

ASP.NET: Pull up each file in Design View, right click and select Create Local Resources
FairlyLocal: Build the project (thus running xgettext automatically)
Everybody Else: run xgettext

ASP.NET uses a proprietary XML file format called .resx, which is incomprehensible to humans in its raw form, but has an editor in Visual Studio.NET. Everybody Else uses .po files, which is a text format that's simple enough to be read and edited by non-technical translators, but there are also a variety of good standalone editors available.

Updating that list of text fragments as code changes

ASP.NET: Pull up each file in Design View (again), right click and select Create Local Resources (again)
FairlyLocal: Build the project (thus running xgettext automatically (again))
Everybody Else: run xgettext again

Specifying languages for translation:

ASP.NET: Copy the .resx file for each page on your site to a language-specific version, such as .es-ES.resx.
FairlyLocal and Everybody Else: create a language-specific folder under /locale and copy a single .po file there.

Surely there must be a tool to copy and rename the hundreds of locale-specific .resx files that ASP.NET needs for every single language, but I haven't found it yet. Please ASP.NET camp, point me in the right direction here so I don't need to go off on a rant about this one…

Translating strings from one language to another

ASP.NET: Translator opens the project in Visual Studio.NET (seriously!) so that he can use the .resx editor there to edit the cryptic XML files containing the text.
FairlyLocal & Everybody Else: Give your translator a .po file and have him edit it as text or with a 3rd party tool such as POedit

Identifying the language preference of the end user

Everybody: Automatically happens behind the scenes, but you can specify language preference too.

Referencing Translated Text (by using):

ASP.NET: Uniquely named Resource Keys
FairlyLocal: The text itself
Everybody Else: The text itself

When Visual Studio.NET does its magic, every runat="server" control will get a new attribute called meta:resourceKey containing a unique key with a helpful name such as "Literal26" or "HyperLink7" that is used to relate the text in the .resx file back to the control that uses it.

This is not actually as unhelpful as it seems, since translators will still see the Original Text in the .resx file alongside that meaningless key, so they will in fact know what text they're translating. Just not its context. Further, as ASP.NET developers we've learned to put up with a certain amount of VS.NET's autogenerated metagarbage, so we can generally gloss over these strange XML attributes that suddenly appear in our source.

Everybody else simply uses the text itself as the lookup key.

Displaying text to the end user in his preferred language

ASP.NET: Automagic. Can also ask for text directly from AppLocalResources
FairlyLocal: Automagic. Can also ask for translated text directly.
Everybody Else: Automagic. Can also ask for translated text directly.

In ASP.NET, you can add keys to your .resx file by hand if there are any messages you need that didn't get sniffed from the source. Other technologies don't need to bother with this step as often, since any text appearing in the source code will be marked for translation, whether it's associated with a control or not.

Wrapping Up

A short interlude...

I'm a believer in Sturgeon's Law, which states that "90% of everything is crap." Even ASP.NET, which I feel is still miles ahead of every other web development framework is not immune.

We've learned to avoid using pretty much all of the "Rich" controls and Designer Mode garbage that shipped with 1.1 and has plagued .NET ever since, and every new release brings a few things with it (including, alas, System.Globalization) that are best avoided.

In my opinion, that's fine, since the rest of the framework is so ridiculously productive. Don't worry though, any honest Django or Rails veteran will tell you that their frameworks also have bits that are best left alone. And hey, the most popular platform in the world for building web apps is 100% crap, so we're still miles ahead of the game here in the land of MS.

Anybody still following along will notice that while ASP.NET offers workable solutions to every stage of the i18n process, it's generally not quite as straightforward or convenient as the alternative way of doing things. ASP.NET also tends to pollute your codebase with a lot of extraneous noise in the form of meta:resourceKey attributes (why couldn't they have at least shortened that to "key" and made it part of the Control class so you could easily add it to anything) and .resx file collections for every single page in your site, and it leaves you a little short in the Tools department when it comes time to translate those files.

So while it's certainly possible to localize a website the way that ASP.NET recommends, it is definitely a lot of work, and it tends to be quite confusing. Doing it in another technology, say Django for instance, just doesn't seem like that big a deal. That's the sort of experience that I'm trying to bring to ASP.NET with the FairlyLocal library, and I hope it's at least a good first step.

If you have any suggestions (or better still, code contributions) to make it better, I look forward to hearing from you.

Labels: , , ,

Monday, March 01, 2010

Fixing Internationalization in ASP.NET

I've been building websites with ASP.NET for a little over 10 years now, and I have a dirty little secret to confess: I've never Internationalized a single one of them.

It's not from lack of trying, I can tell you. I've got a good dozen false starts under my belt, and plenty of hours spent studying the code from other people's sites that implement Internationalization (abbreviated as i18n for us lazy typists) the way that Microsoft wants you to do it. And my conclusion is that it's just plain not worth the effort.

I18n is hopelessly broken in ASP.NET. Let's look at this nice snippet of sample code to see why:

<!-- STEP ONE, in MyPage.aspx: Create Runat="Server" Literal Control: --> <asp:Literal ID="lblPages" runat="server" meta:resourcekey="lblPagesResource1" Text="Pages"/> <!-- STEP TWO, in MyPage.es-ES.resx: Create Message Key/Value: --> <data name="lblPagesResource1.Text" xml:space="preserve"> <value>Browse</value> </data> ...and that's for EVERY piece of text in your whole site!

Notice that you need to make every single piece of localized text into a runat="server" control. And that you then need to add this crazy long attribute (that Intellisense doesn't know about, so you have to type out in full) to each one of those controls so that ASP.NET can find them in one of the Resource files that you need to generate by hand for every text fragment in your entire website.

If it sounds like a ridiculous amount of work for your developers, you're probably being charitable. In practice, it's so much extra work that nobody actually does it. That, my friends, is the reason you hardly ever see any multi-language websites written with ASP.NET.

Recently, however, my hand was truly forced. We're getting pretty close to launching FairTutor to the public, and since it has target audiences in both the United States and Latin America it pretty much needs to work in Spanish as well as English. This is the part where I start wistfully looking back to a couple Django projects we did not too long ago, and the absolute breeze it was localizing those sites. If only the rest of Django wasn't so crap, we could just port this project across and… Hang on a sec. Port. Yeah, how about we simply port that amazing Django i18n stuff over to ASP.NET instead.

That was a week ago.

Today, I'm releasing some code that I hope will single-handedly fix i18n in ASP.NET. It's based on the way that everybody else does it. Let's pause a minute to let that sink in, since many of my fellow .NET devs might not have been aware of this fact: There's another way of doing i18n, and it's so simple and straightforward that every other web framework uses it in some form or another to do multi-language websites.

In Django, PHP, Java, Rails, and pretty much everything else out there, you simply call a function called gettext() to localize text. Usually, you alias that function to _(), so you're looking at like 5 keystrokes (including quotes) to mark a piece of text for internationalization. That's simple enough that even lazy developers like me can be convinced to do it.

Better still, frameworks that use this gettext() library (it's actually a chunk of open source code from the GNU folks), also tend to come with a program that will sift through your source and automagically generate translation files for you (in .PO format, which is basic enough to be edited in notepad by non-tech-savvy translators, but is popular enough that there are several existing editors built just for it), containing every text fragment that was marked for i18n.

The whole process is so simple and straightforward that you're left to wonder why Microsoft felt compelled to spend so much time and effort reinventing it all to be worse.

Introducing FairlyLocal

I really want ASP.NET to stop forcing people to monkey with XML files and jump through hoops just to show web pages in Spanish, so I'm going to package up all this code and release it as Open Source:

FairlyLocal - Gettext Internationalization for ASP.NET

At the moment, there's not a whole lot to it. It'll find where you're using the FairlyLocal.GetText() (or its _() alias) and generate .PO files for you. And it'll suck in various language versions of those files and translate text on your website. Not much there, eh? But then that's the whole point: i18n is supposed to be simple and straightforward. Hopefully, FairlyLocal will make that an actuality for the ASP.NET community.

I look forward to hearing your feedback.

FairTutor is our latest project here at Expat. It's a website that connects Spanish teachers in South America with students in the US and lets them hold live Spanish classes online.

We'll be starting Beta classes soon, so if you want to score some free Spanish lessons, you might want to go sign up for the waiting list!

Labels: , , ,

Monday, October 13, 2008

How close to Zero Friction is your signup process?

StackOverflow.com just launched this last week, and it looks pretty cool. It seems like it might be our best shot at getting back to the sort of useful discussion that we used to have on the Usenet back in the 90's. Lots of signal, hardly any noise, and even the occasional correct answer. Sign me up!

Uh... wait a sec... I can't sign up.

StackOverflow has made the inexplicable blunder of requiring its users to sign in via OpenID. That means you can't simply pick a username and password, but must instead go away and find yourself an OpenID provider, sign up for that, and bring it back to StackOverflow. It's like 14 steps, depending on which provider you choose. Observe:

StackOverflow

  • Click login
  • Read a ton of instructions
  • Locate and click the "get one" link
  • Dismiss the javascript error popup from openid.net
  • Read a bunch more instructions
  • Find and click the "ClaimID" link (it's the first one on the list of providers)
  • Click "Create a new account"
  • Type in your information
  • Open your email, find their email, click the link
  • Go back to StackOverflow, click login again
  • Paste in that giant URL that is now your OpenID
  • Type in your Username & Password
  • Type in a bunch of Personal Info
  • ... and you're in! Easy as that!

Now, for sake of comparison, let's take a look at the steps required to start using Twiddla (the web meeting playground that we've been working on these last several months here at Expat):

Twiddla


Can you spot the difference?

Look, it's not just me saying this. Talk to any Usability expert you like, and they'll tell you that every barrier that you put in front of your users will cause a certain percentage of them to leave and not come back. For most sites, even stopping to ask for a Username & Password is too intrusive. That's why we built Twiddla the way we did.

Our stated goal with Twiddla is to get the hell out of your way so that you can get some work done. We've taken that idea so far that most of our users will never see a login screen of any description. Some might not ever know they've used Twiddla at all, since we keep our Logo hidden away in the corner where it's not in your way.

Can we say the same about StackOverflow's new registration system? Unfortunately not. For me, it was 10 minutes of grumbling "StackOverflow", "F'ng StackOverflow" under my breath while stumbling through the painful OpenID signup process. Complete usability failure. I can only hope they'll come to their senses and put in a reasonable username/password login like everybody else.

Labels: , , ,

Tuesday, May 27, 2008

The One Rule of DHTML Programming

I just don't get it.

How can so many smart people be so collectively bad at something as simple as Javascript on a web page?

It's just not that hard. And yet, not an hour goes by when I'm not stopped in my tracks by at least one javascript error. And it's especially sad because many of these errors are coming from well known sites, with huge development budgets and plenty of good talent that really should know better. Observe:

That was just a ten minute sample of browsing today.

The One Rule of DHTML Programming

Look, it's not that hard to do this stuff right. In fact, here is everything you'll ever need to know about Dynamic HTML Programming with Javascript:

Test EVERYTHING before you reference it.

That's it. Simple. Every little scrap of code you write needs to live inside its own little IF block that tests to make sure that the things it's expecting to interact with really exist. Here's how:

BAD:
gbN2Loaded.style.display='none';
Good:
if (window.gbN2Loaded)
{
  gbN2Loaded.style.display='none';
}
BAD:
document.getElementById('myDiv').innerHTML
     = 'stuff';
Good:
if (document.getElementById('myDiv'))
{
  document.getElementById('myDiv').innerHTML
       = 'stuff';
}

I don't care that Google's API Reference told you to put <body onunload='GUnload()'> into all your pages. That's just example code, and it's not intended to be used in the real world.

Real World Javascript will need to survive in dozens of strange browser environments that do things in strange unexpected ways, and as soon as you get it working right, Junior Dev Jimmy will accidently include it on every single page on your site and suddenly it won't be able to find the things it needs to live. When that happens, it needs to quietly stop trying to do stuff instead of throwing error messages all over the place.

What you need to do about it

Ok, cool, you've fixed everything, but you're not done yet. There's one more thing you need to do right this second. You need to turn on those annoying Script Error popups in both Internet Explorer and Firefox, and you need to keep them on from here on out. Don't just do it for your own machine, but for every computer owned by every employee of your company.

Yes, I know that you turned them off on purpose because they make the internet basically unsurfable, but most casual users of your site will have them on by default. That means that most casual users will see every single little script error that your site throws at them, and they won't like it. Those errors are pissing off real people right this minute, and you need to know about it. If you arrange it so that they start pissing off you and your co-workers too, you just might find the incentive to get rid of them once and for all.

Labels: , ,

Monday, September 17, 2007

How to Fail at Freelancing, in 5 Easy Steps

Let's say you're a freelance web designer, and you come across a project description for, say, the redesign of a little site that lets you host web meetings. This could be your ticket to fame and fortune. But how should you proceed? Well, frankly, you could have proceeded in just about any moderately professional way imaginable last week and scored that gig, but instead you did the following. All 200 of you.


Solid Gold!!!
1. Send out a Canned Proposal. And not just any canned proposal, send a giant letter filled with the entire contents of your website, but without the proofreading. Make sure it reads like a long-form sales letter, complete with opening phrases such as "Webmaster, your search is over! We're just the candidate for you!!!", but with more exclamation points. Be sure to include links to at least 500 websites that you may have been remotely involved with.

2. Send that Canned Proposal within the first 20 minutes, because time is of the essence. You want to make the statement that "I Didn't Give This Any Thought At All!" And you want to make that statement fast. Good employers tend to hand out knowledge work on a first-come, first-served basis, so you want to make sure you end up on top of the stack. Extra credit for sending the same bid twice!

3. Don't Read the Project Description, or visit the site you're expected to be writing a proposal for. Bid on pieces that weren't up for bid. Ignore the direct questions asked of you in the project description.

Honorable mention:
To the guy in Florida who tallied up his hours, then tacked on the 5% overhead that this particular freelance site charges (expected to be borne by the freelancer, not the employer), then made a point of demanding payment up front. Good way to break the ice and build confidence.
4. Don't, under any circumstances, write any text specific to the project at hand. If you somehow feel obligated to write a sentence or two to show how much thought you've put into your proposal, be sure to tack it on to the bottom. After your cheesy promo text, after that list of 50 links to your class projects, and after that special offer of a 20% discount if I ACT NOW because it's your Fall Sale!

5. Quote an exact dollar figure, after having completed the steps above. This is a nice final touch, as it shows that even though you didn't bother to skim through the project description before sending off your automated response, you still have a firm grasp on our needs.

How to do it right:

Freelancing on Guru, Elance, Rentacoder, etc. is all about first impressions. Your prospective client will no doubt be swamped with dozens if not hundreds of proposals, and you need to find a way to make yours stand out. This is surprisingly easy to do in the current climate, as even the slightest hint of professionalism will do it. Your competition is all trying to shout over the top of each other, thinking that's how you get heard. But you know what? Shouting "I'm an Idiot" at the top of your lungs may in fact get you heard, but it won't usually get you hired.

So here's what you need to do if you actually want to hear back from a potential client: Write a simple, three paragraph proposal, from scratch. Spend the whole first paragraph summarizing the project and explaining why it's something you're good at. Go over your planned approach in the second paragraph, and end it with a ballpark estimate if the project description gave you enough information to do so. And finally, wrap up with a little pleasantry, and give the client a way of learning more about your operation and contacting you if they want to proceed.

Sounds simple, eh? Well, I used to do a lot of freelancing, and got a lot of work with that approach. Trust me, nobody else is doing it. Behave like a professional, and you'll clean up!

Labels: , , ,

Friday, September 07, 2007

Navigating The Minefield that is Visual Source Safe

I have a new candidate for the Most Infuriating Feature Ever. It's an innocuous little part of the source control implementation for Visual Studio.NET.

Let's say you're working on a new and risky set of changes to a project in Visual Studio.NET. You set off and start breaking things in existing files, safe in the knowledge that if you can't make it all work in the end, you'll be able to roll everything back in source control. Cut to half an hour later: things are hopelessly broken, and it's apparent that you're heading in the wrong direction. Best to cut your losses and start again from scratch, so you right click the solution in VS.NET and select "Undo Checkout" to roll everything back. As if to confirm, the following dialog pops up:

Note the default option. It's not really very descriptive, but what it's actually saying is "Roll these changes back in a half-baked way that virtually guarantees I'll accidentally re-implement them all the next time I modify any of these files."

You see, what it's doing by default is leaving a copy of your broken code sitting on your local machine. Forever. Getting latest won't even overwrite it. Neither will checking the file out. So the next time you want to modify that file, it will pull up the changes you thought you had un-done and not even warn you about it. You'll make some innocuous little text modification, check in, and find that the whole application is broken.

This is just one of many hazardous dialogs that developers running VSS have to tiptoe their way past every day. Dialogs with FIVE BUTTONS, only one of which does what Source Control was intended to do, and that one is hidden second from the left. It's enough to make you want to switch over to subversion.

Oh, and in case you're wondering, the correct response (and the only one that anybody should ever use) to that dialog above is to tick the "Replace your local file…" radio button, check the box, and hit OK. Any other combination and you're screwed.

ps. We're currently rebranding Twiddla as a design collaboration tool for distributed teams. If you're in the industry, we'd love to hear your feedback!

Labels: , ,

Copyright © 2008 Expat Software