Posted by: sourceoffailure | May 16, 2008


One of the most important qualities of programs written today is that they must support numerous languages across the world. All programs written today, regardless of their intended audience, should be written with a Unicode file format [preferably UTF-8] with localization in mind. No internationalization leads to plenty of Unicode bugs!

I cant describe the confusion I was having when I was testing my own little Java program which expected to display some Kanji in a Java program. To my horror, it wasn’t working, and I thought that I had left a bug unsquashed. In reality, I was saving it in an ISO format that guaranteed only ANSI characters (to my knowledge). Saving it in UTF-8 solved the confusion.

The second thing I see and loathe in code is the expectation that there will be no localization. One of the greatest little lessons I learned from my high school teacher was to declare each and every string as a constant and leave it at the top of the file. Then, when somebody wants to localize the file, they do not have to hunt through each and every file editing the text!

I’ve even seen this failure littered throughout DrProject, and I certainly would be tempted to fix all of them. The only time I could see where this should be avoided is in C, which has no overloading of the + operator to concatenate strings easily, or include other data as well. Therefore, Java or Python have no excuse!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: