Tuesday, January 27, 2009

Converting and Encoding a URL Containing Extended ASCII Characters (Delphi utility with source)

It has come to my attention that some programming languages do not always make it very easy to convert and encode a URL.

While helping someone with bug testing and adding Last.fm support to a media player, we came across a problem with opening URL's to the correct page on last.fm's site when the title, artist or other tag info contained extended ASCII characters.


  • Artist: Bjørn Lynne
  • Title: Methydias Cloudship

This would be the correct URL on their site for this song:


But in AHK, which does not support unicode, attempting to come up with the URL resulted in this incorrect one:


The problem was that you have to convert the URL from Ansi to UTF-8 before you encode it.

At first he wasn't sure if he could fix the problem, as that was what he thought he was doing in the first place.

So, while he decided to take a break from bug fixing to go play with his kids, because all coding and no play can make a coder and his family miserable, I decided to go ahead and make a little helper utility for him, just in case he couldn't resolve the issue.

Now I am not too sure on the particular details of how he fixed it, but he did and didn't need my little utility, after all.

But I am not one to let code go to waste, and my intention was to help someone resolve a problem, so I decided that this code will still do just that, one way or another. I am pretty sure that there is someone out there somewhere that could find this useful for something.

So, here it is, a small command line utility for Windows that will accept a parameter of a URL containing extended ASCII characters, convert the URL to UTF-8, then properly encode it, and finally, open that URL in the system's default browser.

I have included the Delphi source in case you may need that, too.

While testing this with various browsers, I noticed something peculiar. Certain browsers will display the URL in the addressbar differently than what the URL actually is. Although it is properly converted and encoded, the addressbar will display the original extended ASCII. I suppose this is to make it look pretty.

The following browsers exhibited this behavior:

  • Firefox 3
  • Opera 9
  • Chrome 1

Additionally, while both Firefox and Chrome copied the correct URL to the clipboard from the addressbar, Opera did not.

Opera copied extended ASCII, instead.

Internet Explorer did not exhibit these behaviors and displayed the actual URL used, in the addressbar.

Rest assured, this application does work properly, even if your browser isn't displaying what you think it should be showing in its addressbar.

Now I don't care if you use either the code or the compiled utility in an open source or close source project, for non-commercial or commercial purposes, and you don't have to give me any credit or compensation if you use it.

My goal was to help someone and if this can help you in some way, feel free to use it in any way you wish. I have released both the utility and the source into the public domain.


Creative Commons LicenseThis application and its source code is dedicated to the Public Domain.

Download: Delphi encodeURL utility with source

blog comments powered by Disqus