Playing around with an AJAX project at home. More of a learning project than anything. And I've run into a particularly troublesome problem with Apple Safari browser and the XMLHttpRequest object. Ah, the joys of cross-browser development. I cringe whenever I hear that a new browser has come out.
The problem comes about when I'm making an AJAX "method call" and the response contains extended characters. Mac Safari doesn't want to play nice with these characters (yet Win/Mac Opera & Win Firefox are fine). I almost feel sorry for those who have extended characters in their name! Playing with this AJAX wrapper from Michael Swartz. A look under the hood reveals how things a working. And I isolate it down to the request from the XMLHttpRequest object. Knocked up a quick spike (?) to see what is actually coming back from the server.
First off I tried loading two things from the XMLHttpRequest object:
- a file on the server containing the extended characters.
- simulating the request the ajax wrapper is making.
Hmmm... Loading the file works fine. The AJAX request comes back with some ugly characters in place of the extended ones. For a further look under the hood I've downloaded the Fiddler proxy to see what is going over the wire. Some observations:
Reponse Headers |
From File | From Ajax |
HTTP/1.1 200 OK
Connection: close
Proxy-Connection: close
Content-Length: 649
Date: Sat, 15 Oct 2005 09:09:23 GMT
Content-Type: text/xml
ETag: "b04067251cfc51:ae2"
Server: Microsoft-IIS/5.0
X-Powered-By: ASP.NET
Accept-Ranges: bytes
Last-Modified: Wed, 12 Oct 2005 07:47:07 GMT
|
HTTP/1.1 200 OK
Connection: close
Proxy-Connection: close
Content-Length: 680
Expires: Sat, 15 Oct 2005 09:09:24 GMT
Date: Sat, 15 Oct 2005 09:09:24 GMT
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/5.0
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
Cache-Control: private
|
Response Bytes |
From File |
EF BB BF 5B 27 41 6D 65 72 69 63 61 2F 41 |
From Ajax |
5B 27 41 6D 65 72 69 63 61 2F 41 |
Different content types, and the preceding bytes EF BB BF on the file. A quick google discovers the bytes are a byte order mark. Presumably one of these is the reason why Safari doesn't want to treat the AJAX response as UTF-8?? Time to read up on unicode, this is a learning exercise after all: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Time to have a look at an AJAX application that isn't having problems with extended characters on Safari: Gmail seems to work fine. Time to fire up Fiddler again and see how the experts are doing it! I've sent my gmail account an email containing extended characters, interested to see how the content type and BOM will look. The content type response header from gmail looks like this:
Content-Type: text/html; charset=utf-8
The byte response starts off like this:
61 | 34 | 34 | 0D | 0A | 3C | 68 | 74 | 6D | 6C |
a | 4 | 4 | . | . | < | h | t |
m | l |
And the HTML contains this meta tag:
61 34 34 isn't a byte order mark, maybe its just part of Google's own 'wire format'. This data is being read in via Javascript, maybe they are just stashing something handy here? The meta tag is worth experimenting with. So I create two HTML files one with the meta tag, and without. And load both via the XMLHttpRequest object. Interesting, the one containing the meta tag loads okay! Here's a stab in the dark theory: Maybe Safari doesn't pass the content-type from the response headers onto the XMLHttpRequest object? But it does interpret the meta tag? It previously loaded the file okay - but this contained a byte order mark which would've flagged it at UTF-8 regardless of the content type.
Now to experiment with this in my AJAX world. This time I try loading an ASPX page from the XMLHttpRequest object. A plain vanilla ASPX page with extended characters doesn't display the characters correctly. Now to try it with the meta tag. Visual Studio even provides an easy way of adding the tag: edit the ASPX page in design mode -> Properties -> Charset -> Unicode (UTF-8). And how does this look in Safari: Even uglier!
A closer look at the way Belém, Pará is being encoded:
From File: |
42 65 6C C3 A9 6D 2C 20 50 61 72 C3 A1 |
From AJAX: |
42 65 6C C3 A9 6D 2C 20 50 61 72 C3 A1 |
From ASPX: (without META) |
42 65 6C C3 A9 6D 2C 20 50 61 72 C3 A1 |
From ASPX: (with META) |
42 65 6C C3 83 C2 A9 6D 2C 20 50 61 72 C3 83 C2 A1 |
Huh?? The first two look pretty consistent with my (new found) understanding of UTF-8. Then adding the meta tag to the ASPX page is causing ASP.NET to change the response?? Is it getting encoded twice? Time to hit Google and find an explanation for this one!
The ASP.NET Resources site has an interesting article: Unicode in Visual Studio.NET 2003 - yet this mentions:
Some people—including myself—use the http-equiv="content-type" meta tag. In the course of this research I learned that this header has no bearing on anything because ASP.NET will always set a response encoding—yours or a default one.
Can't find anything about the meta tag on the web. Played around a bit, and had a look at two ASPX files with and without the meta tag. When viewed in a hex editor - it shows that Visual Studio is actually encoding the files differently when the meta tag is present! This only effects the contents of the ASPX page - this doesn't effect a label on the ASPX page, with its Text property set from the code-behind.
No meta tag
With the meta tag
Discoveries
- The XMLHttpRequest object under Safari seems to require the meta tag present to correctly interpret a UTF-8 encoded response.
- Visual Studio changes the encoding used to save the ASPX file when the this meta tag is present.
"When all is said and done - much more is said than done". Some interesting discoveries there. But this doesn't actually solve my initial problem. The response from the AJAX method only contains the result of the method. I could change it to wrap the reponse with html, head, meta and body tags - which seems to be the way Gmail is doing it. Experimented with this and it does actually work. But I'd have to strip out all the wrapper client-side. More client side code means more cross-browser worries. Or I could change the AJAX wrapper to make its response look more like downloading a file - the content type header, and the byte order mark at the start. This should 'play nice' in other browsers. Ideally I'd want something that integrates nicely back into Michael Swartz's AJAX project - shielding future developers from the palava I've just gone though. Lots of work for a browser that makes up for 0.85% - 5.1% of the net (when you include all KHMTL based browsers)! As I said - all a learning exercise!
Labels: ajax, asp.net