So, I dug up the standard for SMS - GSM 03.38. This corresponds to an ISO character set called ISO 8859-1, which is extremely similar to Microsoft's Windows-1252 character set.
Sending a message through Kannel
I was using Kannel, the open source SMS Gateway software to send out the SMSs. Now, Kannel accepts all messages posted over HTTP only in the Windows-1252 encoding.
So if you're using ASP.NET, you must URL encode your text using the Windows-1252 encoding before making the HTTP request to Kannel. Otherwise, the message received on the device on the other end will look like gibberish.
Receiving a message through Kannel (Kannel Post)
When Kannel receives a message, It tries to see if the character encoding matches ISO 8859-1. If it decoding the message fails using the 8-bit character set, it tries 16-bit Unicode Big Endian (UTF-16BE).
If it is configured to post the message to a designated URL, it will first URL Encode the received text using using the determined formatting, and then supply the character set in the URL as a query string parameter.
If you want to receive your messages in ISO 8859-1, it is important to stick to the characters defined in the set. Failing to do so will call your Post URL with Unicode encoded text.
Resources / References:
- Kannel: Open Source WAP and SMS gateway
- GSM 03.38 Character Set: Ref 1, Ref 2, ISO 8859-1 Mapping
- ISO-8859-1 Encoding, Windows 1252 Encoding
0 .c.o.m.m.e.n.t.s.:
Post a Comment