Sending Special Characters with UTF-8 encoding to RESTful endpoints was causing the service to throw a 500 Internal Service Exception.
- the characters in question: àéïôûç
- chrome devtools showed encoded values as this: Ã Ã©Ã¯Ã´Ã»Ã§
- adding to confusion, logging data on the server side, Notepad++ (encoding set to UTF-8) showed me this:
My initial thought was I was passing the data incorrectly. I opened a Stackoverflow.com question for this issue: angularjs $http POST special characters
The Request encoding/charset (see w3c: HTML URL Encoding Reference) is determined by the instructions passed in the header.
Setting up log4j logs on the server side revealed my data was making it to the endpoint correctly.
I was encouraged to read the values using Windows Notepad itself (heaven forbid I should ever use such backwards tool!) But it showed the truth, the characters got to the endpoint correctly.
The issue was coming from another service being consumed. It wasn’t handling the UTF-8 characters incorrectly, and throwing an exception.
So why did I feel the need to blog about this?
I want to provide a trail of what I discovered available, because another will inevitably have the same questions as I did.
- Notepad++, the normally VERY reliable text reader, threw me for a loop. Granted Hex values are technically accurate, but mix that with the unexpected encoding revealed by Chrome DevTools an
- Notepad ++ encoding was set to UTF-8.
- That threw me for a loop
- I then zeroed in on request encoding as it left the browser
(Google Chrome’s Devtools Network tab).
- I saw the ‘human’ unreadable encoding of special characters which normally appears as Clear Text: Ã Ã©Ã¯Ã´Ã»Ã§
- THAT threw me for a loop
- Finally someone suggested I look at the values in straight Windows Notepad – which reflected the truth:
"###### First Name:àéïôûç àéïôûç "
- The browser receives its encoding instructions from the header values
- Windows Notepad can still be a useful tool for validating data