Both Oracle and PL/SQL Developer seems to be a real pain to work with when it comes to unicode encoded as UTF8.
I thought I had nailed down a recipee that worked for PL/SQL Developer 11.0.3. Then we upgraded to 11.0.5 and now it corrupts our unicode files again..
1) I have configured PL/SQL Developer to enable UTF8
2) I have configured PL/SQL Developer to save as UTF8 without BOM.
3) I have set NLS_LANG to AMERICAN_AMERICA.AL32UTF8
Greek letter
I save a file with just the greek "theta" character, in UTF8 format in Notepad++
Using a hex viewer I see that it contains the two bytes CE B8, which is correct:
http://www.fileformat.info/info/unicode/char/03b8/index.htm
When I open this as a Program File, it displays correctly, but when I save it I get C3 8E C2 B8 0D 0A..
If we ignore the cr+lf at the end, it seems each byte has been re-encoded as a separate "utf8" character.
When I open it as a SQL Script it is displayed as two characters.
And when I save this, I get C3 8E C2 B8 0D 0A, same as above, and equally wrong.
Norwegian letter
When I do the same test with a norwegian character, which is my local character set, it is loaded and saved correctly by the Program Editor, except that it still adds cr+lf, but that is ok, and probably desireable.
The SQL editor makes the same kind of hash of it as for the theta character.
This came as a surprise, since the SQL editor seems to handle proper files correctly. I assume it has some other issue triggered by tiny files?
All in all really unhappy with PL/SQL Developer at the moment.
( Btw, the forum also truncated my 1st posting attempt where I had pasted in a norwegian character, even though it looked ok in the preview. Not happy about that either.. )
I thought I had nailed down a recipee that worked for PL/SQL Developer 11.0.3. Then we upgraded to 11.0.5 and now it corrupts our unicode files again..
1) I have configured PL/SQL Developer to enable UTF8
2) I have configured PL/SQL Developer to save as UTF8 without BOM.
3) I have set NLS_LANG to AMERICAN_AMERICA.AL32UTF8
Greek letter
I save a file with just the greek "theta" character, in UTF8 format in Notepad++
Using a hex viewer I see that it contains the two bytes CE B8, which is correct:
http://www.fileformat.info/info/unicode/char/03b8/index.htm
When I open this as a Program File, it displays correctly, but when I save it I get C3 8E C2 B8 0D 0A..
If we ignore the cr+lf at the end, it seems each byte has been re-encoded as a separate "utf8" character.
When I open it as a SQL Script it is displayed as two characters.
And when I save this, I get C3 8E C2 B8 0D 0A, same as above, and equally wrong.
Norwegian letter
When I do the same test with a norwegian character, which is my local character set, it is loaded and saved correctly by the Program Editor, except that it still adds cr+lf, but that is ok, and probably desireable.
The SQL editor makes the same kind of hash of it as for the theta character.
This came as a surprise, since the SQL editor seems to handle proper files correctly. I assume it has some other issue triggered by tiny files?
All in all really unhappy with PL/SQL Developer at the moment.
( Btw, the forum also truncated my 1st posting attempt where I had pasted in a norwegian character, even though it looked ok in the preview. Not happy about that either.. )
Last edited: