Quantcast
Channel: Encoding – The Wiert Corner – irregular stream of stuff
Viewing all articles
Browse latest Browse all 160

Delphi – HIGHCHARUNICODE directive (Delphi) – RAD Studio

$
0
0

I forgot about it, but this thread reminded be about the differences between these two character values.

Quoting from the first post:

c1 := #128;
c2 := chr(128);
Assert(c1 = c2);

the assertion fails, meaning that c1 <> c2.

In fact c1 = #$20AC and c2 = #$80.

Since Chr is a pseudo-function that does a conversion from an integer to a Unicode character, c2 ends up as Unicode codepoint u+0080, whereas c1 gets converted from the AnsiChar 0x80 (the Euro Sign in a lot of Ansi codepages) into Unicode codepoint U+20AC.

Allen Bauer correctly mentioned that in order to define a character constant as a true Unicode codepoint, you have to use 4 hexadecimal digits:

c1 := #$0080;
c2 := chr(128);
Assert(c1 = c2);

This syntax with 4 hexadecimal digits is backwards compatible: with the above code, Pre-Delphi-2009 compilers, will get Ansi codepoint 128.

If you cannot rely on the encoding of your Delphi source files (for instance because your version control system mangles them, or for other reasons) that is the only way to go, hence my SO answer on via Wrong Unicode conversion, how to store accent characters in Delphi 2010 source code and handle character sets?

Don’t rely on the encoding of your Delphi source code files.

It might be mangled when using any non-Unicode tool to work on your text files (or even buggy Unicode aware tools).

The best way is to specify your characters as a 4-digit Unicode code point.

const
   MyEuroSign = #$20AC;

A few more notes:

Here you can find a few of the Unicode codepoints (thanks Thomas Schild!):

Rudy Velthuis explains that you can automagically force the Delphi compiler to always use Unicode codepoints using the $HIGHCHARUNICODE directive (I didn’t know that <g>). That is not always what you want though. So it is better to expand your character constants into 4 hexadecimal digits.

See: HIGHCHARUNICODE directive (Delphi) – RAD Studio.

Some more people that got bitten by this

–jeroen


Posted in Delphi, Development, Encoding, Software Development, Unicode

Viewing all articles
Browse latest Browse all 160

Trending Articles