Getting portuguese to work with al_draw

Getting portuguese to work with al_draw_text

Cassio Renan

Member #14,189

April 2012

Hi,

I'm brazilian, and as such, I am inclined to make pt-br versions of my games. But as you guys may know(or not), portuguese words may, or may not, have accents in them, such as "áàéíóúêôã"(can't tell if you'll be able to see these characters).

Witch brings to my problem. What happens when I try printing "Isto é uma frase" with al_draw_text, is the function printing "Isto ". The function aborts at the first character it doesn't know(but it is described in the TTF font), and prints only the first characters before it.

My question is: Is there any way to get around this? Will support for other languages be added in future releases?

Thanks!

Trent Gamblin

Member #261

April 2000

If your text is in UTF8 format, and the font has the characters you need, it will work. I have a Brazilian Portuguese translation in my game and it works. I use DejaVu Sans but any font with those characters should work. Just make sure the text is encoded as UTF8.

Cassio Renan

Member #14,189

April 2012

I feel really stupid. the Allegro UTF-8 routines take like, 30 pages of the reference manual. Thanks Trent, if you hadn't replied I wouldn't have seen it.
I'm trying to use allegro routines, but can't get them to work. Tried googling for some time(and even got to this nice article) with no success. Is there any particular way I should be handling the strings? the code:

al_ustr_new("Isto é uma linha");

al_ustr_new(random_c_string); // any random string with 'accented' chars

doesn't seem to work. I know I can keep the code points in a header file full of defines, like:

#define U_ae         0x00e6   /* æ */
#define U_i_acute    0x00ed   /* í */
#define U_eth        0x00f0   /* ð */
#define U_o_dia      0x00f6   /* ö */
#define U_thorn      0x00fe   /* þ */
#define U_z_bar      0x01b6   /* ƶ */
#define U_schwa      0x0259   /* ə */
#define U_beta       0x03b2   /* β */
#define U_1d08       0x1d08   /* ᴈ */
#define U_1ff7       0x1ff7   /* ῷ */
#define U_2051       0x2051   /* ⁑ */
#define U_euro       0x20ac   /* € */

And write a function to push them into a ALLEGRO_USTR using a regular c string and a switch, but I know there is a better way of doing this.

Any help?
Thanks!

Trent Gamblin

Member #261

April 2000

When you save your source file with whatever text editor you're using, save it with UTF8 encoding. If your text editor can't do that, get one that can. For Windows Notepad++ (free) can convert between encodings. On Mac, Textwrangler (free) and probably something like gedit or kate can on Linux but I'm not sure there.

Cassio Renan

Member #14,189

April 2012

That's the problem. I use Visual Studio 2010 Express on Windows, and all my source code is alredy encoded in UTF-8(just checked it, and the character 'é' occupies 2 bytes, as it is supposed to). I know that's a problem with C strings(they only allow 1 byte per char, witch means my compiler is probably converting the string into ASCII before building, i guess). Is there any configuration I'm missing? I just gone into a crazy "check unicode" rampage on every configuration field I could into the project options.

And thanks for the help, I really appreciate it.

Peter Wang

Member #23

April 2000

MSVC is indeed troublesome in this regard. I believe you need to remove the UTF-8 BOM (byte order mark) on your source files (and make sure they never come back!) so that MSVC treats your files as an 8-bit encoding. I couldn't find an option to convince the compiler to leave the strings alone otherwise.

Apparently you can install this hotfix http://support.microsoft.com/kb/980263 then use

#pragma execution_character_set("utf-8")

A more robust but tedious way is to externalise your accented strings and use only ASCII string literals in your source. C11 introduces the u8 prefix on string literals but that doesn't help for now.

Cassio Renan

Member #14,189

April 2012

This must be the most hilarious solution I've ever seen to a problem.
I changed my source files encoding to ANSI(ASCII) using notepad++ (whithout converting from UTF-8, witch made 'é' turn into 'Ã©') and then loaded them again on MSVC.
MSVC probably recognized as ASCII and not tried to convert it. Like you said. It worked.

I'm really amazed.

PS.:Unfortunately, the hotfix only works for MSVC 2008, and I use 2010.

Thanks!