|
UTF-8 text and terminal on Windows |
Polybios
Member #12,293
October 2010
|
I'd like to output some UTF-8 text to a terminal on Windows. Of course, it doesn't work with cmd.exe. I've read about setting a 'magic' codepage 65001, but this doesn't work for me either. So I'm looking for a replacement terminal app for Windows which supports that. I've tried MSYS already to no avail. |
torhu
Member #2,727
September 2002
|
Codepage 65001 works for me, but maybe you're doing something I'm not? By the way, you have to make sure it's set to a font that supports the characters you want to see. Mine is set to Consolas. |
Polybios
Member #12,293
October 2010
|
I'm getting two boxes with question marks for one UTF-8 character with either Consolas or Lucida Console. If it was just the glyphs missing, there should only be one of those tiny boxes per character, I guess. So it's probably not a font-problem. I've checked the fonts, the glyphs are there. |
torhu
Member #2,727
September 2002
|
Hm. Well, UTF-8 support in Windows still sucks. |
Elias
Member #358
May 2000
|
Output as utf16 instead of utf8 maybe, at least worth a try. al_ustr_encode_utf16 might be helpful. -- |
Polybios
Member #12,293
October 2010
|
I've further tested this crap with cp 65001. |
torhu
Member #2,727
September 2002
|
Could be because printf outputs one byte at a time, while the others don't, since they have no need to inspect the contents of the string. Just guessing. |
Polybios
Member #12,293
October 2010
|
It's ... very interesting behavior. When I put multibyte characters into a %s argument-string, it doesn't work either. But I've finally managed to find something on the matter |
furinkan
Member #10,271
October 2008
|
There's some free OS out there that supports UTF-8 on the terminal. Was it... Line Ucks? |
Polybios
Member #12,293
October 2010
|
I know. But I need to port it to Windows. For wprintf to work at all, you have to call _setmode(_fileno(stdout), _O_U16TEXT) beforehand plus everything needs to be converted to wstrings, which I don't want to do. I guess I'll just re#define printf to some custom function. snprintf-ing and then fputs-ing UTF-8 works with codepage 65001 |
furinkan
Member #10,271
October 2008
|
Eww... I'm really sorry. You could use Allegro's routines to write the UTF-8 to a file. I believe you could use fputs() and al_fwrite(). Your editor obviously supports UTF-8... Unless you need this log to be real time. |
Polybios
Member #12,293
October 2010
|
Ok, it's solved:
What a crap thing to do. I was surprised that cmd.exe did pass all files found by * in a certain directory to my program via argc/argv, though. Last time I checked (long time ago), you had to do the scanning yourself. |
torhu
Member #2,727
September 2002
|
Polybios said: I was surprised that cmd.exe did pass all files found by * in a certain directory to my program via argc/argv, though. Last time I checked (long time ago), you had to do the scanning yourself. Are you sure? I just tested with VS 9, and that definitely didn't happen... |
Polybios
Member #12,293
October 2010
|
Yes, it works. I'm using g++ / MinGW, though, maybe it's a special feature of their runtime? |
torhu
Member #2,727
September 2002
|
Yes, GCC is doing it because Unix shells usually do it. In other words, cmd.exe had nothing to do with it. |
Edgar Reynaldo
Major Reynaldo
May 2007
|
Why are you guys talking about compilers? What do they have to do with whether cmd.exe globs * into a file list? It's easy to see it does, on Vista at least with this tiny program : #include <cstdio> int main(int argc , char** argv) { for (int i = 0 ; i < argc ; ++i) { printf("Arg %d = '%s'\n" , i , argv[i]); } return 0; } Try passing * or *.* or something similar to the program and you will see cmd.exe turns the *s into batches of command line parameters. My Website! | EAGLE GUI Library Demos | My Deviant Art Gallery | Spiraloid Preview | A4 FontMaker | Skyline! (Missile Defense) Eagle and Allegro 5 binaries | Older Allegro 4 and 5 binaries | Allegro 5 compile guide |
Arthur Kalliokoski
Second in Command
February 2005
|
For compilers that do it the Microsoft way, you have to link in glob.obj or something, it's been that way lo these many years. DJGPP had a VMS-like way of globbing through all the subdirectories with a "../*" approach. The cmd.exe program only loads up the globbing program and passes on the arguments verbatim. They all watch too much MSNBC... they get ideas. |
torhu
Member #2,727
September 2002
|
Which version of VS does that? |
Arthur Kalliokoski
Second in Command
February 2005
|
This MSDN article says it's Setargv.obj. Maybe I was thinking of the old Borland compilers with glob.obj or something. They all watch too much MSNBC... they get ideas. |
torhu
Member #2,727
September 2002
|
Wow. I guess Microsoft must have had powerful enemies at that time. Maybe God, Satan, and Hitler teamed up with Mighty Mouse or something. It's not every day that M$ do something that doesn't not make sense |
Arthur Kalliokoski
Second in Command
February 2005
|
I did a lot of assembler programs on DOS back in the day, and the Program Segment Prefix only had room for 127 bytes to store parameters. For DOS compilers that needed a long command line, a '@' prefix was used to specify a file that had all the needed info. Windows has improved on that somewhat in the meantime, be grateful. They all watch too much MSNBC... they get ideas. |
torhu
Member #2,727
September 2002
|
That's not the same thing, though |
Arthur Kalliokoski
Second in Command
February 2005
|
They all watch too much MSNBC... they get ideas. |
|