<?xml version="1.0"?>
<rss version="2.0">
	<channel>
		<title>UTF-8 text and terminal on Windows</title>
		<link>http://www.allegro.cc/forums/view/614672</link>
		<description>Allegro.cc Forum Thread</description>
		<webMaster>matthew@allegro.cc (Matthew Leverton)</webMaster>
		<lastBuildDate>Wed, 01 Oct 2014 08:36:35 +0000</lastBuildDate>
	</channel>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I&#39;d like to output some UTF-8 text to a terminal on Windows. Of course, it doesn&#39;t work with cmd.exe. I&#39;ve read about setting a &#39;magic&#39; codepage 65001, but this doesn&#39;t work for me either.  </p><p>So I&#39;m looking for a replacement terminal app for Windows which supports that. I&#39;ve tried MSYS already to no avail.<br />Do you have any suggestions? <img src="http://www.allegro.cc/forums/smileys/smiley.gif" alt=":)" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Polybios)</author>
		<pubDate>Thu, 25 Sep 2014 22:27:50 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Codepage 65001 works for me, but maybe you&#39;re doing something I&#39;m not?</p><p>By the way, you have to make sure it&#39;s set to a font that supports the characters you want to see. Mine is set to Consolas.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (torhu)</author>
		<pubDate>Thu, 25 Sep 2014 22:57:44 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I&#39;m getting <i>two</i> boxes with question marks for one UTF-8 character with either Consolas or Lucida Console. If it was just the glyphs missing, there should only be <i>one</i> of those tiny boxes per character, I guess. So it&#39;s probably not a font-problem. I&#39;ve checked the fonts, the glyphs are there. <img src="http://www.allegro.cc/forums/smileys/undecided.gif" alt=":-/" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Polybios)</author>
		<pubDate>Thu, 25 Sep 2014 23:06:22 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Hm. Well, UTF-8 support in Windows still sucks.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (torhu)</author>
		<pubDate>Thu, 25 Sep 2014 23:15:09 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Output as utf16 instead of utf8 maybe, at least worth a try. <span class="source-code"><a href="http://www.allegro.cc/manual/al_ustr_encode_utf16"><span class="a">al_ustr_encode_utf16</span></a></span> might be helpful.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Elias)</author>
		<pubDate>Fri, 26 Sep 2014 01:29:19 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I&#39;ve further tested this crap with cp 65001. <br />Looks like cout and puts do work with UTF-8, just the printf family of functions doesn&#39;t... Now why&#39;s that? <img src="http://www.allegro.cc/forums/smileys/rolleyes.gif" alt="::)" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Polybios)</author>
		<pubDate>Mon, 29 Sep 2014 21:23:25 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Could be because printf outputs one byte at a time, while the others don&#39;t, since they have no need to inspect the contents of the string. Just guessing.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (torhu)</author>
		<pubDate>Mon, 29 Sep 2014 21:32:00 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>It&#39;s ... very interesting behavior. When I put multibyte characters into a %s argument-string, it doesn&#39;t work either.<br />Reading input doesn&#39;t seem to work at all. As soon as there is a multibyte character, the usual functions just fail and return empty / garbage strings.</p><p><i>But</i> I&#39;ve finally managed to find something on the matter <br /><a href="http://alfps.wordpress.com/2011/12/08/unicode-part-2-utf-8-stream-mode/">here</a>. Input can be fixed by installing a custom streambuffer on input streams among some other stuff that needs to be done.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Polybios)</author>
		<pubDate>Mon, 29 Sep 2014 22:14:26 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>There&#39;s some free OS out there that supports UTF-8 on the terminal. Was it... Line Ucks? <img src="http://www.allegro.cc/forums/smileys/grin.gif" alt=";D" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (furinkan)</author>
		<pubDate>Tue, 30 Sep 2014 09:43:48 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I know. But I need to port it to Windows. <br />Now I was finally able to read wstrings without problems via ReadConsoleW WinApi, yay!</p><p>For wprintf to work at all, you have to call _setmode(_fileno(stdout), _O_U16TEXT) beforehand plus everything needs to be converted to wstrings, which I don&#39;t want to do.</p><p>I guess I&#39;ll just re#define printf to some custom function. snprintf-ing and then fputs-ing UTF-8 works with codepage 65001 <img src="http://www.allegro.cc/forums/smileys/rolleyes.gif" alt="::)" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Polybios)</author>
		<pubDate>Tue, 30 Sep 2014 14:25:12 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Eww... I&#39;m really sorry. <img src="http://www.allegro.cc/forums/smileys/undecided.gif" alt=":-/" /></p><p>You could use Allegro&#39;s routines to write the UTF-8 to a file. I believe you could use fputs() and al_fwrite(). Your editor obviously supports UTF-8...</p><p>Unless you need this log to be real time. <img src="http://www.allegro.cc/forums/smileys/huh.gif" alt="???" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (furinkan)</author>
		<pubDate>Tue, 30 Sep 2014 19:23:29 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Ok, it&#39;s solved:
</p><ul><li><p>Output via <tt>chchp 65001</tt>, re#defining printf to sprintf to a buffer which is then puts-ed out, since normal printf just won&#39;t work with multibyte-chars</p></li><li><p>Input via <tt>ReadConsoleW</tt> and subsequent conversion to UTF-8 with <tt>WideCharToMultiByte</tt></p></li></ul><p>
What a crap thing to do. </p><p>I was surprised that cmd.exe did pass all files found by <tt>*</tt> in a certain directory to my program via argc/argv, though. Last time I checked (long time ago), you had to do the scanning yourself. <img src="http://www.allegro.cc/forums/smileys/shocked.gif" alt=":o" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Polybios)</author>
		<pubDate>Wed, 01 Oct 2014 01:07:08 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title"><a href="http://www.allegro.cc/forums/thread/614672/1005742#target">Polybios</a> said:</div><div class="quote"><p> I was surprised that cmd.exe did pass all files found by * in a certain directory to my program via argc/argv, though. Last time I checked (long time ago), you had to do the scanning yourself. <img src="http://www.allegro.cc/forums/smileys/shocked.gif" alt=":o" /></p></div></div><p>Are you sure? I just tested with VS 9, and that definitely didn&#39;t happen... <img src="http://www.allegro.cc/forums/smileys/undecided.gif" alt=":-/" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (torhu)</author>
		<pubDate>Wed, 01 Oct 2014 01:28:37 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Yes, it works. I&#39;m using g++ / MinGW, though, maybe it&#39;s a special feature of their runtime?
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Polybios)</author>
		<pubDate>Wed, 01 Oct 2014 03:24:44 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Yes, GCC is doing it because Unix shells usually do it. In other words, cmd.exe had nothing to do with it.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (torhu)</author>
		<pubDate>Wed, 01 Oct 2014 07:36:31 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Why are you guys talking about compilers? What do they have to do with whether cmd.exe globs * into a file list? It&#39;s easy to see it does, on Vista at least with this tiny program :
</p><div class="source-code snippet"><div class="inner"><pre><span class="p">#include &lt;cstdio&gt;</span>

<span class="k1">int</span> main<span class="k2">(</span><span class="k1">int</span> argc , <span class="k1">char</span><span class="k3">*</span><span class="k3">*</span> argv<span class="k2">)</span> <span class="k2">{</span>
  
  <span class="k1">for</span> <span class="k2">(</span><span class="k1">int</span> i <span class="k3">=</span> <span class="n">0</span> <span class="k2">;</span> i <span class="k3">&lt;</span> argc <span class="k2">;</span> <span class="k3">+</span><span class="k3">+</span>i<span class="k2">)</span> <span class="k2">{</span>
    <a href="http://www.delorie.com/djgpp/doc/libc/libc_624.html" target="_blank">printf</a><span class="k2">(</span><span class="s">"Arg %d = '%s'\n"</span> , i , argv<span class="k2">[</span>i<span class="k2">]</span><span class="k2">)</span><span class="k2">;</span>
  <span class="k2">}</span>

  <span class="k1">return</span> <span class="n">0</span><span class="k2">;</span>
<span class="k2">}</span>
</pre></div></div><p>
Try passing * or *.* or something similar to the program and you will see cmd.exe turns the *s into batches of command line parameters.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Edgar Reynaldo)</author>
		<pubDate>Wed, 01 Oct 2014 07:49:13 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>For compilers that do it the Microsoft way, you have to link in glob.obj or something, it&#39;s been that way lo these many years.  DJGPP had a VMS-like way of globbing through all the subdirectories with a &quot;../*&quot; approach.  The cmd.exe program only loads up the globbing program and passes on the arguments verbatim.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Arthur Kalliokoski)</author>
		<pubDate>Wed, 01 Oct 2014 07:52:23 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Which version of VS does that?
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (torhu)</author>
		<pubDate>Wed, 01 Oct 2014 08:01:20 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p> <a href="http://msdn.microsoft.com/en-us/library/8bch7bkk.aspx">This MSDN article</a> says it&#39;s Setargv.obj.  Maybe I was thinking of the old Borland compilers with glob.obj or something.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Arthur Kalliokoski)</author>
		<pubDate>Wed, 01 Oct 2014 08:10:37 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Wow. I guess Microsoft must have had powerful enemies at that time. Maybe God, Satan, and Hitler teamed up with Mighty Mouse or something. It&#39;s not every day that M$ do something that doesn&#39;t not make sense <img src="http://www.allegro.cc/forums/smileys/tongue.gif" alt=":P" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (torhu)</author>
		<pubDate>Wed, 01 Oct 2014 08:27:11 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I did a <i>lot</i> of assembler programs on DOS back in the day, and the Program Segment Prefix only had room for 127 bytes to store parameters.  For DOS compilers that needed a long command line, a &#39;@&#39; prefix was used to specify a file that had all the needed info.</p><p>Windows has improved on that somewhat in the meantime, be grateful.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Arthur Kalliokoski)</author>
		<pubDate>Wed, 01 Oct 2014 08:31:06 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>That&#39;s not the same thing, though <img src="http://www.allegro.cc/forums/smileys/tongue.gif" alt=":P" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (torhu)</author>
		<pubDate>Wed, 01 Oct 2014 08:33:21 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p><a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms683156%28v=vs.85%29.aspx">http://msdn.microsoft.com/en-us/library/windows/desktop/ms683156%28v=vs.85%29.aspx</a>
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Arthur Kalliokoski)</author>
		<pubDate>Wed, 01 Oct 2014 08:36:35 +0000</pubDate>
	</item>
</rss>
