<?xml version="1.0"?>
<rss version="2.0">
	<channel>
		<title>double precision for al_transform_coordinates?</title>
		<link>http://www.allegro.cc/forums/view/616178</link>
		<description>Allegro.cc Forum Thread</description>
		<webMaster>matthew@allegro.cc (Matthew Leverton)</webMaster>
		<lastBuildDate>Tue, 12 Apr 2016 08:04:52 +0000</lastBuildDate>
	</channel>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I&#39;m guessing this is moot because of implementation details, but I was just wondering if there could be a use for a double precision version of al_transform_coordinates? I&#39;m guessing allegro uses floats in its transformation matrices so there wouldn&#39;t be much point if that is true.</p><p>It might matter if there was a version that took double pointers instead of float pointers, because right now I have to declare two floats and perform assignment to get the data back into my double types. It&#39;s a data intensive operation in this case, so it might matter at least a little bit.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Edgar Reynaldo)</author>
		<pubDate>Tue, 05 Apr 2016 07:32:44 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>What are you working on that needs <span class="source-code"><span class="k1">double</span></span> instead of <span class="source-code"><span class="k1">float</span></span>?
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Mark Oates)</author>
		<pubDate>Tue, 05 Apr 2016 07:40:00 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I&#39;m working on my Spiraloid program again, and I need super high precision angles for the spiral&#39;s theta value and theta offset, as well as rotation.</p><p>Something I&#39;m doing now is using integer decimals and exponenents to keep the values the same and prevent precision loss when adding values, then I convert to doubles when I go to actually use the value. But I need high precision for the transformations that I&#39;m applying. I suppose I have a matrix class lying around here somewhere that I could use...., but I really like Allegro&#39;s TRANSFORMs.</p><p>Edit<br />I have a list of spiral coordinates that need to be updated anytime the scale, offset, or rotation changes. The rotation changes fairly often, as the spiraloid may be spinning. There may be as many as (sqrt(1920^2 + 1200^2)/radial_delta)*(360/theta_delta) xy data points (for my laptop&#39;s specific resolution, but could be higher than that even) that need to be updated as often as once per monitor refresh. So it could be a lot of transformations, and I need to save the cpu as much as I can so it doesn&#39;t slow down the animation.</p><p>Ex, with a radial_delta of 1 and a theta_delta of 1 that is 815,000 data points running at 60 Hz gives about 2*2*50 million float to double assignments and 2*50 million transformation calculations per second, which is enough to stress the cpu.</p><p>Edit 2</p><p>Here&#39;s some 11x17 prints on the wall I made of some of my Spiraloid images today using the Color copier print service at Staples. Only about $15 bucks for 10 images, and the lady was nice enough to give me 10 free sheets of glossy photo paper to use. <img src="http://www.allegro.cc/forums/smileys/wink.gif" alt=";)" /></p><p><span class="remote-thumbnail"><span class="json">{"name":"610268","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/9\/6\/960872a39afeb072ee8fc19c09dde637.png","w":800,"h":450,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/9\/6\/960872a39afeb072ee8fc19c09dde637"}</span><img src="http://www.allegro.cc//djungxnpq2nug.cloudfront.net/image/cache/9/6/960872a39afeb072ee8fc19c09dde637-240.jpg" alt="610268" width="240" height="135" /></span>
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Edgar Reynaldo)</author>
		<pubDate>Tue, 05 Apr 2016 15:36:31 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I actually use my own version which uses double - float just doesn&#39;t really work at all above values of about &gt;20,000 or you lose 1-pixel accuracy. And that&#39;s when not manipulating coordinates - longer chains of transformations basically don&#39;t work with float, period.</p><p>Even with double it&#39;s easy to hit accuracy problems when you&#39;re not careful about the order of operations.</p><p>So basically, I&#39;d be for converting all floats in Allegro do double <img src="http://www.allegro.cc/forums/smileys/smiley.gif" alt=":)" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Elias)</author>
		<pubDate>Tue, 05 Apr 2016 16:53:04 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Would that impact the FPU performance at all? Are floats significantly faster than doubles?
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Edgar Reynaldo)</author>
		<pubDate>Tue, 05 Apr 2016 17:04:40 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title"><a href="http://www.allegro.cc/forums/thread/616178/1021548#target">Edgar Reynaldo</a> said:</div><div class="quote"><p>
Are floats significantly faster than doubles?
</p></div></div><p>

Everyone on the web keeps giving out B.S. answers. I think we need to do an actual benchmark to get an answer to that.</p><p>The best I could find is this:</p><p><a href="http://brandon.northcutt.net/article/double+VS+float+Speed+Comparison/20150625.html">http://brandon.northcutt.net/article/double+VS+float+Speed+Comparison/20150625.html</a></p><p>In synthetic test, ever-so-slightly slower. In a &quot;real world&quot; test, it was twice as slow.</p><p>Of course, &quot;twice is slow&quot; is meaningless to a 4.0 GHZ server with 802,351 cores.</p><p>[edit] Someone linked this talk on a Reddit post:</p><p>I&#39;m gonna watch it when I get back home. Supposedly it covers float/double performance.</p><p><a href="https://channel9.msdn.com/Events/Build/2014/4-587">https://channel9.msdn.com/Events/Build/2014/4-587</a>
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Chris Katko)</author>
		<pubDate>Tue, 05 Apr 2016 18:24:56 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title"><a href="http://www.allegro.cc/forums/thread/616178/1021553#target">Chris Katko</a> said:</div><div class="quote"><p>
 Everyone on the web keeps giving out B.S. answers. I think we need to do an actual benchmark to get an answer to that.</p><p>The best I could find is this:</p><p><a href="http://brandon.northcutt.net/article/double+VS+float+Speed+Comparison/20150625.html">http://brandon.northcutt.net/article/double+VS+float+Speed+Comparison/20150625.html</a>
</p></div></div><p>

The first thing I saw was -pg and gprof. That did not inspire confidence in me. gprof is hopelessly broken and no longer in development AFAIK (at least for MinGW).</p><div class="quote_container"><div class="title">Brandon Northcutt said:</div><div class="quote"><p>
I used a second method to evaluate a more &quot;in the wild&quot; performance and it yielded interesting results. For this method I compiled the program without the CPU profiling switch &quot;-pg&quot; and then made two binaries, one which ran only the float benchmark and one that ran only the double benchmark.<br />BASH COMMANDS</p><p>$ time ./float_bench<br />real	0m13.677s<br />user	0m13.665s<br />sys	0m0.012s</p><p>$ time ./double_bench<br />real	0m30.670s<br />user	0m21.427s<br />sys	0m9.243s
</p></div></div><p>
These results carry far more weight with me. But do they mean I should sacrifice the precision of doubles for the speed of floats? I don&#39;t know.</p><p>Something to note is that there were not any optimization flags passed to the compiler. It might be worth retesting the second method with optimizations enabled. I&#39;m not on Linux so I can&#39;t use &#39;time&#39; to measure it though, and I dont&#39; know how to use high performance counters on Windows yet.</p><p>Edit
</p><div class="quote_container"><div class="title">Chris Katko said:</div><div class="quote"><p>
[edit] Someone linked this talk on a Reddit post:</p><p>I&#39;m gonna watch it when I get back home. Supposedly it covers float/double performance.</p><p><a href="https://channel9.msdn.com/Events/Build/2014/4-587">https://channel9.msdn.com/Events/Build/2014/4-587</a> 
</p></div></div><p>
I watched the slideshow, and it gave some juicy tidbits about new instruction sets like AVX and AVX2 and about how &#39;optimizations&#39; on one architecture can be &#39;stalls&#39; on another.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Edgar Reynaldo)</author>
		<pubDate>Tue, 05 Apr 2016 21:11:36 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Double precision will be anywhere from slow to glacier on the GPU when compared to single precision; on CPUs (x86 at least), not so much.</p><p><a href="https://github.com/g-truc/glm">GLM</a> is a great math library. It&#39;s pretty much standalone, very portable, and has support for most everything you&#39;d need for rendering. And it supports single and double precision matrices (and vectors, and so on).</p><p>Since Allegro&#39;s transforms are geared towards GPUs, or so I think, single precision is probably best.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Erin Maus)</author>
		<pubDate>Tue, 05 Apr 2016 22:09:54 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I modified Brandon&#39;s benchmarking program (the second method) slightly and fixed a minor bug (he was initializing a float array with 0.0 ((not 0.0f))) and then compiled it with different optimization levels and ran the tests with 1000 calls.</p><p>Zip file of code and batch scripts :<br /><a href="https://www.allegro.cc/files/attachment/610270">BenchmarksAndProfiling.zip</a></p><p>Here are the results :
</p><div class="source-code"><div class="toolbar"><span class="button numbers"><b>#</b></span><span class="button select">Select</span><span class="button expand">Expand</span></div><div class="inner"><span class="number">  1</span>
<span class="number">  2</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;CompileFPUbenchmark.bat
<span class="number">  3</span>
<span class="number">  4</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;echo on
<span class="number">  5</span>
<span class="number">  6</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem Compiling fpubenchmark.cpp
<span class="number">  7</span>ECHO is on.
<span class="number">  8</span>
<span class="number">  9</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;mingw32-g<span class="k3">+</span><span class="k3">+</span> <span class="k3">-</span>Wall <span class="k3">-</span>m32 <span class="k3">-</span>O0 <span class="k3">-</span>o fpu32-0.exe <span class="k3">-</span>Ic:\mingw\LIBS\A5113distro\include <span class="k3">-</span>Lc:\mingw\LIBS\A5113distro\lib fpubenchmark.cpp <span class="k3">-</span>lallegro_monolith.dll
<span class="number"> 10</span>
<span class="number"> 11</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;mingw32-g<span class="k3">+</span><span class="k3">+</span> <span class="k3">-</span>Wall <span class="k3">-</span>m32 <span class="k3">-</span>O1 <span class="k3">-</span>o fpu32-1.exe <span class="k3">-</span>Ic:\mingw\LIBS\A5113distro\include <span class="k3">-</span>Lc:\mingw\LIBS\A5113distro\lib fpubenchmark.cpp <span class="k3">-</span>lallegro_monolith.dll
<span class="number"> 12</span>
<span class="number"> 13</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;mingw32-g<span class="k3">+</span><span class="k3">+</span> <span class="k3">-</span>Wall <span class="k3">-</span>m32 <span class="k3">-</span>O2 <span class="k3">-</span>o fpu32-2.exe <span class="k3">-</span>Ic:\mingw\LIBS\A5113distro\include <span class="k3">-</span>Lc:\mingw\LIBS\A5113distro\lib fpubenchmark.cpp <span class="k3">-</span>lallegro_monolith.dll
<span class="number"> 14</span>
<span class="number"> 15</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;mingw32-g<span class="k3">+</span><span class="k3">+</span> <span class="k3">-</span>Wall <span class="k3">-</span>m32 <span class="k3">-</span>O3 <span class="k3">-</span>o fpu32-3.exe <span class="k3">-</span>Ic:\mingw\LIBS\A5113distro\include <span class="k3">-</span>Lc:\mingw\LIBS\A5113distro\lib fpubenchmark.cpp <span class="k3">-</span>lallegro_monolith.dll
<span class="number"> 16</span>ECHO is on.
<span class="number"> 17</span>
<span class="number"> 18</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem <span class="k3">-</span>m64 <span class="k1">not</span> supported on mingw32
<span class="number"> 19</span>
<span class="number"> 20</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem mingw32-g<span class="k3">+</span><span class="k3">+</span> <span class="k3">-</span>Wall <span class="k3">-</span>m64 <span class="k3">-</span>O0 <span class="k3">-</span>o fpu64-0.exe <span class="k3">-</span>Ic:\mingw\LIBS\A5113distro\include <span class="k3">-</span>Lc:\mingw\LIBS\A5113distro\lib fpubenchmark.cpp <span class="k3">-</span>lallegro_monolith.dll
<span class="number"> 21</span>
<span class="number"> 22</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem mingw32-g<span class="k3">+</span><span class="k3">+</span> <span class="k3">-</span>Wall <span class="k3">-</span>m64 <span class="k3">-</span>O1 <span class="k3">-</span>o fpu64-1.exe <span class="k3">-</span>Ic:\mingw\LIBS\A5113distro\include <span class="k3">-</span>Lc:\mingw\LIBS\A5113distro\lib fpubenchmark.cpp <span class="k3">-</span>lallegro_monolith.dll
<span class="number"> 23</span>
<span class="number"> 24</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem mingw32-g<span class="k3">+</span><span class="k3">+</span> <span class="k3">-</span>Wall <span class="k3">-</span>m64 <span class="k3">-</span>O2 <span class="k3">-</span>o fpu64-2.exe <span class="k3">-</span>Ic:\mingw\LIBS\A5113distro\include <span class="k3">-</span>Lc:\mingw\LIBS\A5113distro\lib fpubenchmark.cpp <span class="k3">-</span>lallegro_monolith.dll
<span class="number"> 25</span>
<span class="number"> 26</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem mingw32-g<span class="k3">+</span><span class="k3">+</span> <span class="k3">-</span>Wall <span class="k3">-</span>m64 <span class="k3">-</span>O3 <span class="k3">-</span>o fpu64-3.exe <span class="k3">-</span>Ic:\mingw\LIBS\A5113distro\include <span class="k3">-</span>Lc:\mingw\LIBS\A5113distro\lib fpubenchmark.cpp <span class="k3">-</span>lallegro_monolith.dll
<span class="number"> 27</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;RunFPUbenchmark.bat
<span class="number"> 28</span>
<span class="number"> 29</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;echo on
<span class="number"> 30</span>ECHO is on.
<span class="number"> 31</span>
<span class="number"> 32</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem Running <span class="n">32</span> bit fpu benchmarks
<span class="number"> 33</span>
<span class="number"> 34</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;fpu32-0.exe
<span class="number"> 35</span>Testing <span class="n">1000</span> calls <span class="k1">and</span> <span class="n">6220800</span> memory allocations <span class="k2">:</span>
<span class="number"> 36</span><span class="k1">float</span> result <span class="n">9679922003968</span>.<span class="n">000000</span>
<span class="number"> 37</span>Float results <span class="k2">(</span>   <span class="n">43</span>.<span class="n">69694183</span> seconds<span class="k2">)</span> <span class="k2">:</span> Total allocation <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">18</span>.<span class="n">57704912</span> seconds , total math <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">23</span>.<span class="n">58766864</span> seconds , total dealloc <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">1</span>.<span class="n">53222407</span>
<span class="number"> 38</span>Float result averages <span class="k2">:</span> Allocation average     <span class="n">0</span>.<span class="n">01857705</span> , math average     <span class="n">0</span>.<span class="n">02358767</span> , dealloc average     <span class="n">0</span>.<span class="n">00153222</span>
<span class="number"> 39</span><span class="k1">double</span> result <span class="n">9674583494400</span>.<span class="n">500000</span>
<span class="number"> 40</span>Double results <span class="k2">(</span>   <span class="n">54</span>.<span class="n">04342045</span> seconds<span class="k2">)</span> <span class="k2">:</span> Total allocation <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">19</span>.<span class="n">55067594</span> seconds , total math <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">31</span>.<span class="n">37647679</span> seconds , total dealloc <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">3</span>.<span class="n">11626772</span>
<span class="number"> 41</span>Double result averages <span class="k2">:</span> Allocation average     <span class="n">0</span>.<span class="n">01955068</span> , math average     <span class="n">0</span>.<span class="n">03137648</span> , dealloc average     <span class="n">0</span>.<span class="n">00311627</span>
<span class="number"> 42</span>
<span class="number"> 43</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;fpu32-1.exe
<span class="number"> 44</span>Testing <span class="n">1000</span> calls <span class="k1">and</span> <span class="n">6220800</span> memory allocations <span class="k2">:</span>
<span class="number"> 45</span><span class="k1">float</span> result <span class="n">9679922003968</span>.<span class="n">000000</span>
<span class="number"> 46</span>Float results <span class="k2">(</span>   <span class="n">25</span>.<span class="n">86970711</span> seconds<span class="k2">)</span> <span class="k2">:</span> Total allocation <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">6</span>.<span class="n">87207194</span> seconds , total math <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">17</span>.<span class="n">43963017</span> seconds , total dealloc <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">1</span>.<span class="n">55800499</span>
<span class="number"> 47</span>Float result averages <span class="k2">:</span> Allocation average     <span class="n">0</span>.<span class="n">00687207</span> , math average     <span class="n">0</span>.<span class="n">01743963</span> , dealloc average     <span class="n">0</span>.<span class="n">00155800</span>
<span class="number"> 48</span><span class="k1">double</span> result <span class="n">9674583494400</span>.<span class="n">500000</span>
<span class="number"> 49</span>Double results <span class="k2">(</span>   <span class="n">33</span>.<span class="n">06386690</span> seconds<span class="k2">)</span> <span class="k2">:</span> Total allocation <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">12</span>.<span class="n">41526336</span> seconds , total math <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">17</span>.<span class="n">54913144</span> seconds , total dealloc <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">3</span>.<span class="n">09947210</span>
<span class="number"> 50</span>Double result averages <span class="k2">:</span> Allocation average     <span class="n">0</span>.<span class="n">01241526</span> , math average     <span class="n">0</span>.<span class="n">01754913</span> , dealloc average     <span class="n">0</span>.<span class="n">00309947</span>
<span class="number"> 51</span>
<span class="number"> 52</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;fpu32-2.exe
<span class="number"> 53</span>Testing <span class="n">1000</span> calls <span class="k1">and</span> <span class="n">6220800</span> memory allocations <span class="k2">:</span>
<span class="number"> 54</span><span class="k1">float</span> result <span class="n">9679922003968</span>.<span class="n">000000</span>
<span class="number"> 55</span>Float results <span class="k2">(</span>   <span class="n">24</span>.<span class="n">68116194</span> seconds<span class="k2">)</span> <span class="k2">:</span> Total allocation <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">6</span>.<span class="n">41169159</span> seconds , total math <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">16</span>.<span class="n">69645412</span> seconds , total dealloc <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">1</span>.<span class="n">57301624</span>
<span class="number"> 56</span>Float result averages <span class="k2">:</span> Allocation average     <span class="n">0</span>.<span class="n">00641169</span> , math average     <span class="n">0</span>.<span class="n">01669645</span> , dealloc average     <span class="n">0</span>.<span class="n">00157302</span>
<span class="number"> 57</span><span class="k1">double</span> result <span class="n">9674583494400</span>.<span class="n">500000</span>
<span class="number"> 58</span>Double results <span class="k2">(</span>   <span class="n">32</span>.<span class="n">01803220</span> seconds<span class="k2">)</span> <span class="k2">:</span> Total allocation <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">12</span>.<span class="n">20064342</span> seconds , total math <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">16</span>.<span class="n">72717455</span> seconds , total dealloc <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">3</span>.<span class="n">09021423</span>
<span class="number"> 59</span>Double result averages <span class="k2">:</span> Allocation average     <span class="n">0</span>.<span class="n">01220064</span> , math average     <span class="n">0</span>.<span class="n">01672717</span> , dealloc average     <span class="n">0</span>.<span class="n">00309021</span>
<span class="number"> 60</span>
<span class="number"> 61</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;fpu32-3.exe
<span class="number"> 62</span>Testing <span class="n">1000</span> calls <span class="k1">and</span> <span class="n">6220800</span> memory allocations <span class="k2">:</span>
<span class="number"> 63</span><span class="k1">float</span> result <span class="n">9679922003968</span>.<span class="n">000000</span>
<span class="number"> 64</span>Float results <span class="k2">(</span>   <span class="n">24</span>.<span class="n">45988656</span> seconds<span class="k2">)</span> <span class="k2">:</span> Total allocation <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">6</span>.<span class="n">31435281</span> seconds , total math <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">16</span>.<span class="n">57656673</span> seconds , total dealloc <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">1</span>.<span class="n">56896702</span>
<span class="number"> 65</span>Float result averages <span class="k2">:</span> Allocation average     <span class="n">0</span>.<span class="n">00631435</span> , math average     <span class="n">0</span>.<span class="n">01657657</span> , dealloc average     <span class="n">0</span>.<span class="n">00156897</span>
<span class="number"> 66</span><span class="k1">double</span> result <span class="n">9674583494400</span>.<span class="n">500000</span>
<span class="number"> 67</span>Double results <span class="k2">(</span>   <span class="n">32</span>.<span class="n">33114045</span> seconds<span class="k2">)</span> <span class="k2">:</span> Total allocation <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">12</span>.<span class="n">42040700</span> seconds , total math <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>    <span class="n">16</span>.<span class="n">77953203</span> seconds , total dealloc <a href="http://www.delorie.com/djgpp/doc/libc/libc_821.html" target="_blank">time</a>     <span class="n">3</span>.<span class="n">13120142</span>
<span class="number"> 68</span>Double result averages <span class="k2">:</span> Allocation average     <span class="n">0</span>.<span class="n">01242041</span> , math average     <span class="n">0</span>.<span class="n">01677953</span> , dealloc average     <span class="n">0</span>.<span class="n">00313120</span>
<span class="number"> 69</span>ECHO is on.
<span class="number"> 70</span>
<span class="number"> 71</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem Running <span class="n">64</span> bit fpu benchmarks
<span class="number"> 72</span>
<span class="number"> 73</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem fpu64-0.exe
<span class="number"> 74</span>
<span class="number"> 75</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem fpu64-1.exe
<span class="number"> 76</span>
<span class="number"> 77</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem fpu64-2.exe
<span class="number"> 78</span>
<span class="number"> 79</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;rem fpu64-3.exe
<span class="number"> 80</span>
<span class="number"> 81</span>c:\ctwoplus\progcode\BenchmarksAndProfiling&gt;
</div></div><p>

Here&#39;s the code I used :
</p><div class="source-code"><div class="toolbar"><span class="button numbers"><b>#</b></span><span class="button select">Select</span><span class="button expand">Expand</span></div><div class="inner"><span class="number">   1</span><span class="c">//CREATED: 2015-06-25 15:11 -  -BDN</span>
<span class="number">   2</span><span class="c">//UPDATED: 2015-06-25 15:11 -  -BDN</span>
<span class="number">   3</span><span class="c">//AUTHORS: Brandon D. Northcutt (brandon@northcutt.net)</span>
<span class="number">   4</span><span class="c">//</span>
<span class="number">   5</span><span class="c">//This is a program intended to illustrate the relative efficiency of double versus single precision floating point numbers.</span>
<span class="number">   6</span><span class="p">#include "allegro5/allegro.h"</span>
<span class="number">   7</span>
<span class="number">   8</span>
<span class="number">   9</span>
<span class="number">  10</span><span class="p">#include &lt;cstdio&gt;</span>
<span class="number">  11</span><span class="p">#include &lt;cstdlib&gt;</span>
<span class="number">  12</span>
<span class="number">  13</span>
<span class="number">  14</span>
<span class="number">  15</span><span class="k1">volatile</span> <span class="k1">int</span> MEM <span class="k3">=</span> <span class="n">6220800</span><span class="k2">;</span><span class="c">///2015-06-25 14:27 - An RGB 1920x1080 image. -BDN</span>
<span class="number">  16</span><span class="k1">const</span> <span class="k1">int</span> CALLS <span class="k3">=</span> <span class="n">1000</span><span class="k2">;</span>
<span class="number">  17</span>
<span class="number">  18</span><span class="k1">double</span>  <span class="k3">*</span> ad<span class="k2">;</span>
<span class="number">  19</span><span class="k1">float</span>  <span class="k3">*</span> af<span class="k2">;</span>
<span class="number">  20</span>
<span class="number">  21</span><span class="k1">int</span> CALLNUM <span class="k3">=</span> <span class="n">0</span><span class="k2">;</span>
<span class="number">  22</span>
<span class="number">  23</span><span class="k1">double</span> double_alloc_time<span class="k2">[</span>CALLS<span class="k2">]</span><span class="k2">;</span>
<span class="number">  24</span><span class="k1">double</span> double_math_time<span class="k2">[</span>CALLS<span class="k2">]</span><span class="k2">;</span>
<span class="number">  25</span><span class="k1">double</span> double_dealloc_time<span class="k2">[</span>CALLS<span class="k2">]</span><span class="k2">;</span>
<span class="number">  26</span><span class="k1">double</span> total_double_alloc_time <span class="k3">=</span> <span class="n">0</span>.<span class="n">0</span><span class="k2">;</span>
<span class="number">  27</span><span class="k1">double</span> total_double_math_time <span class="k3">=</span> <span class="n">0</span>.<span class="n">0</span><span class="k2">;</span>
<span class="number">  28</span><span class="k1">double</span> total_double_dealloc_time <span class="k3">=</span> <span class="n">0</span>.<span class="n">0</span><span class="k2">;</span>
<span class="number">  29</span>
<span class="number">  30</span><span class="k1">double</span> float_alloc_time<span class="k2">[</span>CALLS<span class="k2">]</span><span class="k2">;</span>
<span class="number">  31</span><span class="k1">double</span> float_math_time<span class="k2">[</span>CALLS<span class="k2">]</span><span class="k2">;</span>
<span class="number">  32</span><span class="k1">double</span> float_dealloc_time<span class="k2">[</span>CALLS<span class="k2">]</span><span class="k2">;</span>
<span class="number">  33</span><span class="k1">double</span> total_float_alloc_time <span class="k3">=</span> <span class="n">0</span>.<span class="n">0</span><span class="k2">;</span>
<span class="number">  34</span><span class="k1">double</span> total_float_math_time <span class="k3">=</span> <span class="n">0</span>.<span class="n">0</span><span class="k2">;</span>
<span class="number">  35</span><span class="k1">double</span> total_float_dealloc_time <span class="k3">=</span> <span class="n">0</span>.<span class="n">0</span><span class="k2">;</span>
<span class="number">  36</span>
<span class="number">  37</span><span class="k1">void</span> double_memory_allocation<span class="k2">(</span><span class="k2">)</span>
<span class="number">  38</span><span class="k2">{</span>
<span class="number">  39</span>  ad<span class="k3">=</span><span class="k1">new</span> <span class="k1">double</span><span class="k2">[</span>MEM<span class="k2">]</span><span class="k2">;</span>
<span class="number">  40</span>  <span class="k1">for</span><span class="k2">(</span><span class="k1">int</span> i<span class="k3">=</span><span class="n">0</span><span class="k2">;</span>i<span class="k3">&lt;</span>MEM<span class="k2">;</span>i<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span> ad<span class="k2">[</span>i<span class="k2">]</span><span class="k3">=</span><span class="n">0</span>.<span class="n">0</span><span class="k2">;</span>
<span class="number">  41</span><span class="k2">}</span>
<span class="number">  42</span>
<span class="number">  43</span><span class="k1">double</span> double_math<span class="k2">(</span><span class="k1">void</span><span class="k2">)</span>
<span class="number">  44</span><span class="k2">{</span>
<span class="number">  45</span>  <span class="k1">double</span> t<span class="k2">;</span>
<span class="number">  46</span>  <span class="k1">for</span><span class="k2">(</span><span class="k1">int</span> i<span class="k3">=</span><span class="n">1</span><span class="k2">;</span>i<span class="k3">&lt;</span>MEM<span class="k2">;</span>i<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span>
<span class="number">  47</span>  <span class="k2">{</span>
<span class="number">  48</span>    t<span class="k3">=</span><span class="k2">(</span><span class="k1">double</span><span class="k2">)</span>i<span class="k2">;</span>
<span class="number">  49</span>    ad<span class="k2">[</span>i<span class="k2">]</span><span class="k3">=</span><span class="k2">(</span>t<span class="k3">*</span>t <span class="k3">-</span> t<span class="k2">)</span><span class="k3">/</span><span class="k2">(</span>t <span class="k3">+</span> t<span class="k2">)</span><span class="k2">;</span>
<span class="number">  50</span>    ad<span class="k2">[</span><span class="n">0</span><span class="k2">]</span><span class="k3">+</span><span class="k3">=</span>ad<span class="k2">[</span>i<span class="k2">]</span><span class="k2">;</span>
<span class="number">  51</span>  <span class="k2">}</span>
<span class="number">  52</span>  <span class="k1">return</span> ad<span class="k2">[</span><span class="n">0</span><span class="k2">]</span><span class="k2">;</span>
<span class="number">  53</span><span class="k2">}</span>
<span class="number">  54</span>
<span class="number">  55</span><span class="k1">void</span> double_memory_deallocation<span class="k2">(</span><span class="k2">)</span>
<span class="number">  56</span><span class="k2">{</span>
<span class="number">  57</span>  <span class="k1">delete</span> ad<span class="k2">;</span>
<span class="number">  58</span><span class="k2">}</span>
<span class="number">  59</span>
<span class="number">  60</span><span class="k1">double</span> double_benchmark<span class="k2">(</span><span class="k1">void</span><span class="k2">)</span>
<span class="number">  61</span><span class="k2">{</span>
<span class="number">  62</span>  <span class="k1">double</span> r<span class="k2">;</span>
<span class="number">  63</span>  <span class="k1">double</span> t1 <span class="k3">=</span> <a href="http://www.allegro.cc/manual/al_get_time"><span class="a">al_get_time</span></a><span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number">  64</span>  double_memory_allocation<span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number">  65</span>  <span class="k1">double</span> t2 <span class="k3">=</span> <a href="http://www.allegro.cc/manual/al_get_time"><span class="a">al_get_time</span></a><span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number">  66</span>  r<span class="k3">=</span>double_math<span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number">  67</span>  <span class="k1">double</span> t3 <span class="k3">=</span> <a href="http://www.allegro.cc/manual/al_get_time"><span class="a">al_get_time</span></a><span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number">  68</span>  double_memory_deallocation<span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number">  69</span>  <span class="k1">double</span> t4 <span class="k3">=</span> <a href="http://www.allegro.cc/manual/al_get_time"><span class="a">al_get_time</span></a><span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number">  70</span>  
<span class="number">  71</span>  total_double_alloc_time <span class="k3">+</span><span class="k3">=</span> double_alloc_time<span class="k2">[</span>CALLNUM<span class="k2">]</span> <span class="k3">=</span> t2 <span class="k3">-</span> t1<span class="k2">;</span>
<span class="number">  72</span>  total_double_math_time <span class="k3">+</span><span class="k3">=</span> double_math_time<span class="k2">[</span>CALLNUM<span class="k2">]</span> <span class="k3">=</span> t3 <span class="k3">-</span> t2<span class="k2">;</span>
<span class="number">  73</span>  total_double_dealloc_time <span class="k3">+</span><span class="k3">=</span> double_dealloc_time<span class="k2">[</span>CALLNUM<span class="k2">]</span> <span class="k3">=</span> t4 <span class="k3">-</span> t3<span class="k2">;</span>
<span class="number">  74</span>  
<span class="number">  75</span>  <span class="k1">return</span> r<span class="k2">;</span>
<span class="number">  76</span><span class="k2">}</span>
<span class="number">  77</span>
<span class="number">  78</span><span class="k1">void</span> float_memory_allocation<span class="k2">(</span><span class="k2">)</span>
<span class="number">  79</span><span class="k2">{</span>
<span class="number">  80</span>  af<span class="k3">=</span><span class="k1">new</span> <span class="k1">float</span><span class="k2">[</span>MEM<span class="k2">]</span><span class="k2">;</span>
<span class="number">  81</span>  <span class="k1">for</span><span class="k2">(</span><span class="k1">int</span> i<span class="k3">=</span><span class="n">0</span><span class="k2">;</span>i<span class="k3">&lt;</span>MEM<span class="k2">;</span>i<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span> af<span class="k2">[</span>i<span class="k2">]</span><span class="k3">=</span><span class="n">0</span>.<span class="n">0f</span><span class="k2">;</span>
<span class="number">  82</span><span class="k2">}</span>
<span class="number">  83</span>
<span class="number">  84</span><span class="k1">float</span> float_math<span class="k2">(</span><span class="k1">void</span><span class="k2">)</span>
<span class="number">  85</span><span class="k2">{</span>
<span class="number">  86</span>  <span class="k1">float</span> t<span class="k2">;</span>
<span class="number">  87</span>  <span class="k1">for</span><span class="k2">(</span><span class="k1">int</span> i<span class="k3">=</span><span class="n">1</span><span class="k2">;</span>i<span class="k3">&lt;</span>MEM<span class="k2">;</span>i<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span>
<span class="number">  88</span>  <span class="k2">{</span>
<span class="number">  89</span>    t<span class="k3">=</span><span class="k2">(</span><span class="k1">float</span><span class="k2">)</span>i<span class="k2">;</span>
<span class="number">  90</span>    af<span class="k2">[</span>i<span class="k2">]</span><span class="k3">=</span><span class="k2">(</span>t<span class="k3">*</span>t <span class="k3">-</span> t<span class="k2">)</span><span class="k3">/</span><span class="k2">(</span>t <span class="k3">+</span> t<span class="k2">)</span><span class="k2">;</span>
<span class="number">  91</span>    af<span class="k2">[</span><span class="n">0</span><span class="k2">]</span><span class="k3">+</span><span class="k3">=</span>af<span class="k2">[</span>i<span class="k2">]</span><span class="k2">;</span>
<span class="number">  92</span>  <span class="k2">}</span>
<span class="number">  93</span>  <span class="k1">return</span> af<span class="k2">[</span><span class="n">0</span><span class="k2">]</span><span class="k2">;</span>
<span class="number">  94</span><span class="k2">}</span>
<span class="number">  95</span>
<span class="number">  96</span><span class="k1">void</span> float_memory_deallocation<span class="k2">(</span><span class="k2">)</span>
<span class="number">  97</span><span class="k2">{</span>
<span class="number">  98</span>  <span class="k1">delete</span> af<span class="k2">;</span>
<span class="number">  99</span><span class="k2">}</span>
<span class="number"> 100</span>
<span class="number"> 101</span><span class="k1">float</span> float_benchmark<span class="k2">(</span><span class="k1">void</span><span class="k2">)</span>
<span class="number"> 102</span><span class="k2">{</span>
<span class="number"> 103</span>  <span class="k1">float</span> r<span class="k2">;</span>
<span class="number"> 104</span>  <span class="k1">double</span> t1 <span class="k3">=</span> <a href="http://www.allegro.cc/manual/al_get_time"><span class="a">al_get_time</span></a><span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number"> 105</span>  float_memory_allocation<span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number"> 106</span>  <span class="k1">double</span> t2 <span class="k3">=</span> <a href="http://www.allegro.cc/manual/al_get_time"><span class="a">al_get_time</span></a><span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number"> 107</span>  r<span class="k3">=</span>float_math<span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number"> 108</span>  <span class="k1">double</span> t3 <span class="k3">=</span> <a href="http://www.allegro.cc/manual/al_get_time"><span class="a">al_get_time</span></a><span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number"> 109</span>  float_memory_deallocation<span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number"> 110</span>  <span class="k1">double</span> t4 <span class="k3">=</span> <a href="http://www.allegro.cc/manual/al_get_time"><span class="a">al_get_time</span></a><span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number"> 111</span>
<span class="number"> 112</span>  total_float_alloc_time <span class="k3">+</span><span class="k3">=</span> float_alloc_time<span class="k2">[</span>CALLNUM<span class="k2">]</span> <span class="k3">=</span> t2 <span class="k3">-</span> t1<span class="k2">;</span>
<span class="number"> 113</span>  total_float_math_time <span class="k3">+</span><span class="k3">=</span> float_math_time<span class="k2">[</span>CALLNUM<span class="k2">]</span> <span class="k3">=</span> t3 <span class="k3">-</span> t2<span class="k2">;</span>
<span class="number"> 114</span>  total_float_dealloc_time <span class="k3">+</span><span class="k3">=</span> float_dealloc_time<span class="k2">[</span>CALLNUM<span class="k2">]</span> <span class="k3">=</span> t4 <span class="k3">-</span> t3<span class="k2">;</span>
<span class="number"> 115</span>
<span class="number"> 116</span>  <span class="k1">return</span> r<span class="k2">;</span>
<span class="number"> 117</span><span class="k2">}</span>
<span class="number"> 118</span>
<span class="number"> 119</span><span class="k1">int</span> main <span class="k2">(</span><span class="k1">void</span><span class="k2">)</span>
<span class="number"> 120</span><span class="k2">{</span>
<span class="number"> 121</span>   <a href="http://www.allegro.cc/manual/al_init"><span class="a">al_init</span></a><span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number"> 122</span>
<span class="number"> 123</span>  <span class="k1">float</span> tmpf<span class="k2">;</span>    
<span class="number"> 124</span>  <span class="k1">double</span> tmpd<span class="k2">;</span>  
<span class="number"> 125</span>
<span class="number"> 126</span>  <a href="http://www.delorie.com/djgpp/doc/libc/libc_624.html" target="_blank">printf</a><span class="k2">(</span><span class="s">"Testing %d calls and %d memory allocations :\n"</span> , CALLS , MEM<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 127</span>
<span class="number"> 128</span>  <span class="k1">for</span><span class="k2">(</span>CALLNUM<span class="k3">=</span><span class="n">0</span><span class="k2">;</span>CALLNUM<span class="k3">&lt;</span>CALLS<span class="k2">;</span>CALLNUM<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span> tmpf<span class="k3">=</span>float_benchmark<span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>  
<span class="number"> 129</span>  <a href="http://www.delorie.com/djgpp/doc/libc/libc_624.html" target="_blank">printf</a><span class="k2">(</span><span class="s">"float result %f\n"</span>,tmpf<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 130</span>
<span class="number"> 131</span>  <span class="k1">double</span> total_float_time <span class="k3">=</span> total_float_alloc_time <span class="k3">+</span> total_float_math_time <span class="k3">+</span> total_float_dealloc_time<span class="k2">;</span>
<span class="number"> 132</span>  <a href="http://www.delorie.com/djgpp/doc/libc/libc_624.html" target="_blank">printf</a><span class="k2">(</span><span class="s">"Float results (%14.8lf seconds) : Total allocation time %14.8lf seconds , total math time %14.8lf seconds , total dealloc time %14.8lf\n"</span>,
<span class="number"> 133</span>            total_float_time , total_float_alloc_time , total_float_math_time , total_float_dealloc_time<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 134</span>   <a href="http://www.delorie.com/djgpp/doc/libc/libc_624.html" target="_blank">printf</a><span class="k2">(</span><span class="s">"Float result averages : Allocation average %14.8lf , math average %14.8lf , dealloc average %14.8lf\n"</span>,
<span class="number"> 135</span>            total_float_alloc_time<span class="k3">/</span>CALLS , total_float_math_time<span class="k3">/</span>CALLS , total_float_dealloc_time<span class="k3">/</span>CALLS<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 136</span>          
<span class="number"> 137</span>  
<span class="number"> 138</span>  <span class="k1">for</span><span class="k2">(</span>CALLNUM<span class="k3">=</span><span class="n">0</span><span class="k2">;</span>CALLNUM<span class="k3">&lt;</span>CALLS<span class="k2">;</span>CALLNUM<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span> tmpd<span class="k3">=</span>double_benchmark<span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>  
<span class="number"> 139</span>  <a href="http://www.delorie.com/djgpp/doc/libc/libc_624.html" target="_blank">printf</a><span class="k2">(</span><span class="s">"double result %lf\n"</span>,tmpd<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 140</span>  
<span class="number"> 141</span>  <span class="k1">double</span> total_double_time <span class="k3">=</span> total_double_alloc_time <span class="k3">+</span> total_double_math_time <span class="k3">+</span> total_double_dealloc_time<span class="k2">;</span>
<span class="number"> 142</span>  <a href="http://www.delorie.com/djgpp/doc/libc/libc_624.html" target="_blank">printf</a><span class="k2">(</span><span class="s">"Double results (%14.8lf seconds) : Total allocation time %14.8lf seconds , total math time %14.8lf seconds , total dealloc time %14.8lf\n"</span>,
<span class="number"> 143</span>            total_double_time , total_double_alloc_time , total_double_math_time , total_double_dealloc_time<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 144</span>   <a href="http://www.delorie.com/djgpp/doc/libc/libc_624.html" target="_blank">printf</a><span class="k2">(</span><span class="s">"Double result averages : Allocation average %14.8lf , math average %14.8lf , dealloc average %14.8lf\n"</span>,
<span class="number"> 145</span>            total_double_alloc_time<span class="k3">/</span>CALLS , total_double_math_time<span class="k3">/</span>CALLS , total_double_dealloc_time<span class="k3">/</span>CALLS<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 146</span>
<span class="number"> 147</span>  <span class="k1">return</span> <span class="n">0</span><span class="k2">;</span>
<span class="number"> 148</span><span class="k2">}</span>
</div></div><p>
As expected, -O0 took the longest. -O1, -O2, and -O3 were all comparable. Memory allocation and deallocation generally took twice the time for doubles as it did for floats (because they are twice as big). Deallocation times were constant across optimizations. Something important to note is that I used <span class="source-code"><span class="k1">volatile</span></span> for the memory allocation size so it couldn&#39;t be optimized away.</p><p>I used al_get_time for measurements. Allocation and deallocation can be quite costly, and should be avoided if possible. The math times are comparable on my laptop with any optimization other than -O0 (Intel i7-5700HQ @ 2.70 GHz).</p><p>I&#39;m running Windows 10 64 bit and I wanted to test with -m64 architecture but mingw32 doesn&#39;t support it. <img src="http://www.allegro.cc/forums/smileys/tongue.gif" alt=":P" /></p><p>Edit</p><p>TL;DR;<br />Here&#39;s table of the results including the allocations :
</p><pre>
-O0 float  : 43.70ms per op = 22.88FPS
-O0 double : 54.04ms per op = 18.50FPS

-O1 float  : 25.87ms per op = 38.65FPS
-O1 double : 33.06ms per op = 30.25FPS

-O2 float  : 24.68ms per op = 40.52FPS
-O2 double : 32.02ms per op = 31.23FPS

-O3 float  : 24.46ms per op = 40.88FPS
-O3 double : 32.33ms per op = 30.93FPS
</pre><p>

And a table of the results for just the computations :
</p><pre>
-O0 float  : 23.59ms per op = 42.39FPS
-O0 double : 31.38ms per op = 31.87FPS

-O1 float  : 17.44ms per op = 57.34FPS
-O1 double : 17.55ms per op = 56.98FPS

-O2 float  : 16.70ms per op = 59.88FPS
-O2 double : 16.73ms per op = 59.77FPS

-O3 float  : 16.58ms per op = 60.31FPS
-O3 double : 16.78ms per op = 59.59FPS
</pre><p>

So you can see that if you wanted to process 6220800 (1920x1200x3) floating point elements per second on my laptop&#39;s cpu it would just barely keep up with a 60HZ refresh rate with optimizations enabled. But the difference between single precision floating point math and double precision floating point math is almost negligible.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Edgar Reynaldo)</author>
		<pubDate>Wed, 06 Apr 2016 02:28:33 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title"><a href="http://www.allegro.cc/forums/thread/616178/1021559#target">Aaron Bolyard</a> said:</div><div class="quote"><p>
Since Allegro&#39;s transforms are geared towards GPUs, or so I think, single precision is probably best.
</p></div></div><p>
OpenGL also supports half-precision floats and integer coordinate systems. I don&#39;t see any clear reason why Allegro shouldn&#39;t support them.</p><p>The Gamecube runs with integer math. <img src="http://www.allegro.cc/forums/smileys/shocked.gif" alt=":o" /> <img src="http://www.allegro.cc/forums/smileys/shocked.gif" alt=":o" /> <img src="http://www.allegro.cc/forums/smileys/shocked.gif" alt=":o" /> Now that OpenGL supports it, the Dolphin emulator was ported to integer math and tons of bugs have gone away.</p><p><a href="https://dolphin-emu.org/blog/2014/03/15/pixel-processing-problems/">https://dolphin-emu.org/blog/2014/03/15/pixel-processing-problems/</a></p><p>[edit]</p><p>ALSO, I had no idea there was a different between 0.0 and 0.0f / 0.0. There&#39;s REALLY such a thing as a float vs double literal, and the compiler will silently convert them if you have the wrong one. ... I think? </p><p>This is insanity!</p><p>Bringing back to another of my threads: Somehow, a std::string implicitly converting to a c_string is terrible, but doubles to floats, and floats to ints are OKAY being implicit?! COME ON C++. COME ON.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Chris Katko)</author>
		<pubDate>Wed, 06 Apr 2016 02:41:01 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>See my last edit for FPS results of ops with and without allocations included.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Edgar Reynaldo)</author>
		<pubDate>Wed, 06 Apr 2016 02:54:18 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p><span class="source-code"><a href="http://www.allegro.cc/manual/ALLEGRO_TRANSFORM"><span class="a">ALLEGRO_TRANSFORM</span></a></span> indeed has floats inside it, and since its internals are public, we&#39;re kind of stuck with it that way. It is that way primarily because that&#39;s what is supported across platforms (the culprit in this case is Direct3D).
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (SiegeLord)</author>
		<pubDate>Wed, 06 Apr 2016 06:16:33 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title"><a href="http://www.allegro.cc/forums/thread/616178/1021564#target">Chris Katko</a> said:</div><div class="quote"><p>  OpenGL also supports half-precision floats and integer coordinate systems. I don&#39;t see any clear reason why Allegro shouldn&#39;t support them.</p></div></div><p>If I remember correctly, half precision is only useful on mobile platforms. It&#39;s a no-op on most desktop GPUs. Similarly, native integer support is slow, like doubles.</p><p>But most of all, such features are useless for anyone using Allegro for rendering.</p><div class="quote_container"><div class="title">Quote:</div><div class="quote"><p> The Gamecube runs with integer math.</p></div></div><p>The classic Xbox had a bizarre programmable GPU unlike otherwise equivalent Nvidia chips before and after. The SNES had a terribly weak CPU, only a minor step up from the NES. The Nintendo 64 was pretty much a SGI workstation. The Wii has a small ARM processor on the same die as the GPU that controls various security and I/O processes.</p><p>Consoles used to have strange quirks unlike PCs, and that was nice, but that doesn&#39;t have any relevance to modern hardware.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Erin Maus)</author>
		<pubDate>Wed, 06 Apr 2016 14:53:40 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>It would be possible to create a function called al_transform_coordinates_d that took double pointers though. That would at least save the allocation of two floats. But I guess if they&#39;re on the stack it wouldn&#39;t matter, even in a heavy loop. Don&#39;t mind me. Just thinking out loud.</p><p>My only concern is this part of my code :</p><p>GeneratePlotData only gets called if the theta_delta or the radial_delta change, as that affects the number of data points in the spiral. But the transform and the modified coordinates change every time the rotation changes, which is quite often in my program.</p><div class="source-code"><div class="toolbar"><span class="button numbers"><b>#</b></span><span class="button select">Select</span><span class="button expand">Expand</span></div><div class="inner"><span class="number">  1</span><span class="k1">void</span> Spiral2D::Refresh<span class="k2">(</span><span class="k2">)</span> <span class="k2">{</span>
<span class="number">  2</span>   <span class="k1">if</span> <span class="k2">(</span>spiral_needs_refresh<span class="k2">)</span> <span class="k2">{</span>
<span class="number">  3</span>      GeneratePlotData<span class="k2">(</span><span class="k2">)</span><span class="k2">;</span>
<span class="number">  4</span>   <span class="k2">}</span>
<span class="number">  5</span>   <span class="k1">if</span> <span class="k2">(</span>transform_needs_refresh<span class="k2">)</span> <span class="k2">{</span>
<span class="number">  6</span>      <span class="c">/// Refresh modified data from original using transform</span>
<span class="number">  7</span>      <a href="http://www.allegro.cc/manual/al_identity_transform"><span class="a">al_identity_transform</span></a><span class="k2">(</span><span class="k3">&amp;</span>transform<span class="k2">)</span><span class="k2">;</span>
<span class="number">  8</span>      <a href="http://www.allegro.cc/manual/al_rotate_transform"><span class="a">al_rotate_transform</span></a><span class="k2">(</span><span class="k3">&amp;</span>transform , rotation_degrees<span class="k3">*</span><span class="k2">(</span>M_PI<span class="k3">/</span><span class="n">180</span>.<span class="n">0</span><span class="k2">)</span><span class="k2">)</span><span class="k2">;</span>
<span class="number">  9</span>      <a href="http://www.allegro.cc/manual/al_scale_transform"><span class="a">al_scale_transform</span></a><span class="k2">(</span><span class="k3">&amp;</span>transform , scalex , scaley<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 10</span>      <a href="http://www.allegro.cc/manual/al_translate_transform"><span class="a">al_translate_transform</span></a><span class="k2">(</span><span class="k3">&amp;</span>transform , centerx , centery<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 11</span>      <span class="k1">for</span> <span class="k2">(</span><span class="k1">unsigned</span> <span class="k1">int</span> i <span class="k3">=</span> <span class="n">0</span> <span class="k2">;</span> i <span class="k3">&lt;</span> Size<span class="k2">(</span><span class="k2">)</span> <span class="k2">;</span> <span class="k3">+</span><span class="k3">+</span>i<span class="k2">)</span> <span class="k2">{</span>
<span class="number"> 12</span>         Pos2D mod <span class="k3">=</span> DataOriginal<span class="k2">(</span>i<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 13</span>         <span class="c">/// TODO : This is a hack</span>
<span class="number"> 14</span>         <span class="k1">float</span> x <span class="k3">=</span> mod.x<span class="k2">;</span>
<span class="number"> 15</span>         <span class="k1">float</span> y <span class="k3">=</span> mod.y<span class="k2">;</span>
<span class="number"> 16</span>         <a href="http://www.allegro.cc/manual/al_transform_coordinates"><span class="a">al_transform_coordinates</span></a><span class="k2">(</span><span class="k3">&amp;</span>transform , <span class="k3">&amp;</span>x , <span class="k3">&amp;</span>y<span class="k2">)</span><span class="k2">;</span>
<span class="number"> 17</span>         mod.x <span class="k3">=</span> x<span class="k2">;</span>
<span class="number"> 18</span>         mod.y <span class="k3">=</span> y<span class="k2">;</span>
<span class="number"> 19</span>         DataModified<span class="k2">(</span>i<span class="k2">)</span> <span class="k3">=</span> mod<span class="k2">;</span>
<span class="number"> 20</span>      <span class="k2">}</span>
<span class="number"> 21</span>      transform_needs_refresh <span class="k3">=</span> <span class="k1">false</span><span class="k2">;</span>
<span class="number"> 22</span>   <span class="k2">}</span>
<span class="number"> 23</span><span class="k2">}</span>
</div></div><p>
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Edgar Reynaldo)</author>
		<pubDate>Tue, 12 Apr 2016 08:04:52 +0000</pubDate>
	</item>
</rss>
