<?xml version="1.0"?>
<rss version="2.0">
	<channel>
		<title>Profiling old code</title>
		<link>http://www.allegro.cc/forums/view/588951</link>
		<description>Allegro.cc Forum Thread</description>
		<webMaster>matthew@allegro.cc (Matthew Leverton)</webMaster>
		<lastBuildDate>Fri, 08 Dec 2006 18:05:55 +0000</lastBuildDate>
	</channel>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I was digging around my old code and found my Mode7 engine. It&#39;s very slow and I&#39;d really like to give it a speed jolt. The majority of slowdown comes with rendering (obviously):</p><div class="source-code"><div class="toolbar"></div><div class="inner"><table width="100%"><tbody><tr><td class="number">1</td><td>&#160;</td></tr><tr><td class="number">2</td><td><span class="k1">inline</span> <span class="k1">void</span> fast_putpixel<span class="k2">(</span><a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>bmp, <span class="k1">int</span> x, <span class="k1">int</span> y, <span class="k1">int</span> color<span class="k2">)</span></td></tr><tr><td class="number">3</td><td><span class="k2">{</span></td></tr><tr><td class="number">4</td><td>      <span class="k2">(</span><span class="k2">(</span><span class="k1">long</span> <span class="k3">*</span><span class="k2">)</span>bmp-&gt;line<span class="k2">[</span>y<span class="k2">]</span><span class="k2">)</span><span class="k2">[</span>x<span class="k2">]</span> <span class="k3">=</span> color<span class="k2">;</span></td></tr><tr><td class="number">5</td><td><span class="k2">}</span></td></tr><tr><td class="number">6</td><td>&#160;</td></tr><tr><td class="number">7</td><td><span class="k1">inline</span> <span class="k1">int</span> fast_getpixel<span class="k2">(</span><a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>bmp, <span class="k1">int</span> x, <span class="k1">int</span> y<span class="k2">)</span></td></tr><tr><td class="number">8</td><td><span class="k2">{</span></td></tr><tr><td class="number">9</td><td>      <span class="k1">return</span> <span class="k2">(</span><span class="k2">(</span><span class="k1">long</span> <span class="k3">*</span><span class="k2">)</span>bmp-&gt;line<span class="k2">[</span>y<span class="k2">]</span><span class="k2">)</span><span class="k2">[</span>x<span class="k2">]</span><span class="k2">;</span></td></tr><tr><td class="number">10</td><td><span class="k2">}</span></td></tr><tr><td class="number">11</td><td>&#160;</td></tr><tr><td class="number">12</td><td><span class="k1">void</span> render_camera<span class="k2">(</span><a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>source, <a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>dest<span class="k2">)</span> <span class="k2">{</span></td></tr><tr><td class="number">13</td><td>     </td></tr><tr><td class="number">14</td><td>     <a href="http://www.allegro.cc/manual/blit" target="_blank"><span class="a">blit</span></a> <span class="k2">(</span>source, dest, <span class="n">0</span>, <span class="n">0</span>, <span class="n">0</span>, <span class="n">0</span>, dest-&gt;w, dest-&gt;h<span class="k2">)</span><span class="k2">;</span>     </td></tr><tr><td class="number">15</td><td>     clear<span class="k2">(</span>source<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">16</td><td>          </td></tr><tr><td class="number">17</td><td><span class="k2">}</span></td></tr><tr><td class="number">18</td><td>&#160;</td></tr><tr><td class="number">19</td><td><span class="k1">void</span> render_map <span class="k2">(</span><a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>bmp, <a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>tile, <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> angle, <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> cx, <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> cy, CAMERA_PARAMS camera<span class="k2">)</span></td></tr><tr><td class="number">20</td><td><span class="k2">{</span></td></tr><tr><td class="number">21</td><td>    <span class="k1">int</span> screen_x, screen_y<span class="k2">;</span></td></tr><tr><td class="number">22</td><td>&#160;</td></tr><tr><td class="number">23</td><td>    <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> distance, horizontal_scale, line_dx, line_dy, space_x, space_y<span class="k2">;</span></td></tr><tr><td class="number">24</td><td>&#160;</td></tr><tr><td class="number">25</td><td>    <span class="k1">int</span> mask_x <span class="k3">=</span> <span class="k2">(</span>tile-&gt;w <span class="k3">-</span> <span class="n">1</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">26</td><td>    <span class="k1">int</span> mask_y <span class="k3">=</span> <span class="k2">(</span>tile-&gt;h <span class="k3">-</span> <span class="n">1</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">27</td><td>        </td></tr><tr><td class="number">28</td><td>    <span class="k1">for</span> <span class="k2">(</span>screen_y <span class="k3">=</span> <span class="n">75</span><span class="k2">;</span> screen_y <span class="k3">&lt;</span> bmp-&gt;h<span class="k2">;</span> screen_y<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span></td></tr><tr><td class="number">29</td><td>    <span class="k2">{</span></td></tr><tr><td class="number">30</td><td>        </td></tr><tr><td class="number">31</td><td>        distance <span class="k3">=</span> fdiv <span class="k2">(</span>fmul <span class="k2">(</span>camera.z, camera.s_y<span class="k2">)</span>, <a href="http://www.allegro.cc/manual/itofix" target="_blank"><span class="a">itofix</span></a> <span class="k2">(</span>screen_y <span class="k3">+</span> camera.horizon<span class="k2">)</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">32</td><td>        horizontal_scale <span class="k3">=</span> fdiv <span class="k2">(</span>distance, camera.s_x<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">33</td><td>&#160;</td></tr><tr><td class="number">34</td><td>        line_dx <span class="k3">=</span> fmul <span class="k2">(</span><span class="k3">-</span>fsin<span class="k2">(</span>angle<span class="k2">)</span>, horizontal_scale<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">35</td><td>        line_dy <span class="k3">=</span> fmul <span class="k2">(</span>fcos<span class="k2">(</span>angle<span class="k2">)</span>, horizontal_scale<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">36</td><td>&#160;</td></tr><tr><td class="number">37</td><td>        space_x <span class="k3">=</span> cx <span class="k3">+</span> fmul <span class="k2">(</span>distance, fcos<span class="k2">(</span>angle<span class="k2">)</span><span class="k2">)</span> <span class="k3">-</span> bmp-&gt;w<span class="k3">/</span><span class="n">2</span> <span class="k3">*</span> line_dx<span class="k2">;</span></td></tr><tr><td class="number">38</td><td>        space_y <span class="k3">=</span> cy <span class="k3">+</span> fmul <span class="k2">(</span>distance, fsin<span class="k2">(</span>angle<span class="k2">)</span><span class="k2">)</span> <span class="k3">-</span> bmp-&gt;w<span class="k3">/</span><span class="n">2</span> <span class="k3">*</span> line_dy<span class="k2">;</span></td></tr><tr><td class="number">39</td><td>&#160;</td></tr><tr><td class="number">40</td><td>        <span class="k1">for</span> <span class="k2">(</span>screen_x <span class="k3">=</span> <span class="n">0</span><span class="k2">;</span> screen_x <span class="k3">&lt;</span> bmp-&gt;w<span class="k2">;</span> screen_x<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span></td></tr><tr><td class="number">41</td><td>        <span class="k2">{</span></td></tr><tr><td class="number">42</td><td>            </td></tr><tr><td class="number">43</td><td>            <span class="c">/*if( screen_x == mouse_x &amp;&amp; screen_y == mouse_y ) { </span></td></tr><tr><td class="number">44</td><td><span class="c">                application.m_x = 160 - fixtoi (space_x); </span></td></tr><tr><td class="number">45</td><td><span class="c">                application.m_y = 100 - fixtoi (space_y);</span></td></tr><tr><td class="number">46</td><td><span class="c">            }*/</span></td></tr><tr><td class="number">47</td><td>            </td></tr><tr><td class="number">48</td><td>            fast_putpixel <span class="k2">(</span>bmp, screen_x, screen_y,</td></tr><tr><td class="number">49</td><td>                     <span class="k2">(</span>       </td></tr><tr><td class="number">50</td><td>                     </td></tr><tr><td class="number">51</td><td>                                fast_getpixel <span class="k2">(</span>tile,</td></tr><tr><td class="number">52</td><td>                                   <a href="http://www.allegro.cc/manual/fixtoi" target="_blank"><span class="a">fixtoi</span></a> <span class="k2">(</span>space_x<span class="k2">)</span> <span class="k3">&amp;</span> mask_x,</td></tr><tr><td class="number">53</td><td>                                   <a href="http://www.allegro.cc/manual/fixtoi" target="_blank"><span class="a">fixtoi</span></a> <span class="k2">(</span>space_y<span class="k2">)</span> <span class="k3">&amp;</span> mask_y </td></tr><tr><td class="number">54</td><td>                                <span class="k2">)</span>  </td></tr><tr><td class="number">55</td><td>&#160;</td></tr><tr><td class="number">56</td><td>                     <span class="k2">)</span></td></tr><tr><td class="number">57</td><td>            <span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">58</td><td>&#160;</td></tr><tr><td class="number">59</td><td>            space_x <span class="k3">+</span><span class="k3">=</span> line_dx<span class="k2">;</span> space_y <span class="k3">+</span><span class="k3">=</span> line_dy<span class="k2">;</span></td></tr><tr><td class="number">60</td><td>            </td></tr><tr><td class="number">61</td><td>        <span class="k2">}</span></td></tr><tr><td class="number">62</td><td>&#160;</td></tr><tr><td class="number">63</td><td>    <span class="k2">}</span></td></tr><tr><td class="number">64</td><td>    </td></tr><tr><td class="number">65</td><td>&#160;</td></tr><tr><td class="number">66</td><td><span class="k2">}</span></td></tr></tbody></table></div></div><p>

fast_getpixel and fast_putpixel take up most of the execution time, along with fixtoi. I&#39;m really looking to try and optimize this section of code. I had briefly considered a jump to hardware acceleration. Would this be of much use? Most of my project is already written in C. How much of a headache would it be to introduce something like Openlayer? Would I see a big speed increase?</p><p>Any help on either optimization or otherwise is appreciated!
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (ngiacomelli)</author>
		<pubDate>Fri, 08 Dec 2006 03:15:43 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Why, specifically, are you needing individual pixel-writes?  Are you using them for a particle system?
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (TeamTerradactyl)</author>
		<pubDate>Fri, 08 Dec 2006 03:20:32 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>The code is based on an ancient (and legendary) Allegro-specific tutorial, which featured a Mode7 example that worked by grabbing and plotting individual pixels for the Mode7 projection.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (ngiacomelli)</author>
		<pubDate>Fri, 08 Dec 2006 03:27:19 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title">Quote:</div><div class="quote"><p>
ast_getpixel and fast_putpixel take up most of the execution time, along with fixtoi.
</p></div></div><p>
They should take up most of the execution time — they&#39;re doing all the work! Your compiler might be missing one obvious optimisation though. You and I can see that screen_y doesn&#39;t change inside the innermost loop, but because you&#39;re using inline functions the compiler possibly can&#39;t do anything about that. Have you considered something more like the following?
</p><div class="source-code"><div class="toolbar"></div><div class="inner"><table width="100%"><tbody><tr><td class="number">1</td><td>    <span class="k1">int</span> screen_x, screen_y<span class="k2">;</span></td></tr><tr><td class="number">2</td><td>&#160;</td></tr><tr><td class="number">3</td><td>    <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> distance, horizontal_scale, line_dx, line_dy, space_x, space_y<span class="k2">;</span></td></tr><tr><td class="number">4</td><td>&#160;</td></tr><tr><td class="number">5</td><td>    <span class="k1">int</span> mask_x <span class="k3">=</span> <span class="k2">(</span>tile-&gt;w <span class="k3">-</span> <span class="n">1</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">6</td><td>    <span class="k1">int</span> mask_y <span class="k3">=</span> <span class="k2">(</span>tile-&gt;h <span class="k3">-</span> <span class="n">1</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">7</td><td>        </td></tr><tr><td class="number">8</td><td>    <span class="k1">for</span> <span class="k2">(</span>screen_y <span class="k3">=</span> <span class="n">75</span><span class="k2">;</span> screen_y <span class="k3">&lt;</span> bmp-&gt;h<span class="k2">;</span> screen_y<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span></td></tr><tr><td class="number">9</td><td>    <span class="k2">{</span></td></tr><tr><td class="number">10</td><td>        </td></tr><tr><td class="number">11</td><td>        distance <span class="k3">=</span> fdiv <span class="k2">(</span>fmul <span class="k2">(</span>camera.z, camera.s_y<span class="k2">)</span>, <a href="http://www.allegro.cc/manual/itofix" target="_blank"><span class="a">itofix</span></a> <span class="k2">(</span>screen_y <span class="k3">+</span> camera.horizon<span class="k2">)</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">12</td><td>        horizontal_scale <span class="k3">=</span> fdiv <span class="k2">(</span>distance, camera.s_x<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">13</td><td>&#160;</td></tr><tr><td class="number">14</td><td>        line_dx <span class="k3">=</span> fmul <span class="k2">(</span><span class="k3">-</span>fsin<span class="k2">(</span>angle<span class="k2">)</span>, horizontal_scale<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">15</td><td>        line_dy <span class="k3">=</span> fmul <span class="k2">(</span>fcos<span class="k2">(</span>angle<span class="k2">)</span>, horizontal_scale<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">16</td><td>&#160;</td></tr><tr><td class="number">17</td><td>        space_x <span class="k3">=</span> cx <span class="k3">+</span> fmul <span class="k2">(</span>distance, fcos<span class="k2">(</span>angle<span class="k2">)</span><span class="k2">)</span> <span class="k3">-</span> bmp-&gt;w<span class="k3">/</span><span class="n">2</span> <span class="k3">*</span> line_dx<span class="k2">;</span></td></tr><tr><td class="number">18</td><td>        space_y <span class="k3">=</span> cy <span class="k3">+</span> fmul <span class="k2">(</span>distance, fsin<span class="k2">(</span>angle<span class="k2">)</span><span class="k2">)</span> <span class="k3">-</span> bmp-&gt;w<span class="k3">/</span><span class="n">2</span> <span class="k3">*</span> line_dy<span class="k2">;</span></td></tr><tr><td class="number">19</td><td>&#160;</td></tr><tr><td class="number">20</td><td>        <span class="k1">long</span> <span class="k3">*</span>targetptr <span class="k3">=</span> <span class="k2">(</span><span class="k1">long</span> <span class="k3">*</span><span class="k2">)</span>bmp-&gt;line<span class="k2">[</span>screen_y<span class="k2">]</span><span class="k2">;</span></td></tr><tr><td class="number">21</td><td>&#160;</td></tr><tr><td class="number">22</td><td>        <span class="k1">for</span> <span class="k2">(</span>screen_x <span class="k3">=</span> <span class="n">0</span><span class="k2">;</span> screen_x <span class="k3">&lt;</span> bmp-&gt;w<span class="k2">;</span> screen_x<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span></td></tr><tr><td class="number">23</td><td>        <span class="k2">{</span></td></tr><tr><td class="number">24</td><td>            </td></tr><tr><td class="number">25</td><td>            <span class="k3">*</span>targetptr<span class="k3">+</span><span class="k3">+</span> <span class="k3">=</span></td></tr><tr><td class="number">26</td><td>                     <span class="k2">(</span>       </td></tr><tr><td class="number">27</td><td>                     </td></tr><tr><td class="number">28</td><td>                                fast_getpixel <span class="k2">(</span>tile,</td></tr><tr><td class="number">29</td><td>                                   <a href="http://www.allegro.cc/manual/fixtoi" target="_blank"><span class="a">fixtoi</span></a> <span class="k2">(</span>space_x<span class="k2">)</span> <span class="k3">&amp;</span> mask_x,</td></tr><tr><td class="number">30</td><td>                                   <a href="http://www.allegro.cc/manual/fixtoi" target="_blank"><span class="a">fixtoi</span></a> <span class="k2">(</span>space_y<span class="k2">)</span> <span class="k3">&amp;</span> mask_y </td></tr><tr><td class="number">31</td><td>                                <span class="k2">)</span>  </td></tr><tr><td class="number">32</td><td>&#160;</td></tr><tr><td class="number">33</td><td>                     <span class="k2">)</span></td></tr><tr><td class="number">34</td><td>            <span class="k2">)</span></td></tr><tr><td class="number">35</td><td>&#160;</td></tr><tr><td class="number">36</td><td>            space_x <span class="k3">+</span><span class="k3">=</span> line_dx<span class="k2">;</span> space_y <span class="k3">+</span><span class="k3">=</span> line_dy</td></tr><tr><td class="number">37</td><td>      <span class="k2">}</span></td></tr></tbody></table></div></div><p>
Also you might shave a few cycles by dumping fixtoi in favour of some simple &gt;&gt; 16s. fixtoi is bound to do something more like (x + 32768) &gt;&gt; 16 or some other rounding method, whereas truncation isn&#39;t going to be noticeably different for your application. I think the speed-ups are likely to be insubstantial though, at best.
</p><div class="quote_container"><div class="title">Quote:</div><div class="quote"><p>
How much of a headache would it be to introduce something like Openlayer? Would I see a big speed increase?
</p></div></div><p>
I can&#39;t answer the headache question, but I can tell you the speed difference would be gigantic. A Mode-7 floor that isn&#39;t using a tilemap like yours can be done in a single quad, which the GPU will do for you.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Thomas Harte)</author>
		<pubDate>Fri, 08 Dec 2006 03:31:06 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title">Quote:</div><div class="quote"><p>
Also you might shave a few cycles by dumping fixtoi in favour of some simple &gt;&gt; 16s. fixtoi is bound to do something more like (x + 32768) &gt;&gt; 16 or some other rounding method, whereas truncation isn&#39;t going to be noticeably different for your application.
</p></div></div><p>

I&#39;ve never done any bitshifting (I believe that&#39;s the terminology). Could you explain?</p><p>Also, I&#39;m talking headache in terms of: updating my existing implementation, setting up OpenGL and writing code for it (having never done so).
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (ngiacomelli)</author>
		<pubDate>Fri, 08 Dec 2006 03:36:47 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Bitshifting:
</p><div class="source-code snippet"><div class="inner"><pre><span class="k1">int</span> a <span class="k3">=</span> <span class="n">128</span><span class="k2">;</span>
<span class="k1">int</span> b<span class="k2">;</span>

b <span class="k3">=</span> a <span class="k3">&gt;</span><span class="k3">&gt;</span> <span class="n">1</span><span class="k2">;</span> <span class="c">// 'b' now equals 128 / 2</span>
b <span class="k3">=</span> a <span class="k3">&gt;</span><span class="k3">&gt;</span> <span class="n">2</span><span class="k2">;</span> <span class="c">// 'b' now equals (128 / 2) / 2</span>
b <span class="k3">=</span> a <span class="k3">&gt;</span><span class="k3">&gt;</span> <span class="n">3</span><span class="k2">;</span> <span class="c">// 'b' now equals ((128 / 2) / 2) / 2</span>
</pre></div></div><p>

For each time you bit-shift to the right, you divide by two.  It&#39;s called bit-shifting since it&#39;s noticable in binary:</p><div class="source-code snippet"><div class="inner"><pre>a <span class="k3">=</span> <span class="n">10000000</span>

b <span class="k3">=</span> a <span class="k3">&gt;</span><span class="k3">&gt;</span> <span class="n">1</span><span class="k2">:</span>  <span class="n">01000000</span> <span class="c">// Shifted to the right by 1, so it now equals 64</span>
b <span class="k3">=</span> a <span class="k3">&gt;</span><span class="k3">&gt;</span> <span class="n">2</span><span class="k2">:</span>  <span class="n">00100000</span> <span class="c">// Shifted to the right by 2, so it now equals 32</span>
b <span class="k3">=</span> a <span class="k3">&gt;</span><span class="k3">&gt;</span> <span class="n">3</span><span class="k2">:</span>  <span class="n">00010000</span> <span class="c">// Shifted to the right by 3</span>
b <span class="k3">=</span> a <span class="k3">&gt;</span><span class="k3">&gt;</span> <span class="n">7</span><span class="k2">:</span>  <span class="n">00000001</span> <span class="c">// Shifted to the right by 7, so it now equals 1</span>
b <span class="k3">=</span> a <span class="k3">&gt;</span><span class="k3">&gt;</span> <span class="n">8</span><span class="k2">:</span>  <span class="n">00000000</span> <span class="c">// Now it equals 0</span>
</pre></div></div><p>

Now that I&#39;ve written this, I hope you don&#39;t already know what this is and were looking for <i>implementation</i> instead of a &quot;what&#39;s this?&quot;  <img src="http://www.allegro.cc/forums/smileys/grin.gif" alt=";D" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (TeamTerradactyl)</author>
		<pubDate>Fri, 08 Dec 2006 03:44:35 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title">Quote:</div><div class="quote"><p>
I&#39;ve never done any bitshifting (I believe that&#39;s the terminology). Could you explain?
</p></div></div><p>
Bitshifting is the same as moving the decimal point with normal numbers. If you do &lt;&lt; 1 then you move all the binary digits in your integer number one place to the left. If you do &gt;&gt; 1 then you move all the binary digits in your integer number one place to the right.</p><p>Because of the way binary works, if you do &lt;&lt; 1 then you do the same thing as multiplying the number by 2. If you do &lt;&lt; 2 then you do the same thing as multiplying the number by 4. That&#39;s exactly the same as the way that if you took an ordinary decimal number and moved all the digits one place to the left then you&#39;d do the same thing as multiplying the number by 10. If you move all the digits two places to the left then you do the same thing as multiplying by 100.</p><p>Similarly &gt;&gt; 1 is like a divide by 2, &gt;&gt; 2 is like a divide by 4, &gt;&gt; 3 is like a divide by 8, etc. They&#39;re not identical this time because the rounding doesn&#39;t work. For example, in decimal, if you had the number 9 and you moved it one place to the right you&#39;d have 0.9. If someone told you to round it to an integer you&#39;d say the answer was 1. With shifting you get truncation, not rounding — i.e. any digits that fall off the end of the memory spot reserved for the integer are just forgotten.</p><p>Fixed numbers are really just integers that pretend to have a decimal point in the middle. Think of it like the difference in decimal between millimetres and metres. If you can only store integers and stick with one scale then you can only store distances like 1 metre, 2 metres, etc. If however you keep some measurements on a millimetre scale and convert them to metres when you need them then you can store with much greater precision.</p><p>Fixed numbers are exactly the same, except that in the realm of binary it&#39;s easier to divide a whole unit into units of 1/65536 rather than the 1/1000 in the milimetre example. So a fixed number is just an integer that uses a different scale.</p><p>So the correct way to convert a fixed to an integer is to divide by 65536, just as the correct way to convert from a millimetre to a metre is to divide by 1000. A time saving quick fix is to just shift the binary digits right by 16 places. You don&#39;t get exactly the right answer, but you save a tiny little bit of work.</p><p>In the millimetres example, if you just shifted then you&#39;d see that 0 to 999 millimetres are considered to be 0 metres, 1000 to 1999 millimetres are considered to be 1 metre, etc. So a way to do the conversion correctly with shifting would be &quot;add 500, then shift&quot;. That&#39;s no more than the rounding you learnt in school — if the digit after the one you are interested in is 5 or more then the one you are interested will go up by 1. If you aren&#39;t then it won&#39;t.</p><p>So another way to look at it is that all you do by just shifting rather than shift+adding is move the pixels on your floor texture map half a pixel to the right and half a pixel down. Which is barely any different.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Thomas Harte)</author>
		<pubDate>Fri, 08 Dec 2006 03:48:23 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Fantastic explanations. Thank you both. Thomas Harte, you seem to be something of an authority when it comes to this sort of stuff. There&#39;s been a slight speed increase with your suggestions. But I&#39;m really looking to get the most out of this, as I possibly can. Is there a way of improving my rendering method, at all?
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (ngiacomelli)</author>
		<pubDate>Fri, 08 Dec 2006 04:17:47 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Avoiding a call to some function can help. So , I think that if you replace your fast_put/get pixel method directly by the only line inside the function, it should help, specially if you have a lots of call.</p><p>Also, you should profile your whole code and use something as gprof to see where exactly you loose some time.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (GullRaDriel)</author>
		<pubDate>Fri, 08 Dec 2006 04:33:32 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title">Quote:</div><div class="quote"><p>
For each time you bit-shift to the right, you divide by two. It&#39;s called bit-shifting since it&#39;s noticable in binary:
</p></div></div><p>
I just thought I&#39;d point out a very minor detail here; when you divide by 2 using bitshifting, the rounding is towards negative infinity (ie 1 / 2 is 0, but -1 / 2 is -1), whereas when you divide using regular integer math the result is rounded towards 0 (1/2 = 0, -1/2 = 0).  </p><p>Thomas Harte earlier posted this:
</p><div class="source-code"><div class="toolbar"></div><div class="inner"><table width="100%"><tbody><tr><td class="number">1</td><td>    <span class="k1">int</span> screen_x, screen_y<span class="k2">;</span></td></tr><tr><td class="number">2</td><td>&#160;</td></tr><tr><td class="number">3</td><td>    <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> distance, horizontal_scale, line_dx, line_dy, space_x, space_y<span class="k2">;</span></td></tr><tr><td class="number">4</td><td>&#160;</td></tr><tr><td class="number">5</td><td>    <span class="k1">int</span> mask_x <span class="k3">=</span> <span class="k2">(</span>tile-&gt;w <span class="k3">-</span> <span class="n">1</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">6</td><td>    <span class="k1">int</span> mask_y <span class="k3">=</span> <span class="k2">(</span>tile-&gt;h <span class="k3">-</span> <span class="n">1</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">7</td><td>        </td></tr><tr><td class="number">8</td><td>    <span class="k1">for</span> <span class="k2">(</span>screen_y <span class="k3">=</span> <span class="n">75</span><span class="k2">;</span> screen_y <span class="k3">&lt;</span> bmp-&gt;h<span class="k2">;</span> screen_y<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span></td></tr><tr><td class="number">9</td><td>    <span class="k2">{</span></td></tr><tr><td class="number">10</td><td>        </td></tr><tr><td class="number">11</td><td>        distance <span class="k3">=</span> fdiv <span class="k2">(</span>fmul <span class="k2">(</span>camera.z, camera.s_y<span class="k2">)</span>, <a href="http://www.allegro.cc/manual/itofix" target="_blank"><span class="a">itofix</span></a> <span class="k2">(</span>screen_y <span class="k3">+</span> camera.horizon<span class="k2">)</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">12</td><td>        horizontal_scale <span class="k3">=</span> fdiv <span class="k2">(</span>distance, camera.s_x<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">13</td><td>&#160;</td></tr><tr><td class="number">14</td><td>        line_dx <span class="k3">=</span> fmul <span class="k2">(</span><span class="k3">-</span>fsin<span class="k2">(</span>angle<span class="k2">)</span>, horizontal_scale<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">15</td><td>        line_dy <span class="k3">=</span> fmul <span class="k2">(</span>fcos<span class="k2">(</span>angle<span class="k2">)</span>, horizontal_scale<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">16</td><td>&#160;</td></tr><tr><td class="number">17</td><td>        space_x <span class="k3">=</span> cx <span class="k3">+</span> fmul <span class="k2">(</span>distance, fcos<span class="k2">(</span>angle<span class="k2">)</span><span class="k2">)</span> <span class="k3">-</span> bmp-&gt;w<span class="k3">/</span><span class="n">2</span> <span class="k3">*</span> line_dx<span class="k2">;</span></td></tr><tr><td class="number">18</td><td>        space_y <span class="k3">=</span> cy <span class="k3">+</span> fmul <span class="k2">(</span>distance, fsin<span class="k2">(</span>angle<span class="k2">)</span><span class="k2">)</span> <span class="k3">-</span> bmp-&gt;w<span class="k3">/</span><span class="n">2</span> <span class="k3">*</span> line_dy<span class="k2">;</span></td></tr><tr><td class="number">19</td><td>&#160;</td></tr><tr><td class="number">20</td><td>        <span class="k1">long</span> <span class="k3">*</span>targetptr <span class="k3">=</span> <span class="k2">(</span><span class="k1">long</span> <span class="k3">*</span><span class="k2">)</span>bmp-&gt;line<span class="k2">[</span>screen_y<span class="k2">]</span><span class="k2">;</span></td></tr><tr><td class="number">21</td><td>&#160;</td></tr><tr><td class="number">22</td><td>        <span class="k1">for</span> <span class="k2">(</span>screen_x <span class="k3">=</span> <span class="n">0</span><span class="k2">;</span> screen_x <span class="k3">&lt;</span> bmp-&gt;w<span class="k2">;</span> screen_x<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span></td></tr><tr><td class="number">23</td><td>        <span class="k2">{</span></td></tr><tr><td class="number">24</td><td>            </td></tr><tr><td class="number">25</td><td>            <span class="k3">*</span>targetptr<span class="k3">+</span><span class="k3">+</span> <span class="k3">=</span></td></tr><tr><td class="number">26</td><td>                     <span class="k2">(</span>       </td></tr><tr><td class="number">27</td><td>                     </td></tr><tr><td class="number">28</td><td>                                fast_getpixel <span class="k2">(</span>tile,</td></tr><tr><td class="number">29</td><td>                                   <a href="http://www.allegro.cc/manual/fixtoi" target="_blank"><span class="a">fixtoi</span></a> <span class="k2">(</span>space_x<span class="k2">)</span> <span class="k3">&amp;</span> mask_x,</td></tr><tr><td class="number">30</td><td>                                   <a href="http://www.allegro.cc/manual/fixtoi" target="_blank"><span class="a">fixtoi</span></a> <span class="k2">(</span>space_y<span class="k2">)</span> <span class="k3">&amp;</span> mask_y </td></tr><tr><td class="number">31</td><td>                                <span class="k2">)</span>  </td></tr><tr><td class="number">32</td><td>&#160;</td></tr><tr><td class="number">33</td><td>                     <span class="k2">)</span></td></tr><tr><td class="number">34</td><td>            <span class="k2">)</span></td></tr><tr><td class="number">35</td><td>&#160;</td></tr><tr><td class="number">36</td><td>            space_x <span class="k3">+</span><span class="k3">=</span> line_dx<span class="k2">;</span> space_y <span class="k3">+</span><span class="k3">=</span> line_dy</td></tr><tr><td class="number">37</td><td>      <span class="k2">}</span></td></tr></tbody></table></div></div><p>
That&#39;s probably the biggest speed bump possible without doing something drastic (like switching color depths or using hardware acceleration).  </p><p>He also mentioned replacing the fixtoi calls with simple left-shifts for further minute speedups.  I&#39;d just like to add that if you want identical rounding results you can add the 32768 to space_x and space_y prior to the loop.  Furthermore, it&#39;s possible to optimize away the bitwise-and operators by left shifting space_x, line_dx, space_y, and line_dy prior to the loop in such a manner that integer overflow performed the same operation.  However, the resulting speed improvement would probably be negligable, as bitwise-and is extremely fast; in fact, I don&#39;t actually recommend implementing this optimization, as the improvement should be almost too small to measure.  The optimized version would look something like this:
</p><div class="source-code"><div class="toolbar"></div><div class="inner"><table width="100%"><tbody><tr><td class="number">1</td><td><span class="k1">void</span> render_map <span class="k2">(</span><a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>bmp, <a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>tile, <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> angle, <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> cx, <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> cy, CAMERA_PARAMS camera<span class="k2">)</span></td></tr><tr><td class="number">2</td><td><span class="k2">{</span></td></tr><tr><td class="number">3</td><td>    <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> distance, horizontal_scale, line_dx, line_dy, space_x, space_y<span class="k2">;</span></td></tr><tr><td class="number">4</td><td>    <span class="k1">int</span> screen_x, screen_y<span class="k2">;</span></td></tr><tr><td class="number">5</td><td>&#160;</td></tr><tr><td class="number">6</td><td>    <a href="http://www.allegro.cc/manual/fixed" target="_blank"><span class="a">fixed</span></a> distance, horizontal_scale, line_dx, line_dy, space_x, space_y<span class="k2">;</span></td></tr><tr><td class="number">7</td><td>&#160;</td></tr><tr><td class="number">8</td><td>    <span class="k1">int</span> mask_x <span class="k3">=</span> <span class="k2">(</span>tile-&gt;w <span class="k3">-</span> <span class="n">1</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">9</td><td>    <span class="k1">int</span> mask_y <span class="k3">=</span> <span class="k2">(</span>tile-&gt;h <span class="k3">-</span> <span class="n">1</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">10</td><td>    <span class="k1">int</span> shift_X <span class="k3">=</span> <span class="n">32</span><span class="k3">-</span>fast_ilog2<span class="k2">(</span>tile-&gt;w<span class="k2">)</span><span class="k2">;</span><span class="c">//replace with a #defined or enumed constant</span></td></tr><tr><td class="number">11</td><td>    <span class="k1">int</span> shift_Y <span class="k3">=</span> <span class="n">32</span><span class="k3">-</span>fast_ilog2<span class="k2">(</span>tile-&gt;h<span class="k2">)</span><span class="k2">;</span><span class="c">//if you know what size tile is at run-time</span></td></tr><tr><td class="number">12</td><td>        </td></tr><tr><td class="number">13</td><td>    <span class="k1">for</span> <span class="k2">(</span>screen_y <span class="k3">=</span> <span class="n">75</span><span class="k2">;</span> screen_y <span class="k3">&lt;</span> bmp-&gt;h<span class="k2">;</span> screen_y<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span></td></tr><tr><td class="number">14</td><td>    <span class="k2">{</span></td></tr><tr><td class="number">15</td><td>        </td></tr><tr><td class="number">16</td><td>        distance <span class="k3">=</span> fdiv <span class="k2">(</span>fmul <span class="k2">(</span>camera.z, camera.s_y<span class="k2">)</span>, <a href="http://www.allegro.cc/manual/itofix" target="_blank"><span class="a">itofix</span></a> <span class="k2">(</span>screen_y <span class="k3">+</span> camera.horizon<span class="k2">)</span><span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">17</td><td>        horizontal_scale <span class="k3">=</span> fdiv <span class="k2">(</span>distance, camera.s_x<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">18</td><td>&#160;</td></tr><tr><td class="number">19</td><td>        line_dx <span class="k3">=</span> fmul <span class="k2">(</span><span class="k3">-</span>fsin<span class="k2">(</span>angle<span class="k2">)</span>, horizontal_scale<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">20</td><td>        line_dy <span class="k3">=</span> fmul <span class="k2">(</span>fcos<span class="k2">(</span>angle<span class="k2">)</span>, horizontal_scale<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">21</td><td>&#160;</td></tr><tr><td class="number">22</td><td>        space_x <span class="k3">=</span> cx <span class="k3">+</span> fmul <span class="k2">(</span>distance, fcos<span class="k2">(</span>angle<span class="k2">)</span><span class="k2">)</span> <span class="k3">-</span> bmp-&gt;w<span class="k3">/</span><span class="n">2</span> <span class="k3">*</span> line_dx<span class="k2">;</span></td></tr><tr><td class="number">23</td><td>        space_y <span class="k3">=</span> cy <span class="k3">+</span> fmul <span class="k2">(</span>distance, fsin<span class="k2">(</span>angle<span class="k2">)</span><span class="k2">)</span> <span class="k3">-</span> bmp-&gt;w<span class="k3">/</span><span class="n">2</span> <span class="k3">*</span> line_dy<span class="k2">;</span></td></tr><tr><td class="number">24</td><td>&#160;</td></tr><tr><td class="number">25</td><td>        line_dx <span class="k3">&lt;</span><span class="k3">&lt;</span><span class="k3">=</span> shift_X-16<span class="k2">;</span></td></tr><tr><td class="number">26</td><td>        line_dy <span class="k3">&lt;</span><span class="k3">&lt;</span><span class="k3">=</span> shift_Y-16<span class="k2">;</span></td></tr><tr><td class="number">27</td><td>        space_x <span class="k3">+</span><span class="k3">=</span> <span class="n">32768</span><span class="k2">;</span> space_x <span class="k3">&lt;</span><span class="k3">&lt;</span><span class="k3">=</span> shift_X-16<span class="k2">;</span></td></tr><tr><td class="number">28</td><td>        space_y <span class="k3">+</span><span class="k3">=</span> <span class="n">32768</span><span class="k2">;</span> space_y <span class="k3">&lt;</span><span class="k3">&lt;</span><span class="k3">=</span> shift_Y-16<span class="k2">;</span></td></tr><tr><td class="number">29</td><td>        <span class="k1">long</span> <span class="k3">*</span>targetptr <span class="k3">=</span> <span class="k2">(</span><span class="k1">long</span> <span class="k3">*</span><span class="k2">)</span>bmp-&gt;line<span class="k2">[</span>screen_y<span class="k2">]</span><span class="k2">;</span></td></tr><tr><td class="number">30</td><td>&#160;</td></tr><tr><td class="number">31</td><td>        <span class="k1">for</span> <span class="k2">(</span>screen_x <span class="k3">=</span> <span class="n">0</span><span class="k2">;</span> screen_x <span class="k3">&lt;</span> bmp-&gt;w<span class="k2">;</span> screen_x<span class="k3">+</span><span class="k3">+</span><span class="k2">)</span></td></tr><tr><td class="number">32</td><td>        <span class="k2">{</span></td></tr><tr><td class="number">33</td><td>            </td></tr><tr><td class="number">34</td><td>            <span class="k3">*</span>targetptr<span class="k3">+</span><span class="k3">+</span> <span class="k3">=</span></td></tr><tr><td class="number">35</td><td>                     <span class="k2">(</span>       </td></tr><tr><td class="number">36</td><td>                     </td></tr><tr><td class="number">37</td><td>                                fast_getpixel <span class="k2">(</span>tile,</td></tr><tr><td class="number">38</td><td>                                   space_x <span class="k3">&gt;</span><span class="k3">&gt;</span> shift_X,</td></tr><tr><td class="number">39</td><td>                                   space_y <span class="k3">&gt;</span><span class="k3">&gt;</span> shift_Y </td></tr><tr><td class="number">40</td><td>                                <span class="k2">)</span>  </td></tr><tr><td class="number">41</td><td>&#160;</td></tr><tr><td class="number">42</td><td>                     <span class="k2">)</span></td></tr><tr><td class="number">43</td><td>            <span class="k2">)</span></td></tr><tr><td class="number">44</td><td>&#160;</td></tr><tr><td class="number">45</td><td>            space_x <span class="k3">+</span><span class="k3">=</span> line_dx<span class="k2">;</span> space_y <span class="k3">+</span><span class="k3">=</span> line_dy</td></tr><tr><td class="number">46</td><td>      <span class="k2">}</span></td></tr><tr><td class="number">47</td><td><span class="k1">int</span> fast_ilog2<span class="k2">(</span><span class="k1">unsigned</span> <span class="k1">int</span> x<span class="k2">)</span> <span class="k2">{</span></td></tr><tr><td class="number">48</td><td><span class="c">//returns 0 if x is 0 (there is no correct answer if x is zero)</span></td></tr><tr><td class="number">49</td><td>  <span class="k1">int</span> r <span class="k3">=</span> <span class="n">0</span><span class="k2">;</span></td></tr><tr><td class="number">50</td><td>  <span class="k1">if</span> <span class="k2">(</span>x <span class="k3">&amp;</span> <span class="n">0xFFFF0000</span><span class="k2">)</span> <span class="k2">{</span>r<span class="k3">+</span><span class="k3">=</span><span class="n">16</span><span class="k2">;</span> x <span class="k3">&gt;</span><span class="k3">&gt;</span><span class="k3">=</span> <span class="n">16</span><span class="k2">;</span><span class="k2">}</span></td></tr><tr><td class="number">51</td><td>  <span class="k1">if</span> <span class="k2">(</span>x <span class="k3">&amp;</span> <span class="n">0xFF00</span><span class="k2">)</span> <span class="k2">{</span>r<span class="k3">+</span><span class="k3">=</span><span class="n">8</span><span class="k2">;</span> x <span class="k3">&gt;</span><span class="k3">&gt;</span><span class="k3">=</span> <span class="n">8</span><span class="k2">;</span><span class="k2">}</span></td></tr><tr><td class="number">52</td><td>  <span class="k1">if</span> <span class="k2">(</span>x <span class="k3">&amp;</span> <span class="n">0xF0</span><span class="k2">)</span> <span class="k2">{</span>r<span class="k3">+</span><span class="k3">=</span><span class="n">4</span><span class="k2">;</span> x <span class="k3">&gt;</span><span class="k3">&gt;</span><span class="k3">=</span> <span class="n">4</span><span class="k2">;</span><span class="k2">}</span></td></tr><tr><td class="number">53</td><td>  <span class="k1">if</span> <span class="k2">(</span>x <span class="k3">&amp;</span> <span class="n">0xC</span><span class="k2">)</span> <span class="k2">{</span>r<span class="k3">+</span><span class="k3">=</span><span class="n">2</span><span class="k2">;</span> x <span class="k3">&gt;</span><span class="k3">&gt;</span><span class="k3">=</span> <span class="n">2</span><span class="k2">;</span><span class="k2">}</span></td></tr><tr><td class="number">54</td><td>  <span class="k1">if</span> <span class="k2">(</span>x <span class="k3">&amp;</span> <span class="n">0x2</span><span class="k2">)</span> <span class="k2">{</span>r<span class="k3">+</span><span class="k3">=</span><span class="n">1</span><span class="k2">;</span> x <span class="k3">&gt;</span><span class="k3">&gt;</span><span class="k3">=</span> <span class="n">1</span><span class="k2">;</span><span class="k2">}</span></td></tr><tr><td class="number">55</td><td>  <span class="k1">return</span> r<span class="k2">;</span></td></tr><tr><td class="number">56</td><td><span class="k2">}</span></td></tr></tbody></table></div></div><p>
note: that aproach will fail is the tile has a width or height of 1; use only 2x2 or large tiles with it
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (orz)</author>
		<pubDate>Fri, 08 Dec 2006 05:01:30 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>EDIT: Moved.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (ngiacomelli)</author>
		<pubDate>Fri, 08 Dec 2006 18:05:55 +0000</pubDate>
	</item>
</rss>
