<?xml version="1.0"?>
<rss version="2.0">
	<channel>
		<title>Improvement for allegro RLE sprites</title>
		<link>http://www.allegro.cc/forums/view/571021</link>
		<description>Allegro.cc Forum Thread</description>
		<webMaster>matthew@allegro.cc (Matthew Leverton)</webMaster>
		<lastBuildDate>Fri, 17 Mar 2006 21:58:09 +0000</lastBuildDate>
	</channel>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>RLE sprites are good for isometric tile based games (such as ufo2000). We don&#39;t need rotation or any other advanced features, and very fast blitting and reduced memory usage have their advantages here. </p><p>But unfortunately allegro RLE sprites become less usabe when we need alpha blitting and lighting effects (nice features like fire and smoke, night missions simulation).</p><p>Here is a list of problems with RLE sprites:<br />1. When using blenders for alpha channel or transparency callback is called for each pixel! That&#39;s very slow<br />2. There is no standard blender functions than can implement both alpha channel and tint sprite to some color at the same time (alpha + lighting)<br />3. Function get_rle_sprite() is extremely slow as it uses getpixel() internally<br />4. Created RLE sprites are not optimal - each line can contain unnecessary trailing skip run of pixels<br />5. The use of &#39;magic pink&#39; code as end of line marker is not a very good choice, it adds artificial limitation on the maximal length of each chunk of pixels (127), also using 0 as a marker create better optimization possibilities for most architectures (on x86 we need only three instructions &#39;cmp ..., 0&#39; -&gt; &#39;js&#39; -&gt; &#39;jz&#39; to select one of the three needed branches)<br />6. Seems like there is a design problem in allegro as it is very hard to see the difference between RGB and RGBA bitmaps in 32bpp mode (after loading PNG or TGA picture). Maybe each bitmap should also have some flag indicating alpha channel presence/absence which should be set by image loaders?</p><p>As a solution to some of these problems, we have created an optimized RLE sprites functions built on top af allegro RLE. It allows fast alpha blending and tinting to black. Sprites are versatile and contain alpha channel only when needed. That means you can mix ordinary RLE sprites (fast blitting and low memory requirements) with  sprites that really contain alpha channel. Also these functions are very portable and also run well on ARM cpu as they were mostly developed as part of porting UFO2000 to Nokia 770, see <a href="http://www.allegro.cc/forums/thread/561300">http://www.allegro.cc/forums/thread/561300</a> for more details. This code doesn&#39;t use any MMX extensions (for portability) and is just composed of unrolled and optimized allegro RLE functions (and shamelessly relicensed to GPL <img src="http://www.allegro.cc/forums/smileys/smiley.gif" alt=":)" /> ).</p><p>Here is the download link:<br /><a href="http://ufo2000.sourceforge.net/files/spritelib-20060305.tar.gz">http://ufo2000.sourceforge.net/files/spritelib-20060305.tar.gz</a></p><p>Here is interface part of these sprite functions:
</p><div class="source-code"><div class="toolbar"></div><div class="inner"><table width="100%"><tbody><tr><td class="number">1</td><td><span class="c">/**</span></td></tr><tr><td class="number">2</td><td><span class="c"> * Create a versatile sprite, which can support alpha transparency</span></td></tr><tr><td class="number">3</td><td><span class="c"> * and different brightness levels.</span></td></tr><tr><td class="number">4</td><td><span class="c"> */</span></td></tr><tr><td class="number">5</td><td>ALPHA_SPRITE <span class="k3">*</span>get_alpha_sprite<span class="k2">(</span><a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>bmp<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">6</td><td>&#160;</td></tr><tr><td class="number">7</td><td><span class="c">/**</span></td></tr><tr><td class="number">8</td><td><span class="c"> * Destroy alpha sprite.</span></td></tr><tr><td class="number">9</td><td><span class="c"> */</span></td></tr><tr><td class="number">10</td><td><span class="k1">void</span> destroy_alpha_sprite<span class="k2">(</span>ALPHA_SPRITE <span class="k3">*</span>spr<span class="k2">)</span><span class="k2">;</span></td></tr><tr><td class="number">11</td><td>&#160;</td></tr><tr><td class="number">12</td><td><span class="c">/**</span></td></tr><tr><td class="number">13</td><td><span class="c"> * Draws a darkened sprite with alpha transparency support. It is optimized </span></td></tr><tr><td class="number">14</td><td><span class="c"> * for 16bpp mode and is faster than allegro functions.</span></td></tr><tr><td class="number">15</td><td><span class="c"> *</span></td></tr><tr><td class="number">16</td><td><span class="c"> * @param dst         destination bitmap</span></td></tr><tr><td class="number">17</td><td><span class="c"> * @param src         source bitmap</span></td></tr><tr><td class="number">18</td><td><span class="c"> * @param dx          target x coordinate</span></td></tr><tr><td class="number">19</td><td><span class="c"> * @param dy          target y coordinate</span></td></tr><tr><td class="number">20</td><td><span class="c"> * @param brightness  brightness (0 - black image, 255 - original unmodified image)</span></td></tr><tr><td class="number">21</td><td><span class="c"> */</span></td></tr><tr><td class="number">22</td><td><span class="k1">void</span> draw_alpha_sprite<span class="k2">(</span><a href="http://www.allegro.cc/manual/BITMAP" target="_blank"><span class="a">BITMAP</span></a> <span class="k3">*</span>dst, ALPHA_SPRITE <span class="k3">*</span>src, <span class="k1">int</span> dx, <span class="k1">int</span> dy, <span class="k1">unsigned</span> <span class="k1">int</span> brightness <span class="k3">=</span> <span class="n">255</span><span class="k2">)</span><span class="k2">;</span></td></tr></tbody></table></div></div><p>

Here are some benchmarks:
</p><div class="source-code"><div class="toolbar"></div><div class="inner"><table width="100%"><tbody><tr><td class="number">1</td><td>Athlon XP <span class="n">2400</span><span class="k3">+</span> <span class="k2">(</span><span class="n">2</span>.<span class="n">0</span> GHz, NForce2, DDR266<span class="k2">)</span></td></tr><tr><td class="number">2</td><td>gcc version <span class="n">3</span>.<span class="n">4</span>.<span class="n">4</span></td></tr><tr><td class="number">3</td><td>optimization flags: <span class="k3">-</span>O2 <span class="k3">-</span>fomit-frame-pointer</td></tr><tr><td class="number">4</td><td>note: with <span class="s">'-march=athlon-xp'</span> option added results are almost the same</td></tr><tr><td class="number">5</td><td>&#160;</td></tr><tr><td class="number">6</td><td><span class="s">"./test 16"</span></td></tr><tr><td class="number">7</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">8</td><td>explosion sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>      <span class="k3">=</span> <span class="n">10309</span>.<span class="n">3</span></td></tr><tr><td class="number">9</td><td>explosion sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>    <span class="k3">=</span> <span class="n">5181</span>.<span class="n">3</span></td></tr><tr><td class="number">10</td><td>explosion sprites per second <span class="k2">(</span>allegro bitmap<span class="k2">)</span> <span class="k3">=</span> <span class="n">3367</span>.<span class="n">0</span></td></tr><tr><td class="number">11</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">12</td><td>normal fire sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>    <span class="k3">=</span> <span class="n">181818</span>.<span class="n">2</span></td></tr><tr><td class="number">13</td><td>normal fire sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>  <span class="k3">=</span> <span class="n">173611</span>.<span class="n">1</span></td></tr><tr><td class="number">14</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">15</td><td>lit fire sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>       <span class="k3">=</span> <span class="n">147492</span>.<span class="n">6</span></td></tr><tr><td class="number">16</td><td>lit fire sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>     <span class="k3">=</span> <span class="n">92592</span>.<span class="n">6</span></td></tr><tr><td class="number">17</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">18</td><td>alpha fire sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>     <span class="k3">=</span> <span class="n">102459</span>.<span class="n">0</span></td></tr><tr><td class="number">19</td><td>alpha fire sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>   <span class="k3">=</span> <span class="n">73099</span>.<span class="n">4</span></td></tr><tr><td class="number">20</td><td>&#160;</td></tr><tr><td class="number">21</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">22</td><td>&#160;</td></tr><tr><td class="number">23</td><td>Nokia <span class="n">770</span> Internet Tablet <span class="k2">(</span><span class="n">250</span>MHz OMAP1710<span class="k2">)</span></td></tr><tr><td class="number">24</td><td>gcc version <span class="n">3</span>.<span class="n">3</span>.<span class="n">4</span></td></tr><tr><td class="number">25</td><td>optimization flags: <span class="k3">-</span>O2 <span class="k3">-</span>fomit-frame-pointer</td></tr><tr><td class="number">26</td><td>&#160;</td></tr><tr><td class="number">27</td><td><span class="s">"./test 16"</span></td></tr><tr><td class="number">28</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">29</td><td>explosion sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>      <span class="k3">=</span> <span class="n">823</span>.<span class="n">7</span></td></tr><tr><td class="number">30</td><td>explosion sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>    <span class="k3">=</span> <span class="n">386</span>.<span class="n">8</span></td></tr><tr><td class="number">31</td><td>explosion sprites per second <span class="k2">(</span>allegro bitmap<span class="k2">)</span> <span class="k3">=</span> <span class="n">273</span>.<span class="n">2</span></td></tr><tr><td class="number">32</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">33</td><td>normal fire sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>    <span class="k3">=</span> <span class="n">21376</span>.<span class="n">7</span></td></tr><tr><td class="number">34</td><td>normal fire sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>  <span class="k3">=</span> <span class="n">16863</span>.<span class="n">4</span></td></tr><tr><td class="number">35</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">36</td><td>lit fire sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>       <span class="k3">=</span> <span class="n">16244</span>.<span class="n">3</span></td></tr><tr><td class="number">37</td><td>lit fire sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>     <span class="k3">=</span> <span class="n">7961</span>.<span class="n">8</span></td></tr><tr><td class="number">38</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">39</td><td>alpha fire sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>     <span class="k3">=</span> <span class="n">13358</span>.<span class="n">3</span></td></tr><tr><td class="number">40</td><td>alpha fire sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>   <span class="k3">=</span> <span class="n">6463</span>.<span class="n">3</span></td></tr><tr><td class="number">41</td><td>&#160;</td></tr><tr><td class="number">42</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">43</td><td>&#160;</td></tr><tr><td class="number">44</td><td>Nokia <span class="n">770</span> Internet Tablet <span class="k2">(</span><span class="n">250</span>MHz OMAP1710<span class="k2">)</span></td></tr><tr><td class="number">45</td><td>gcc version <span class="n">3</span>.<span class="n">3</span>.<span class="n">4</span></td></tr><tr><td class="number">46</td><td>optimization flags: <span class="k3">-</span>O2 <span class="k3">-</span>fomit-frame-pointer <span class="k3">-</span>march<span class="k3">=</span>armv5te</td></tr><tr><td class="number">47</td><td>&#160;</td></tr><tr><td class="number">48</td><td><span class="s">"./test 16"</span></td></tr><tr><td class="number">49</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">50</td><td>explosion sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>      <span class="k3">=</span> <span class="n">801</span>.<span class="n">9</span></td></tr><tr><td class="number">51</td><td>explosion sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>    <span class="k3">=</span> <span class="n">388</span>.<span class="n">8</span></td></tr><tr><td class="number">52</td><td>explosion sprites per second <span class="k2">(</span>allegro bitmap<span class="k2">)</span> <span class="k3">=</span> <span class="n">270</span>.<span class="n">6</span></td></tr><tr><td class="number">53</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">54</td><td>normal fire sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>    <span class="k3">=</span> <span class="n">30413</span>.<span class="n">6</span></td></tr><tr><td class="number">55</td><td>normal fire sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>  <span class="k3">=</span> <span class="n">16683</span>.<span class="n">4</span></td></tr><tr><td class="number">56</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">57</td><td>lit fire sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>       <span class="k3">=</span> <span class="n">20145</span>.<span class="n">0</span></td></tr><tr><td class="number">58</td><td>lit fire sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>     <span class="k3">=</span> <span class="n">7921</span>.<span class="n">4</span></td></tr><tr><td class="number">59</td><td><span class="k3">-</span><span class="k3">-</span><span class="k3">-</span></td></tr><tr><td class="number">60</td><td>alpha fire sprites per second <span class="k2">(</span>spritelib<span class="k2">)</span>     <span class="k3">=</span> <span class="n">13161</span>.<span class="n">4</span></td></tr><tr><td class="number">61</td><td>alpha fire sprites per second <span class="k2">(</span>allegro rle<span class="k2">)</span>   <span class="k3">=</span> <span class="n">6379</span>.<span class="n">2</span></td></tr></tbody></table></div></div><p>

Todo: Improve get_alpha_sprite() function as it relies on allegro get_rle_sprite() and is slow</p><p>There are options what to do next:<br />1. Convert this code into addon library<br />2. Add some improvements to the allegro library itself, so that fast alpha blending becomes available to more <br />3. Do nothing and use this code entirely as a part of UFO2000 project <img src="http://www.allegro.cc/forums/smileys/smiley.gif" alt=":)" /> </p><p>That&#39;s why I need your feedback and probably results of benchmarks on different CPU architectures.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (serge)</author>
		<pubDate>Sun, 05 Mar 2006 18:34:36 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Without even looking at your code I can tell you that the choice of GPL instead of the Allegro license will severely limit the acceptance of it... <img src="http://www.allegro.cc/forums/smileys/tongue.gif" alt=":P" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (gnolam)</author>
		<pubDate>Sun, 05 Mar 2006 20:27:36 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Don&#39;t you think that the MMX version of color mixing in 32bpp I posted in the other thread could be used to further improve the performance? <img src="http://www.allegro.cc/forums/smileys/wink.gif" alt=";)" />
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Fladimir da Gorf)</author>
		<pubDate>Sun, 05 Mar 2006 21:35:58 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>This code is currently GPL licensed as it is part of UFO2000 project right now (which is GPL licensed itself). In the case if it becomes addon library or gets (partially?) included into allegro library, it will have allegro license of course <img src="http://www.allegro.cc/forums/smileys/smiley.gif" alt=":)" /></p><p>That&#39;s why I posted it here and wait for feedback. If no feedback is received, it will remain a part of UFO2000 (see option 3) as I don&#39;t feel like doing useless work. The scope of this code is currently fast and portable 16bpp alpha blending for mem-&gt;mem blitting of RLE sprites (needed on Nokia 770) and it seems to serve it well. Depending on community interest, it might grow into something more useful <img src="http://www.allegro.cc/forums/smileys/smiley.gif" alt=":)" /></p><p>Also my intention was to eventually make some patches ready for inclusion into allegro improving RLE sprites support. Alpha blending is slow in allegro mostly not because it is not MMX or whatever optimized, it just uses callbacks that slow everything down (see problem 1). Probably some standard blenders could be inlined into some special versions of optimized functions, alpha blender is the first candidate, but that would not serve UFO2000 well as there is no standard blender for what it needs (see problem 2). So improving performance of current allegro API would solve only some problems (but if these changes get accepted, that would be also good).</p><p>PS. Test program which verifies both performance and correctness (by comparing results of optimized and standard allegro blenders) is included, so it should not be too hard to try it.</p><div class="quote_container"><div class="title">Fladmir said:</div><div class="quote"><p>
Don&#39;t you think that the MMX version of color mixing in 32bpp I posted in the other thread could be used to further improve the performance? <img src="http://www.allegro.cc/forums/smileys/wink.gif" alt=";)" />
</p></div></div><p>
I&#39;ll try it, thanks <img src="http://www.allegro.cc/forums/smileys/smiley.gif" alt=":)" /> <b>edit:</b> Tried, it really speeds up the program, but only about 5% (maybe I just have a fast cpu with slow memory and memory is the bottleneck). In addition it requires &#39;inline&#39; changed to &#39;INLINE&#39; for BlendColorsNoEmms, otherwise it actually gets even slower (&#39;inline&#39; unfortunately is only a hint for gcc and it seems to ignore it):
</p><div class="source-code snippet"><div class="inner"><pre><span class="p">#ifdef __GNUC__</span>
<span class="p">#define INLINE inline __attribute__((always_inline))</span>
<span class="p">#else</span>
<span class="p">#define INLINE inline</span>
<span class="p">#endif</span>
</pre></div></div><p>

By the way, one more improvement (but not related to alpha blending) and which is also not very portable is to add support for compiled sprites. When checking for clipping, there is a place in the code where we are sure that the sprite is not clipped at all. And it means that we can add &lt;b&gt;COMPILED_SPRITE  member to ALPHA_SPRITE struct, initialize it for the sprites which do not have alpha channel and use them when no clipping is needed -&gt; improve performance when no blending or tinting is required, but also increase memory requirements.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (serge)</author>
		<pubDate>Sun, 05 Mar 2006 21:41:26 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Thanks for the INLINE hint. I would&#39;ve expected the compiler to know better, though...
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Fladimir da Gorf)</author>
		<pubDate>Sun, 05 Mar 2006 22:38:09 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><div class="quote_container"><div class="title">Fladmir said:</div><div class="quote"><p>
Thanks for the INLINE hint. I would&#39;ve expected the compiler to know better, though...
</p></div></div><p>
After having a second look, appears that I deceived you somewhat. Compiler seems to inline BlendColorsNoEmms() fine on normal code.</p><p>But in my &#39;sprite.cpp&#39; I heavily use that &#39;__attribute__((always_inline))&#39; option in order to force inlining, otherwise gcc refuses to inline some of the functions containing loops, but they still affect performance much (maybe it wouldn&#39;t if used with profile guided optimization though, but that&#39;s not convenient). Anyway, in this particular source file with lots of functions with forced inline, gcc seems to take a revenge on BlendColorsNoEmms() function and DOES NOT inline it if it does not have that &#39;__attribute__((always_inline))&#39; <img src="http://www.allegro.cc/forums/smileys/smiley.gif" alt=":)" /> That&#39;s weird.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (serge)</author>
		<pubDate>Mon, 06 Mar 2006 11:42:48 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>There&#39;s a limit GCC has for the size of code that it will inline. You can change this limit with some command line switch, but I don&#39;t remember what it is.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Kitty Cat)</author>
		<pubDate>Mon, 06 Mar 2006 11:55:07 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>It also by default doesnt inline most things with loops, generally because it&#39;ll make the size of the code explode to insane proportions. (since loops are unrolled..)
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Thomas Fjellstrom)</author>
		<pubDate>Mon, 06 Mar 2006 12:24:43 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>Well, but what about RLE sprites and allegro blenders? Is anybody using them? Or everyone switched to OpenLayer already? <img src="http://www.allegro.cc/forums/smileys/wink.gif" alt=";)" /></p><p>It would be interesting to see benchmark of my little test program on Pentium 4 and Mac.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (serge)</author>
		<pubDate>Tue, 07 Mar 2006 02:00:25 +0000</pubDate>
	</item>
	<item>
		<description><![CDATA[<div class="mockup v2"><p>I&#39;m interested and I&#39;m going to take a look at it.<br />...<br />uhm... well, the explosion sprites I have don&#39;t have an alpha layer, so I can&#39;t use the alpha thingy, until I get a replacement from &quot;somewhere&quot;  - sorry, that can take a while.
</p></div>]]>
		</description>
		<author>no-reply@allegro.cc (Geoman)</author>
		<pubDate>Fri, 17 Mar 2006 21:58:09 +0000</pubDate>
	</item>
</rss>
