Friday, July 24, 2009

Silverlight 3 WriteableBitmap Performance

Having a fast dynamic bitmap generation API at hand is essential for procedural image generation and a lot of computer games. Therefore many Silverlight developers were disappointed that WPFs WriteableBitmap wasn't available before Silverlight 3. Fortunately there was the BitmapImage.SetSource method, which uses a Stream as parameter for the bitmap source and could be used to fill an Image. I think Joe Stegman was the first who wrote a custom PNG Genertor Stream, which used this Stream mechanism and made it possible to generate procedural images with Silverlight 2. Other implementations followed.

Now that we have Silverlight 3 and the WriteableBitmap class, all these custom PNG Stream implementations became obsolete. There are still some developers, who complain about the performance of the WriteableBitmap. I was curious how the custom PNG Stream implementations compete with the WriteableBitmap and how big the speed difference really is. That's why I wrote a small Silverlight 3 application, which measures the frames per second of the custom PNG Stream implementations and the Silverlight 3 WriteableBitmap.

The Competitors
  1. Silverlight 3 WriteableBitmap.
  2. RawPngBufferStream from the open source GameEngine Balder, which I used for my Perlin Noise sample.
  3. Nikola's PngEncoder, which is an improved version of Joe Stegman's work.
  4. Ian Griffiths' SlDynamicBitmap library.

Live

The application measures the time, which every implementation needs to draw the "Maximum Frames" and calculates the mean frames per seconds (fps). The third text column shows the relative performance compared to the WriteableBitmap.
If the tests complete very fast, you should increase the "Maximum Frames" to get right results.

How it works
The Image has the size 512 x 512. One after another every drawing method is executed and the time is measured. The method CalculateColor(int x, int y) computes the color for every pixel. For that I implemented a nice old school demoscene effect, which produces an interference image. I was inspired by the brilliant Amiga demo State of the Art from 1992.

private void CalculateColor(int x, int y)
{
   // Nice sine circle movement. 
   int x1 = (int)(sin[FramesCount * 1] * TexSizeHalf) + TexSizeQuarter;
   int y1 = (int)(sin[FramesCount * 4] * TexSizeHalf) + TexSizeQuarter;
   int x2 = (int)(sin[FramesCount * 5] * TexSizeHalf) + TexSizeQuarter;
   int y2 = (int)(sin[FramesCount * 2] * TexSizeHalf) + TexSizeQuarter;

   // Clamped Euclidean distance as color
   // Change the multiplication Factor to get more circles 
   // Change the clamping Threshold for the space between
   int d = (x - x1) * (x - x1) + (y - y1) * (y - y1);
   r = (byte)((byte)Math.Sqrt(d << Factor) > Threshold ? 255 : 0);
   d = (x - x2) * (x - x2) + (y - y2) * (y - y2);
   b = (byte)((byte)Math.Sqrt(d << Factor) > Threshold ? 255 : 0);
   // Fill the gaps with green
   g = (byte)~(r | b);
}

The circles center position is animated with a sine function. For better performance I use a pre calculated lookup table (LUT) here. The rings are built using the clamped Euclidian distance to the center. Actually every Math.Sqrt() produces one colored circle, which is cut into rings by the shifting and clamping. Of course this could also be done with some loops and sine / cosine or other techniques. The square roots are calculated on the fly and not stored in a LUT. Otherwise the calculation would be too fast and not representing a real use case.
The rest of the implementation is quite simple and there's not much to explain. If you are interested in the details, please look at the source code or write a comment.

Results

The WriteableBitmap is obviously the fastest implementation. Actually I haven't expected anything else, but I hoped it would be a bit faster. Nevertheless, the Silverlight 3 WriteableBitmap is almost twice as fast as the SlDynamicBitmap library and Balder's RawPngBufferStream.
Please consider, although I use relative values, you might encounter some slightly different test results. Depending on the used hardware each implementation could perform better or worse.

Source code
The Visual Studio 2008 solution of the Speedtest application is available for download from here.

Update 08-06-2009
I've written a follow-up to this article and included the Quakelight PNG implementation and a custom pixel shader.

16 comments:

  1. Excellent post. Nice to see some facts and figures and a runnable demo. The figures on my machine were slightly different to yours (0.58, 0.84, 0.56, 1 in the order of your bar chart) but the WritableBitmap still came out on top.

    I'd be interested in seeing some scenarios, if any, where the alternative methods are more performant.

    ReplyDelete
  2. Thanks.

    Although I'm using relative values for comparison, it might well be that you encounter slightly different results. Depending on your hardware each implementation could perform better or worse.

    ReplyDelete
  3. Thank you for another excellent article.
    I see a little problem with your results tho and testing .
    Most of the time program spends inside calculateColor function. Thus you dont see such a big difference between methods. If you remove this function, you will get very dramatic difference.
    On my machine i got 120 fps for writable bitmap and only 55fps for the second closest (nokola).

    ReplyDelete
  4. Thanks John.

    You’re right, most of the time is spent in the CalculateColor() function, but all 4 dynamic bitmap implementation tests use this function and the same iterations to produce the image. Therefore it isn't part of the performance measurement.
    If the same value is just copied to the buffer over and over, the calculation is too fast and would not represent a real use case. Removing the whole CalculateColor() function is even worse. The compiler or the WriteableBitmap implementation could do some optimizations and distort the results. In fact, if you remove the CalculateColor() function, the values of r, g, b, a never change and writeableBmp.Pixels[index++] = 0xFF000000 is always executed.

    I added some more explanations about the CalculateColor() function in the "How it works" section.

    ReplyDelete
  5. Excellent article!

    ReplyDelete
  6. Great to see some hard facts on this.

    Kinda bummed out that my RawPngBufferStream didn't quite measure up, though. :) Seeing I spent quite a bit of work trying to optimize it.

    ReplyDelete
  7. great post!

    On my machine Nokola PngEncoder is consistently faster though. 1,12 relative to WritableBitmap.
    I have a DELL Latitude D830.

    ReplyDelete
  8. Thanks Einar.

    You did a great job by providing a PNG Stream implementation, which made it possible to generate dynamic bitmaps before Silverlight 3! As you know, it’s nearly impossible to compete with a good Framework internal implementation like the WriteableBitmap.
    I used the latest RawPngBufferStream from the Codeplex repository, but maybe I made a mistake in my Speedtest application and the results are wrong. It would be nice, if you could do a quick look over the Speedtest code just to make sure the results are right.

    ReplyDelete
  9. Thanks Paul.

    Now that’s interesting, Nikola’s PngEncoder performs better on your computer than the WriteableBitmap. I haven’t looked deeply into Nikola’s PNG Encoder implementation or the WriteableBitmap. Maybe the tests are running to fast and therefore the results are wrong.
    I updated the Speedtest application. It’s now possible to set the duration of the tests ("Maximum Frames"). Please try a high value for that (> 1000) and run the tests again. Make sure no other application interrupts the tests too much and that you don’t change the website.

    ReplyDelete
  10. Rene: I've abandoned the PNG thingy for Balder anyways and gone with WriteableBitmap - I am doing some optimizations using it and doing some triple-buffering magic utilizing multiple threads to get real performant. My tests now show a steady FPS of 60, compared to 30-40 before.

    ReplyDelete
  11. Ah, I get very different results with this setup, even with the same number of frames. Nokola is 0.51 relative to WritableBitmap. Fortunately a clear conclusion now, WritableBitmap is the way to go.

    ReplyDelete
  12. Paul, I haven't changed the test code itself, only the "Maximum Frames" TextBox was added. I guess the tests were interrupted during your first try.

    ReplyDelete
  13. I have yet to use Silverlight 3, I used Silverlight 2 a few months ago, and really liked how it worked. I would like to learn more about how this works in the future.

    http://www.onlinewebmarket.net/google/

    ReplyDelete
  14. Hello Rene,

    Have you considered testing the PNG wrapper technique used in Quake in Silverlight?

    I recently published the whole C# source code for Quakelight at http://www.innoveware.com.
    The PNG wrapper is available in the source tree.
    Just ask if you want an example on how to use it.

    To my knowledge, this is the fastest implementation possible. If you want an implementation without the need for a 256 colors palette, just ask.

    ReplyDelete
  15. Hello Julien

    I thought I've read in the Silverlight forums that SL3 Quakelight uses the WriteableBitmap. ???

    It would be good if you could make a 32 Bit ARGB version of your PNG wrapper and send the source code to me. Otherwise the comparision with the other implementations would be falsified.
    You can find my Email address at my website: http://rene-schulte.info

    BTW, I tried the Quakelight tech demo at SilverArcade.com. It's really cool and fast. I'm looking forward to the playable version. Keep up the good work!

    ReplyDelete