Wednesday, May 25, 2011

Why is a Y in the Windows Phone Mango Camera API

The release of the new Mango tools brings Windows Phone development on par with Silverlight 4 and will therefore add many great features to the Windows Phone platform. This means it will also contain the Webcam CaptureSource and VideoSink API from Silverlight 4. Additionally it also introduces the new FileSink class which can be used to record the video stream as MP4 to the Isolated Storage. Most important a new PhotoCamera class with a lot of functionality is present in Mango.
This class is used in the latest SLARToolkit Windows Phone sample and in some other new projects I'm working on.

Don't Reinvent the Wheel
The Silverlight 4 webcam API was explained in this detailed blog post almost a year ago. The techniques and concepts I described there can now also be used with Windows Phone Mango.
The updated MSDN documentation has quite a few articles and samples about the new camera API. My MVP buddy Alex Golesh also has a nice write up about the new Camera API.
This blog post tries to fill the gaps and provides some information especially about the PhotoCamera's YCrCb capture methods.

YCbCr vs ARGB
In the well-known RGB color space the red, green and blue information is stored in separate components which also contains the redundant luminance data for each channel. In the YCbCr color space (or YCrCb) the luminance information is stored in the Y component and the chroma (color) information in the Cb component as blue-difference and in Cr component as red-difference. The RGB-YCbCr conversion can be done with simple addition and multiplication operations. The Y component usually ranges from 0 to 1, Cb and Cr from -0.5 to 0.5.

Y = 0.5, Cr [-0.5, 0.5], Cb[-0.5, 0.5]

Humans are more sensitive to luminance information than to chroma, therefore the resolution of the color information can be reduced and only the luminance needs to be stored in full resolution. Many digital camera sensors use the YCrCb color space and make use of this reduced chroma information.

PhotoCamera
The PhotoCamera class has a lot of useful methods to either capture a full resolution image from the camera or to get a smaller (and faster) preview buffer snapshot. The GetPreviewBufferY and GetPreviewBufferYCrCb methods provide the direct data from the camera without a transformation to 32 bit ARGB. Not only is the alpha channel left out in the YCrCb buffer, also the Cr and Cb color components are stored with reduced resolution. This keeps the buffer size smaller and is way faster, but also makes it a bit trickier when the color components (and brightness) need to be extracted from the byte buffer. Fortunately there's the YCbCrPixelLayout property which contains alls the offsets, strides and other needed information.

Conclusion
The GetPreviewBufferYCrCb method is approximately 4 times faster than the GetPreviewBufferArgb32 method and also takes a smaller buffer, therefore the YCrCb methods are the way to go when only the luminance data is needed or the YCbCr color space can be used for the given scenario. For example many computer vision techniques only need the luminance information for processing.
I like that both color spaces are supported by the API. On mobile devices you need all the performance you can get. I actually helped the Windows Phone camera team with quite a bit feedback to decide about this API design. Very smart people by the way.

2 comments:

  1. Any way to save back the PreviewBuffer as a stream or any way to apply filters directly to the a video feed and save it as a MP4? I have a little filter I apply to the preview buffer and works very nice but I want to be able to save the result as a video.

    ReplyDelete
  2. Nope, there's no way to accomplish this with built-in fucntionality. You can only save the original stream with the FileSink as MP4 to IsolatedStorage. There's no hook for custom filters.
    You would have to implement the whole pipeline yourself.

    ReplyDelete