Kodierer [Coder]: Mango

Showing posts with label Mango. Show all posts

Thursday, January 5, 2012

SLARToolkit Samples Updated

The samples of my open source Windows Phone and Silverlight Augmented Reality Toolkit were updated to the latest version of the WP 7.1 and Silverlight 5 SDKs.
Please note the changed security model in Silverlight 5, which is a big bummer. My Silverlight MVP friend Morten wrote a few true words about it here.
As usual you can find a list of the samples on the project site and also get the code there.

Friday, May 27, 2011

Nodo vs Mango - Windows Phone ListBox Performance Improvements in Action

There were quite a few complaints about the Windows Phone's ListBox scrolling performance in the past. The Windows Phone team obviously heard it and heavily worked on the performance and responsiveness of the whole platform and the ListBox in particular. The touch input processing was loaded off from the UI thread to a new separate thread. Additionally the BitmapImage doesn't load the data on the UI thread anymore. I'm sure lots of other tweaks were implemented to increase the performance and the responsiveness of the platform. I think the Windows Phone team did a very good job!

Showdown

The video below shows a side-by-side comparison of a Nodo device with build 7392 and a prototype device with a Mango prebuild. The Nodo device on the left side is a Samsung Omnia 7, the Mango device on the right side is the ASUS prototype.

For comparison I use the official Twitter app build against 7.0 and in the second half of the video I show the effects preview list of my Pictures Lab app. This effects pivot item only uses a ListBox with a DataTemplate that contains an Image control with a fixed size and a TextBox for each item. So there's no background image loading being performed or any other heavy computing, just a static list with a lot of redrawing.

Background music is Taiga by SMILETRON

As you can see, it's quite a huge difference and even this Mango prebuild runs very smooth on this rather old ASUS hardware. There's still some room left for more improvement, like the rasterizer, but imagine the boost on production devices with the final Mango version.

MANGO, MANGO, MANGO!

I don't have a deal with Microsoft or are paid to blog or tweet this. I'm just exited about all the goodies that are coming with Mango. Good times.

Wednesday, May 25, 2011

Why is a Y in the Windows Phone Mango Camera API

The release of the new Mango tools brings Windows Phone development on par with Silverlight 4 and will therefore add many great features to the Windows Phone platform. This means it will also contain the Webcam CaptureSource and VideoSink API from Silverlight 4. Additionally it also introduces the new FileSink class which can be used to record the video stream as MP4 to the Isolated Storage. Most important a new PhotoCamera class with a lot of functionality is present in Mango.
This class is used in the latest SLARToolkit Windows Phone sample and in some other new projects I'm working on.

Don't Reinvent the Wheel
The Silverlight 4 webcam API was explained in this detailed blog post almost a year ago. The techniques and concepts I described there can now also be used with Windows Phone Mango.
The updated MSDN documentation has quite a few articles and samples about the new camera API. My MVP buddy Alex Golesh also has a nice write up about the new Camera API.
This blog post tries to fill the gaps and provides some information especially about the PhotoCamera's YCrCb capture methods.

YCbCr vs ARGB
In the well-known RGB color space the red, green and blue information is stored in separate components which also contains the redundant luminance data for each channel. In the YCbCr color space (or YCrCb) the luminance information is stored in the Y component and the chroma (color) information in the Cb component as blue-difference and in Cr component as red-difference. The RGB-YCbCr conversion can be done with simple addition and multiplication operations. The Y component usually ranges from 0 to 1, Cb and Cr from -0.5 to 0.5.

Y = 0.5, Cr [-0.5, 0.5], Cb[-0.5, 0.5]

Humans are more sensitive to luminance information than to chroma, therefore the resolution of the color information can be reduced and only the luminance needs to be stored in full resolution. Many digital camera sensors use the YCrCb color space and make use of this reduced chroma information.

PhotoCamera
The PhotoCamera class has a lot of useful methods to either capture a full resolution image from the camera or to get a smaller (and faster) preview buffer snapshot. The GetPreviewBufferY and GetPreviewBufferYCrCb methods provide the direct data from the camera without a transformation to 32 bit ARGB. Not only is the alpha channel left out in the YCrCb buffer, also the Cr and Cb color components are stored with reduced resolution. This keeps the buffer size smaller and is way faster, but also makes it a bit trickier when the color components (and brightness) need to be extracted from the byte buffer. Fortunately there's the YCbCrPixelLayout property which contains alls the offsets, strides and other needed information.

Conclusion
The GetPreviewBufferYCrCb method is approximately 4 times faster than the GetPreviewBufferArgb32 method and also takes a smaller buffer, therefore the YCrCb methods are the way to go when only the luminance data is needed or the YCbCr color space can be used for the given scenario. For example many computer vision techniques only need the luminance information for processing.
I like that both color spaces are supported by the API. On mobile devices you need all the performance you can get. I actually helped the Windows Phone camera team with quite a bit feedback to decide about this API design. Very smart people by the way.

Tuesday, May 24, 2011

Augmented Mango - SLARToolkit for Windows Phone

The beta of the new Windows Phone Developer Tools was just publicly released. The update with the codename "Mango" comes with many new APIs and will finally contain an API for real-time camera access what a lot of developers have been asking for. The new runtime gives us the needed functionality to implement many cool scenarios. One of these scenarios is Augmented Reality, which leads to my open source Silverlight Augmented Reality Toolkit (SLARToolkit).
This post announces the new Windows Phone version of SLARToolkit and also provides a sample. If you're one of those lucky people with a Mango-enabled device you can download the XAP here or just watch a video instead.

The SLARToolkit project description from the CodePlex site:

SLARToolkit is a flexible marker-based Augmented Reality library for Silverlight and Windows Phone with the aim to make real time Augmented Reality applications with Silverlight as easy and fast as possible. It can be used with Silverlight's Webcam API or with any other CaptureSource, WriteableBitmap or with the Windows Phone's PhotoCamera. SLARTookit is based on the established NyARToolkit and ARToolkit.

Demo

The sample XAP can be deployed to a Mango-enabled device (tested with build 7629). Alternatively there's also a video of the new sample embedded below.

If you want to try it yourself you need do download the SLAR and / or L marker, print them and point the camera toward these. The marker(s) should be printed non-scaled at the original size (80 x 80 mm) and centered for a small white border. As an alternative it's also possible to open a marker file on a different device and to use the device's screen as marker.
See the SLARToolkit Markers documentation for more details.

Video

I've recorded a short video of the new sample with my Samsung Omnia 7. It's a bit blurry, but it demonstrates how well the sample works even on this quite old ASUS prototype, which's camera pipeline seems a bit slow.
The video is also available at YouTube.

Background music is Melo by Mosaik

This demo shows how the new Windows Phone Mango real-time camera API can be used to augment the reality with the help of the SLARToolkit. This can be nice for educational projects and it's actually no problem to add correctly transformed videos or other content to the demo.
The demo demonstrates just some basic UIElements like a TextBox and an Image control. Mango will also enable the combination of Silverlight and XNA, which means that nice 3D AR games can be developed with the help of the SLARToolkit.

How it works

This sample uses the new PhotoCamera and a timer to constantly get a snapshot of the real-time camera stream. This snapshot is then passed to the SLARToolkit algorithms to get the 3D spatial information of the marker. The computed detection results are used to transform the elements perspectively correct.

protected override void OnNavigatedTo(System.Windows.Navigation.NavigationEventArgs e)
{
   base.OnNavigatedTo(e);

   // Initialize the webcam
   photoCamera = new PhotoCamera();
   photoCamera.Initialized += PhotoCameraInitialized;
   isInitialized = false;

   // Fill the Viewport Rectangle with the VideoBrush
   var vidBrush = new VideoBrush();
   vidBrush.SetSource(photoCamera);
   Viewport.Fill = vidBrush;

   // Start timer
   dispatcherTimer = new DispatcherTimer { Interval = TimeSpan.FromMilliseconds(50) };
   dispatcherTimer.Tick += (sender, e1) => Detect();
   dispatcherTimer.Start();
}

The PhotoCamera instance is set up in the OnNavigatedTo event handler of the page and the DispatcherTimer is started. The timer will constantly call the Detect method every 50 milliseconds. Additionally a viewfinder Rectangle is filled with a VideoBrush which in turn has the photoCamera video stream set as source.

void Detect()
{
   if (!isInitialized)
   {
      return;
   }

   // Update buffer size
   var pixelWidth = photoCamera.PreviewBufferResolution.Width;
   var pixelHeight = photoCamera.PreviewBufferResolution.Height;
   if (buffer == null || buffer.Length != pixelWidth * pixelHeight)
   {
      buffer = new byte[pixelWidth * pixelHeight];
   }

   // Grab snapshot
   photoCamera.GetPreviewBufferY(buffer);

   // Detect
   var dr = arDetector.DetectAllMarkers(buffer, pixelWidth, pixelHeight);

   // Calculate the projection matrix
   if (dr.HasResults)
   {
      // Center at origin of the 256x256 controls
      var centerAtOrigin = Matrix3DFactory.CreateTranslation(-128, -128, 0);
            
      // Swap the y-axis and scale down by half
      var scale = Matrix3DFactory.CreateScale(0.5, -0.5, 0.5);

      // Calculate the complete transformation matrix based on the first detection result
      var world = centerAtOrigin * scale * dr[0].Transformation;

      // Viewport transformation
      var viewport = Matrix3DFactory.CreateViewportTransformation(pixelWidth, pixelHeight);

      // Calculate the final transformation matrix by using the camera projection matrix 
      var m = Matrix3DFactory.CreateViewportProjection(world, Matrix3D.Identity, arDetector.Projection, viewport);

      // Apply the final transformation matrix to the controls
      var matrix3DProjection = new Matrix3DProjection { ProjectionMatrix = m };
      Txt.Projection = matrix3DProjection;
      Img.Projection = matrix3DProjection;
   }
}

A snapshot of the current preview buffer is taken in the Detect method using the GetPreviewBufferY method. This method fills up a byte buffer with the luminance data of the current viewfinder frame. This buffer is then passed to the SLARToolkit's MarkerDetector Detect method, which returns the detected marker information. This transformation data is then used to transform the UIElement perspectively correct in 3D.
Read more about the PhotoCamera's YCbCr methods in this blog post.

void PhotoCameraInitialized(object sender, CameraOperationCompletedEventArgs e)
{
   //  Initialize the Detector
   arDetector = new GrayBufferMarkerDetector();

   // Load the marker pattern. It has 16x16 segments and a width of 80 millimeters
   var marker = Marker.LoadFromResource("data/Marker_SLAR_16x16segments_80width.pat", 16, 16, 80);

   // The perspective projection has the near plane at 1 and the far plane at 4000
   arDetector.Initialize(photoCamera.PreviewBufferResolution.Width, photoCamera.PreviewBufferResolution.Height, 1, 4000, marker);

   isInitialized = true;
}

The SLARToolkit's GrayBufferMarkerDetector is created and set up in the PhotoCamera's Initialized event handler. The brand new GrayBufferMarkerDetector uses the byte buffer with luminance data directly without the need of an ARGB 32 bit pixel conversion.

Checkout the source code at CodePlex if you want to see all the details of the sample which were left out for clarity.

Download it, build your app and augment your reality

The open source SLARToolkit library and all samples are hosted at CodePlex. If you have any comments, questions or suggestions don't hesitate and write a comment, use the Issue Tracker on the CodePlex site or contact me via any other media.

Have fun with the library and please keep me updated if you use it anywhere so I can put a link on the project site.

Thursday, June 3, 2010

Push and Pull - Silverlight Webcam Capturing Details

Photo (CC) by MShades

It's not a secret that one of my favorite Silverlight 4 features is the webcam support and I already played endless hours with it. There are many blog posts out there demonstrating how to use the webcam and how to take a screenshot with the CaptureImageAsync method. Only a few cover the VideoSink.

This blog post will show how to use the webcam, the CaptureImageAsync method and also how to implement and use the VideoSink. But most important I'll cover what the differences between the CaptureImageAsync and VideoSink are and when to use which.

Silverlight Webcam 101

The Silverlight 4 webcam API is pretty easy to use and just a few lines of code are needed to show a webcam video stream on screen.
Silverlight's CaptureSource class provides the webcam stream that is used as the source of a VideoBrush, which in turn fills a Rectangle with the video feed from the webcam. It's also possible to use any other Shape with a Fill property.
The CaptureDeviceConfiguration class can be used to retrieve a list of all installed video and audio devices on the system. Most of the time it's sufficient to use the GetDefaultVideoCaptureDevice to get the default device. The user can specify the default video and audio devices with the Silverlight configuration; he or she only has to press the right mouse button over the Silverlight application, click "Silverlight" in the context menu and select the "Webcam / Mic" tab to set them.

Silverlight Webcam / Mic configuration dialog.

The following C# code initializes the webcam (captureSource) in the Loaded event of the page and fills a rectangle (Viewport) with a VideoBrush:

// Member variable (webcam reference)
CaptureSource captureSource;

private void UserControl_Loaded(object sender, RoutedEventArgs e)
{
   // Initialize the webcam
   captureSource = new CaptureSource();
   captureSource.VideoCaptureDevice = CaptureDeviceConfiguration.GetDefaultVideoCaptureDevice();

   // Fill the Viewport Rectangle with the VideoBrush
   var vidBrush = new VideoBrush();
   vidBrush.SetSource(captureSource);
   Viewport.Fill = vidBrush;
}

Now that the webcam is initialized, the streaming can be started. This is done in an event handler of a Button because RequestDeviceAccess has to be called from an user initiated event. Otherwise it would be possible to start the webcam without the user's permission. Of course nobody wants to experience something like what happened to the students with their MacBooks provided by their high school.

Here's the C# code:

private void StartButton_Click(object sender, RoutedEventArgs e)
{
   // Request webcam access and start the capturing
   if (CaptureDeviceConfiguration.RequestDeviceAccess())
   {
      captureSource.Start();
   }
}

This will open a webcam permission dialog asking the user for the device access. This setting is consent and Silverlight can remember if the user has previously allowed that certain application.

Webcam permission dialog

Capturing The Webcam
There are two different approaches to capture the webcam in Silverlight. The CaptureSource's CaptureImageAsync method and CaptureImageCompleted event provide a snapshot on demand and can be considered as a pull-based technology. A custom VideoSink implementation on the other hand constantly gets the raw stream from the webcam and can be considered as a push-based approach.

Pull: CaptureImageAsync Webcam Capture
When the CaptureSource's CaptureImageAsync method is called an asynchronous capturing task is started. After the snapshot is completed, the CaptureImageCompleted event is fired. The event provides a WriteableBitmap as EventArgs.

The following C# code should be added after the captureSource initialization code above:

// Wiring the CaptureImageCompleted event handler
captureSource.CaptureImageCompleted += (s, e) =>
{
   // Do something with the camera snapshot
   // e.Result is a WriteableBitmap
   Process(e.Result);
};

Another button is used to start the asynchronous capturing:

private void SnapshotButton_Click(object sender, RoutedEventArgs e)
{
   // CaptureImageAsync fires the CaptureImageCompleted event
   captureSource.ImageCaptureAsync();
}

Push: VideoSink Webcam Capture
The other capturing approach constantly pushes every frame from the webcam into a VideoSink. The abstract VideoSink class has four methods that have to be implemented in an own subclass in order to use it.

The basic set-up of a custom VideoSink looks like this:

// MyVideoSink is derived from Silverlight's VideoSink
public class MyVideoSink : VideoSink
{
   VideoFormat vidFormat;
   
   // Could be used to initialize a container for the webcam stream data
   protected override void OnCaptureStarted() { }
   
   // Could be used to dispose a container for the webcam stream data
   // or to write a header of a video file format
   protected override void OnCaptureStopped() { }

   // Is called when the VideoFormat was changed
   protected override void OnFormatChange(VideoFormat videoFormat)
   {
      this.vidFormat = videoFormat;
   }

   // Is called every time the webcam provides a complete frame (Push)
   protected override void OnSample(long sampleTime, long frameDuration, 
                                    byte[] sampleData)
   {
      // Process the webcam snapshot 
      // sampleData contains the raw byte stream
      // according to the videoFormat from OnFormatChange
      Process(sampleData, this.vidFormat);
   }
}

The following C# code initializes MyVideoSink with the webcam. It should be added after the captureSource initialization code above:

// Wire the VideoSink and the webcam together
var sink = new MyVideoSink { CaptureSource = captureSource };

The VideoSink's OnCaptureStarted and OnFormatChange are raised after the captureSource.Start() method was called. The OnSample method is constantly called as long as the webcam is activated. The actual interval OnSample will be called is defined in VideoFormat.FramesPerSecond which is provided through the OnFormatChange method. The OnCaptureStopped is raised after the captureSource.Stop() method was called.

Push vs. Pull
Obviously the two approaches have different characteristics.

Pull: CaptureImageAsync

Simple to use
Provides a WriteableBitmap
CaptureImageCompleted is raised on the UI thread (no Dispatcher.BeginInvoke necessary)
Only one PixelFormat (ARGB32)
System resources aren't constantly used
Pull: Samples on demand

Push: Custom VideoSink

Direct usage of the raw byte stream
Less overhead
OnSample is called on a background thread
Automatically called for every frame
Might support more than one PixelFormat in the future
More information like frame number and duration (accurate sample times)
Slightly faster than CaptureImageAsync if every frame is needed
Push: Constant sampling

Please note that the CaptureImageAsync method can also be called periodically. Thereby it's possible to get a snapshot in a defined interval which might be faster than using a VideoSink that fires every 30 or even 60 frames per second (fps).

The following C# code calls CaptureImageAsync every 100 milliseconds, which means every 10 fps a snapshot is taken:

var dispatcherTimer = new DispatcherTimer();
dispatcherTimer.Interval = new TimeSpan(0, 0, 0, 0, 100); // 10 fps
dispatcherTimer.Tick += (s, e) =>
{
   // Process camera snapshot if started
   if (captureSource.State == CaptureState.Started)
   {
      // CaptureImageAsync fires the CaptureImageCompleted event
      captureSource.CaptureImageAsync();
   }
};
// Start the timer
dispatcherTimer.Start();

Conclusion
Both approaches are helpful for different scenarios. The pull-based CaptureImageAsync method is useful for taking single snapshots, whereas a push-based custom VideoSink can be used for capturing complete sequences and encoding the webcam stream.