Animative Blogging, Part 3: Advanced Sprite Extraction and Analysis

My previous entry on sprite extraction covered the basics of extracting objects from images of gameplay, but only covered situations where the object is cast against a uniform background.  Here, I'll detail an alternate approach using the GFX viewer in MAME, as well as more advanced techniques for extracting objects set against non-uniform backgrounds.

The MAME GFX Viewer

MAME has a brilliant little feature that most users don't know about.  If you press F4 during the emulation of a game, it will bring up the GFX viewer, which allows you to view any of the game's stored graphics; that is, assuming MAME recognizes the format in which they're stored.  It won't work on vector games or games with encrypted graphics.

The viewer is laid out in three sections.  The first section is the color palette, which shows all of the colors used in the game.  You won't need this section for sprite extraction, so push Enter to skip it.  The next section shows the bitmaps that the game uses to build its graphics.  These can be tilesets, sprites, or any other bitmap stored in ROM that MAME recognizes as graphics.  The first screen usually shows tiles.  For example, here's the tileset used in Galaga:

Image of the tileset from the arcade version of Galaga (1981).

This screen won't be much use to you because tiles are usually smaller than the objects you're trying to capture and even if they're made from tiles, they won't be shown here fully constructed.  However, if you look at the top of the above image, you'll see a "0/1" after the ':gfxdecode' string.  That means that there's another page of bitmaps, which you can access by pressing the "}" key.

Tiled image of the Galaga sprites, as seen in the MAME graphics viewer

Now that's more like it.  Usually, when dedicated hardware is being used for sprites, they're stored on the second page of graphics.  To grab a sprite from this sheet, just crop the image around the sprite you want and use the color replacement method to make the background transparent.  Note that you can also cycle through the color schemes (if there's more than one) using the left/right arrow keys.    The color scheme above looks good for the player's ship, but the aliens use different ones, so you'll have to cycle through until you find the right one.

Press Enter again and you'll see the tilemap viewer.  What's shown on this screen will change depending on where you are in the game because it's essentially a snapshot of the video RAM.  If the object you want to capture was constructed from tiles you saw in the previous section, you should be able to find it fully constructed here, so long as you press F4 at the moment that it's being displayed in the game.  Note that some games, like Phoenix, use multiple tilemaps, so it's possible for this section to have multiple pages as well.  As before, use the "}" to cycle to the next page.  Here's a sample set of tilemaps from the first round of Phoenix:

Side-by-side images of the two tilemaps drawn in the arcade version of Phoenix (1980), as seen in the MAME GFX viewer.

The tilemap viewer shows the two tilemaps at their internal aspect ratios, so you can extract the birds without having to deal with the background.  The garbage data in the above image is not part of the game, it's just visible because the video RAM (256x256) is larger than the screen resolution of the game (256x208).  

Note that hardware sprites are not stored in the video RAM, so you won't see them displayed here.  For games like Pac-Man or Galaga, you'll have to get the sprites from the bitmaps in the previous section.

Advanced Techniques

Since I find myself extracting a lot of graphics for my blog entries, I've written my own programs to do some of the image manipulation.  If you're a C or Python programmer, the OpenCV library is a fantastic resource for a wide range of image manipulation routines.  If you just want to perform simple tasks, like zooming, cropping, and color replacements, a Google search should give you plenty of resources, so I won't go into that kind of detail here.  Instead, here's some high-level information on less well known techniques that I use.  I have a GitHub repository that I plan to make public eventually, but feel free to contact me or comment if you want more detail on something.

Fuzzy Color Replacement

If you're extracting from image data that are less than pristine, like a lossy video, a simple one-color replacement isn't going to be sufficient to select the background.  Instead you're going to need to select an "approximate" or "fuzzy" color from the image.  This concept isn't too hard to understand if you imagine an RGB color as a point in a three-dimensional color space.  To find how close one color is to another, you can calculate a three-dimensional color distance, like so:

D = sqrt((R1-R2)2+(G1-G2)2+(B1-B2)2)

To perform a "fuzzy" color selection, you could just select all pixels where the color is within some distance threshold from your reference color.  For example, two pixels with colors of (3,4,0) and (0,0,7) have color distances of 5 and 7 with respect to black, which is (0,0,0), so both would be selected if your distance threshold was greater than seven.  Once all such pixels are selected, you can then make them fully transparent to create a sprite image.

Image Segmentation

If you need to extract something that's up against a multi-colored background, a color replacement isn't going to work for extraction.  You always have the option of pixel-by-pixel selection in a pixel editor, but I find image segmentation to be much faster.  At least one image segmentation technique, the magic wand, is a common feature in high-end image editing software, such as Adobe Photoshop and GIMP.  The magic wand is color-based, however, and will tend to select only subsections of multi-colored sprites.  For example, a single application of the magic wand to the Pac-Man ghost will exclude its eyes from the selection.

I find that sprites are more reliably selected by algorithms that start from a clearly defined region of an image.  In the OpenCV distribution, for example, there is an algorithm called "grabcut", which is based on 2004 research paper by Rother, Kolmogorov, and Blake.  See this page for some example Python code that makes use of it.  

I was impressed enough with the grabcut algorithm that I thought I would see what happened if I used it on an animation.  The GIF below shows the results when I run the grabcut algorithm sequentially on each frame of a Pac-Man clip, using the box shown in green as the selection region.  The position of the box is initially selected manually in the first frame, and then recentered automatically in x and y based on the central position of the pixels selected by the algorithm.  The original animation is on top and a version with only the sprite selected by grabcut is shown on the bottom.  

Animation of arcade version of Pac-Man with Inky being followed by an image segmentation algorithm.

It manages to capture the Inky pretty well in the first third of the animation, but gets confused when he passes by other objects, like the orange ghost, and then completely loses Inky when he passes by Pac-Man.

Fortunately, there's one more trick that I've had some success with in extended animations.

Mode Background Subtraction

When analyzing sprites, the nice thing about having an animation is that it contains information about what's part of the background and what's part of the sprite.  For example, in the clip below, most of the background is static, while Mario, the barrels, and the fireball are moving on top of it.

Animation from the original arcade version of Donkey Kong, barrel level.

Distinguishing the background from the rest of the animation is straightforward in this case -- for a given pixel on the image, the background color is the most common RGB value across the frames of the animation.  The statistical term for the most common value is the "mode", and most programming languages have standard functions for computing it.  With Python 3, I used stats.mode (from the SciPy distribution) on the above animation, and got the following image:

Background from the original arcade version of Donkey Kong, barrel level.

This is a good representation of the game's background, at least in the region of the animation we care about (near Mario and the barrels).  Now that I know what my background pixels are, I can go back to the animation and set them to a constant color.

Animation from the original arcade version of Donkey Kong showing only the sprites, background removed.

What remains (mostly) is the sprites, and I now have a whole slew of frames from which I can extract Mario, the barrels, and the fireball.  This is also useful for studying the motion of the game objects and finding quirks in the emulation/hardware.  

Of course, this technique becomes more difficult in games that have background scrolling, or worse yet, parallax scrolling.  I have some tricks for those situations as well, but will come back to it in a later entry.


  1. Your previous/next post buttons appear to be gone for good again. Is that intentional? I found them extremely useful!

  2. Thanks again for pointing that out. I think I accidentally killed them in trying to fix an issue with the front page. I'll look into it.

  3. They’re back now - thanks for looking into it!


Post a Comment