March 24, 2008
NAB.org   |   Technical Resources  

Aspect Ratio Conversion by "Seam Carving"
Last week's TV TechCheck featured a paper from the upcoming NAB Broadcast Engineering Conference in Las Vegas describing how one broadcast network is addressing the challenge of handling multiple aspect ratios through the production and distribution process. Another paper in the same session, entitled "Seam Carving for Video" by Mike Knee, Algorithm Team Leader at Snell & Wilcox Ltd., presents a different technology that offers a way of transforming images from one aspect ratio to another without cropping or adding black bars. While this raises issues of altering the original producer's composition, it is an interesting concept and excerpts are presented below.

Seam carving is a technique for content-aware image resizing in which the size and shape of regions of visual importance are preserved without resorting to cropping. With user interaction, unwanted objects can also be removed from an image with minimal effect on the remainder of the picture. The results on still pictures have been spectacular, but extending the idea to moving video presents significant challenges. This paper describes how seam carving can successfully be modified and extended to work with moving video. The resulting algorithm is robust, computationally efficient and gives pleasing results. It can be combined seamlessly with dynamic reframing and conventional resizing to provide a rich toolkit for content-aware repurposing of video. Applications include post-production, conversion between HD and SD TV standards, aspect ratio conversion, repurposing for mobile devices and internet video.

INTRODUCTION - The main application considered here is aspect ratio conversion, for example from 16:9 to 4:3, but the techniques presented can readily be applied to other video sequence resizing and re-purposing tasks.

Conventional approaches to aspect ratio conversion include cropping (removal of information at the sides or top and bottom of the picture), letterboxing (adding black bars) or stretching or squeezing the picture to the required shape. These approaches all have drawbacks. Cropping removes information that might be important, letterboxing wastes precious screen space and loses resolution, and stretching or squeezing changes the shape of objects. The drawback of cropping can be overcome to some extent by a pan-scan process in which the preserved region of the picture is moved around smoothly to follow the most important content. This can be done manually or by an automated process which tracks regions of interest in the scene. However, pan-scan can fail when there are objects of interest near both ends of the picture, for example in a motion picture dialog scene.

Seam carving is a novel technique used to resize or reshape a picture by removing less important pixels. The integrity of the information that is retained is preserved by insuring that the pixels that are removed form connected "seams" which "carve" through the picture from top to bottom or from left to right. Spectacular results have been presented in which pictures can be gracefully and progressively resized by the repeated removal of seams. The technique can also be used for the expansion of pictures and for the removal of specific objects. When used for aspect ratio conversion, the disadvantages of conventional methods can be overcome. The full extent of the original picture can be preserved without changing the shape of the objects in the scene and without wasting screen area and losing resolution.

Seam carving as originally presented was developed for still pictures and produces unacceptable artifacts when applied to moving sequences, because the seam carving decisions are made independently for each frame of the sequence. The process can be adapted for moving sequences, using two new techniques referred to as recursive energy weighting and map processing. Recursive energy weighting addresses the fundamental problem of coping with moving sequences. Map processing improves the quality and flexibility of the process and brings two additional benefits. It can dramatically increase processing speed, which is particularly important for seam carving of HDTV sequences, and it can provide elegant solutions to the problem of combining horizontal and vertical seam carving for still pictures as well as for sequences.

THE ORIGINAL SEAM CARVING ALGORITHM - This section gives a brief, informal description of the seam carving algorithm for still pictures. Suppose we wish to shrink a picture horizontally. Seam carving is applied repeatedly, shrinking A Seam Diagramthe picture by one pixel width at a time. Each pass of the algorithm operates as follows. We calculate an energy or activity function for each pixel in the picture. Typically, this is the sum of absolute differences between the current pixel's luminance value and each of its four neighbors. We then find a seam of minimum energy extending from the top to the bottom of the picture. A seam is a set of connected pixels, one pixel per line, the connection criterion typically being vertical or diagonal adjacency. The figure to the right gives an example of a seam on a very small picture.

The energy of a seam is the sum of the energy values of the pixels in the seam. The minimum-energy seam can be found using a recursive technique in which we calculate best partial seams leading to each pixel on successive rows of the picture until we have a minimum-energy seam leading to each pixel on the bottom row. We then take the minimum of all the bottom-row results and back-track along the seam to the top of the picture.

Having found the minimum-energy seam, we simply remove all its pixels from the picture, shifting the rest of the picture into the gap to make a new picture one pixel narrower than before. This is the process of "carving" a seam from the picture.

Diagram of horizontal and vertical seamSeam carving may be performed horizontally, removing top-to-bottom seams to reduce the width of the picture, or vertically, removing left-to-right seams to reduce the height of the picture. If the picture is being shrunk in both directions, individual horizontal and vertical seam carving operations may be carried out according to a pattern or rule, or in a picture-dependent order based on minimizing the energy cost of the seam removal operations.

Seam carving may also be used to expand pictures. In this case, seam carving is first used to reduce the size of a picture by the same number of pixels as the desired increase. Then, for every pixel that is removed in the seam carving process, a new pixel is instead added to the original picture by interpolation. Thus, low-energy areas that would have been removed are in fact doubled in size.

SEAM CARVING ON MOVING SEQUENCES - The paper describes in detail, with illustrations, the sophisticated mathematical techniques that are used for extending seam carving to moving sequences. These are Recursive Energy Weighting (with motion compensation) and Map Processing (including map transformations with mixing, scaling, and smoothing), with arrangements for combining horizontal and vertical seam carving operations.

RESULTS - The techniques described in the paper were combined together into a single algorithm and tested on many sequences, including some 6,000 frames of the most interesting and critical material from a 24Hz progressive HDTV version of the motion picture "Mission Antarctique". Our main goal was aspect ratio conversion of this material from 16:9 to 4:3. Several insights were gained from these tests. The original seam carving algorithm, applied picture by picture, produces unwatchable results, whereas recursive energy weighting produces smoothly moving results whose remaining temporal inconsistencies are largely removed by map smoothing. The addition of motion compensation is beneficial, particularly in sensitive regions such as faces, where changes of shape due to erroneous propagation of seams are particularly disturbing. The figure above shows a 16:9 source picture, the horizontal and vertical seams and the resulting rescaled picture in 4:3.

DISCUSSION - In general, the algorithm works remarkably well on natural scenes with detailed foreground objects or people. The foreground is preserved, and the inevitable distortions in the shapes of landscape elements are much less apparent than they would be on foreground objects.

The remaining problems with the algorithm fall into two categories. First, there are problems to do with the nature of the source material, which could be solved using known techniques. For example, in many scenes from "Mission Antarctique" the foreground is relatively dark and can sometimes lose detail, leading to low energy values and the possibility of attracting seams and therefore being distorted. Smooth foreground objects, for example penguins and diving suits, can also end up narrower than they should be. In both cases, the problems could be overcome by algorithms for detecting regions of interest based on color and/or motion.

The second category of problem is more fundamental and is concerned with what we actually desire when we remove areas from moving scenes. To give one example, what should happen to a steadily panning background while the camera is tracking a foreground object? If the algorithm is working correctly, seams in the background will be tracked until they leave the edge of the picture, but what should happen then? Those seams can either "bunch up" near the edge of the picture, "reappear" at the other end of the picture, or be "set free" for reallocation to another part of the background. The first two outcomes seem to be the most desirable, as they tend to result in cropping, and there is an argument to say that cropping is perfectly acceptable in a background pan because it only serves to bring forward the disappearance of information that is about to disappear anyway, or to delay the appearance of information at the other end of the picture.

FURTHER WORK - Following on from the discussion above, further tests are needed to incorporate algorithms for detecting regions of interest, modifying the energy function accordingly. Beyond that, it would be desirable to concentrate on the more fundamental problems of removing material from moving sequences. Ultimately, the goal may be to move away from the concept of seams altogether and toward an approach where the map function is considered as a two- or even a three-dimensional "membrane" which is distorted smoothly by a measure of importance of the picture information.

This paper will be presented on Tuesday, April 15, 2008 starting at 11:30 a.m. in room S226/227 of the Las Vegas Convention Center. It will also be included in its entirety in the 2008 NAB Broadcast Engineering Conference Proceedings, on sale at the 2008 NAB Show. For additional conference information visit the NAB Show Web page at www.nabshow.com.

2008 NAB Broadcast Engineering Conference Summary of Presentations
Check out the papers that will be presented at the 2008 NAB Broadcast Engineering Conference in Las Vegas,
April 12 -17, 2008.

Mobile TV: Opportunity at 100 MPH!
Members Lounge LogoMonday, April 14
7:30 a.m. - 8:30 a.m.
Las Vegas Hilton Ballroom A

The Open Mobile Video Coalition (OMVC) invites engineers from television, telcos, cable and OEMs to learn more about breakthroughs and milestones in engineering, consumer interest and testing, as well as new revenue opportunities in the fast approaching locally broadcast Mobile TV world. Join them for breakfast on Monday, April 14 in Ballroom A.

The March 24, 2008 TV TechCheck is also available in an Adobe Acrobat file.
Please click here to read the Adobe Acrobat version of TV TechCheck.