|
Aspect Ratio Conversion by "Seam Carving"
Last week's TV TechCheck featured a
paper from the upcoming NAB Broadcast Engineering Conference in
Las Vegas describing how one broadcast network is addressing the
challenge of handling multiple aspect ratios through the production
and distribution process. Another paper in the same session, entitled
"Seam Carving for Video" by Mike Knee, Algorithm Team
Leader at Snell & Wilcox Ltd., presents a different technology
that offers a way of transforming images from one aspect ratio
to another without cropping or adding black bars. While this raises
issues of altering the original producer's composition, it is
an interesting concept and excerpts are presented below.
Seam carving
is a technique for content-aware image resizing in which the size
and shape of regions of visual importance are preserved without
resorting to cropping. With user interaction, unwanted objects
can also be removed from an image with minimal effect on the remainder
of the picture. The results on still pictures have been spectacular,
but extending the idea to moving video presents significant challenges.
This paper describes how seam carving can successfully be modified
and extended to work with moving video. The resulting algorithm
is robust, computationally efficient and gives pleasing results.
It can be combined seamlessly with dynamic reframing and conventional
resizing to provide a rich toolkit for content-aware repurposing
of video. Applications include post-production, conversion between
HD and SD TV standards, aspect ratio conversion, repurposing for
mobile devices and internet video.
INTRODUCTION
- The main application considered here is aspect ratio conversion,
for example from 16:9 to 4:3, but the techniques presented can
readily be applied to other video sequence resizing and re-purposing
tasks.
Conventional
approaches to aspect ratio conversion include cropping (removal
of information at the sides or top and bottom of the picture),
letterboxing (adding black bars) or stretching or squeezing the
picture to the required shape. These approaches all have drawbacks.
Cropping removes information that might be important, letterboxing
wastes precious screen space and loses resolution, and stretching
or squeezing changes the shape of objects. The drawback of cropping
can be overcome to some extent by a pan-scan process in which
the preserved region of the picture is moved around smoothly to
follow the most important content. This can be done manually or
by an automated process which tracks regions of interest in the
scene. However, pan-scan can fail when there are objects of interest
near both ends of the picture, for example in a motion picture
dialog scene.
Seam carving
is a novel technique used to resize or reshape a picture by removing
less important pixels. The integrity of the information that is
retained is preserved by insuring that the pixels that are removed
form connected "seams" which "carve" through
the picture from top to bottom or from left to right. Spectacular
results have been presented in which pictures can be gracefully
and progressively resized by the repeated removal of seams. The
technique can also be used for the expansion of pictures and for
the removal of specific objects. When used for aspect ratio conversion,
the disadvantages of conventional methods can be overcome. The
full extent of the original picture can be preserved without changing
the shape of the objects in the scene and without wasting screen
area and losing resolution.
Seam carving
as originally presented was developed for still pictures and produces
unacceptable artifacts when applied to moving sequences, because
the seam carving decisions are made independently for each frame
of the sequence. The process can be adapted for moving sequences,
using two new techniques referred to as recursive energy weighting
and map processing. Recursive energy weighting addresses the
fundamental problem of coping with moving sequences. Map processing
improves the quality and flexibility of the process and brings
two additional benefits. It can dramatically increase processing
speed, which is particularly important for seam carving of HDTV
sequences, and it can provide elegant solutions to the problem
of combining horizontal and vertical seam carving for still pictures
as well as for sequences.
THE ORIGINAL
SEAM CARVING ALGORITHM - This section gives a brief, informal
description of the seam carving algorithm for still pictures.
Suppose we wish to shrink a picture horizontally. Seam carving
is applied repeatedly, shrinking the
picture by one pixel width at a time. Each pass of the algorithm
operates as follows. We calculate an energy or activity
function for each pixel in the picture. Typically, this is the
sum of absolute differences between the current pixel's luminance
value and each of its four neighbors. We then find a seam
of minimum energy extending from the top to the bottom of the
picture. A seam is a set of connected pixels, one pixel per line,
the connection criterion typically being vertical or diagonal
adjacency. The figure to the right gives an example of a seam
on a very small picture.
The energy
of a seam is the sum of the energy values of the pixels in the
seam. The minimum-energy seam can be found using a recursive technique
in which we calculate best partial seams leading to each pixel
on successive rows of the picture until we have a minimum-energy
seam leading to each pixel on the bottom row. We then take the
minimum of all the bottom-row results and back-track along the
seam to the top of the picture.
Having found
the minimum-energy seam, we simply remove all its pixels from
the picture, shifting the rest of the picture into the gap to
make a new picture one pixel narrower than before. This is the
process of "carving" a seam from the picture.
Seam
carving may be performed horizontally, removing top-to-bottom
seams to reduce the width of the picture, or vertically, removing
left-to-right seams to reduce the height of the picture. If the
picture is being shrunk in both directions, individual horizontal
and vertical seam carving operations may be carried out according
to a pattern or rule, or in a picture-dependent order based on
minimizing the energy cost of the seam removal operations.
Seam carving
may also be used to expand pictures. In this case, seam carving
is first used to reduce the size of a picture by the same number
of pixels as the desired increase. Then, for every pixel that
is removed in the seam carving process, a new pixel is instead
added to the original picture by interpolation. Thus, low-energy
areas that would have been removed are in fact doubled in size.
SEAM CARVING
ON MOVING SEQUENCES - The paper describes in detail, with
illustrations, the sophisticated mathematical techniques that
are used for extending seam carving to moving sequences. These
are Recursive Energy Weighting (with motion compensation) and
Map Processing (including map transformations with mixing, scaling,
and smoothing), with arrangements for combining horizontal and
vertical seam carving operations.
RESULTS
- The techniques described in the paper were combined together
into a single algorithm and tested on many sequences, including
some 6,000 frames of the most interesting and critical material
from a 24Hz progressive HDTV version of the motion picture "Mission
Antarctique". Our main goal was aspect ratio conversion of
this material from 16:9 to 4:3. Several insights were gained from
these tests. The original seam carving algorithm, applied picture
by picture, produces unwatchable results, whereas recursive energy
weighting produces smoothly moving results whose remaining temporal
inconsistencies are largely removed by map smoothing. The addition
of motion compensation is beneficial, particularly in sensitive
regions such as faces, where changes of shape due to erroneous
propagation of seams are particularly disturbing. The figure above
shows a 16:9 source picture, the horizontal and vertical seams
and the resulting rescaled picture in 4:3.
DISCUSSION
- In general, the algorithm works remarkably well on natural scenes
with detailed foreground objects or people. The foreground is
preserved, and the inevitable distortions in the shapes of landscape
elements are much less apparent than they would be on foreground
objects.
The remaining
problems with the algorithm fall into two categories. First, there
are problems to do with the nature of the source material, which
could be solved using known techniques. For example, in many scenes
from "Mission Antarctique" the foreground is relatively
dark and can sometimes lose detail, leading to low energy values
and the possibility of attracting seams and therefore being distorted.
Smooth foreground objects, for example penguins and diving suits,
can also end up narrower than they should be. In both cases, the
problems could be overcome by algorithms for detecting regions
of interest based on color and/or motion.
The second
category of problem is more fundamental and is concerned with
what we actually desire when we remove areas from moving scenes.
To give one example, what should happen to a steadily panning
background while the camera is tracking a foreground object? If
the algorithm is working correctly, seams in the background will
be tracked until they leave the edge of the picture, but what
should happen then? Those seams can either "bunch up"
near the edge of the picture, "reappear" at the other
end of the picture, or be "set free" for reallocation
to another part of the background. The first two outcomes seem
to be the most desirable, as they tend to result in cropping,
and there is an argument to say that cropping is perfectly acceptable
in a background pan because it only serves to bring forward the
disappearance of information that is about to disappear anyway,
or to delay the appearance of information at the other end of
the picture.
FURTHER
WORK - Following on from the discussion above, further tests
are needed to incorporate algorithms for detecting regions of
interest, modifying the energy function accordingly. Beyond that,
it would be desirable to concentrate on the more fundamental problems
of removing material from moving sequences. Ultimately, the goal
may be to move away from the concept of seams altogether and toward
an approach where the map function is considered as a two- or
even a three-dimensional "membrane" which is distorted
smoothly by a measure of importance of the picture information.
This paper
will be presented on Tuesday, April 15, 2008 starting at 11:30
a.m. in room S226/227 of the Las Vegas Convention Center. It will
also be included in its entirety in the 2008 NAB Broadcast
Engineering Conference Proceedings, on sale at the 2008 NAB
Show. For additional conference information visit the NAB Show
Web page at www.nabshow.com.
2008
NAB Broadcast Engineering Conference Summary of Presentations
Check out the papers
that will be presented at the 2008 NAB Broadcast Engineering Conference
in Las Vegas, April
12 -17, 2008.
Mobile
TV: Opportunity at 100 MPH!
Monday,
April 14
7:30
a.m. - 8:30 a.m.
Las Vegas Hilton Ballroom A
The
Open Mobile Video Coalition (OMVC) invites engineers from television,
telcos, cable and OEMs to learn more about breakthroughs and milestones
in engineering, consumer interest and testing, as well as new
revenue opportunities in the fast approaching locally broadcast
Mobile TV world. Join them for breakfast
on Monday, April 14 in Ballroom A.

The
March 24, 2008 TV TechCheck is also available
in an Adobe Acrobat file.
Please click
here to read the Adobe Acrobat version of TV TechCheck.
|