Sign In

Communications of the ACM

Research highlights

Extracting 3D Objects from Photographs Using 3-Sweep

Extracting 3D Objects from Photographs Using 3-Sweep, illustration


We introduce an interactive technique to extract and manipulate simple 3D shapes in a single photograph. Such extraction requires an understanding of the shape's components, their projections, and their relationships. These cognitive tasks are simple for humans, but particularly difficult for automatic algorithms. Thus, our approach combines the cognitive abilities of humans with the computational accuracy of the machine to create a simple modeling tool. In our interface, the human draws three strokes over the photograph to generate a 3D component that snaps to the outline of the shape. Each stroke defines one dimension of the component. Such human assistance implicitly segments a complex object into its components, and positions them in space. The computer reshapes the component to fit the image of the object in the photograph as well as to satisfy various inferred geometric constraints between components imposed by a global 3D structure. We show that this intelligent interactive modeling tool provides the means to create editable 3D parts quickly. Once the 3D object has been extracted, it can be quickly edited and placed back into photos or 3D scenes, permitting object-driven photo editing tasks which are impossible to perform in image-space.

Back to Top

1. Introduction

Extracting three dimensional objects from a single photo is still a long way from reality given the current state of technology, since it involves numerous complex tasks: the target object must be separated from its background, and its 3D pose, shape, and structure should be recognized from its projection. These tasks are difficult, even ill-posed, since they require some degree of semantic understanding of the object. To alleviate this difficulty, complex 3D models can be partitioned into simpler parts that can be extracted from the photo. However, assembling parts into an object also requires further semantic understanding and is difficult to perform automatically. Moreover, having decomposed a 3D shape into parts, the relationships between these parts should also be understood and maintained in the final composition.

In this paper, we present an interactive technique to extract 3D man-made objects from a single photograph, leveraging the strengths of both humans and computers. Human perceptual abilities are used to partition, recognize, and position shape parts, using a very simple interface based on triplets of strokes, while the computer performs tasks which are computationally intensive or require accuracy. The final object model produced by our method includes its geometry and structure, as well as some of its semantics. This allows the extracted model to be readily available for intelligent editing, which maintains the shape's semantics (see Figure 1).


No entries found

Log in to Read the Full Article

Sign In

Sign in using your ACM Web Account username and password to access premium content if you are an ACM member, Communications subscriber or Digital Library subscriber.

Need Access?

Please select one of the options below for access to premium content and features.

Create a Web Account

If you are already an ACM member, Communications subscriber, or Digital Library subscriber, please set up a web account to access premium content on this site.

Join the ACM

Become a member to take full advantage of ACM's outstanding computing information resources, networking opportunities, and other benefits.

Subscribe to Communications of the ACM Magazine

Get full access to 50+ years of CACM content and receive the print version of the magazine monthly.

Purchase the Article

Non-members can purchase this article or a copy of the magazine in which it appears.