Credit: iStockPhoto.com
We introduce an interactive technique to extract and manipulate simple 3D shapes in a single photograph. Such extraction requires an understanding of the shape's components, their projections, and their relationships. These cognitive tasks are simple for humans, but particularly difficult for automatic algorithms. Thus, our approach combines the cognitive abilities of humans with the computational accuracy of the machine to create a simple modeling tool. In our interface, the human draws three strokes over the photograph to generate a 3D component that snaps to the outline of the shape. Each stroke defines one dimension of the component. Such human assistance implicitly segments a complex object into its components, and positions them in space. The computer reshapes the component to fit the image of the object in the photograph as well as to satisfy various inferred geometric constraints between components imposed by a global 3D structure. We show that this intelligent interactive modeling tool provides the means to create editable 3D parts quickly. Once the 3D object has been extracted, it can be quickly edited and placed back into photos or 3D scenes, permitting object-driven photo editing tasks which are impossible to perform in image-space.
Extracting three dimensional objects from a single photo is still a long way from reality given the current state of technology, since it involves numerous complex tasks: the target object must be separated from its background, and its 3D pose, shape, and structure should be recognized from its projection. These tasks are difficult, even ill-posed, since they require some degree of semantic understanding of the object. To alleviate this difficulty, complex 3D models can be partitioned into simpler parts that can be extracted from the photo. However, assembling parts into an object also requires further semantic understanding and is difficult to perform automatically. Moreover, having decomposed a 3D shape into parts, the relationships between these parts should also be understood and maintained in the final composition.
In this paper, we present an interactive technique to extract 3D man-made objects from a single photograph, leveraging the strengths of both humans and computers. Human perceptual abilities are used to partition, recognize, and position shape parts, using a very simple interface based on triplets of strokes, while the computer performs tasks which are computationally intensive or require accuracy. The final object model produced by our method includes its geometry and structure, as well as some of its semantics. This allows the extracted model to be readily available for intelligent editing, which maintains the shape's semantics (see Figure 1).
No entries found