6533b81ffe1ef96bd1278ba6
RESEARCH PRODUCT
3D Object Modeling by Sharing Visual Attributes across Poses and Scales
Liliana Lo PrestiMarco La Casciasubject
Scene ParsingVisual attributeObject Modelingdescription
Scene parsing aims at understanding a scene and the arrangements of the objects in it. While this is a task human beings are pretty good at [7], a machine needs to: recognize the kind of scene (indoor vs outdoor, bedroom vs. living room etc.)[4], detect and recognize 3D objects across multiple poses and scales [8, 5], infer the geometrical arrangement of the objects in the scene [2, 1], etc.. In the proposed framework, a 3D object is modeled as a graph. Each node in the graph represents a visual attribute automatically discovered by considering features that are consistently and repeatedly present across different poses and scales. Such visual attributes are different from “parts” [5], which are referred in literature as regions the object may be segmented in and related together by geometrical proper- ties or transformations. Differently than other methods [5], which learn the appearance of the object given the pose, we aim at building a model that jointly represents local appearance, pose and scale. To discover visual attributes, we track key-points across multiple views and cluster the set of trajectories to get the nodes in the graph. In practice, each node is a visual attribute and is described by the clustered key-point expected appearance, the pose probability distribution representing how likely is that the visual attribute may be detected in a discretized set of poses, the scale probability distribution representing the probability that the visual attribute is visible in a discretized set of scales. Edges in the graph represent connectivity among the visual attributes across poses: two visual attributes are connected if they can be detected together at least in a pose. Figure 1.a and 1.b show the set of trajectories detected across a subset of views for two objects belonging to the same category. The set of trajectories across multiple objects is used to learn the visual attribute graph for the category (1.c).
| year | journal | country | edition | language |
|---|---|---|---|---|
| 2014-01-01 |