6533b81ffe1ef96bd1278ba6
RESEARCH PRODUCT
3D Object Modeling by Sharing Visual Attributes across Poses and Scales
Liliana Lo PrestiMarco La Casciasubject
Scene ParsingVisual attributeObject Modelingdescription
Scene parsing aims at understanding a scene and the arrangements of the objects in it. While this is a task human beings are pretty good at [7], a machine needs to: recognize the kind of scene (indoor vs outdoor, bedroom vs. living room etc.)[4], detect and recognize 3D objects across multiple poses and scales [8, 5], infer the geometrical arrangement of the objects in the scene [2, 1], etc.. In the proposed framework, a 3D object is modeled as a graph. Each node in the graph represents a visual attribute automatically discovered by considering features that are consistently and repeatedly present across different poses and scales. Such visual attributes are different from “parts” [5], which are referred in literature as regions the object may be segmented in and related together by geometrical proper- ties or transformations. Differently than other methods [5], which learn the appearance of the object given the pose, we aim at building a model that jointly represents local appearance, pose and scale. To discover visual attributes, we track key-points across multiple views and cluster the set of trajectories to get the nodes in the graph. In practice, each node is a visual attribute and is described by the clustered key-point expected appearance, the pose probability distribution representing how likely is that the visual attribute may be detected in a discretized set of poses, the scale probability distribution representing the probability that the visual attribute is visible in a discretized set of scales. Edges in the graph represent connectivity among the visual attributes across poses: two visual attributes are connected if they can be detected together at least in a pose. Figure 1.a and 1.b show the set of trajectories detected across a subset of views for two objects belonging to the same category. The set of trajectories across multiple objects is used to learn the visual attribute graph for the category (1.c).
year | journal | country | edition | language |
---|---|---|---|---|
2014-01-01 |