6533b7d4fe1ef96bd1261ffb

RESEARCH PRODUCT

Toward Approximate GML Retrieval Based on Structural and Semantic Characteristics

Richard ChbeirPatrizia GrifoniFernando FerriJoe Tekli

subject

[ INFO.INFO-IR ] Computer Science [cs]/Information Retrieval [cs.IR]Tree edit distanceSimilarity (geometry)[INFO.INFO-WB] Computer Science [cs]/WebComputer sciencecomputer.internet_protocol[ INFO.INFO-WB ] Computer Science [cs]/Web[SCCO.COMP]Cognitive science/Computer science02 engineering and technologycomputer.software_genre[SCCO.COMP] Cognitive science/Computer science020204 information systemsEncoding (memory)0202 electrical engineering electronic engineering information engineering[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB][ INFO.INFO-MM ] Computer Science [cs]/Multimedia [cs.MM][INFO.INFO-MM] Computer Science [cs]/Multimedia [cs.MM]Information retrieval[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB]GML SearchStructural & Semantic Similarity[INFO.INFO-WB]Computer Science [cs]/WebProcess (computing)[INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM]GISConstraint (information theory)[ INFO.INFO-DB ] Computer Science [cs]/Databases [cs.DB][ SCCO.COMP ] Cognitive science/Computer science[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]Ranked retrieval020201 artificial intelligence & image processingData mining[INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR]computerXMLDecision tree model

description

International audience; GML is emerging as the new standard for representing geographic information in GISs on the Web, allowing the encoding of structurally and semantically rich geographic data in self describing XML-based geographic entities. In this study, we address the problem of approximate querying and ranked results for GML data and provide a method for GML query evaluation. Our method consists of two main contributions. First, we propose a tree model for representing GML queries and data collections. Then, we introduce a GML retrieval method based on the concept of tree edit distance as an efficient means for comparing semi-structured data. Our approach allows the evaluation of both structural and semantic similarities in GML data, enabling the user to tune the querying process according to her needs. The user can also choose to perform either template querying, taking into account all elements in the query and data trees, or minimal constraint querying, considering only those elements required by the query (disregarding additional data elements), in the similarity evaluation process. An experimental prototype was implemented to test and validate our method. Results are promising.

https://hal.archives-ouvertes.fr/hal-01093369