6533b837fe1ef96bd12a3160

RESEARCH PRODUCT

Extensible User-Based XML Grammar Matching

Joe TekliKokou YetongnonRichard Chbeir

subject

Document Structure Description[ INFO.INFO-IR ] Computer Science [cs]/Information Retrieval [cs.IR]XML Encryption[INFO.INFO-WB] Computer Science [cs]/WebComputer sciencecomputer.internet_protocolEfficient XML Interchange[ INFO.INFO-WB ] Computer Science [cs]/WebXML Signature[SCCO.COMP]Cognitive science/Computer science02 engineering and technologycomputer.software_genreSchema matchingSimple API for XML[SCCO.COMP] Cognitive science/Computer scienceXML Schema Editor020204 information systemsStreaming XML0202 electrical engineering electronic engineering information engineering[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB]RELAX NGXML schemaBinary XMLSGML[ INFO.INFO-MM ] Computer Science [cs]/Multimedia [cs.MM]computer.programming_language[INFO.INFO-MM] Computer Science [cs]/Multimedia [cs.MM]Information retrieval[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB][INFO.INFO-WB]Computer Science [cs]/Web[INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM]XML validationcomputer.file_formatXML framework[ INFO.INFO-DB ] Computer Science [cs]/Databases [cs.DB]XML databaseXML Schema (W3C)[ SCCO.COMP ] Cognitive science/Computer science[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]Vector space model020201 artificial intelligence & image processing[INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR]computerXMLXML Catalog

description

International audience; XML grammar matching has found considerable interest recently due to the growing number of heterogeneous XML documents on the web and the increasing need to integrate, and consequently search and retrieve XML data originated from different data sources. In this paper, we provide an approach for automatic XML grammar matching and comparison aiming to minimize the amount of user effort required to perform the match task. We propose an open framework based on the concept of tree edit distance, integrating different matching criterions so as to capture XML grammar element semantic and syntactic similarities, cardinality and alternativeness constraints, as well as data-type correspondences and relative ordering. It is flexible, enabling the user to chose mapping cardinality (1:1 , 1:n , n:1 , n:n ), in comparison with existing static methods (constrained to 1:1 ), and considers user feedback to adjust matching results to the user's perception of correct matches. Conducted experiments demonstrate the efficiency of our approach, in comparison with alternative methods.

https://hal.archives-ouvertes.fr/hal-01094109