Search results for "Extraction"

showing 10 items of 2072 documents

The Anatomy of an Optical Biopsy Semantic Retrieval System

2012

A case-based computer-aided diagnosis system assists physicians and other medical personnel in the interpretation of optical biopsies obtained through confocal laser endomicroscopy. Extraction in CLE images shows promising results on inferring semantic metadata from low-level features. In order to effectively ensure the interoperability with potential third-party applications, the system provides an interface compliant with the recent standards ISO/IEC 15938-12:2008 (MPEG Query Format) and ISO/IEC 24800 (JPEG Search).

Information retrievalComputer scienceInterface (computing)InteroperabilityFeature extractionComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONFeature recognitioncomputer.file_formatOptical BiopsyJPEGComputer Science ApplicationsMetadataHardware and ArchitectureSignal ProcessingMedia TechnologycomputerImage retrievalSoftwareIEEE Multimedia
researchProduct

Content Code Blurring: A New Approach to Content Extraction

2008

Most HTML documents on the world wide web contain far more than the article or text which forms their main content. Navigation menus, functional and design elements or commercial banners are typical examples of additional contents. Content extraction is the process of identifying the main content and/or removing the additional contents. We introduce content code blurring, a novel content extraction algorithm. As the main text content is typically a long, homogeneously formatted region in a web document, the aim is to identify exactly these regions in an iterative process. Comparing its performance with existing content extraction solutions we show thatfor most documents content code blurrin…

Information retrievalComputer sciencebusiness.industryContent (measure theory)Content extractionProcess (computing)Code (cryptography)businessKnowledge acquisitionContent management2008 19th International Conference on Database and Expert Systems Applications
researchProduct

Semantic web service discovery system for road traffic information services

2015

Create a multi-agent platform for a traveller information system (FIPA standards).Extend Paulucci algorithm with the use of seven similarity measures.Weight the similarity measure according to semantic relation and parameter nature.Improved running-time with a filtering pre-process for non-functional parameters.Improved the recall by measuring the sibling relationship concepts. We describe a multi-agent platform for a traveller information system, allowing travellers to find the road traffic information web service (WSs) that best fits their requirements. After studying existing proposals for discovery of semantic WS, we implemented a hybrid matching algorithm, which is described in detail …

Information retrievalComputer sciencebusiness.industryGeneral EngineeringSemantic web servicesSimilarity measurecomputer.software_genreRoad traffic information systemsSocial Semantic WebComputer Science ApplicationsKnowledge discoverySemantic similarityKnowledge extractionArtificial IntelligenceInformation systemInformation retrievalSemantic integrationRelevance (information retrieval)Semantic Web StackData miningWeb servicebusinessMatchmakingcomputerSemantic matching
researchProduct

Combining content extraction heuristics

2008

The main text content of an HTML document on the WWW is typically surrounded by additional contents, such as navigation menus, advertisements, link lists or design elements. Content Extraction (CE) is the task to identify and extract the main content. Ongoing research has spawned several CE heuristics of different quality. However, so far only the Crunch framework combines several heuristics to improve its overall CE performance. Since Crunch, though, many new algorithms have been formulated. The CombinE system is designed to test, evaluate and optimise combinations of CE heuristics. Its aim is to develop CE systems which yield better and more reliable extracts of the main content of a web …

Information retrievalComputer sciencemedia_common.quotation_subjectDesign elements and principlescomputer.software_genreCrunchTask (project management)Content extractionQuality (business)Data miningHeuristicsWeb documentcomputermedia_commonProceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
researchProduct

Extracting Semantic Knowledge from Unstructured Text Using Embedded Controlled Language

2016

Nowadays, most of the data on the Web is still in the form of unstructured text. Knowledge extraction from unstructured text is highly desirable but extremely challenging due to the inherent ambiguity of natural language. In this article, we present an architecture of an information extraction system based on the concept of Embedded Controlled Language that allows for extracting formal semantic knowledge from an unstructured text corpus. Moreover, the presented approach has a potential to support multilingual input and output.

Information retrievalConcept searchNoisy text analyticsbusiness.industryComputer scienceText simplification010401 analytical chemistryText graph02 engineering and technologycomputer.software_genre01 natural scienceslanguage.human_language0104 chemical sciencesInformation extractionControlled natural languageKnowledge extractionExplicit semantic analysis0202 electrical engineering electronic engineering information engineeringlanguage020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerNatural language processing2016 IEEE Tenth International Conference on Semantic Computing (ICSC)
researchProduct

Machine Learning and Knowledge Discovery in Databases. Research Track

2021

Information retrievalKnowledge extractionComputer scienceTrack (disk drive)
researchProduct

FrameNet CNL: A Knowledge Representation and Information Extraction Language

2014

The paper presents a FrameNet-based information extraction and knowledge representation framework, called FrameNet-CNL. The framework is used on natural language documents and represents the extracted knowledge in a tailor-made Frame-ontology from which unambiguous FrameNet-CNL paraphrase text can be generated automatically in multiple languages. This approach brings together the fields of information extraction and CNL, because a source text can be considered belonging to FrameNet-CNL, if information extraction parser produces the correct knowledge representation as a result. We describe a state-of-the-art information extraction parser used by a national news agency and speculate that Fram…

Information retrievalParsingKnowledge representation and reasoningbusiness.industryComputer scienceAgency (philosophy)computer.software_genreParaphraseInformation extractionArtificial intelligenceSource textFrameNetbusinesscomputerNatural language processingNatural language
researchProduct

An interactive evolutionary approach for content based image retrieval

2009

Content Based Image Retrieval (CBIR) systems aim to provide a means to find pictures in large repositories without using any other information except its contents usually as low-level descriptors. Since these descriptors do not exactly match the high level semantics of the image, assessing perceptual similarity between two pictures using only their feature vectors is not a trivial task. In fact, the ability of a system to induce high level semantic concepts from the feature vector of an image is one of the aspects which most influences its performance. This paper describes a CBIR algorithm which combines relevance feedback, evolutionary computation concepts and ad-hoc strategies in an attem…

Information retrievalbusiness.industryComputer scienceFeature vectorFeature extractionComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONRelevance feedbackPattern recognitionContent-based image retrievalSemanticsEvolutionary computationHistogramVisual WordArtificial intelligencebusinessImage retrieval2009 IEEE International Conference on Systems, Man and Cybernetics
researchProduct

Estimating web site readability using content extraction

2009

Nowadays, information is primarily searched on the WWW. From a user perspective, the readability is an important criterion for measuring the accessibility and thereby the quality of an information. We show that modern content extraction algorithms help to estimate the readability of a web document quite accurate.

Information retrievalbusiness.industryComputer sciencemedia_common.quotation_subjectContent extractionQuality (business)UsabilitybusinessReadabilitymedia_commonWeb siteProceedings of the 18th international conference on World wide web
researchProduct

On the Ion-Pair Recognition and Indication Features of a Fluorescent Heteroditopic Host Based on a BODIPY Core

2014

A fluorescent heteroditopic host for ion pairs and zwitterionic species has been synthesized. Its affinity towards a series of anions, cations and ion pairs in acetonitrile has been assessed, and the spectroscopic response has been evaluated. Solid–liquid extraction experiments of inorganic salts, α-amino acids and γ-aminobutyric acid (GABA) into acetonitrile solutions were performed, and the resulting complexes were analyzed by UV/Vis absorption, fluorescence and 1H NMR spectroscopy. The discrimination patterns observed have been rationalized in terms of the molecular topologies of the host and guests.

Inorganic saltschemistry.chemical_compound1h nmr spectroscopychemistryOrganic ChemistryExtraction (chemistry)Physical and Theoretical ChemistryIon pairsBODIPYAbsorption (chemistry)PhotochemistryAcetonitrileFluorescenceEuropean Journal of Organic Chemistry
researchProduct