Search results for "Data mining"

showing 10 items of 907 documents

Growing Hierarchical Self-organizing Maps and Statistical Distribution Models for Online Detection of Web Attacks

2013

In modern networks, HTTP clients communicate with web servers using request messages. By manipulating these messages attackers can collect confidential information from servers or even corrupt them. In this study, the approach based on anomaly detection is considered to find such attacks. For HTTP queries, feature matrices are obtained by applying an n-gram model, and, by learning on the basis of these matrices, growing hierarchical self-organizing maps are constructed. For HTTP headers, we employ statistical distribution models based on the lengths of header values and relative frequency of symbols. New requests received by the web-server are classified by using the maps and models obtaine…

Self-organizing mapWeb serverComputer scienceServerHeaderSingle-linkage clusteringAnomaly detectionIntrusion detection systemData miningWeb servicecomputer.software_genrecomputer
researchProduct

Semantic annotation and big data techniques for patent information processing

2017

This thesis analyzes approaches to generate semantic annotations on patent records, as well as on other structured data, by relying on the structure and semantic representation of documents. Information in patent records reflects how real-world technologies evolve, and the approximately 3 million annual new patent applications capture the global inventive frontier. The volume of this information is too big to be effectively analyzed purely with human effort, necessitating Big data approaches to analyze it with computer aided tools and techniques. Big data is a term that describes a massive volume of structured, semi structured and unstructured data that is so large to the point that it is d…

Semantic annotationPatent informationbig datasemanttinen annotointiannotointiData Miningpatentittiedonlouhinta
researchProduct

Healthcare trajectory mining by combining multidimensional component and itemsets

2012

Sequential pattern mining is aimed at extracting correlations among temporal data. Many different methods were proposed to either enumerate sequences of set valued data (i.e., itemsets) or sequences containing multidimensional items. However, in real-world scenarios, data sequences are described as events of both multidimensional items and set valued information. These rich heterogeneous descriptions cannot be exploited by traditional approaches. For example, in healthcare domain, hospitalizations are defined as sequences of multi-dimensional attributes (e.g. Hospital or Diagnosis) associated with two sets, set of medical procedures (e.g. $ \lbrace $ Radiography, Appendectomy $\rbrace$) and…

Sequential PatternsComputer scienceDONNEE MEDICALE02 engineering and technologyReusecomputer.software_genreSynthetic dataDomain (software engineering)DATA MININGSet (abstract data type)Multi-dimensional Sequential Patterns020204 information systemsComponent (UML)SANTE0202 electrical engineering electronic engineering information engineeringPoint (geometry)SEQUENTIAL PATTERNMULTI DIMENSIONAL SEQUENTIAL PATTERNANALYSE DE DONNEES[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB]BASE DE DONNEESTemporal databaseINFORMATIQUEScalabilityTRAJECTOIRE[SDE]Environmental Sciences020201 artificial intelligence & image processingData miningFOUILLEcomputer
researchProduct

Occlusion-based estimation of independent multinomial random variables using occurrence and sequential information

2017

Abstract This paper deals with the relatively new field of sequence-based estimation in which the goal is to estimate the parameters of a distribution by utilizing both the information in the observations and in their sequence of appearance. Traditionally, the Maximum Likelihood (ML) and Bayesian estimation paradigms work within the model that the data, from which the parameters are to be estimated, is known, and that it is treated as a set rather than as a sequence. The position that we take is that these methods ignore, and thus discard, valuable sequence -based information, and our intention is to obtain ML estimates by “extracting” the information contained in the observations when perc…

Sequential estimationBayes estimatorSequenceComputer scienceMaximum likelihood02 engineering and technologycomputer.software_genre01 natural sciencesBinomial distributionCardinalityArtificial IntelligenceControl and Systems Engineering0103 physical sciences0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingMultinomial distributionData miningElectrical and Electronic Engineering010306 general physicsAlgorithmRandom variablecomputerEngineering Applications of Artificial Intelligence
researchProduct

A weighted logistic regression for conjoint analysis and Kansei engineering

2007

Customer needs for emotional satisfaction are increasingly being considered by product and service designers. While several existing methods such as conjoint analysis (CA), Kano model and quality function deployment support the translation of customer requirements into technical specifications, researchers are now working to develop methods aimed at integrating affective aspects into product design. Kansei engineering (KE) is a design philosophy that considers customer perceptions and emotions by adopting a multi-disciplinary approach. CA is a useful tool within a KE project. This article presents a methodology for conducting a KE project in early development phases. This methodology is bas…

Service (business)EngineeringATTRIBUTE IMPORTANCEPhilosophy of designProduct designOperations researchSettore SECS-S/02 - Statistica Per La Ricerca Sperimentale E Tecnologicabusiness.industryMODELSManagement Science and Operations ResearchOF-FIT TESTScomputer.software_genreINTERIORConjoint analysisDESIGNKano modelHALOData miningOrdered logitKansei engineeringSafety Risk Reliability and QualitybusinesscomputerERRORQuality function deploymentQuality and Reliability Engineering International
researchProduct

An Approach to Cadastre Map Quality Evaluation

2008

An approach to data quality evaluation is proposed, which is elaborated and implemented by State Land Service of the Republic of Latvia. The approach is based on opinion of Land Service experts about Cadastre map quality that depends on its usage points. Quality parameters of Cadastre map objects identified by experts and its limit values are used for evaluation. The assessment matrix is used, which allow to define Cadastre map quality that depends on its usage purpose. The matrix is used to find out, of what quality a Cadastre map should be in order to be used for the chosen purpose. The given approach is flexible, it gives a possibility to change sets of quality parameters and their limit…

Service (systems architecture)Computer scienceCadastremedia_common.quotation_subjectData qualityQuality (business)Limit (mathematics)Data miningMap qualitycomputer.software_genrecomputermedia_common
researchProduct

GPCALMA: A Grid-based tool for mammographic screening

2005

The next generation of High Energy Physics (HEP) experiments requires a GRID approach to a distributed computing system and the associated data management: the key concept is the Virtual Organisation (VO), a group of distributed users with a common goal and the will to share their resources. A similar approach is being applied to a group of Hospitals which joined the GPCALMA project (Grid Platform for Computer Assisted Library for MAmmography), which will allow common screening programs for early diagnosis of breast and, in the future, lung cancer. HEP techniques come into play in writing the application code, which makes use of neural networks for the image analysis and proved to be useful…

Service (systems architecture)InternationalityDatabases FactualMedical Records Systems ComputerizedTeleradiologyVirtual organizationComputer scienceGrid; Mammogram; Screening; Virtual organizationFOS: Physical sciencesBreast NeoplasmsHealth InformaticsTeleradiologycomputer.software_genregridSet (abstract data type)User-Computer InterfaceHealth Information ManagementmammogramHumansDiagnosis Computer-AssistedProgram DevelopmentAdvanced and Specialized NursingInternetbusiness.industryscreeningGridPhysics - Medical PhysicsEuropeSystems IntegrationRadiology Information SystemsItalyKey (cryptography)Database Management SystemsSystem integrationFemaleThe InternetMedical Physics (physics.med-ph)Data miningvirtual organizationbusinesscomputerAlgorithmsMammography
researchProduct

An Approach to Cadastral Map Quality Evaluation in the Republic of Latvia

2009

An approach to cadastral map quality evaluation is proposed, which is elaborated and implemented by State Land Service of the Republic of Latvia. The approach is based on opinion of Land Service experts about cadastral map quality that depends on its usage points. Quality parameters of cadastral map objects identified by experts and its limit values are used for evaluation. The assessment matrix is used, which allow to define cadastral map quality that depends on its usage purpose. The matrix is used to find out, of what quality a cadastral map should be in order to be used for the chosen purpose. The given approach is flexible, it gives a possibility to change sets of quality parameters an…

Service (systems architecture)business.industryCadastremedia_common.quotation_subjectEnvironmental resource managementcomputer.software_genreThe RepublicGeographyData qualityQuality (business)Limit (mathematics)Data miningbusinesscomputermedia_commonData Quality and High-Dimensional Data Analysis
researchProduct

Datamining: Pemanfaatan Algoritma Apriori dalam Menganalisa Pola-Pola Transaksi yang Terjadi

2012

This paper will be described about implementation and analysis of the well-known apriori algorithm, which is called Market Basket Analysis (MBA) in data mining. This algorithm is widely used to predict the relation among market basket in the huge amount of database. This algorithm is based on the concept of a prefix tree. There are several ways to organize the nodes of such a tree, to encode the items, and to organize the transactions, which may be used in order to minimize the time needed to find the frequent itemsets as well as to reduce the amount of memory needed to store the counters. The rules produced will be used by management of supermarket to organize the items set to increase the…

Set (abstract data type)Apriori algorithmTree (data structure)Relation (database)Order (exchange)Computer scienceMarket basketTrieInformationSystems_DATABASEMANAGEMENTAffinity analysisData miningcomputer.software_genrecomputerJurnal Natural
researchProduct

Condition Assessment of Norwegian Bridge Elements Using Existing Damage Records

2020

The Norwegian Public Roads Administration (NPRA) has recorded bridge element damages in a database for all the bridges it manages since the 1990s. This paper presents a comparison of three methods to establish element condition based on damage records. The methods consist in a non-parametric procedure based on the worst damage registered in the element, linear regression considering also bridge and road characteristics data and classification through an artificial neural network. The methods are assessed using a set of 159 bridges inspected in 2016. The results show that diagnostics of bridge element condition can reach high accuracy by using an artificial neural network classifier and taki…

Set (abstract data type)Artificial neural networkComputer sciencelanguageNorwegianData miningcomputer.software_genreCondition assessmentBridge (interpersonal)computerlanguage.human_languageArtificial neural network classifier
researchProduct