Search results for "mining"

showing 10 items of 1730 documents

Effectiveness of local feature selection in ensemble learning for prediction of antimicrobial resistance

2008

In the real world concepts are often not stable but change over time. A typical example of this in the biomedical context is antibiotic resistance, where pathogen sensitivity may change over time as pathogen strains develop resistance to antibiotics that were previously effective. This problem, known as concept drift (CD), complicates the task of learning a robust model. Different ensemble learning (EL) approaches (that instead of learning a single classifier try to learn and maintain a set of classifiers over time) have been shown to perform reasonably well in the presence of concept drift. In this paper we study how much local feature selection (FS) can improve ensemble performance for da…

Change over timeConcept driftbusiness.industryComputer sciencemedia_common.quotation_subjectSystem testingFeature selectionMachine learningcomputer.software_genreEnsemble learningStatistical classificationVotingArtificial intelligenceData miningbusinesscomputerClassifier (UML)media_common
researchProduct

Modeling Multi-label Recurrence in Data Streams

2019

Most of the existing data stream algorithms assume a single label as the target variable. However, in many applications, each observation is assigned to several labels with latent dependencies among them, which their target function may change over time. Classification of such non-stationary multi-label streaming data with the consideration of dependencies among labels and potential drifts is a challenging task. The few existing studies mostly cope with drifts implicitly, and all learn models on the original label space, which requires a lot of time and memory. None of them consider recurrent drifts in multi-label streams and particularly drifts and recurrences visible in a latent label spa…

Change over timeMulti-label classificationData streambusiness.industryComputer scienceData stream miningSpace dimensionPattern recognitionComputingMethodologies_PATTERNRECOGNITIONStreaming dataArtificial intelligencebusinessClassifier (UML)Decoding methods2019 IEEE International Conference on Big Knowledge (ICBK)
researchProduct

Anomaly detection in dynamic systems using weak estimators

2011

Accepted version of an article from the journal: ACM transactions on internet technology. Published version available from the ACM: http://dx.doi.org/10.1145/1993083.1993086 Anomaly detection involves identifying observations that deviate from the normal behavior of a system. One of the ways to achieve this is by identifying the phenomena that characterize “normal” observations. Subsequently, based on the characteristics of data learned from the “normal” observations, new observations are classified as being either “normal” or not. Most state-of-the-art approaches, especially those which belong to the family of parameterized statistical schemes, work under the assumption that the underlying…

Change over timeVDP::Mathematics and natural science: 400::Mathematics: 410::Applied mathematics: 413education.field_of_studyComputer Networks and CommunicationsComputer sciencePopulationEstimatorParameterized complexityVDP::Technology: 500::Information and communication technology: 550Network monitoringcomputer.software_genreOutlierAnomaly detectionData miningeducationcomputer
researchProduct

World Influence of Infectious Diseases from Wikipedia Network Analysis

2019

AbstractWe consider the network of 5 416 537 articles of English Wikipedia extracted in 2017. Using the recent reduced Google matrix (REGOMAX) method we construct the reduced network of 230 articles (nodes) of infectious diseases and 195 articles of world countries. This method generates the reduced directed network between all 425 nodes taking into account all direct and indirect links with pathways via the huge global network. PageRank and CheiRank algorithms are used to determine the most influential diseases with the top PageRank diseases being Tuberculosis, HIV/AIDS and Malaria. From the reduced Google matrix we determine the sensitivity of world countries to specific diseases integrat…

CheiRankComputer scienceHuman immunodeficiency virus (HIV)medicine.disease_cause01 natural sciences[INFO.INFO-SI]Computer Science [cs]/Social and Information Networks [cs.SI]law.invention03 medical and health sciencesPageRanklaw0103 physical sciencesGlobal networkmedicine010306 general physics030304 developmental biology0303 health sciencesInformation retrievalGoogle matrixMarkov processes[PHYS.PHYS.PHYS-SOC-PH]Physics [physics]/Physics [physics]/Physics and Society [physics.soc-ph]complex networksdata mining[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]ranking (statistics)3. Good healthInfectious diseaseslcsh:Electrical engineering. Electronics. Nuclear engineeringlcsh:TK1-9971Network analysisWikipedia
researchProduct

Improving the QM/MM Description of Chemical Processes:  A Dual Level Strategy To Explore the Potential Energy Surface in Very Large Systems.

2005

Potential energy surfaces are fundamental tools for the analysis of reaction mechanisms. The accuracy of these surfaces for reactions in very large systems is often limited by the size of the system even if hybrid quantum mechanics/molecular mechanics (QM/MM) strategies are employed. The large number of degrees of freedom of the system requires hundreds or even thousands of optimization steps to reach convergence. Reactions in condensed media (such as enzymes or solutions) are thus usually restricted to be analyzed using low level quantum mechanical methods, thus introducing a source of error in the description of the QM region. In this paper, an alternative method is proposed, coupled to t…

Chemical processComputer scienceDegrees of freedom (physics and chemistry)computer.software_genreTopologyPotential energyComputer Science ApplicationsQM/MMConvergence (routing)Potential energy surfaceData miningPhysical and Theoretical ChemistrycomputerQuantumEnergy (signal processing)Journal of chemical theory and computation
researchProduct

Type B Aortic Dissection Diagnosed by Left-Sided Transthoracic Ultrasonography in a Woman With Preeclampsia

2017

Chest Painmedicine.medical_specialty030204 cardiovascular system & hematologyLeft sidedPreeclampsiaYoung Adult03 medical and health sciencesText mining0302 clinical medicinePre-Eclampsia030202 anesthesiologyPregnancymedicineHumansUltrasonographyAortic Aneurysm Thoracicbusiness.industryType B aortic dissectionGeneral Medicinemedicine.diseaseAbdominal PainSurgeryAortic DissectionTreatment OutcomeAnesthesiology and Pain Medicine030228 respiratory systemFemaleRadiologyUltrasonographybusiness030217 neurology & neurosurgeryAnesthesia & Analgesia
researchProduct

Catastrophic effects of sand mining on macroinvertebrates in a large shallow lake with implications for management

2019

Sand mining is a human activity that is increasing in inland waters and has profound effects on entire aquatic ecosystems. However, current knowledge of the effects of sand mining on freshwater lake ecosystems remains limited, especially for biotic communities. Here, we investigated the responses of macroinvertebrates to indiscriminate sand mining in a large shallow lake of China. Our results indicated that sand mining significantly increased the content of suspended particulate matter, total nitrogen, total phosphorus and chlorophyll a in the water column both in the sand mining area and the area adjacent to the dredging activities. While there was significantly lower total nitrogen and th…

ChinaEnvironmental Engineering010504 meteorology & atmospheric sciencessand dredging010501 environmental sciencesmacroinvertebrate01 natural sciencesMiningSphaeriumDredgingWater columnparasitic diseasesEnvironmental ChemistryAnimalsWaste Management and DisposalEcosystem0105 earth and related environmental sciencesSand miningBiomass (ecology)biologykaivostoimintaEcologyAquatic ecosystemLake ecosystemvesiekosysteemitympäristönsuojeluBiodiversityselkärangattomatbiology.organism_classificationPollutionInvertebratesbiological traitsbiodiversiteettiekosysteemit (ekologia)LakesBenthic zonebiomonitoringEnvironmental scienceEnvironmental Monitoring
researchProduct

A computer program suitable for analysis of choice of categories in biomedical data recognition problems.

1980

The optimum choice of categories in problems of medical data recognition is governed by the choice of categories, the selection of appropriate features, and by the choice of a loss function. Under these circumstances it is often difficult to find out the suitable classification scheme. The computer program described here serves for the design of the optimum recognition procedure. The Bayes rule is used as decision rule. A criterion for the comparison of different choice of categories is given. The program can be performed after estimation of the underlying prior probabilities and the conditional densities obtained from a training set, and before testing the decision rule with real data.

Choice setComputer programComputer sciencebusiness.industryComputersDecision theoryMedicine (miscellaneous)Decision ruleFunction (mathematics)Machine learningcomputer.software_genreClassificationBayes' theoremDecision TheoryBiomedical dataResearch DesignData miningArtificial intelligencebusinesscomputerSelection (genetic algorithm)Computer programs in biomedicine
researchProduct

Incremental linear model trees on massive datasets

2013

The existence of massive datasets raises the need for algorithms that make efficient use of resources like memory and computation time. Besides well-known approaches such as sampling, online algorithms are being recognized as good alternatives, as they often process datasets faster using much less memory. The important class of algorithms learning linear model trees online (incremental linear model trees or ILMTs in the following) offers interesting options for regression tasks in this sense. However, surprisingly little is known about their performance, as there exists no large-scale evaluation on massive stationary datasets under equal conditions. Therefore, this paper shows their applica…

Class (computer programming)Computer scienceProcess (engineering)business.industryComputationLinear modelSampling (statistics)computer.software_genreMachine learningKISS principleData miningArtificial intelligenceOnline algorithmbusinesscomputerProceedings of the 28th Annual ACM Symposium on Applied Computing
researchProduct

A Methodology to Detect Temporal Regularities in User Behavior for Anomaly Detection

2001

Network security, and intrusion detection in particular, represents an area of increased in security community over last several years. However, the majority of work in this area has been concentrated upon implementation of misuse detection systems for intrusion patterns monitoring among network traffic. In anomaly detection the classification was mainly based on statistical or sequential analysis of data often neglect ion temporal events' information as well as existing relations between them. In this paper we consider an anomaly detection problem as one of classification of user behavior in terms of incoming multiple discrete sequences. We present and approach that allows creating and mai…

Class (computer programming)User profileNetwork securitybusiness.industryAnomaly-based intrusion detection systemComputer scienceIntrusion detection systemcomputer.software_genreMisuse detectionData analysisAnomaly detectionData miningbusinesscomputer
researchProduct