Search results for "Data mining"

showing 10 items of 907 documents

Modeling a non-stationary bots’ arrival process at an e-commerce Web site

2017

Abstract The paper concerns the issue of modeling and generating a representative Web workload for Web server performance evaluation through simulation experiments. Web traffic analysis has been done from two decades, usually based on Web server log data. However, while the character of the overall Web traffic has been extensively studied and modeled, relatively few studies have been devoted to the analysis of Web traffic generated by Internet robots (Web bots). Moreover, the overwhelming majority of studies concern the traffic on non e-commerce websites. In this paper we address the problem of modeling a realistic arrival process of bots’ requests on an e-commerce Web server. Based on real…

Web serverGeneral Computer ScienceComputer scienceInternet robotReal-time computing02 engineering and technologyE-commercecomputer.software_genreSession (web analytics)Theoretical Computer ScienceWeb traffic characterizationWeb serverWeb traffic0202 electrical engineering electronic engineering information engineeringTraffic generation modelWeb traffic analysis and modelingbusiness.industryComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS020206 networking & telecommunicationsWeb botHeavy-tailed distributionModeling and SimulationHeavy-tailed distribution020201 artificial intelligence & image processingThe InternetWeb log analysis softwareLog file analysisData miningbusinessRegression analysiscomputerJournal of Computational Science
researchProduct

Feature selection: A multi-objective stochastic optimization approach

2020

The feature subset task can be cast as a multiobjective discrete optimization problem. In this work, we study the search algorithm component of a feature subset selection method. We propose an algorithm based on the threshold accepting method, extended to the multi-objective framework by an appropriate definition of the acceptance rule. The method is used in the task of identifying relevant subsets of features in a Web bot recognition problem, where automated software agents on the Web are identified by analyzing the stream of HTTP requests to a Web server.

Web serverLinear programmingthreshold acceptingComputer scienceFeature extractionFeature selectionstochastic optimizationcomputer.software_genreMulti-objective optimizationfeature selection; multiobjective optimization; stochastic optimization; subset selection; threshold acceptingfeature selectionsubset selectionFeature (computer vision)Search algorithmStochastic optimizationmultiobjective optimizationData miningcomputer
researchProduct

Using association rules to assess purchase probability in online stores

2016

The paper addresses the problem of e-customer behavior characterization based on Web server log data. We describe user sessions with the number of session features and aim to identify the features indicating a high probability of making a purchase for two customer groups: traditional customers and innovative customers. We discuss our approach aimed at assessing a purchase probability in a user session depending on categories of viewed products and session features. We apply association rule mining to real online bookstore data. The results show differences in factors indicating a high purchase probability in session for both customer types. The discovered association rules allow us to formu…

Web usage miningWeb serverclick-stream analysise-CommerceAssociation rule learningComputer sciencebusiness.industrylog file analysisdata mining02 engineering and technologyE-commercecomputer.software_genreSession (web analytics)association rulesWorld Wide WebWeb mining020204 information systemsLog dataClick stream analysis0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingbusinesscomputerInformation SystemsInformation Systems and e-Business Management
researchProduct

High precision mass measurements for wine metabolomics

2014

An overview of the critical steps for the non-targeted Ultra-High Performance Liquid Chromatography coupled with Quadrupole Time-of-Flight Mass Spectrometry (UPLC-Q-ToF-MS) analysis of wine chemistry is given, ranging from the study design, data preprocessing and statistical analyses, to markers identification. UPLC-Q-ToF-MS data was enhanced by the alignment of exact mass data from FTICR-MS, and marker peaks were identified using UPLC-Q-ToF-MS(2). In combination with multivariate statistical tools and the annotation of peaks with metabolites from relevant databases, this analytical process provides a fine description of the chemical complexity of wines, as exemplified in the case of red (P…

Winemultivariate data analysisFTICR-MSUPLC-Q-ToF-MSGeneral Chemistrycomputer.software_genreMass spectrometryMasslcsh:ChemistryMetabolomicslcsh:QD1-999non-targeted metabolomicsNon targeted metabolomicsStatistical analysesMS/MSFticr-ms ; Ms/ms ; Uplc-q-tof-ms ; Multivariate Data Analysis ; Non-targeted Metabolomics ; WineWine chemistryData miningOriginal Research ArticleMultivariate statisticalwineBiological systemcomputerMathematicsNutritionFrontiers in Chemistry
researchProduct

Keynote Paper: Data Mining Researcher, Who is Your Customer? Some Issues Inspired by the Information Systems Field

2006

Data mining as an applied research field is still causing great expectations among organizations which want to raise the utility they are getting from their huge databases and data warehouses. There exist too few success stories about organizations having managed to satisfy even some of those expectations. This situation is very similar to the one inside the information systems (IS) field, especially earlier but even currently. The recent lively debate about the identity of the IS discipline included also the analysis concerning the customers of IS research. Inspired by IS researchers' insights related to the topic, we ask the question "who is our customer?" as data mining researchers. With…

Work (electrical)Computer scienceInformation systemIdentity (social science)Applied researchData miningcomputer.software_genreData sciencecomputerData warehouseField (computer science)17th International Conference on Database and Expert Systems Applications (DEXA'06)
researchProduct

Connections Between Topology and Macroscopic Mechanical Properties of Three-Dimensional Open-Pore Materials

2018

This work addresses a number of fundamental questions regarding the topological description of materials characterized by a highly porous three-dimensional structure with bending as the major deformation mechanism. Highly efficient finite-element beam models were used for generating data on the mechanical behavior of structures with different topologies, ranging from highly coordinated bcc to Gibson–Ashby structures. Random cutting enabled a continuous modification of average coordination numbers ranging from the maximum connectivity to the percolation-cluster transition of the 3D network. The computed macroscopic mechanical properties–Young's modulus, yield strength, and Poisson's ratio–co…

Work (thermodynamics)Materials sciencetopologyMaterials Science (miscellaneous)Coordination numberModulus02 engineering and technologyBendingPoisson distributionTopologystructure–property relationship01 natural scienceslcsh:Technologysymbols.namesakeApproximation error0103 physical sciencesTechnik [600]Topology (chemistry)ddc:620.11010302 applied physicslcsh:T600data mining021001 nanoscience & nanotechnologyelastic-plastic deformation behaviormachine learningopen-pore materialssymbols0210 nano-technologyReduction (mathematics)ddc:600Frontiers in Materials
researchProduct

Tracing Potential School Shooters in the Digital Sphere

2010

There are over 300 known school shooting cases in the world and over ten known cases where the perpetrator(s) have been prohibited to perform the attack at the last moment or earlier. Interesting from our point of view is that in many cases the perpetrators have expressed their views in social media or on their web page well in advance, and often also left suicide messages in blogs and other forums before their attack, along the planned date and place. This has become more common towards the end of this decennium. In some cases this has made it possible to prevent the attack. In this paper we will look at the possibilities to find commonalities of the perpetrators, beyond the fact that they…

World Wide WebSocial networkPoint (typography)business.industryComputer scienceOrder (business)Internet privacyWeb pageSocial mediaTracingMultimedia data miningbusiness
researchProduct

Overlapping community detection versus ground-truth in AMAZON co-purchasing network

2015

International audience; Objective evaluation of community detection algorithms is a strategic issue. Indeed, we need to verify that the communities identified are actually the good ones. Moreover, it is necessary to compare results between two distinct algorithms to determine which is most effective. Classically, validations rely on clustering comparison measures or on quality metrics. Although, various traditional performance measures are used extensively. It appears very clearly that they cannot distinguish community structures with different topological properties. It is therefore necessary to propose an alternative methodology more sensitive to the community structure variations in orde…

[ INFO ] Computer Science [cs]Computer sciencemedia_common.quotation_subject02 engineering and technologycomputer.software_genreMachine learning01 natural sciencesClique percolation method010104 statistics & probability[SPI]Engineering Sciences [physics][ SPI ] Engineering Sciences [physics]0202 electrical engineering electronic engineering information engineeringQuality (business)[INFO]Computer Science [cs]0101 mathematicsCluster analysisnetwork analysismedia_commonGround truthoverlapping community networksbusiness.industryCommunity structurePurchasing[ SPI.TRON ] Engineering Sciences [physics]/ElectronicsCommunity structure[SPI.TRON]Engineering Sciences [physics]/Electronicsdetection algorithmsoverlap- ping community networks020201 artificial intelligence & image processingAlgorithm designArtificial intelligenceData miningbusinesscomputerNetwork analysis
researchProduct

CLEARMiner: a new algorithm for mining association patterns on heterogeneous time series from climate data

2010

International audience; Recently, improvements in sensor technology contributed to increasing in spatial data acquisition. The use of remote sensing in many countries and states, where agricultural business is a large part of their gross income, can provide a valuable source to improve their economy. The combination of climate and remote sensing data can reveal useful information, which can help researchers to monitor and estimate the production of agricultural crops. Data mining techniques are the main tools to analyze and extract relationships and patterns. In this context, this paper presents a new algorithm for mining association patterns in Geo-referenced databases of climate and satel…

[ INFO.INFO-IR ] Computer Science [cs]/Information Retrieval [cs.IR]Association rule learning[INFO.INFO-WB] Computer Science [cs]/WebComputer scienceAssociation (object-oriented programming)[ INFO.INFO-WB ] Computer Science [cs]/Web[SCCO.COMP]Cognitive science/Computer scienceContext (language use)computer.software_genreNOAA-AVHRR imagesImage-based Information Systemsassociation rules[SCCO.COMP] Cognitive science/Computer science[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB]Spatial analysisAgricultural crops[ INFO.INFO-MM ] Computer Science [cs]/Multimedia [cs.MM][INFO.INFO-MM] Computer Science [cs]/Multimedia [cs.MM][INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB]Series (mathematics)[INFO.INFO-WB]Computer Science [cs]/Web[INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM][ INFO.INFO-DB ] Computer Science [cs]/Databases [cs.DB]Remote sensing (archaeology)[ SCCO.COMP ] Cognitive science/Computer science[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]Data mining[INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR]Vegetation IndexAlgorithmcomputer
researchProduct

User profile matching in social networks

2010

International audience; Inter-social networks operations and functionalities are required in several scenarios (data integration, data enrichment, information retrieval, etc.). To achieve this, matching user profiles is required. Current methods are so restrictive and do not consider all the related problems. Particularly, they assume that two profiles describe the same physical person only if the values of their Inverse Functional Property or IFP (e.g. the email address, homepage, etc.) are the same. However, the observed trend in social networks is not fully compatible with this assumption since users tend to create more than one social network account (for personal use, for work, etc.) w…

[ INFO.INFO-IR ] Computer Science [cs]/Information Retrieval [cs.IR]Matching (statistics)Computer science[SCCO.COMP]Cognitive science/Computer science02 engineering and technologySimilarity measurecomputer.software_genreElectronic mail[SCCO.COMP] Cognitive science/Computer science020204 information systemsFOAF0202 electrical engineering electronic engineering information engineeringPattern matchingUser profileSocial networkbusiness.industrycomputer.file_formatProfile MatchingSocial Networks[ SCCO.COMP ] Cognitive science/Computer science[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]020201 artificial intelligence & image processingData mining[INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR]businesscomputerData integration
researchProduct