Search results for "Web"
showing 10 items of 2018 documents
Extended Poster Abstract: Open Source Solution for Massive Map Sheet Georeferencing Tasks for Digital Archiving
2010
Scanned maps need to be georeferenced, to be useful in a GIS environment for data extraction (vectorization), web publishing or spatially-aware archiving. Widely used software solutions with georeferencing functionality are designed to suit a universal scenario for georeferencing many different kinds of data sources. Such general nature also makes them very time-consuming for georeferencing a large number of map sheets to a known grid. This work presents an alternative scenario for georeferencing large numbers of map sheets in a time-efficient manner and implements the approach as the MapSheetAutoGeoRef plug-in for the freely available open source Quantum GIS [1].
Towards Graphical Query Notation for Semantic Databases
2015
We describe a notation and a tool for schema-enabled visual/diagrammatic creation of SPARQL queries over RDF databases. The notation and the tool support both the standard basic query pattern comprising a main query class and possibly linked condition classes and means for aggregate query definition and placing conditions over aggregates including also aggregation of aggregate results. We discuss the applicability of the tool for ad-hoc query formulation in practical use cases.
Post-search query modeling in federated web scenario
2014
As opposed to query reformulation oriented towards changes made by a user to specify the information need more precisely, a post-search query modeling is a technique of exploiting syntax variation of gradually extended query which depending on some other factors like e.g. the resource, database or the key word alignment, facilitates the searching process. The study into modeling query submitted to some search engines that utilize different translation semantic paradigms is motivated by a real-world's challenges to retrieve heterogeneous textual documents from the web. For a couple of language pairs, we develop a user-centered framework for imposing the Hidden Web traffic optimization. In li…
Mobile Search - Social Network Search Using Mobile Devices
2008
During the last years progress in Web search engines has been made to the point that relevant information can be reached easily most of the time. However very little empirical research has been carried to study Web search in highly dynamic social network environments composed of mobile devices. The aim of this work was therefore to investigate novel approaches that took advantage of the social network environment inherent to mobile peer-to-peer paradigm. The work focused mainly on the development of a prototype for mobile search concept. The prototype was built on top of Drupal content site management system. This study suggests that the methods presented can be a complement to traditional …
FluDetWeb: an interactive web-based system for the early detection of the onset of influenza epidemics
2009
Abstract Background The early identification of influenza outbreaks has became a priority in public health practice. A large variety of statistical algorithms for the automated monitoring of influenza surveillance have been proposed, but most of them require not only a lot of computational effort but also operation of sometimes not-so-friendly software. Results In this paper, we introduce FluDetWeb, an implementation of a prospective influenza surveillance methodology based on a client-server architecture with a thin (web-based) client application design. Users can introduce and edit their own data consisting of a series of weekly influenza incidence rates. The system returns the probabilit…
Detection of Internet robots using a Bayesian approach
2015
A large part of Web traffic on e-commerce sites is generated not by human users but by Internet robots: search engine crawlers, shopping bots, hacking bots, etc. In practice, not all robots, especially the malicious ones, disclose their identities to a Web server and thus there is a need to develop methods for their detection and identification. This paper proposes the application of a Bayesian approach to robot detection based on characteristics of user sessions. The method is applied to the Web traffic from a real e-commerce site. Results show that the classification model based on the cluster analysis with the Ward's method and the weighted Euclidean metric is very effective in robot det…
Simulation-Based Performance Study of e-Commerce Web Server System – Results for FIFO Scheduling
2013
The chapter concerns the issue of overloaded Web server performance evaluation using a simulation-based approach. We focus on a Business-to-Consumer (B2C) environment and consider server performance both from the perspective of computer system efficiency and e-business profitability. Results of simulation experiments for the Web server system under First-In-First-Out (FIFO) scheduling are discussed. Much attention has been paid to the analysis of the impact of a limited server system capacity on business-related performance metrics.
Analysis of Aggregated Bot and Human Traffic on E-Commerce Site
2014
A significant volume of Web traffic nowadays can be attributed to robots. Although some of them, e.g., search-engine crawlers, perform useful tasks on a website, others may be malicious and should be banned. Consequently, there is a growing need to identify bots and to characterize their behavior. This paper investigates the share of bot-generated traffic on an e-commerce site and studies differences in bots' and humans' session-based traffic by analyzing data recorded in Web server log files. Results show that both kinds of sessions reveal different characteristics, including the session duration, the number of pages visited in session, the number of requests, the volume of data transferre…
Anomaly Detection from Network Logs Using Diffusion Maps
2011
The goal of this study is to detect anomalous queries from network logs using a dimensionality reduction framework. The fequencies of 2-grams in queries are extracted to a feature matrix. Dimensionality reduction is done by applying diffusion maps. The method is adaptive and thus does not need training before analysis. We tested the method with data that includes normal and intrusive traffic to a web server. This approach finds all intrusions in the dataset. peerReviewed
HTTP-level e-commerce data based on server access logs for an online store
2020
Abstract Web server logs have been extensively used as a source of data on the characteristics of Web traffic and users’ navigational patterns. In particular, Web bot detection and online purchase prediction using methods from artificial intelligence (AI) are currently key areas of research. However, in reality, it is hard to obtain logs from actual online stores and there is no common dataset that can be used across different studies. Moreover, there is a lack of studies exploring Web traffic over a longer period of time, due to the unavailability of long-term data from server logs. The need to develop reliable models of Web traffic, Web user navigation, and e-customer behaviour calls for …