0000000000004035
AUTHOR
Panagiotis Bouros
LocalRec 2019 workshop report: The Third ACM SIGSPATIAL Workshop on Location-Based Recommendations, Geosocial Networks and Geoadvertising
The amount of publicly available geo-referenced data has seen a dramatic explosion over the past few years. Many user activities generate data that are annotated with location and contextual information. Furthermore, it has become easier to collect and combine rich and diverse location information. In the context of geoadvertising, the use of geosocial data for targeted marketing is receiving significant attention from a wide spectrum of companies and organizations. With the advent of smartphones and online social networks, a multi-billion dollar industry that utilizes geosocial data for advertising and marketing has emerged. Geotagged social-media posts, GPS traces, data from cellular ante…
Spatial joins
The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. This paper reviews research and recent trends on spatial join evaluation. The complexity of different data types, the consideration of different join predicates, the use of modern commodity hardware, and support for parallel processing open the road to a number of interesting directions for future research, some of which we outline in the paper.
Set similarity joins on mapreduce
Set similarity joins, which compute pairs of similar sets, constitute an important operator primitive in a variety of applications, including applications that must process large amounts of data. To handle these data volumes, several distributed set similarity join algorithms have been proposed. Unfortunately, little is known about the relative performance, strengths and weaknesses of these techniques. Previous comparisons are limited to a small subset of relevant algorithms, and the large differences in the various test setups make it hard to draw overall conclusions. In this paper we survey ten recent, distributed set similarity join algorithms, all based on the MapReduce paradigm. We emp…
Relative Reachability Analysis as a Tool for Urban Mobility Planning
There is a plethora of user-oriented route planning applications and systems that enable the computation of the fastest journey between two locations using different transportation modes, e.g., car, public transport, walking, bicycle. While useful for individuals, they are of limited interest to a class of users that may be interested in a more global and comparative view of transportation systems in general. In this context, we adopt the view of an urban planner. Urban planners may be interested in queries such as "if a new transit stop was to be introduced in a given location, would that bring the travel time to a given point-of-interest (POI) or area-of-interest (AOI) by bus closer to th…
Proceedings of the 16th International Symposium on Spatial and Temporal Databases
A Two-level Spatial In-Memory Index
Very large volumes of spatial data increasingly become available and demand effective management. While there has been decades of research on spatial data management, few works consider the current state of commodity hardware, having relatively large memory and the ability of parallel multi-core processing. In this work, we re-consider the design of spatial indexing under this new reality. Specifically, we propose a main-memory indexing approach for objects with spatial extent, which is based on a classic regular space partitioning into disjoint tiles. The novelty of our index is that the contents of each tile are further partitioned into four classes. This second-level partitioning not onl…
Most Diverse Near-Shortest Paths
Computing the shortest path in a road network is a fundamental problem that has attracted lots of attention. However, in many real-world scenarios, determining solely the shortest path is not enough as users want to have additional, alternative ways of reaching their destination. In this paper, we investigate a novel variant of alternative routing, termed the k-Most Diverse Near-Shortest Paths (kMDNSP). In contrast to previous work, kMDNSP aims at maximizing the diversity of the recommended paths, while bounding their length based on a user-defined constraint. Our theoretical analysis proves the NP-hardness of the problem at hand. To compute an exact solution to kMDNSP, we present an algori…
Parallel In-Memory Evaluation of Spatial Joins
The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. As main memories become bigger and faster and commodity hardware supports parallel processing, there is a need to revamp classic join algorithms which have been designed for I/O-bound processing. In view of this, we study the in-memory and parallel evaluation of spatial joins, by re-designing a classic partitioning-based algorithm to consider alternative approaches for space partitioning. Our study shows that, compared to a straightforward implementation of the algorithm, our tuning can improve performance significantly. We also show how to select appropriate partitioning parame…
The Fourth ACM SIGSPATIAL Workshop on Location-Based Recommendations, Geosocial Networks and Geoadvertising
The amount of publicly available geo-referenced data has seen a dramatic increase over the last years. Many user activities generate data that are annotated with location and contextual information. Moreover, it has become easier to collect and combine rich and diverse location information. In the context of geoadvertising, the use of geosocial data for targeted marketing is receiving significant attention from a wide spectrum of companies and organizations. With the advent of smartphones and online social networks, a multi-billion dollar industry that utilizes geosocial data for advertising and marketing has emerged. Geotagged social-media posts, GPS traces, data from cellular antennas and…
Finding k -dissimilar paths with minimum collective length
Shortest path computation is a fundamental problem in road networks. However, in many real-world scenarios, determining solely the shortest path is not enough. In this paper, we study the problem of finding k-Dissimilar Paths with Minimum Collective Length (kDPwML), which aims at computing a set of paths from a source s to a target t such that all paths are pairwise dissimilar by at least \theta and the sum of the path lengths is minimal. We introduce an exact algorithm for the kDPwML problem, which iterates over all possible s-t paths while employing two pruning techniques to reduce the prohibitively expensive computational cost. To achieve scalability, we also define the much smaller set …
LocalRec 2018 workshop report the second ACM SIGSPATIAL workshop on recommendations for location-based services and social networks * Seattle, Washington, USA - November 6, 2018
Driven by technological advances in hardware (positioning systems, environmental sensors), software (standards, tools, network services), and aided by various open movements (open, linked, government data) and the ever-growing mentality of sharing for the greater good (crowdsourcing, crowdfunding, collaborative and volunteered geographic information), the amount of available geo-referenced data has seen dramatic explosion over the past few years. Human activities generate data and traces that are now often transparently annotated with location and contextual information. At the same time, it has become easier than ever to collect and combine rich and diverse information about locations. Exp…
Top-k String Similarity Joins
Top-k joins have been extensively studied in relational databases as ranking operations when every object has, among others, at least one ranking attribute. However, the focus has mostly been the case when the join attributes are of primitive data types (e.g., numerical values) and the join predicate is equality. In this work, we consider string objects assigned such ranking attributes or simply scores. Given two collection of string objects and a string similarity measure (e.g., the Edit distance), we introduce the top-k string similarity join () which returns k sufficiently similar pairs of objects with respect to a similarity threshold ϵ, which have the highest combined score computed by…
A Two-layer Partitioning for Non-point Spatial Data
Non-point spatial objects (e.g., polygons, linestrings, etc.) are ubiquitous and their effective management is always timely. We study the problem of indexing non-point objects in memory. We propose a secondary partitioning technique for space-oriented partitioning indices (e.g., grids), which improves their performance significantly, by avoiding the generation and elimination of duplicate results. Our approach is novel and of a high impact, as (i) it is extremely easy to implement and (ii) it can be used by any space-partitioning index. We show how our approach can be used to boost the performance of spatial range queries. We also show how we can avoid performing the expensive refinement s…
Simulation-based Evacuation Planning for Urban Areas
Evacuation planning is a critical task in disaster management. Especially in situations such as natural disasters or terrorist attacks, large crowds need to move away from danger and reach designated safe zones. For this purpose, various approaches that efficiently compute evacuation plans in urban areas have been proposed. To evaluate the computed plans, previous works employ heuristics that can only roughly estimate the egress time of each plan. Intuitively, a much better approach is to estimate the egress time via simulation. However, designing a simulation model is usually a time-consuming task and, what is more, this model can only be used to evaluate evacuation plans for a specific ar…