6533b832fe1ef96bd129a2f8
RESEARCH PRODUCT
On utilizing an enhanced object partitioning scheme to optimize self-organizing lists-on-lists
B. John OommenB. John OommenO. Ekaba Bisongsubject
Control and OptimizationTheoretical computer scienceLearning automataComputer sciencebusiness.industryBig data02 engineering and technologyObject (computer science)Data structureHierarchical database modelField (computer science)030218 nuclear medicine & medical imagingComputer Science Applications03 medical and health sciences0302 clinical medicineControl and Systems EngineeringModeling and Simulation0202 electrical engineering electronic engineering information engineeringLocality of reference020201 artificial intelligence & image processingCluster analysisbusinessVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550description
With the advent of “Big Data” as a field, in and of itself, there are at least three fundamentally new questions that have emerged, namely the Artificially Intelligence (AI)-based algorithms required, the hardware to process the data, and the methods to store and access the data efficiently. This paper (The work of the second author was partially supported by NSERC, the Natural Sciences and Engineering Council of Canada. We are very grateful for the feedback from the anonymous Referees of the original submission. Their input significantly improved the quality of this final version.) presents some novel schemes for the last of the three areas. There have been thousands of papers written regarding the algorithms themselves, and the hardware vendors are scrambling for the market share. However, the question of how to store, manage and access data, which has been central to the field of Computer Science, is even more pertinent in these days when megabytes of data are being generated every second. This paper considers the problem of minimizing the cost of retrieval using the most fundamental data structure, i.e., a Singly-Linked List (SLL). We consider a SLL in which the elements are accessed by a Non-stationary Environment (NSE) exhibiting the so-called “Locality of Reference”. We propose a solution to the problem by designing an “Adaptive” Data Structure (ADS), which is created by utilizing a composite of hierarchical data “sub”-structures to constitute the overall data structure (In this paper, the primitive data structure is the SLL. However, these concepts can be extended to more complicated data structures.). In this paper, we design hierarchical Lists-on-Lists (LOLs) by assembling a SLL into a hierarchical scheme that results in SLLs on SLLs (SLLs-on-SLLs) comprising of an outer-list and sublist contexts. The goal is that elements that are more likely to be accessed together are grouped within the same sub-context, while the sublists themselves are moved “en masse” towards the head of the list-context to minimize the overall access cost. This move is carried out by employing the “de-facto” list re-organization schemes, i.e., the Move-To-Front (MTF) and Transposition (TR) rules. To achieve the clustering of elements within the sublists, we invoke the Object Migration Automaton (OMA) family of reinforcement schemes from the theory of Learning Automata (LA). They capture the probabilistic dependence of the elements in the data structure, receiving query accesses from the Environment. We show that SLLs-on-SLLs augmented with the Enhanced OMA (EOMA) minimizes the retrieval cost for elements in NSEs, and are superior to the stand-alone MTF and TR schemes, and also superior to the OMA-augmented SLLs-on-SLLs operating in such Environments.
year | journal | country | edition | language |
---|---|---|---|---|
2020-02-12 |