Search results for "data structures"

showing 10 items of 258 documents

Stability-Based Model Selection for High Throughput Genomic Data: An Algorithmic Paradigm

2012

Clustering is one of the most well known activities in scien- tific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is the model selection problem, i.e., the identifi- cation of the correct number of clusters in a dataset. In the last decade, a few novel techniques for model selection, representing a sharp departure from previous ones in statistics, have been proposed and gained promi- nence for microarray data analysis. Among those, the stability-based methods are the most robust and best performing in terms of predic- tion, but the slowest in terms of time. Unfortunately…

Class (computer programming)Settore INF/01 - Informaticabusiness.industryComputer scienceHeuristic (computer science)Model selectionStability (learning theory)Machine learningcomputer.software_genreIdentification (information)Algorithm designArtificial intelligenceCluster analysisbusinessAlgorithms and Data StructuresThroughput (business)computer
researchProduct

The expressive power of the shuffle product

2010

International audience; There is an increasing interest in the shuffle product on formal languages, mainly because it is a standard tool for modeling process algebras. It still remains a mysterious operation on regular languages.Antonio Restivo proposed as a challenge to characterize the smallest class of languages containing the singletons and closed under Boolean operations, product and shuffle. This problem is still widely open, but we present some partial results on it. We also study some other smaller classes, including the smallest class containing the languages composed of a single word of length 2 which is closed under Boolean operations and shuffle by a letter (resp. shuffle by a l…

Class (set theory)Computer science[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]0102 computer and information sciences02 engineering and technologyStar (graph theory)01 natural sciencesExpressive powerTheoretical Computer ScienceRegular languageFormal language0202 electrical engineering electronic engineering information engineeringArithmeticAlgebraic numberComputingMilieux_MISCELLANEOUSDiscrete mathematicsComputer Science Applicationsshuffle operatorComputational Theory and Mathematics010201 computation theory & mathematicsProduct (mathematics)Formal language020201 artificial intelligence & image processingBoolean operations in computer-aided designWord (computer architecture)Information Systems
researchProduct

Linear-size suffix tries

2016

Suffix trees are highly regarded data structures for text indexing and string algorithms [MCreight 76, Weiner 73]. For any given string w of length n = | w | , a suffix tree for w takes O ( n ) nodes and links. It is often presented as a compacted version of a suffix trie for w, where the latter is the trie (or digital search tree) built on the suffixes of w. Here the compaction process replaces each maximal chain of unary nodes with a single arc. For this, the suffix tree requires that the labels of its arcs are substrings encoded as pointers to w (or equivalent information). On the contrary, the arcs of the suffix trie are labeled by single symbols but there can be Θ ( n 2 ) nodes and lin…

Compressed suffix arrayGeneral Computer ScienceSuffix tree[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Generalized suffix tree0102 computer and information sciences02 engineering and technologyData_CODINGANDINFORMATIONTHEORYText indexing01 natural sciencesY-fast trielaw.inventionLongest common substring problemTheoretical Computer ScienceCombinatoricsSuffix treelawFactor and suffix automata0202 electrical engineering electronic engineering information engineeringData_FILESArithmeticFactor and suffix automata; Pattern matching; Suffix tree; Text indexing; Theoretical Computer Science; Computer Science (all)Pattern matchingMathematicsSettore INF/01 - InformaticaX-fast trieComputer Science (all)LCP array010201 computation theory & mathematics020201 artificial intelligence & image processingFM-index
researchProduct

Table Compression

2016

Data Compression Techniques for massive tables are described. Related methodological results are also presented.

Compression and transmission of tableSettore INF/01 - InformaticaBig Data ManagementStorageCompressive estimates of entropyData Compression. Algorithms. Data structuresCompression of multidimensional data
researchProduct

Fully Automatic Trunk Packing with Free Placements

2010

We present a new algorithm to compute the volume of a trunk according to the SAE J1100 standard. Our new algorithm uses state-of-the-art methods from computational geometry and from combinatorial optimization. It finds better solutions than previous approaches for small trunks.

Computational Geometry (cs.CG)FOS: Computer and information sciencesDiscrete Mathematics (cs.DM)ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKSComputer Science - Data Structures and AlgorithmsComputer Science - Computational GeometryData Structures and Algorithms (cs.DS)Computer Science - Discrete Mathematics
researchProduct

Continuous reformulations and heuristics for the Euclidean travelling salesperson problem

2008

We consider continuous reformulations of the Euclidean travelling salesperson problem (TSP), based on certain clustering problem formulations. These reformulations allow us to apply a generalisation with perturbations of the Weiszfeld algorithm in an attempt to find local approximate solutions to the Euclidean TSP.

Computational MathematicsMathematical optimizationControl and OptimizationControl and Systems EngineeringProblem FormulationsEuclidean geometryApplied mathematicsComputer Science::Data Structures and AlgorithmsHeuristicsCluster analysisMathematicsESAIM: Control, Optimisation and Calculus of Variations
researchProduct

Indexing a sequence for mapping reads with a single mismatch

2014

Mapping reads against a genome sequence is an interesting and useful problem in computational molecular biology and bioinformatics. In this paper, we focus on the problem of indexing a sequence for mapping reads with a single mismatch. We first focus on a simpler problem where the length of the pattern is given beforehand during the data structure construction. This version of the problem is interesting in its own right in the context of the next generation sequencing. In the sequel, we show how to solve the more general problem. In both cases, our algorithm can construct an efficient data structure in time and space and can answer subsequent queries in time. Here, n is the length of the s…

Computer sciencegenome sequenceGeneral Mathematics[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]General Physics and AstronomyContext (language use)algorithmscomputer.software_genrePattern matchingSequenceSearch engine indexingGeneral EngineeringWildcard characterArticlescomputer.file_formatConstruct (python library)Data structuremapping readspattern matchingComputingMethodologies_DOCUMENTANDTEXTPROCESSINGData mining[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]Focus (optics)mismatchcomputerAlgorithmindexingPhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
researchProduct

FRIPON: a worldwide network to track incoming meteoroids

2020

Context. Until recently, camera networks designed for monitoring fireballs worldwide were not fully automated, implying that in case of a meteorite fall, the recovery campaign was rarely immediate. This was an important limiting factor as the most fragile - hence precious - meteorites must be recovered rapidly to avoid their alteration. Aims. The Fireball Recovery and InterPlanetary Observation Network (FRIPON) scientific project was designed to overcome this limitation. This network comprises a fully automated camera and radio network deployed over a significant fraction of western Europe and a small fraction of Canada. As of today, it consists of 150 cameras and 25 European radio receiver…

DYNAMICS[INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR]MeteorsComputer scienceRadio receiver[INFO.INFO-DM]Computer Science [cs]/Discrete Mathematics [cs.DM]Surveys010502 geochemistry & geophysicsTrack (rail transport)01 natural sciencesMeteorites meteors meteoroidslaw.inventionPlanets and planetary system[INFO.INFO-TS]Computer Science [cs]/Signal and Image ProcessingMethods: observationallaw[INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO]meteoroids010303 astronomy & astrophysicsComputingMilieux_MISCELLANEOUSObservational methodsEarth and Planetary Astrophysics (astro-ph.EP)meteoroids -surveys -methods: observational -interplanetary medium[SDU.ASTR]Sciences of the Universe [physics]/Astrophysics [astro-ph]ORIGIN[INFO.INFO-AO]Computer Science [cs]/Computer Arithmeticmeteorites meteors meteoroids – surveys – methods: observational – interplanetary mediumMeteoroidsRECOVERYORBITMeteoriteFully automatedInterplanetary medium; Meteorites meteors meteoroids; Methods: observational; Surveys[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV][INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]Astrophysics - Instrumentation and Methods for Astrophysics[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processingFLUXReal-time computingfripon[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]FOS: Physical sciencesContext (language use)CAMERA[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE][SPI.AUTO]Engineering Sciences [physics]/Automatic[SDU.STU.PL]Sciences of the Universe [physics]/Earth Sciences/Planetology0103 physical sciencesFIREBALL NETWORKobservational [Methods]meteorsInstrumentation and Methods for Astrophysics (astro-ph.IM)0105 earth and related environmental sciencesMeteoroidINNISFREE METEORITE[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]Astronomy and AstrophysicsMETEORITE FALLMeteorites meteors meteoroidCamera networkSpace and Planetary Science[SDU]Sciences of the Universe [physics]Interplanetary spaceflightmeteroids trackingmeteoroids - surveys - methods: observationalSYSTEMInterplanetary mediumAstrophysics - Earth and Planetary AstrophysicsMeteorites
researchProduct

Reverse-Safe Text Indexing

2021

We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z - reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D . The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z , we propose an algorithm that constructs a z -reverse-safe data structure ( z -RSDS) that has size O(n) and answers decision and counting pattern matc…

Data structuresComputer scienceSuffix treesuffix tree0102 computer and information sciences02 engineering and technologytext indexing01 natural sciencesTheoretical Computer Sciencelaw.inventionSet (abstract data type)law020204 information systems0202 electrical engineering electronic engineering information engineeringPattern matchingdata privacySettore INF/01 - InformaticaSearch engine indexingdata privacy; Data structures; pattern matching; suffix tree; text indexingData structureMatrix multiplicationpattern matching010201 computation theory & mathematicsData structureAlgorithmAdversary modelInteger (computer science)ACM Journal of Experimental Algorithmics
researchProduct

Longest Motifs with a Functionally Equivalent Central Block

2004

International audience; This paper presents a generalization of the notion of longest repeats with a block of k don't care symbols introduced by [Crochemore et al., LATIN 2004] (for k fixed) to longest motifs composed of three parts: a first and last that parameterize match (that is, match via some symbol renaming, initially unknown), and a functionally equivalent central block. Such three-part motifs are called longest block motifs. Different types of functional equivalence, and thus of matching criteria for the central block are considered, which include as a subcase the one treated in [Crochemore et al., LATIN 2004] and extend to the case of regular expressions with no Kleene closure or …

Discrete mathematics0303 health sciences[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Block (permutation group theory)0102 computer and information sciences01 natural sciencesCombinatoricsKleene algebra03 medical and health sciencesClosure (mathematics)010201 computation theory & mathematicsAlgorithmicsKleene starRegular expressionTime complexity030304 developmental biologyMathematicsComplement (set theory)
researchProduct