Search results for "processing"

showing 10 items of 8572 documents

FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy

2021

Abstract Background Storage of genomic data is a major cost for the Life Sciences, effectively addressed via specialized data compression methods. For the same reasons of abundance in data production, the use of Big Data technologies is seen as the future for genomic data storage and processing, with MapReduce-Hadoop as leaders. Somewhat surprisingly, none of the specialized FASTA/Q compressors is available within Hadoop. Indeed, their deployment there is not exactly immediate. Such a State of the Art is problematic. Results We provide major advances in two different directions. Methodologically, we propose two general methods, with the corresponding software, that make very easy to deploy …

Big DataFASTQ formatComputer scienceBig data02 engineering and technologycomputer.software_genrelcsh:Computer applications to medicine. Medical informaticsBiochemistry03 medical and health sciencesSoftwareStructural BiologySpark (mathematics)0202 electrical engineering electronic engineering information engineeringData_FILESMapReduceMapReduce; hadoop; sequence analysis; data compressionMolecular Biologylcsh:QH301-705.5030304 developmental biologyFile system0303 health sciencesSettore INF/01 - InformaticaDatabasebusiness.industryMethodology ArticleApplied MathematicsSequence analysisGenomicsData compression; Hadoop; MapReduce; Sequence analysis; Algorithms; Big Data; Data Compression; Genomics; SoftwareComputer Science Applicationslcsh:Biology (General)Software deploymentHadoopData compressionlcsh:R858-859.7020201 artificial intelligence & image processingState (computer science)businesscomputerAlgorithmsSoftwareData compressionBMC Bioinformatics
researchProduct

SOCIAL NETWORKS, BIG DATA AND TRANSPORT PLANNING

2016

[EN] The characteristics of people who are related or tied to each individual affects her activity-travel behavior. That influence is especially associated to social and recreational activities, which are increasingly important. Collecting high quality data from those social networks is very difficult using traditional travel surveys, because respondents are asked about their general social life, which is most demanding to remember that specific facts. On the other hand, currently there are different potential sources of transport data, which is characterized by the huge amount of information available, the velocity with it is obtained and the variety of format in which is presented. This s…

Big DataOperations researchTransport PlanningComputer scienceBig data02 engineering and technologyINGENIERÍA DEL TRANSPORTEINGENIERIA E INFRAESTRUCTURA DE LOS TRANSPORTESSocial life0502 economics and business0202 electrical engineering electronic engineering information engineeringTransporte y movilidad 34807 / C - Máster universitario en sistemas inteligentes de transporte 2283sortRecreation050210 logistics & transportationTransportation planningSocial networkMINERVA projectbusiness.industry05 social sciencesData scienceVariety (cybernetics)Social NetworksData quality020201 artificial intelligence & image processingbusinessLibro de Actas CIT2016. XII Congreso de Ingeniería del Transporte
researchProduct

Modelling and development of a generic observatory to harvest and analyze big data

2021

Big Data fascinate, both because of the value they hold that can provide a significant advantage in decision-making, and because of the challenges that their exploitation represents. These challenges are present at several levels of analytics workflows. At the level of the creation of software architectures, the volume and the velocity require at least enough performance to handle the ingestion and storage of data. The data variety has also an impact, as several new storage systems have emerged, each one corresponding to a specific need. The polystores are systems that integrate this diversity, to gain flexibility compared to the data warehouses, now too rigid. However, this diversification…

Big DataStream processing[INFO.INFO-OH] Computer Science [cs]/Other [cs.OH]TenseursData modelsCategory TheoryArchitectures logiciellesTensorsThéorie des catégoriesDonnées massivesModèles de donnéesSoftware Architectures
researchProduct

Mining customer requirements from online reviews: A product improvement perspective

2016

We propose a filtering model to predict helpfulness of reviews for product design.We provide a way to use the KANO model based on online reviews.We explore how to obtain insights from Big Data through knowledge-based view. Big data commerce has become an e-commerce trend. Learning how to extract valuable and real time insights from big data to drive smarter and more profitable business decisions is a main task of big data commerce. Using online reviews as an example, manufacturers have come to value how to select helpful online reviews and what can be learned from online reviews for new product development. In this research, we first proposed an automatic filtering model to predict the help…

Big data commerceEngineeringINF/01 - Computer ScienceInformation Systems and ManagementBig data02 engineering and technologyOnline reviewManagement Information SystemsKANO0502 economics and business0202 electrical engineering electronic engineering information engineeringProduct (category theory)Robustness (economics)Product designbusiness.industry05 social sciencesSettore IUS/10 - Diritto AmministrativoData scienceConjoint analysisProduct designConjoint analysiKano modelHelpfulnessNew product development050211 marketing020201 artificial intelligence & image processingbusinessInformation Systems
researchProduct

Memristors in Nonlinear Network : Application to Information (Signal and Image) Processing

2021

Memristor is a two-terminal nonlinear dynamic electronic device. Typically, it is a passive nano-device whose conductivity is controlled by the flux, time-integral of the voltage across its terminals, or by the charge, time-integral of the current flowing through it, and it presents interesting features for versatile applications. This thesis considers memristor use as a neighborhood connection for 2D cellular nonlinear or neural network (CNN), essentially for information (image and signal) processing and electronic prosthesis. We develop a model of the memristor based 2D cellular nonlinear networks CNNs compatible to image applications by incorporating memristor in the adjacent neighborhoo…

BilateralityMemristor and modelsSignal and image processingRéseau 2 dimensions[INFO.INFO-OH]Computer Science [cs]/Other [cs.OH]Bilatéralité2 dimensional networks[INFO.INFO-OH] Computer Science [cs]/Other [cs.OH]Propagation (réseau 1D)Fitzhugh-Nagumo cellsTraitement du signal et de l'imageFitzhugh-Nagumo cellulesPropagation (1D network)Memristor et models
researchProduct

Computation of the area in the discrete plane: Green’s theorem revisited

2017

International audience; The detection of the contour of a binary object is a common problem; however, the area of a region, and its moments, can be a significant parameter. In several metrology applications, the area of planar objects must be measured. The area is obtained by counting the pixels inside the contour or using a discrete version of Green's formula. Unfortunately, we obtain the area enclosed by the polygonal line passing through the centers of the pixels along the contour. We present a modified version of Green's theorem in the discrete plane, which allows for the computation of the exact area of a two-dimensional region in the class of polyominoes. Penalties are introduced and …

Binary Objectcontour detectionPolyominoComputationGeometry0102 computer and information sciences02 engineering and technology01 natural sciencesconnectednessPick's theoremsymbols.namesake0202 electrical engineering electronic engineering information engineeringPick's theoremElectrical and Electronic EngineeringGreen's theoremMathematicsDigital picturesPixelMathematical analysisImage segmentationAtomic and Molecular Physics and OpticsComputer Science Applications[SPI.TRON]Engineering Sciences [physics]/Electronics010201 computation theory & mathematics[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV]Binary datasymbols[SPI.OPTI]Engineering Sciences [physics]/Optics / Photonic020201 artificial intelligence & image processingpolyominoesGreen's theorem
researchProduct

Fast Algorithms for Pseudoarboricity

2015

The densest subgraph problem, which asks for a subgraph with the maximum edges-to-vertices ratio d∗, is solvable in polynomial time. We discuss algorithms for this problem and the computation of a graph orientation with the lowest maximum indegree, which is equal to ⌈d∗⌉. This value also equals the pseudoarboricity of the graph. We show that it can be computed in O(|E| √ log log d∗) time, and that better estimates can be given for graph classes where d∗ satisfies certain asymptotic bounds. These runtimes are achieved by accelerating a binary search with an approximation scheme, and a runtime analysis of Dinitz’s algorithm on flow networks where all arcs, except the source and sink arcs, hav…

Binary search algorithmComputation0102 computer and information sciences02 engineering and technologyOrientation (graph theory)01 natural sciencesFlow (mathematics)010201 computation theory & mathematicsLog-log plotTheoryofComputation_ANALYSISOFALGORITHMSANDPROBLEMCOMPLEXITY0202 electrical engineering electronic engineering information engineeringGraph (abstract data type)020201 artificial intelligence & image processingUnit (ring theory)AlgorithmTime complexityMathematicsofComputing_DISCRETEMATHEMATICSMathematics2016 Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments (ALENEX)
researchProduct

On the Non-uniform Redundancy in Grammatical Evolution

2016

This paper investigates the redundancy of representation in grammatical evolution (GE) for binary trees. We analyze the entire GE solution space by creating all binary genotypes of predefined length and map them to phenotype trees, which are then characterized by their size, depth and shape. We find that the GE representation is strongly non-uniformly redundant. There are huge differences in the number of genotypes that encode one particular phenotype. Thus, it is difficult for GE to solve problems where the optimal tree solutions are underrepresented. In general, the GE mapping process is biased towards short tree structures, which implies high GE performance if the optimal solution requir…

Binary treeComputer scienceBinary number0102 computer and information sciences02 engineering and technologyENCODE01 natural sciencesTree (graph theory)Tree structure010201 computation theory & mathematicsGrammatical evolution0202 electrical engineering electronic engineering information engineeringRedundancy (engineering)020201 artificial intelligence & image processingRepresentation (mathematics)Algorithm
researchProduct

A distance metric on binary trees using lattice-theoretic measures

1990

A so called height function which is a strictly antitone supervaluation is defined on binary trees. Via lattice-theoretic results and using the height function, we can define a distance metric on binary trees of size n which can be computed in expected time O(n 3/2 )

Binary treeData structureRandom binary treeComputer Science ApplicationsTheoretical Computer ScienceHeight functionCombinatoricsTree structureLattice (order)Signal ProcessingMetric (mathematics)Metric treeComputer Science::DatabasesInformation SystemsMathematicsInformation Processing Letters
researchProduct

Efficient lower and upper bounds of the diagonal-flip distance between triangulations

2006

There remains today an open problem whether the rotation distance between binary trees or equivalently the diagonal-flip distance between triangulations can be computed in polynomial time. We present an efficient algorithm for computing lower and upper bounds of this distance between a pair of triangulations.

Binary treeOpen problem010102 general mathematicsDiagonalApproximation algorithmTriangulation (social science)0102 computer and information sciences01 natural sciencesUpper and lower boundsComputer Science ApplicationsTheoretical Computer ScienceCombinatorics010201 computation theory & mathematicsTheoryofComputation_ANALYSISOFALGORITHMSANDPROBLEMCOMPLEXITYSignal Processing[MATH.MATH-CO]Mathematics [math]/Combinatorics [math.CO]0101 mathematicsRotation (mathematics)Time complexityComputingMilieux_MISCELLANEOUSInformation SystemsMathematics
researchProduct