Search results for "Base"

showing 10 items of 8362 documents

Big Data Processing in the ATLAS Experiment: Use Cases and Experience

2015

Abstract The physics goals of the next Large Hadron Collider run include high precision tests of the Standard Model and searches for new physics. These goals require detailed comparison of data with computational models simulating the expected data behavior. To highlight the role which modeling and simulation plays in future scientific discovery, we report on use cases and experience with a unified system built to process both real and simulated data of growing volume and variety.

Big DataComputational modelLarge Hadron ColliderComputer sciencebusiness.industryPhysics beyond the Standard ModelData managementBig dataATLAS experimentcomputer.software_genreData scienceStandard ModelModeling and simulationParallel and Distributed ComputingGrid-based Simulation and ComputingGrid computingLarge Scale Scientific InstrumentsGeneral Earth and Planetary SciencesUse casebusinesscomputerGeneral Environmental ScienceProcedia Computer Science
researchProduct

Big Data in metagenomics: Apache Spark vs MPI.

2020

The progress of next-generation sequencing has lead to the availability of massive data sets used by a wide range of applications in biology and medicine. This has sparked significant interest in using modern Big Data technologies to process this large amount of information in distributed memory clusters of commodity hardware. Several approaches based on solutions such as Apache Hadoop or Apache Spark, have been proposed. These solutions allow developers to focus on the problem while the need to deal with low level details, such as data distribution schemes or communication patterns among processing nodes, can be ignored. However, performance and scalability are also of high importance when…

Big DataComputer and Information SciencesScienceBig dataMessage Passing InterfaceParallel computingResearch and Analysis MethodsComputing MethodologiesComputing MethodologiesComputer ArchitectureComputer SoftwareDatabase and Informatics MethodsSoftwareSpark (mathematics)GeneticsMammalian GenomicsMultidisciplinarybusiness.industryApplied MathematicsSimulation and ModelingQRBiology and Life SciencesComputational BiologySoftware EngineeringGenomicsDNAGenomic DatabasesGenome AnalysisComputer HardwareSupercomputerBiological DatabasesAnimal GenomicsPhysical SciencesScalabilityEngineering and TechnologyMetagenomeMedicineDistributed memoryMetagenomicsbusinessMathematicsAlgorithmsGenome BacterialSoftwareResearch ArticlePLoS ONE
researchProduct

A comparison of HDFS compact data formats: Avro versus Parquet

2017

In this paper, file formats like Avro and Parquet are compared with text formats to evaluate the performance of the data queries. Different data query patterns have been evaluated. Cloudera’s open-source Apache Hadoop distribution CDH 5.4 has been chosen for the experiments presented in this article. The results show that compact data formats (Avro and Parquet) take up less storage space when compared with plain text data formats because of binary data format and compression advantage. Furthermore, data queries from the column based data format Parquet are faster when compared with text data formats and Avro. Article in English. HDFS glaustųjų duomenų formatų palyginimas: Avro prieš Parquet…

Big DataComputer scienceBig dataEnergy Engineering and Power Technology02 engineering and technologyManagement Science and Operations Researchcomputer.software_genreColumn (database)020204 information systemsData query0202 electrical engineering electronic engineering information engineeringHDFSDatabasebusiness.industryPlain textMechanical Engineeringcomputer.file_formatAvroFile formatHiveParquetData formatHadoopBinary data020201 artificial intelligence & image processingbusinesscomputerMokslas – Lietuvos ateitis / Science – Future of Lithuania
researchProduct

FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy

2021

Abstract Background Storage of genomic data is a major cost for the Life Sciences, effectively addressed via specialized data compression methods. For the same reasons of abundance in data production, the use of Big Data technologies is seen as the future for genomic data storage and processing, with MapReduce-Hadoop as leaders. Somewhat surprisingly, none of the specialized FASTA/Q compressors is available within Hadoop. Indeed, their deployment there is not exactly immediate. Such a State of the Art is problematic. Results We provide major advances in two different directions. Methodologically, we propose two general methods, with the corresponding software, that make very easy to deploy …

Big DataFASTQ formatComputer scienceBig data02 engineering and technologycomputer.software_genrelcsh:Computer applications to medicine. Medical informaticsBiochemistry03 medical and health sciencesSoftwareStructural BiologySpark (mathematics)0202 electrical engineering electronic engineering information engineeringData_FILESMapReduceMapReduce; hadoop; sequence analysis; data compressionMolecular Biologylcsh:QH301-705.5030304 developmental biologyFile system0303 health sciencesSettore INF/01 - InformaticaDatabasebusiness.industryMethodology ArticleApplied MathematicsSequence analysisGenomicsData compression; Hadoop; MapReduce; Sequence analysis; Algorithms; Big Data; Data Compression; Genomics; SoftwareComputer Science Applicationslcsh:Biology (General)Software deploymentHadoopData compressionlcsh:R858-859.7020201 artificial intelligence & image processingState (computer science)businesscomputerAlgorithmsSoftwareData compressionBMC Bioinformatics
researchProduct

Digital epidemiology: assessment of measles infection through Google Trends mechanism in Italy.

2019

Introduction. The primary aim of this study is to evaluate the temporal correlation between Google Trends and the data on measles infection arising from the conventional surveillance system, reported by the Istituto Superiore di Sanità's (ISS) bulletin. Moreover, this study is also aimed at forecasting the trends of the reported infectious diseases cases over time. Materials and Methods. The reported cases of measles were selected from January 2013 until October 2018. The data on Internet searches have been obtained from Google Trends; the research data referred to the first 48 weeks of year 2017 have been aggregated on a weekly basis. The search volume provided by Google Trends has a relat…

Big DataInternetTime FactorsDatabases FactualMedical Informatics ComputingMeasles VaccineMedical InformaticSearch EngineEpidemiologic StudiesItalyMeasleVaccine-preventable diseasesPopulation SurveillanceHumansPublic HealthEpidemiologic MethodsMeaslesAnnali di igiene : medicina preventiva e di comunita
researchProduct

A systematic review of SQL-on-Hadoop by using compact data formats

2016

Article also submitted for publication in Baltic J. Modern Computing (BJMC) on October 5, 2016.

Big DataSQLHDFSGeneral Computer ScienceDatabaseComputer sciencebusiness.industryBig dataAvrocomputer.software_genreParquetWorld Wide WebHadoopSystematic reviewbusinesscomputercomputer.programming_language
researchProduct

A distance metric on binary trees using lattice-theoretic measures

1990

A so called height function which is a strictly antitone supervaluation is defined on binary trees. Via lattice-theoretic results and using the height function, we can define a distance metric on binary trees of size n which can be computed in expected time O(n 3/2 )

Binary treeData structureRandom binary treeComputer Science ApplicationsTheoretical Computer ScienceHeight functionCombinatoricsTree structureLattice (order)Signal ProcessingMetric (mathematics)Metric treeComputer Science::DatabasesInformation SystemsMathematicsInformation Processing Letters
researchProduct

A Practical Perspective: The Effect of Ligand Conformers on the Negative Image-Based Screening.

2019

Negative image-based (NIB) screening is a rigid molecular docking methodology that can also be employed in docking rescoring. During the NIB screening, a negative image is generated based on the target protein’s ligand-binding cavity by inverting its shape and electrostatics. The resulting NIB model is a drug-like entity or pseudo-ligand that is compared directly against ligand 3D conformers, as is done with a template compound in the ligand-based screening. This cavity-based rigid docking has been demonstrated to work with genuine drug targets in both benchmark testing and drug candidate/lead discovery. Firstly, the study explores in-depth the applicability of different ligand 3D conformer…

Binding SitesCyclooxygenase 2 Inhibitorsstructure-based drug discoveryrigid dockingmolecular dockingnegative image-based (NIB) screeningvirtual screeningArticlenegative image-based rescoring (R-NiB)cyclooxygenase-2 (COX-2)Molecular Docking SimulationCyclooxygenase 2Drug DiscoveryHumansdocking rescoringProtein BindingInternational journal of molecular sciences
researchProduct

Promoting Deoxygenation of Bio-Oil by Metal-Loaded Hierarchical ZSM-5 Zeolites

2016

3 Figuras.- 5 tablas.-1 Esquema.- This document is the Accepted Manuscript version of a Published Work that appeared in final form in ACS Sustainable Chemistry & Engineering, copyright © American Chemical Society after peer review and technical editing by the publisher. To access the final edited and published work see https://doi.org/10.1021/acssuschemeng.5b01606 ”

Bio-oil upgradingGeneral Chemical EngineeringInorganic chemistryLignocellulosic biomass02 engineering and technology01 natural sciencesCatalysisEnvironmental ChemistryOrganic chemistryLewis acids and basesZeoliteMetal loadingDeoxygenationIon exchange010405 organic chemistryRenewable Energy Sustainability and the EnvironmentChemistryDecarbonylationDeoxygenationGeneral Chemistry021001 nanoscience & nanotechnology0104 chemical sciencesZSM-50210 nano-technologyHierarchical ZSM-5 zeoliteACS Sustainable Chemistry & Engineering
researchProduct

A class-selective immunoassay for simultaneous analysis of anilinopyrimidine fungicides using a rationally designed hapten

2017

he development of multianalyte immunoassays constitutes a main research issue in the field of bioanalytical techniques. In the present study, class-specific antibodies against the three members of the anilinopyrimidine family of fungicides (pyrimethanil, cyprodinil and mepanipyrim) were raised by using a bioconjugate of a rationally designed hapten [5-(6-methyl-2-(phenylamino)pyrimidin-4-yl)pentanoic acid]. Highly sensitive immunoassays were developed for the generic determination of these compounds, using the competitive enzyme-linked immunosorbent assay (ELISA). Particularly, a direct antibody-coated competitive ELISA afforded identical sensitivity for the three anilinopyrimidines, with I…

BioanalysisGrapesChromatography-mass spectrometryLinked-Immunosorbent-AssayPyrimethanilEnzyme-Linked Immunosorbent AssayWine02 engineering and technology01 natural sciencesBiochemistryAnalytical ChemistryLiquid-chromatographychemistry.chemical_compoundAntibody-based immunoassaysElectrochemistrymedicineEnvironmental ChemistrySolid phase extractionResidue analysisSpectroscopyDetection limitWineOrganophosphorus pesticidesSolid-phase extractionChromatographymedicine.diagnostic_testChemistry010401 analytical chemistryRed wine021001 nanoscience & nanotechnology0104 chemical sciencesFungicides IndustrialFungicideImmunoassayPyrimethanil0210 nano-technologyHaptenHaptens
researchProduct