Search results for "oftware"

showing 10 items of 7396 documents

A comparison of HDFS compact data formats: Avro versus Parquet

2017

In this paper, file formats like Avro and Parquet are compared with text formats to evaluate the performance of the data queries. Different data query patterns have been evaluated. Cloudera’s open-source Apache Hadoop distribution CDH 5.4 has been chosen for the experiments presented in this article. The results show that compact data formats (Avro and Parquet) take up less storage space when compared with plain text data formats because of binary data format and compression advantage. Furthermore, data queries from the column based data format Parquet are faster when compared with text data formats and Avro. Article in English. HDFS glaustųjų duomenų formatų palyginimas: Avro prieš Parquet…

Big DataComputer scienceBig dataEnergy Engineering and Power Technology02 engineering and technologyManagement Science and Operations Researchcomputer.software_genreColumn (database)020204 information systemsData query0202 electrical engineering electronic engineering information engineeringHDFSDatabasebusiness.industryPlain textMechanical Engineeringcomputer.file_formatAvroFile formatHiveParquetData formatHadoopBinary data020201 artificial intelligence & image processingbusinesscomputerMokslas – Lietuvos ateitis / Science – Future of Lithuania
researchProduct

FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy

2021

Abstract Background Storage of genomic data is a major cost for the Life Sciences, effectively addressed via specialized data compression methods. For the same reasons of abundance in data production, the use of Big Data technologies is seen as the future for genomic data storage and processing, with MapReduce-Hadoop as leaders. Somewhat surprisingly, none of the specialized FASTA/Q compressors is available within Hadoop. Indeed, their deployment there is not exactly immediate. Such a State of the Art is problematic. Results We provide major advances in two different directions. Methodologically, we propose two general methods, with the corresponding software, that make very easy to deploy …

Big DataFASTQ formatComputer scienceBig data02 engineering and technologycomputer.software_genrelcsh:Computer applications to medicine. Medical informaticsBiochemistry03 medical and health sciencesSoftwareStructural BiologySpark (mathematics)0202 electrical engineering electronic engineering information engineeringData_FILESMapReduceMapReduce; hadoop; sequence analysis; data compressionMolecular Biologylcsh:QH301-705.5030304 developmental biologyFile system0303 health sciencesSettore INF/01 - InformaticaDatabasebusiness.industryMethodology ArticleApplied MathematicsSequence analysisGenomicsData compression; Hadoop; MapReduce; Sequence analysis; Algorithms; Big Data; Data Compression; Genomics; SoftwareComputer Science Applicationslcsh:Biology (General)Software deploymentHadoopData compressionlcsh:R858-859.7020201 artificial intelligence & image processingState (computer science)businesscomputerAlgorithmsSoftwareData compressionBMC Bioinformatics
researchProduct

Proposed use of a conversational agent for patient empowerment

2021

Empowerment is a process through which people acquire the necessary knowledge and self-awareness to understand their conditions and treatment options, make informed choices and self-manage their health conditions in daily life, in collaboration with medical professionals. Conversational Agents in healthcare could play an important role in the process of empowering a person but, so far, they have been seldom been used for this purpose. This paper presents the basic principles and preliminary implementation of a conversational health agent for patient empowerment. It dialogues with the user in a "natural" way, collects health data from heterogeneous sources and provides the user wit…

Big DataPatient EmpowermentSettore INF/01 - InformaticaPatient EmpowermentArtificial IntelligenceApplied psychologyConversational AgentDigital HealthDialog systemPsychologycomputer.software_genrecomputerTailored Health Communication
researchProduct

A systematic review of SQL-on-Hadoop by using compact data formats

2016

Article also submitted for publication in Baltic J. Modern Computing (BJMC) on October 5, 2016.

Big DataSQLHDFSGeneral Computer ScienceDatabaseComputer sciencebusiness.industryBig dataAvrocomputer.software_genreParquetWorld Wide WebHadoopSystematic reviewbusinesscomputercomputer.programming_language
researchProduct

Modelling and development of a generic observatory to harvest and analyze big data

2021

Big Data fascinate, both because of the value they hold that can provide a significant advantage in decision-making, and because of the challenges that their exploitation represents. These challenges are present at several levels of analytics workflows. At the level of the creation of software architectures, the volume and the velocity require at least enough performance to handle the ingestion and storage of data. The data variety has also an impact, as several new storage systems have emerged, each one corresponding to a specific need. The polystores are systems that integrate this diversity, to gain flexibility compared to the data warehouses, now too rigid. However, this diversification…

Big DataStream processing[INFO.INFO-OH] Computer Science [cs]/Other [cs.OH]TenseursData modelsCategory TheoryArchitectures logiciellesTensorsThéorie des catégoriesDonnées massivesModèles de donnéesSoftware Architectures
researchProduct

Deep learning and process understanding for data-driven Earth system science

2017

Machine learning approaches are increasingly used to extract patterns and insights from the ever-increasing stream of geospatial data, but current approaches may not be optimal when system behaviour is dominated by spatial or temporal context. Here, rather than amending classical machine learning, we argue that these contextual cues should be used as part of deep learning (an approach that is able to extract spatio-temporal features automatically) to gain further process understanding of Earth system science problems, improving the predictive ability of seasonal forecasting and modelling of long-range spatial connections across multiple timescales, for example. The next step will be a hybri…

Big DataTime FactorsProcess modelingGeospatial analysis010504 meteorology & atmospheric sciencesProcess (engineering)0208 environmental biotechnologyBig dataGeographic Mapping02 engineering and technologycomputer.software_genreMachine learning01 natural sciencesPattern Recognition AutomatedData-drivenDeep LearningSpatio-Temporal AnalysisHumansComputer SimulationWeather0105 earth and related environmental sciencesMultidisciplinarybusiness.industryDeep learningUncertaintyReproducibility of ResultsTranslatingRegression Psychology020801 environmental engineeringEarth system scienceKnowledgePattern recognition (psychology)Earth SciencesFemaleSeasonsArtificial intelligencebusinessPsychologyFacial RecognitioncomputerForecastingNature
researchProduct

Towards Service-oriented 5G: Virtualizing the Networks for Everything-as-a-Service

2018

It is widely acknowledged that the forthcoming 5G architecture will be highly heterogeneous and deployed with a high degree of density. These changes over the current 4G bring many challenges on how to achieve an efficient operation from the network management perspective. In this article, we introduce a revolutionary vision of the future 5G wireless networks, in which the network is no longer limited by hardware or even software. Specifically, by the idea of virtualizing the wireless networks, which has recently gained increasing attention, we introduce the Everything-as-a-Service (XaaS) taxonomy to light the way towards designing the service-oriented wireless networks. The concepts, chall…

Big Datawireless networksNetworking and Internet Architecture (cs.NI)FOS: Computer and information sciencesvirtualisointisoftware5G-tekniikkawireless network virtualizationEverything-as-a-servicevirtualizationComputer Science - Networking and Internet Architecture5G technologyvirtualisation5G mobile communicationhardwarelcsh:Electrical engineering. Electronics. Nuclear engineeringeverything-as-a-servicecomputer architecturelcsh:TK1-99715Glangattomat verkot
researchProduct

Cluster-based active learning for compact image classification

2010

In this paper, we consider active sampling to label pixels grouped with hierarchical clustering. The objective of the method is to match the data relationships discovered by the clustering algorithm with the user's desired class semantics. The first is represented as a complete tree to be pruned and the second is iteratively provided by the user. The active learning algorithm proposed searches the pruning of the tree that best matches the labels of the sampled points. By choosing the part of the tree to sample from according to current pruning's uncertainty, sampling is focused on most uncertain clusters. This way, large clusters for which the class membership is already fixed are no longer…

Binary treeContextual image classificationbusiness.industryActive learning (machine learning)Sampling (statistics)Pattern recognitioncomputer.software_genreHierarchical clusteringMulticlass classificationTree (data structure)ComputingMethodologies_PATTERNRECOGNITIONLife ScienceArtificial intelligenceData miningbusinessCluster analysiscomputerMathematics
researchProduct

On the Locality of Standard Search Operators in Grammatical Evolution

2014

Offspring should be similar to their parents and inherit their relevant properties. This general design principle of search operators in evolutionary algorithms is either known as locality or geometry of search operators, respectively. It takes a geometric perspective on search operators and suggests that the distance between an offspring and its parents should be less than or equal to the distance between both parents. This paper examines the locality of standard search operators used in grammatical evolution (GE) and genetic programming (GP) for binary tree problems. Both standard GE and GP search operators suffer from low locality since a substantial number of search steps result in an o…

Binary treeTheoretical computer sciencebusiness.industryPerspective (graphical)LocalityEvolutionary algorithmGenetic programmingcomputer.software_genreRandom walkGrammatical evolutionArtificial intelligencebusinesscomputerNatural language processingMathematics
researchProduct

Bio-inspired security analysis for IoT scenarios

2020

Computer security has recently become more and more important as the world economy dependency from data has kept growing. The complexity of the systems that need to be kept secure calls for new models capable of abstracting the interdependencies among heterogeneous components that cooperate at providing the desired service. A promising approach is attack graph analysis, however, the manual analysis of attack graphs is tedious and error prone. In this paper we propose to apply the metabolic network model to attack graph analysis, using three interacting bio-inspired algorithms: topological analysis, flux balance analysis, and extreme pathway analysis. A developed framework for graph building…

Bio-inspired techniqueService (systems architecture)Security analysisIoTDependency (UML)Computer scienceNetwork securityDistributed computingmedia_common.quotation_subject0211 other engineering and technologies02 engineering and technologyMetabolic networksAttack graphs; Bio-inspired algorithms; Bio-inspired techniques; IoT; Metabolic networks; Network security; Security analysis; System securityAttack graph03 medical and health sciences0302 clinical medicineUse casemedia_common021110 strategic defence & security studiesSecurity analysisbusiness.industryMetabolic network030208 emergency & critical care medicineBio-inspired techniquesNetwork securitySystem securityFlux balance analysisInterdependenceHardware and ArchitectureBio-inspired algorithmGraph (abstract data type)businessSoftwareAttack graphsBio-inspired algorithms
researchProduct