0000000001113899

AUTHOR

Daiga Plase

showing 4 related works from this author

A comparison of HDFS compact data formats: Avro versus Parquet

2017

In this paper, file formats like Avro and Parquet are compared with text formats to evaluate the performance of the data queries. Different data query patterns have been evaluated. Cloudera’s open-source Apache Hadoop distribution CDH 5.4 has been chosen for the experiments presented in this article. The results show that compact data formats (Avro and Parquet) take up less storage space when compared with plain text data formats because of binary data format and compression advantage. Furthermore, data queries from the column based data format Parquet are faster when compared with text data formats and Avro. Article in English. HDFS glaustųjų duomenų formatų palyginimas: Avro prieš Parquet…

Big DataComputer scienceBig dataEnergy Engineering and Power Technology02 engineering and technologyManagement Science and Operations Researchcomputer.software_genreColumn (database)020204 information systemsData query0202 electrical engineering electronic engineering information engineeringHDFSDatabasebusiness.industryPlain textMechanical Engineeringcomputer.file_formatAvroFile formatHiveParquetData formatHadoopBinary data020201 artificial intelligence & image processingbusinesscomputerMokslas – Lietuvos ateitis / Science – Future of Lithuania
researchProduct

Accelerating data queries on Hadoop framework by using compact data formats

2016

There are massive amounts of data generated from IoT, online transactions, click streams, emails, logs, posts, social networking interactions, sensors, mobile phones and their applications etc. The question is where and how to store these data in order to provide faster data access. Understanding and handling Big Data is a big challenge. The research direction in Big Data projects using Hadoop Technology, MapReduce kind of framework and compact data formats such as RCFile, SequenceFile, ORC, Avro, Parquet shows that only two data formats (Avro and Parquet) support schema evolution and compression in order to utilize less storage space. In this paper, file formats like Avro and Parquet are c…

Distributed databaseDatabasePlain textComputer sciencebusiness.industryBig datacomputer.file_formatcomputer.software_genreFile formatColumn (database)Schema evolutionData accessBinary databusinesscomputer2016 IEEE 4th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE)
researchProduct

A systematic review of SQL-on-Hadoop by using compact data formats

2016

Article also submitted for publication in Baltic J. Modern Computing (BJMC) on October 5, 2016.

Big DataSQLHDFSGeneral Computer ScienceDatabaseComputer sciencebusiness.industryBig dataAvrocomputer.software_genreParquetWorld Wide WebHadoopSystematic reviewbusinesscomputercomputer.programming_language
researchProduct

Lēmumu atbalsta sistēma maziem un vidējiem uzņēmumiem

2015

Lēmumu atbalsta sistēmas mūsdienās izmanto visa veida uzņēmumi. Arī maza un vidēja uzņēmuma vadītājam, ir nepieciešams pieņemt svarīgus lēmumus, kas balstīti uz liela apjoma informācijas analīzes rezultātiem. Pareizi izsvērti lēmumi palīdz ātrāk un veiksmīgāk sasniegt uzņēmuma mērķus, palielināt apgrozījumu, gūt lielāku peļņu, samazināt uzņēmuma izmaksas, paver iespēju ātrāk kļūt par lielu uzņēmumu. Esošie lielo ražotāju risinājumi (piemēram, SAP, IBM Cognos, SAS, Oracle u.c.) ne vienmēr ir piemēroti maziem uzņēmumiem, jo tos raksturo augsta cena, augstas tehnisko resursu un infrastruktūras prasības, sarežģītība, neatbilstoša funkcionalitāte, zema elastība, lai risinātu uzdevumus strauji ma…

lēmumu atbalsta sistēmaDatorzinātnedatu analīzebiznesa inteliģencelēmumu atbalstsdatorizēts atbalsts
researchProduct