Search results for " Computer"

showing 10 items of 6910 documents

Partitioning net carbon dioxide fluxes into photosynthesis and respiration using neural networks

2020

Abstract The eddy covariance (EC) technique is used to measure the net ecosystem exchange (NEE) of CO2 between ecosystems and the atmosphere, offering a unique opportunity to study ecosystem responses to climate change. NEE is the difference between the total CO2 release due to all respiration processes (RECO), and the gross carbon uptake by photosynthesis (GPP). These two gross CO2 fluxes are derived from EC measurements by applying partitioning methods that rely on physiologically based functional relationships with a limited number of environmental drivers. However, the partitioning methods applied in the global FLUXNET network of EC observations do not account for the multiple co‐acting…

0106 biological sciencesecosystem respiration010504 meteorology & atmospheric sciencesnet ecosystem exchangeneural networkEddy covarianceClimate changeAtmospheric sciencesPhotosynthesis01 natural sciences7. Clean energyCarbon CycleAtmosphereFlux (metallurgy)FluxNetRespirationeddy covarianceEnvironmental ChemistryEcosystemPrimary Research ArticlePhotosynthesisEcosystem0105 earth and related environmental sciencesGeneral Environmental ScienceGlobal and Planetary ChangeEcologycarbon dioxide fluxes partitioningRespirationgross primary production (GPP)Carbon DioxideBiological Sciences15. Life on landgross primary productionmachine learning13. Climate action[SDE]Environmental SciencesEnvironmental scienceNeural Networks ComputerSeasonsecosystem respiration (RECO)Environmental Sciences010606 plant biology & botanyGlobal Change Biology

researchProduct

Efficient Parallel Sort on AVX-512-Based Multi-Core and Many-Core Architectures

2019

Sorting kernels are a fundamental part of numerous applications. The performance of sorting implementations is usually limited by a variety of factors such as computing power, memory bandwidth, and branch mispredictions. In this paper we propose an efficient hybrid sorting method which takes advantage of wide vector registers and the high bandwidth memory of modern AVX-512-based multi-core and many-core processors. Our approach employs a combination of vectorized bitonic sorting and load-balanced multi-threaded merging. Thread-level and data-level parallelism are used to exploit both compute power and memory bandwidth. Our single-threaded implementation is ~30x faster than qsort in the C st…

020203 distributed computingBitonic sorterSpeedupComputer scienceRadix sortSortingMemory bandwidth02 engineering and technologyParallel computingBitonic sorting020202 computer hardware & architecture0202 electrical engineering electronic engineering information engineeringsortqsortMerge sortBranch mispredictionXeon Phi2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)

researchProduct

Online Scheduling of Task Graphs on Hybrid Platforms

2018

Modern computing platforms commonly include accelerators. We target the problem of scheduling applications modeled as task graphs on hybrid platforms made of two types of resources, such as CPUs and GPUs. We consider that task graphs are uncovered dynamically, and that the scheduler has information only on the available tasks, i.e., tasks whose predecessors have all been completed. Each task can be processed by either a CPU or a GPU, and the corresponding processing times are known. Our study extends a previous \(4\sqrt{m/k}\)-competitive online algorithm [2], where m is the number of CPUs and k the number of GPUs (\(m\ge k\)). We prove that no online algorithm can have a competitive ratio …

020203 distributed computingCompetitive analysisonline algorithmsComputer scienceHeuristicSchedulingSymmetric multiprocessor system02 engineering and technologyParallel computingUpper and lower boundsheterogeneous computingGraph020202 computer hardware & architectureScheduling (computing)task graphs0202 electrical engineering electronic engineering information engineeringOnline algorithm[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]

researchProduct

Multi-application Based Network-on-Chip Design for Mesh-of-Tree Topology Using Global Mapping and Reconfigurable Architecture

2019

This paper outlines a multi-application mapping for Mesh-of-Tree (MoT) topology based Network-on-Chip (NoC) design using reconfigurable architecture. A two phase Particle Swarm Optimization (PSO) has been proposed for reconfigurable architecture to minimize the communication cost. In first phase global mapping is done by combining multiple applications and in second phase, reconfiguration is achieved by switching the cores to near by routers using multiplexers. Experimentations have been carried out for several application benchmarks and synthetic applications generated using TGFF tool. The results show significant improvement in terms of communication cost after reconfiguration.

020203 distributed computingComputer scienceControl reconfigurationParticle swarm optimizationTopology (electrical circuits)02 engineering and technologyNetwork topologyMultiplexingMultiplexer020202 computer hardware & architectureNetwork on a chipComputer architecture0202 electrical engineering electronic engineering information engineeringArchitecture2019 32nd International Conference on VLSI Design and 2019 18th International Conference on Embedded Systems (VLSID)

researchProduct

Nvidia CUDA parallel processing of large FDTD meshes in a desktop computer

2020

The Finite Difference in Time Domain numerical (FDTD) method is a well know and mature technique in computational electrodynamics. Usually FDTD is used in the analysis of electromagnetic structures, and antennas. However still there is a high computational burden, which is a limitation for use in combination with optimization algorithms. The parallelization of FDTD to calculate in GPU is possible using Matlab and CUDA tools. For instance, the simulation of a planar array, with a three dimensional FDTD mesh 790x276x588, for 6200 time steps, takes one day -elapsed time- using the CPU of an Intel Core i3 at 2.4GHz in a personal computer, 8Gb RAM. This time is reduced 120 times when the calcula…

020203 distributed computingComputer scienceFinite-difference time-domain methodGraphics processing unit02 engineering and technologyComputational scienceCUDAPersonal computer0202 electrical engineering electronic engineering information engineeringComputational electromagnetics020201 artificial intelligence & image processingCentral processing unitTime domainMATLABcomputercomputer.programming_languageProceedings of the 10th Euro-American Conference on Telematics and Information Systems

researchProduct

WarpDrive: Massively Parallel Hashing on Multi-GPU Nodes

2018

Hash maps are among the most versatile data structures in computer science because of their compact data layout and expected constant time complexity for insertion and querying. However, associated memory access patterns during the probing phase are highly irregular resulting in strongly memory-bound implementations. Massively parallel accelerators such as CUDA-enabled GPUs may overcome this limitation by virtue of their fast video memory featuring almost one TB/s bandwidth in comparison to main memory modules of state-of-the-art CPUs with less than 100 GB/s. Unfortunately, the size of hash maps supported by existing single-GPU hashing implementations is restricted by the limited amount of …

020203 distributed computingComputer scienceHash function0102 computer and information sciences02 engineering and technologyParallel computingData structure01 natural sciencesHash tableElectronic mailMemory management010201 computation theory & mathematicsScalability0202 electrical engineering electronic engineering information engineeringMassively parallelTime complexity2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

researchProduct

Torus Topology based Fault-Tolerant Network-on-Chip Design with Flexible Spare Core Placement

2018

The increase in the density of the IP cores being fabricated on a chip poses on-chip communication challenges and heat dissipation. To overcome these issues, Network-onChip (NoC) based communication architecture is introduced. In the nanoscale era NoCs are prone to faults which results in performance degradation and un-reliability. Hence efficient fault-tolerant methods are required to make the system reliable in contrast to diverse component failures. This paper presents a flexible spare core placement in torus topology based faulttolerant NoC design. The communications related to the failed core is taken care by selecting the best position for a spare core in the torus network. By conside…

020203 distributed computingComputer scienceParticle swarm optimizationFault toleranceTopology (electrical circuits)Hardware_PERFORMANCEANDRELIABILITY02 engineering and technologyChipTopology020202 computer hardware & architectureReduction (complexity)Network on a chipSpare part0202 electrical engineering electronic engineering information engineeringMetaheuristic

researchProduct

Wireless NoC for Inter-FPGA Communication: Theoretical Case for Future Datacenters

2020

Integration of FPGAs in datacenters might have different motivations from acceleration to energy efficiency, but the goal of better performance tops all. FPGAs are being utilized in a variety of ways today, tightly coupled with heterogenous computing resources, and as a standalone network of homogenous resources. Open source software stacks, propriety tool chain, and programming languages with advanced methodologies are hitting hard on the programmability wall of the FPGAs. The deployment of FPGAs in datacenters will neither be sustainable nor economical, without realizing the multi-tenancy in multiple FPGAs. Inter-FPGA communication among multiple FPGAs remained relatively less addressed p…

020203 distributed computingComputer sciencebusiness.industryWireless networkDistributed computingCloud computing02 engineering and technologyVirtualizationcomputer.software_genreBottleneck020202 computer hardware & architectureSoftware deployment0202 electrical engineering electronic engineering information engineeringWireless[INFO]Computer Science [cs]businessField-programmable gate arraycomputerComputingMilieux_MISCELLANEOUSEfficient energy use2020 IEEE 23rd International Multitopic Conference (INMIC)

researchProduct

A segmentation algorithm for noisy images

2005

International audience; This paper presents a segmentation algorithm for gray-level images and addresses issues related to its performance on noisy images. It formulates an image segmentation problem as a partition of a weighted image neighborhood hypergraph. To overcome the computational difficulty of directly solving this problem, a multilevel hypergraph partitioning has been used. To evaluate the algorithm, we have studied how noise affects the performance of the algorithm. The alpha-stable noise is considered and its effects on the algorithm are studied. Key words : graph, hypergraph, neighborhood hypergraph, multilevel hypergraph partitioning, image segmentation and noise removal.

020203 distributed computingHypergraphMathematics::Combinatorics[ INFO ] Computer Science [cs]Computer sciencebusiness.industrySegmentation-based object categorizationComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONScale-space segmentationImage processing02 engineering and technologyImage segmentation[INFO] Computer Science [cs]020202 computer hardware & architectureComputer Science::Computer Vision and Pattern Recognition0202 electrical engineering electronic engineering information engineeringGraph (abstract data type)SegmentationComputer vision[INFO]Computer Science [cs]Artificial intelligencebusinessAlgorithmMathematicsofComputing_DISCRETEMATHEMATICS

researchProduct

Experimental trade-offs between different strategies for multihop communications evaluated over real deployments of wireless sensor network for envir…

2018

Although much work has been done since wireless sensor networks appeared, there is not a great deal of information available on real deployments that incorporate basic features associated with these networks, in particular multihop routing and long lifetimes features. In this article, an environmental monitoring application (Internet of Things oriented) is described, where temperature and relative humidity samples are taken by each mote at a rate of 2 samples/min and sent to a sink using multihop routing. Our goal is to analyse the different strategies to gather the information from the different motes in this context. The trade-offs between ‘sending always’ and ‘buffering locally’ approac…

020203 distributed computingInternetComputer Networks and Communicationsbusiness.industryComputer scienceTrade offsGeneral Engineering020206 networking & telecommunications02 engineering and technologyXarxes locals sense fil Wi-Filcsh:QA75.5-76.95Environmental monitoring0202 electrical engineering electronic engineering information engineeringlcsh:Electronic computers. Computer scienceInternet of ThingsbusinessWireless sensor networkComputer network

researchProduct