Search results for "Parallel"
showing 10 items of 667 documents
A note on Serrin's overdetermined problem
2014
We consider the solution of the torsion problem $$−Δu = N \quad\mathrm{in}\quad Ω,\quad u = 0\quad\mathrm{on}\quad ∂Ω,$$ where Ω is a bounded domain in RN. ¶ Serrin's celebrated symmetry theorem states that, if the normal derivative uν is constant on ∂Ω, then Ω must be a ball. In [6], it has been conjectured that Serrin's theorem may be obtained by stability in the following way: first, for the solution u of the torsion problem prove the estimate $$r_e − r_i ≤ C_t\Bigl(\max_{\Gamma_t} u-\min_{\Gamma_t} u\Bigr)$$ for some constant Ct depending on t, where re and ri are the radii of an annulus containing ∂Ω and Γt is a surface parallel to ∂Ω at distance t and sufficiently close to ∂Ω secondly…
MetaCache-GPU: Ultra-Fast Metagenomic Classification
2021
The cost of DNA sequencing has dropped exponentially over the past decade, making genomic data accessible to a growing number of scientists. In bioinformatics, localization of short DNA sequences (reads) within large genomic sequences is commonly facilitated by constructing index data structures which allow for efficient querying of substrings. Recent metagenomic classification pipelines annotate reads with taxonomic labels by analyzing their $k$-mer histograms with respect to a reference genome database. CPU-based index construction is often performed in a preprocessing phase due to the relatively high cost of building irregular data structures such as hash maps. However, the rapidly growi…
Dendrochemical assessment of mercury releases from a pond and dredged-sediment landfill impacted by a chlor-alkali plant.
2016
International audience; Although current Hg emissions from industrial activities may be accurately monitored, evidence of past releases to the atmosphere must rely on one or more environmental proxies. We used Hg concentrations in tree cores collected from poplars and willows to investigate the historical changes of Hg emissions from a dredged sediment landfill and compared them to a nearby control location. Our results demonstrated the potential value of using dendrochemistry to record historical Hg emissions from past industrial activities.
GPU-laskennan optimointi
2013
Näytönohjaimet, grafiikkasuorittimet, tarjoavat rinnakkaisen laskennan alustan, jossa voidaan suorittaa ohjelmakoodia satojen ydinten toimesta. Tämä alusta mahdollistaa matemaattisesti työläiden ongelmien ratkaisemisen tehokkaasti. Grafiikkasuorittimen rinnakkainen suoritusympäristö kuitenkin eroaa suuresti tietokoneen suorittimen peräkkäisestä suoritusympäristöstä. Ongelmien ratkaisemiseksi tehokkaasti rinnakkaisympäristössä on noudettava ohjelmointimenetelmiä, jotka soveltuvat erityisesti rinnakkaisympäristöön. Tässä työssä tarkastellaan rinnakkaisen laskennan perusteita, miten erilaiset ohjelmointimenetelmät vaikuttavat ohjelman suoriutumiseen grafiikkasuorittimella sekä miten voidaan sa…
Parallel Algorithms for Listing Well-Formed Parentheses Strings
1998
We present two cost-optimal parallel algorithms generating the set of all well-formed parentheses strings of length 2n with constant delay for each generated string. In our first algorithm we generate in lexicographic order well-formed parentheses strings represented by bitstrings, and in the second one we use the representation by weight sequences. In both cases the computational model is based on an architecture CREW PRAM, where each processor performs the same algorithm simultaneously on a different set of data. Different processors can access the shared memory at the same time to read different data in the same or different memory locations, but no two processors are allowed to write i…
A novel hardware accelerator for the HEVC intra prediction
2015
International audience; A novel hardware accelerator for the High Efficiency Video Coding (HEVC) intra prediction is presented in this paper in order to reduce the computation complexity within this standard and to accelerate the concerned calculations. We propose a new pipelined structure that we called Processing Element (PE) to execute all angular modes, and we repeat it in five paths that our architecture composed of. We present also another structure to carry out the Planar mode. This architecture supports all intra prediction modes for all prediction unit sizes. The synthesis results show that our design can run at 213 MHz for Xilinx Virtex 6 and is capable to process real time 120 10…
A HARDWARE SOLUTION FOR HEVC INTRA PREDICTION LOSSLESS CODING
2015
International audience; The lossless coding mode of the High Efficiency Video Coding (HEVC) main profile that bypasses transform, quantization, and in-loop filters is described. Compared to the HEVC non-lossless coding mode, the HEVC lossless coding mode provides perfect fidelity and an average bit-rate reduction of 3.2%–13.2%. It also significantly outperforms the existing lossless compression solutions, such as JPEG2000 and JPEG-LS for images as well as WinRAR for data archiving. A fully parallel-based solution is presented in this paper in order to reduce processing time and computation complexity resulting from intra prediction. Two higher performance structures are designed to perform …
Multi-objective optimisations for a superscalar architecture with selective value prediction
2012
This work extends an earlier manual design space ex ploration of our developed Selective Load Value Pre diction based superscalar architecture to the L2 unified cache. A fter that we perform an automatic design space expl oration using a special developed software tool by varying several architectural parameters. Our goal is to find optim al configurations in terms of CPI (Cycles per Instruction) and energy consumption. By varying 19 architectural parameter s, as we proposed, the design space is over 2.5 millions of billions configurations which obviously means that only heuristic search can be considered. Therefore, we propose dif ferent methods of automatic design space exploratio n based…
SYSTOLIC GENERATION OF k-ARY TREES
1999
The only parallel generating algorithms for k-ary trees are those of Akl and Stojmenović in 1996 and of Vajnovszki and Phillips in 1997. In the first of them, trees are represented by an inversion table and the processor model is a linear aray multicomputer. In the second, trees are represented by bitstrings and the algorithm executes on a shared memory multiprocessor. In this paper we give a parallel generating algorithm for k-ary trees represented by generalized P–sequences for execution on a linear array multicomputer.
Optimization of Application-Specific L1 Cache Translation Functions of the LEON3 Processor
2020
Reconfigurable caches offer an intriguing opportunity to tailor cache behavior to applications for better run-times and energy consumptions. While one may adapt structural cache parameters such as cache and block sizes, we adapt the memory-address-to-cache-index mapping function to the needs of an application. Using a LEON3 embedded multi-core processor with reconfigurable cache mappings, a metaheuristic search procedure, and Mibench applications, we show in this work how to accurately compare non-deterministic performances of applications and how to use this information to implement an optimization procedure that evolves application-specific cache mappings.