0000000000053499

AUTHOR

Lucian Vintan

Boosting Design Space Explorations with Existing or Automatically Learned Knowledge

During development, processor architectures can be tuned and configured by many different parameters. For benchmarking, automatic design space explorations (DSEs) with heuristic algorithms are a helpful approach to find the best settings for these parameters according to multiple objectives, e.g. performance, energy consumption, or real-time constraints. But if the setup is slightly changed and a new DSE has to be performed, it will start from scratch, resulting in very long evaluation times. To reduce the evaluation times we extend the NSGA-II algorithm in this article, such that automatic DSEs can be supported with a set of transformation rules defined in a highly readable format, the fuz…

research product

Multi-objective optimisations for a superscalar architecture with selective value prediction

This work extends an earlier manual design space ex ploration of our developed Selective Load Value Pre diction based superscalar architecture to the L2 unified cache. A fter that we perform an automatic design space expl oration using a special developed software tool by varying several architectural parameters. Our goal is to find optim al configurations in terms of CPI (Cycles per Instruction) and energy consumption. By varying 19 architectural parameter s, as we proposed, the design space is over 2.5 millions of billions configurations which obviously means that only heuristic search can be considered. Therefore, we propose dif ferent methods of automatic design space exploratio n based…

research product

Finding near-perfect parameters for hardware and code optimizations with automatic multi-objective design space explorations

Summary In the design process of computer systems or processor architectures, typically many different parameters are exposed to configure, tune, and optimize every component of a system. For evaluations and before production, it is desirable to know the best setting for all parameters. Processing speed is no longer the only objective that needs to be optimized; power consumption, area, and so on have become very important. Thus, the best configurations have to be found in respect to multiple objectives. In this article, we use a multi-objective design space exploration tool called Framework for Automatic Design Space Exploration (FADSE) to automatically find near-optimal configurations in …

research product

A Visual Simulation Framework For Simultaneous Multithreading Architectures

The computing systems, and particularly microarchitectures, are in a continuous expansion reaching an unmanageable complexity by the human mind. In order to understand and control this expansion, researchers need to design and implement larger and more complex systems’ simulators. In the current paradigm the simulators play the key role in going further, by translating all complex processing mechanisms in relevant and easy to understand information. This paper aims to make a suggestive description of the concepts and principles implemented into a Simultaneous Multithreading Architecture. We introduce the SMTAHSim framework, an educational tool that simulates in an interactive manner the imp…

research product

A Comparison of Multi-objective Algorithms for the Automatic Design Space Exploration of a Superscalar System

In today’s computer architectures the design spaces are huge, thus making it very difficult to find optimal configurations. One way to cope with this problem is to use Automatic Design Space Exploration (ADSE) techniques. We developed the Framework for Automatic Design Space Exploration (FADSE) which is focused on microarchitectural optimizations. This framework includes several state-of-the art heuristic algorithms.

research product

Understanding Prediction Limits Through Unbiased Branches

The majority of currently available branch predictors base their prediction accuracy on the previous k branch outcomes. Such predictors sustain high prediction accuracy but they do not consider the impact of unbiased branches which are difficult-to-predict. In this paper, we quantify and evaluate the impact of unbiased branches and show that any gain in prediction accuracy is proportional to the frequency of unbiased branches. By using the SPECcpu2000 integer benchmarks we show that there are a significant proportion of unbiased branches which severely impact on prediction accuracy (averaging between 6% and 24% depending on the prediction context used).

research product

Two-level branch prediction using neural networks

Dynamic branch prediction in high-performance processors is a specific instance of a general time series prediction problem that occurs in many areas of science. Most branch prediction research focuses on two-level adaptive branch prediction techniques, a very specific solution to the branch prediction problem. An alternative approach is to look to other application areas and fields for novel solutions to the problem. In this paper, we examine the application of neural networks to dynamic branch prediction. We retain the first level history register of conventional two-level predictors and replace the second level PHT with a neural network. Two neural networks are considered: a learning vec…

research product

Automatic multi-objective optimization of parameters for hardware and code optimizations

Recent computer architectures can be configured in lots of different ways. To explore this huge design space, system simulators are typically used. As performance is no longer the only decisive factor but also e.g. power usage or the resource usage of the system it became very hard for designers to select optimal configurations. In this article we use a multi-objective design space exploration tool called FADSE to explore the vast design space of the Grid Alu Processor (GAP) and its post-link optimizer called GAPtimize. We improved FADSE with techniques to make it more robust against failures and to speed up evaluations through parallel processing. For the GAP, we present an approximation o…

research product

Exploiting selective instruction reuse and value prediction in a superscalar architecture

In our previously published research we discovered some very difficult to predict branches, called unbiased branches. Since the overall performance of modern processors is seriously affected by misprediction recovery, especially these difficult branches represent a source of important performance penalties. Our statistics show that about 28% of branches are dependent on critical Load instructions. Moreover, 5.61% of branches are unbiased and depend on critical Loads, too. In the same way, about 21% of branches depend on MUL/DIV instructions whereas 3.76% are unbiased and depend on MUL/DIV instructions. These dependences involve high-penalty mispredictions becoming serious performance obstac…

research product

A task scheduling algorithm for HPC applications using colored stochastic Petri Net models

The increase in demand for High Performance Computing (HPC) scientific applications motivates the efforts to reduce costs of running these applications. The problem to solve is that of dynamical multi-criterial optimal scheduling of an application on a HPC platform with a high number of heterogeneous nodes. The solution proposed by the authors is a HPC hardware-software architecture that includes the infrastructure for two level (node and inter-node level) adaptive load balancing. The article presents the development of an Coloured Petri Net(CPN) for such an architecture. The model was used for the development of a dynamic distributed algorithm for the scheduling problem. The CPN allowed a …

research product

Multi-objective DSE algorithms' evaluations on processor optimization

Very complex micro-architectures, like complex superscalar/SMT or multicore systems, have lots of configurations. Exploring this huge design space and trying to optimize multiple objectives, like performance, power consumption and hardware complexity is a real challenge. In this paper, using the multi-objective design space exploration tool FADSE, we tried to optimize the hardware parameters of the complex superscalar Grid ALU Processor. We compared how different heuristic algorithms handle the DSE optimization. Three of these algorithms are taken from the jMetal library (NSGAII, SPEA2 and SMPSO) while the other two, CNSGAII and MOHC were implemented by us. We show that in this huge design …

research product

Enhancing the Sniper Simulator with Thermal Measurement

This paper presents the enhancement of the Sniper multicore / manycore simulator with thermal measurement possibilities using the HotSpot simulator. We present a plugin that interacts with Sniper to retrieve simulation data (integration areas and power consumptions) and calls HotSpot to compute the corresponding thermal results. The plugin also builds a two dimensional floorplan for the simulated microarchitecture. Furthermore we plan to integrate the simulation methodology presented here into an automatic design space exploration process using the multi-objective optimization tool called FADSE. Keywords—multicore; simulator; power consumption; thermal; HotSpot; Sniper

research product

Performance and energy optimisation in CPUs through fuzzy knowledge representation

Abstract This paper presents an automatic design space exploration using processor design knowledge for the multi-objective optimisation of a superscalar microarchitecture enhanced with selective load value prediction (SLVP). We introduced new important SLVP parameters and determined their influence regarding performance, energy consumption, and thermal dissipation. We significantly enlarged initial processor design knowledge expressed through fuzzy rules and we analysed its role in the process of automatic design space exploration. The proposed fuzzy rules improve the diversity and quality of solutions, and the convergence speed of the design space exploration process. Experiments show tha…

research product

Unbiased Branches: An Open Problem

The majority of currently available dynamic branch predictors base their prediction accuracy on the previous k branch outcomes. Such predictors sustain high prediction accuracy but they do not consider the impact of unbiased branches, which are difficult-to-predict. In this paper, we evaluate the impact of unbiased branches in terms of prediction accuracy on a range of branch difference predictors using prediction by partial matching, multiple Markov prediction and neural-based prediction. Since our focus is on the impact that unbiased branches have on processor performance, timing issues and hardware costs are out of scope of this investigation. Our simulation results, with the SPEC2000 in…

research product

Improving Computing Systems Automatic Multiobjective Optimization Through Meta-Optimization

This paper presents the extension of framework for automatic design space exploration (FADSE) tool using a meta-optimization approach, which is used to improve the performance of design space exploration algorithms, by driving two different multiobjective meta-heuristics concurrently. More precisely, we selected two genetic multiobjective algorithms: 1) non-dominated sorting genetic algorithm-II and 2) strength Pareto evolutionary algorithm 2, that work together in order to improve both the solutions’ quality and the convergence speed. With the proposed improvements, we ran FADSE in order to optimize the hardware parameters’ values of the grid ALU processor (GAP) micro-architecture from a b…

research product

Part-of-speech labeling for Reuters database

Even if the Vector Space Model used for document representation in information retrieval systems integrates a small quantity of knowledge it continues to be used due to its computational cost, speed execution and simplicity. We try to improve this document representation by adding some syntactic information such as the parts of speech. In this paper, we have evaluated three different tagging algorithms in order to select the most suitable tagger for using it to tag the Reuters dataset. In this work, we have evaluated the taggers using only five different parts of speech: noun, verb, adverb, adjective and others. We considered these particular tags being the most representative for describin…

research product