0000000000297048
AUTHOR
Francisco J. Alfaro
Evaluation of an Alternative for Increasing Switch Radix
In large switch-based interconnection networks, increasing the switch radix results in a decrease in the total number of network components. In this paper we evaluate an interesting strategy for building high-radix switches going beyond the integration scale bounds. This approach is independent of the evolution of single-chip switches and will remain valid as integration scale keeps evolving. Simulation results show that with a correct internal switch design, this kind of switches achieves almost the same performance as single-chip switches with the same radix, which would be unfeasible with current integration scale.
NoC Reconfiguration for CMP Virtualization
At NoC level, the traffic interferences can be drastically reduced by using virtualization mechanisms. An effective strategy to virtualize a NoC consists in dividing the network in different partitions, each one serving different applications and traffic flows. In this paper, we propose a NoC reconfiguration mechanism to support NoC virtualization under real scenarios. Dynamic reassignment of network resources to different partitions is allowed in order to NoC dynamically adapts to application needs. Evaluation results show a good behavior of CMP virtualization.
Optimal Configuration for N-Dimensional Twin Torus Networks
Torus topology is one of the most common topologies used in the current largest supercomputers. Although 3D torus is widely used, recently some supercomputers in the Top500 list have been built using networks with topologies of five or six dimensions. To obtain an nD torus, 2n ports per node are needed. These ports can be offered by a single or several cards per node. In the second case, there are multiple ways of assigning the dimension and direction of the card ports. In a previous work we proposed the 3D Twin (3DT) torus which uses two 4-port cards per node, and obtained the optimal port configuration. This paper extends and generalizes that work in order to obtain the optimal port confi…
VEF Traces: A Framework for Modelling MPI Traffic in Interconnection Network Simulators
Simulation is often used to evaluate the behaviour and measure the performance of computing systems. Specifically, in high-performance interconnection networks, the simulation has been extensively considered to verify the behaviour of the network itself and to evaluate its performance. In this context, network simulation must be fed with network traffic, also referred to as network workload, whose nature has been traditionally synthetic. These workloads can be used for the purpose of driving studies on network performance, but often such workloads are not accurate enough if a realistic evaluation is pursued. For this reason, other non-synthetic workloads have gained popularity over last dec…
Deadline-based QoS Algorithms for High-performance Networks
Quality of service (QoS) is becoming an attractive feature for high-performance networks and parallel machines because it could allow a more efficient use of resources. Deadline-based algorithms can provide powerful QoS provision. However, the cost associated with keeping ordered lists of packets makes them impractical for high-performance networks. In this paper, we explore how to adapt efficiently the earliest deadline first family of algorithms to the high-speed networks environments. The results show excellent performance using just two virtual channels, FIFO queues, and a cost feasible with today's technology.
C-switches: Increasing switch radix with current integration scale
In large switch-based interconnection networks, increasing the switch radix results in a decrease in the total number of network components, and consequently the overall cost of the network can be significantly reduced. Moreover, high-radix switches are an attractive option to improve the network performance in terms of latency, since hop count is also reduced. However, there are some problems related to the integration scale to design such single-chip switches. In this paper we discuss key issues and evaluate an interesting alternative for building high-radix switches going beyond the integration scale bounds. The idea basically consists in combining several current smaller single-chip swi…
Efficient Switches with QoS Support for Clusters
Current interconnect standards providing hardware support for quality of service (QoS) consider up to 16 virtual channels (VCs) for this purpose. However, most implementations do not offer so many VCs because they increase the complexity of the switch and the scheduling delays. We have shown that this number of VCs can be significantly reduced, because it is enough to use two VCs for QoS purposes at each switch port. In this paper, we cover the weaknesses of that proposal and, not only we reduce VCs, but we also improve performance due to the flexibility assigning buffer memory.
Exploring NoC Virtualization Alternatives in CMPs
Chip Multiprocessor systems (CMPs) contain more and more cores in every new generation. However, applications for these systems do not scale at the same pace. Thus, in order to obtain a good utilization several applications will need to coexist in the system and in those cases virtualization of the CMP system will become mandatory. In this paper we analyze two virtualization strategies at NoC-level aiming to isolate the traffic generated by each application to reduce or even eliminate interferences among messages belonging to different applications. The first model handles most interferences among messages with a virtual-channels (VCs) implementation minimizing both execution time and netwo…