0000000000115576
AUTHOR
Jose Flich
Addressing Manufacturing Challenges with Cost-Efficient Fault Tolerant Routing
The high-performance computing domain is enriching with the inclusion of Networks-on-chip (NoCs) as a key component of many-core (CMPs or MPSoCs) architectures. NoCs face the communication scalability challenge while meeting tight power, area and latency constraints. Designers must address new challenges that were not present before. Defective components, the enhancement of application-level parallelism or power-aware techniques may break topology regularity, thus, efficient routing becomes a challenge.In this paper, uLBDR (Universal Logic-Based Distributed Routing) is proposed as an efficient logic-based mechanism that adapts to any irregular topology derived from 2D meshes, being an alter…
Cost-Effective Congestion Management for Interconnection Networks Using Distributed Deterministic Routing
The Interconnection networks are essential elements in current computing systems. For this reason, achieving the best network performance, even in congestion situations, has been a primary goal in recent years. In that sense, there exist several techniques focused on eliminating the main negative effect of congestion: the Head of Line (HOL) blocking. One of the most successful HOL blocking elimination techniques is RECN, which can be applied in source routing networks. FBICM follows the same approach as RECN, but it has been developed for distributed deterministic routing networks. Although FBICM effectively eliminates HOL blocking, it requires too much resources to be implemented. In this …
On the impact of within-die process variation in GALS-Based NoC Performance
[EN] Current integration scales allow designing chip multiprocessors (CMP), where cores are interconnected by means of a network-on-chip (NoC). Unfortunately, the small feature size of current integration scales causes some unpredictability in manufactured devices because of process variation. In NoCs, variability may affect links and routers causing them not to match the parameters established at design time. In this paper, we first analyze the way that manufacturing deviations affect the components of a NoC by applying a new comprehensive and detailed within-die variability model to 200 instances of an 8¿8 mesh NoC synthesized using 45 nm technology. Later, we show that GALS-based NoCs pr…
NoC Reconfiguration for CMP Virtualization
At NoC level, the traffic interferences can be drastically reduced by using virtualization mechanisms. An effective strategy to virtualize a NoC consists in dividing the network in different partitions, each one serving different applications and traffic flows. In this paper, we propose a NoC reconfiguration mechanism to support NoC virtualization under real scenarios. Dynamic reassignment of network resources to different partitions is allowed in order to NoC dynamically adapts to application needs. Evaluation results show a good behavior of CMP virtualization.
Combining congested-flow isolation and injection throttling in HPC interconnection networks
Existing congestion control mechanisms in interconnects can be divided into two general approaches. One is to throttle traffic injection at the sources that contribute to congestion, and the other is to isolate the congested traffic in specially designated resources. These two approaches have different, but non-overlapping weaknesses. In this paper we present in detail a method that combines injection throttling and congested-flow isolation. Through simulation studies we first demonstrate the respective flaws of the injection throttling and of flow isolation. Thereafter we show that our combined method extracts the best of both approaches in the sense that it gives fast reaction to congesti…
Cost-Efficient On-Chip Routing Implementations for CMP and MPSoC Systems
[EN] The high-performance computing domain is enriching with the inclusion of networks-on-chip (NoCs) as a key component of many-core (CMPs or MPSoCs) architectures. NoCs face the communication scalability challenge while meeting tight power, area, and latency constraints. Designers must address new challenges that were not present before. Defective components, the enhancement of application-level parallelism, or power-aware techniques may break topology regularity, thus, efficient routing becomes a challenge. This paper presents universal logic-based distributed routing (uLBDR), an efficient logic-based mechanism that adapts to any irregular topology derived from 2-D meshes, instead of usi…
A New Scalable and Cost-Effective Congestion Management Strategy for Lossless Multistage Interconnection Networks
In this paper, we propose a new congestion management strategy for lossless multistage interconnection networks that scales as network size and/or link bandwidth increase. Instead of eliminating congestion, our strategy avoids performance degradation beyond the saturation point by eliminating the HOL blocking produced by congestion trees. This is achieved in a scalable manner by using separate queues for congested flows. These are dynamically allocated only when congestion arises, and deallocated when congestion subsides. Performance evaluation results show that our strategy responds to congestion immediately and completely eliminates the performance degradation produced by HOL blocking whi…
On the potential of NoC virtualization for multicore chips
As the end of Moores-law is on the horizon, power becomes a limiting factor to continuous increases in performance gains for single-core processors. Processor engineers have shifted to the multicore paradigm and many-core processors are a reality. Within the context of these multi-core chips, three key metrics point themselves out as being of major importance, performance, fault-tolerance (including yield), and power consumption. A solution that optimizes all three of these metrics is challenging. As the number of cores increases the importance of the interconnection network-on-chip (NoC) grows as well, and chip designers should aim to optimize these three key metrics in the NoC context as …
Logic-Based Distributed Routing for NoCs
The design of scalable and reliable interconnection networks for multicore chips (NoCs) introduces new design constraints like power consumption, area, and ultra low latencies. Although 2D meshes are usually proposed for NoCs, heterogeneous cores, manufacturing defects, hard failures, and chip virtualization may lead to irregular topologies. In this context, efficient routing becomes a challenge. Although switches can be easily configured to support most routing algorithms and topologies by using routing tables, this solution does not scale in terms of latency and area. We propose a new circuit that removes the need for using routing tables. The new mechanism, referred to as logic-based dis…
Exploring NoC Virtualization Alternatives in CMPs
Chip Multiprocessor systems (CMPs) contain more and more cores in every new generation. However, applications for these systems do not scale at the same pace. Thus, in order to obtain a good utilization several applications will need to coexist in the system and in those cases virtualization of the CMP system will become mandatory. In this paper we analyze two virtualization strategies at NoC-level aiming to isolate the traffic generated by each application to reduce or even eliminate interferences among messages belonging to different applications. The first model handles most interferences among messages with a virtual-channels (VCs) implementation minimizing both execution time and netwo…
An Efficient Implementation of Distributed Routing Algorithms for NoCs
The design of NoCs for multi-core chips introduces new design constraints like power consumption, area, and ultra low latencies. Although 2D meshes are preferred, heterogeneous blocks, fabrication faults, reliability issues, and chip virtualization may lead to the need of irregular topologies or regions. In this situation, efficient routing becomes a challenge. Although the use of routing tables at switches is flexible, it does not scale in terms of latency and area due to its memory requirements. LBDR (logic-based distributed routing) is proposed as a new routing method that removes the need of using routing tables at all. LBDR enables the implementation of many routing algorithms on most …