Flit (computer networking)

In computer networking, a flit (flow control unit or flow control digit) is a link-level atomic piece that forms a network packet or stream.^[1] The first flit, called the header flit holds information about this packet's route (namely the destination address) and sets up the routing behavior for all subsequent flits associated with the packet. The header flit is followed by zero or more body flits, containing the actual payload of data. The final flit, called the tail flit, performs some book keeping to close the connection between the two nodes.

A virtual connection holds the state needed to coordinate the handling of the flits of a packet. At a minimum, this state identifies the output port of the current node for the next hop of the route and the state of the virtual connection (idle, waiting for resources, or active). The virtual connection may also include pointers to the flits of the packet that are buffered on the current node and the number of flit buffers available on the next node.^[2]^: 237

Interconnect network: basics

The growing need for performance from computing systems drove the industry into the multi-core and many-core arena. In this setup, the execution of a kernel (a program) is split across multiple processors and the computation happens in parallel, thus ensuring performance with respect to execution time. This however implies that the processors must now be able to communicate with each other and exchange data and control signals seamlessly. One straightforward approach is the bus based interconnect, a group of wires connecting all the processors. This approach is however not scalable as the number of processors in the system increase.^{[citation needed]} Hence, a scalable high performance interconnect network lies at the core of parallel computer architecture.

Basic network terminologies and background

Definitions of an interconnection network

The formal definition of an interconnection network

"An interconnection network I is represented by a strongly connected directed multigraph, I = G(N,C). The set of vertices of the multigraph N includes the set of processing element nodes P and the set of router nodes RT. The set of arcs C represents the set of unidirectional channels (possibly virtual) that connect either the processing elements to the routers or the routers to each other".^[3]

The primary expectation of an interconnection network is to have as low a latency as possible, that is the time taken to transfer a message from one node to another should be minimal, while allowing a large number of such transactions to take place concurrently.^[4] As with any other engineering design trade offs, the interconnection network must accomplish these traits while keeping the cost of implementation as low as possible. Having discussed what is expected of a network, let us look at a few design points that can be tweaked to obtain the necessary performance.

The basic building blocks of an interconnection network are its topology, routing algorithm, switching strategy and the flow control mechanism.

Topology: This refers to the general infrastructure of the interconnection network; the pattern in which multiple processors are connected. This pattern could either be regular or irregular, though many multi-core architectures today use highly regular interconnection networks.

Routing algorithm: This determines which path the message must take in order to ensure delivery to the destination node. The choice of the path is based on multiple metrics such as latency, security and number of nodes involved etc. There are many different routing algorithms, providing different guarantees and offering different performance trade-offs.

Switching strategy: The routing algorithm only determines the path that a message must take to reach its destination node. The actual traversal of the message within the network is the responsibility of the switching strategy. There are basically two types of switching strategies, a circuit switched network is a network where a path is reserved and blocked off from other messages, till the message is delivered to its destination node. A famous example of circuit switched network is the telephone services, which establish a circuit through many switches for a call. The alternative approach is the packet switched network where messages are broken down into smaller compact entities called packets. Each packet contains a part of data in addition to a sequence number. This implies that each packet can now be transferred individually and assembled at the destination based on the sequence number.

Flow control: Note that we have previously established the fact that multiple messages can flow through the interconnect network at any given time. It is the responsibility of the flow control mechanism implemented at the router level to decide which message gets to flow and which message is held back.

Characteristics and metrics of a network

Every network has a width w, and a transmission rate f, which decide the bandwidth of a network as b = w*f. The amount of data transferred in a single cycle is called a physical unit or phit. As is observable, the width of a network is also equal to the phit size. Hence the bandwidth of the network can also be defined in terms of phit/sec. Each message to be transferred can be broken down into smaller chunks of fixed length entities called packets. Packets may in turn be broken down into message flow control units or flits.

The need for flits

It is important to note that flits represent logical units of information, while phits represent the physical domain, that is, phits represent the number of bits that can be transferred in parallel in a single cycle. Consider the Cray T3D.^[5] It has an interconnection network which uses flit level message flow control wherein each flit is composed of eight 16-bit phits. That means its flit size is 128bits and phit size is 16bits. Also consider the IBM SP2 switch.^[6] It also uses the flit level message flow control, but its flit size is equal to its phit size, which is set to 8 bits.

Flit width determination

Note that the message size is the dominant deciding factor (among many others) in deciding the flit widths. Based on the message size, there are two conflicting design choices:

Keeping the size of each packet small, in which case the number of packets will increase, thus increasing the total number of transactions, while decreasing the size of each individual transaction.
Keeping the size of each packet large, in which case the number of packets will decrease, thus decreasing the total number of transactions, while increasing the size of each individual transaction.

Based on the size of the packets, the width of the physical link between two routers has to be decided. Meaning, if the packet size is large, the link width also has to be kept large, however, a larger link width implies more area and higher power dissipation. In general, link widths are kept to a minimum. The link width (which also decides the phit width) now factors into deciding the flit width.^[7]

At this point, it is important to note that though inter-router transfers are necessarily constructed in terms of phits, the switching techniques deal in terms of flits.^[7] For more details regarding the various switching techniques refer wormhole switching and cut-through switching. Since the majority of switching techniques work on flits, they also have a major impact in deciding the flit width. Other determining factors include reliability, performance and implementation complexity.

Example

An example of how flits works in a network

Consider an example of how packets are transmitted in terms of flits. In this case we have a packet transmitting between A and B in the figure. The packet transmitting process is happening in the following steps.

The packet will be split into flits W, X, Y and Z.
The transmit buffer in A will load the first flit Z and send it to B.
After B receive Z, B will move the flit out of the buffer.
The transmit buffer in A will then load the next flit Y and send it to B.
Keep performing the above actions until all flits has been transmitted to B.
B will then put together all the flits to get the whole packet.

Summary

A flit (flow control units/digits) is a unit amount of data when the message is transmitting in link-level. The flit can be accepted or rejected at the receiver side based on the flow control protocol and the size of the receive buffer. The mechanism of link-level flow control is allowing the receiver to send a continuous signals stream to control if it should keep sending flits or stop sending flits. When a packet is transmitted over a link, the packet will need to be split into multiple flits before the transmitting begin.^{[citation needed]}

References

^ "Archived copy" (PDF). Archived from the original (PDF) on 2015-03-20. Retrieved 2018-10-25.{{cite web}}: CS1 maint: archived copy as title (link)
^ William James Dally; Brian Towles (2004). "13.2.1". Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers, Inc. ISBN 978-0-12-200751-4.
^ Duato, J.; Lysne, O.; Pang, R.; Pinkston, T. M. (2005-05-01). "A theory for deadlock-free dynamic network reconfiguration. Part I". IEEE Transactions on Parallel and Distributed Systems. 16 (5): 412–427. doi:10.1109/TPDS.2005.58. ISSN 1045-9219. S2CID 15354425.
^ Elsevier. "Parallel Computer Architecture - 1st Edition". www.elsevier.com. Retrieved 2016-12-03.
^ Scott, Steven L.; Thorson, Greg (1994-01-01). "Optimized Routing in the Cray T3D". Proceedings of the First International Workshop on Parallel Computer Routing and Communication. PCRCW '94. 853. London, UK, UK: Springer-Verlag: 281–294. doi:10.1007/3-540-58429-3_44. ISBN 978-3540584292.
^ "The communication software and parallel environment of the IBM SP2". domino.research.ibm.com. 2001-02-23. Retrieved 2016-11-29.
^ ^a ^b Duato, Jose (2011-08-06). Interconnection Networks. Morgan Kaufmann. ISBN 9780123991805.

[1] "Archived copy" (PDF). Archived from the original (PDF) on 2015-03-20. Retrieved 2018-10-25.{{cite web}}: CS1 maint: archived copy as title (link)

[Dally_and_Towles-2] William James Dally; Brian Towles (2004). "13.2.1". Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers, Inc. ISBN 978-0-12-200751-4.

[3] Duato, J.; Lysne, O.; Pang, R.; Pinkston, T. M. (2005-05-01). "A theory for deadlock-free dynamic network reconfiguration. Part I". IEEE Transactions on Parallel and Distributed Systems. 16 (5): 412–427. doi:10.1109/TPDS.2005.58. ISSN 1045-9219. S2CID 15354425.

[4] Elsevier. "Parallel Computer Architecture - 1st Edition". www.elsevier.com. Retrieved 2016-12-03.

[5] Scott, Steven L.; Thorson, Greg (1994-01-01). "Optimized Routing in the Cray T3D". Proceedings of the First International Workshop on Parallel Computer Routing and Communication. PCRCW '94. 853. London, UK, UK: Springer-Verlag: 281–294. doi:10.1007/3-540-58429-3_44. ISBN 978-3540584292.

[6] "The communication software and parallel environment of the IBM SP2". domino.research.ibm.com. 2001-02-23. Retrieved 2016-11-29.

[:1-7] Duato, Jose (2011-08-06). Interconnection Networks. Morgan Kaufmann. ISBN 9780123991805.

[1]

[2]

[3]

[4]

[5]

[6]

[7]