# Bufferless Routing in Optical Gaussian Macrochip Interconnect

Zhemin Zhang, Zhiyang Guo and Yuanyuan Yang Department of Electrical and Computer Engineering, Stony Brook University, Stony Brook, NY 11794, USA

IEEE Hot Interconnects August 22-23, 2012

## Outline

- Macrochip and bufferless routing in macrochip
- Gaussian network and Gaussian routing algorithm
- All-optical router design
- Performance evaluation
- Conclusion

# Macrochip

• The macrochip is a large piece of silicon substrate where multiple chips are embedded and interconnected by a network.

Macrochip Features:

- Each site contains processor, DRAM chip and optical bridge chip
- Etched optical waveguides
- Fully connected network



## Networks for Macrochip

- Problems with fully connected network.
- 1. Unacceptable wiring density, N(N-1) separate channels.
- 2. Low port bandwidth.
- 3. Long transmission delay.

- Challenges of adopting low-radix network.
- 1. No optical random access buffer (RAM).
- 2. O/E/O conversion is extremely powerhungry.
- 3. Circuit switching network is of low network utilization.

# **Bufferless Routing in Macrochip**

- Bufferless routing features.
- 1. Each incoming packet is assigned an output port.
- 2. Allow packets to be misrouted.
- 3. Avoidance of livelock.

- Requirements in macrochip.
- 1. Fast routing decision: hundreds of *ps*.
- 2. Small macrochip size.
- Execution time of Oldest First algorithm: 3 ns.

### Gaussian Network

- Gaussian network is a low-radix network defined in terms of Gaussian integer.
  - Related terminologies:
  - 1. Gaussian integer: Complex number with integral real and imaginary parts, for example, 4+3*i*.
  - 2. A Gaussian network generated by a+bi is denoted as  $G_{a+bi}$ .
  - G<sub>a+bi</sub> interconnection:
  - A link exists between  $n_1$  and  $n_2$  if  $n_1-n_2=i^j+(x+yi)(a+bi)$ , j=0, 1, 2 and 3.

# An Example of $G_{4+3i}$

• Nodes 6 and *i* are neighbors as  $i - 6 = i^0 + (-1+i)(4+3i)$ .



## Hamiltonian Cycles in G<sub>a+bi</sub>

- When a and b are coprime,  $G_{a+bi}$  can be decomposed into two edge-disjoint Hamiltonian cycles.
- Two Hamiltonian cycles in  $G_{4+3i}$ .



# Gaussian Routing Algorithm

- No output port contention if packets are routed along Hamiltonian cycles.
- Gaussian routing algorithm:
- 1. Shortest routing path in the absence of output port contention.
- 2. Packets failing output port contention are routed along one Hamiltonian cycle.
- Example of Gaussian routing from 0 to 2+i in  $G_{4+3i}$ .



## Gaussian Macrochip

- In Gaussian macrochip, each site is attached to a router. All routers are interconnected by a Gaussian network.
  - An example of Gaussian macrochip adopting G<sub>4+3i</sub>.



### All-Optical Router Design

#### Two states of microring resonator



#### WDM switch with multiple microring resonators



- Features:
- 1. Compact size ( a few μm)
- 2. Fast switching time
  (30 *ps*)
- 3. Low power consumption (0.5 *mW*)

## **All-Optical Router Architecture**

- RC: routing computation
- OA: output allocator
- SU: switching unit
- FDL: fiber delay line



### Hardware Implementation

Logic circuit controlling switching unit.

|      |       |       |       |       | (0,L) ————  |
|------|-------|-------|-------|-------|-------------|
| Ring | 0 ROP | 1 ROP | 2 ROP | 3 ROP | (0,1)       |
| 1    | L     | X     | X     | X     | (3,N)       |
| 2    | 1     | X     | N/0/3 | N     |             |
|      | 1     | N/2   | 1     | N     |             |
|      | 1     | X     | N     | 0     | (2,1) (2,N) |
|      | 1     | N/2   | 3     | 0     | (2,0)       |
|      | 1     | X     | X     | 2     | (2,3)       |
| 3    | 3     | N     | N/0/1 | X     | (0,1)       |
|      | 3     | N     | 3     | N/2   | (3,0)       |
|      | 3     | 0     | N     | X     | (2,3)       |
|      | 3     | 0     | 1     | N/2   | (2,N)       |
|      | 3     | 2     | X     | X     | (0,1)       |
|      |       |       |       |       | (3,2)       |

(0 T )

## Power Consumption Analysis

- Switching unit power consumption: 0.5 W
- SOA power consumption: 65 mW.

OPTICAL LOSS PARAMETERS FOR MACROCHIP

| Photonic Device     | Optical Loss (dB) |  |
|---------------------|-------------------|--|
| Modulator Insertion | 4                 |  |
| Waveguide (per cm)  | 0.05              |  |
| Pass by ring        | 0.005             |  |
| Drop into ring      | 1.5               |  |
| Coupler             | 1                 |  |
| Splitter            | 0.2               |  |

## **Performance Evaluation**

- Four types of macrochips:
- 1. Fully connected macrochip
- 2. H-Clos macrochip
- 3. B-Clos macrochip
- 4. Gaussian macrochip
- Simulation configuration:
- 1. 64 parallel optical waveguides
- 2. 64 wavelengths per optical waveguide
- 3. Off-chip laser power: 33 W
- 4. Thermal tuning power per microring resonator: 0.01 *mW*

### Average Packet Delay Performance

- Comparison among four types of macrochips.
- 1. Gaussian macrochip has much higher port bandwidth under equal wiring density.
- 2. The transmission delay is much lower for Gaussian macrochip.



# **Reserve Routing Algorithm**

- 1. Reserve routing algorithm has poor performance when packet size is small or burst length is short.
- 2. Large-size packet or long burst is needed to compensate for the path establishment overhead.



### Reserve Routing Algorithm

• Simulation results are similar in Gaussian macrochip with 61 sites.



## **Power Consumption Performance**

 Gaussian macrochip is more power efficient than B-Clos and H-Clos macrochips under heavy traffic load.



### Conclusion

- We propose Gaussian macrochip, which adopts low-radix Gaussian network.
- We propose Gaussian routing algorithm to overcome the lack of optical RAM.
- An all-optical router is designed to implement Gaussian routing algorithm in hardware.
- Gaussian macrochip supports higher communication bandwidth and achieves lower communication delay.
- Gaussian macrochip is more power-efficient than H-Clos and B-Clos macrochips under heavy traffic load.

### Thank You!

Acknowledgement

This research work was supported in part by the U.S. National Science Foundation under grant numbers CCF-0915823 and CCF-0915495.

Website: http://mcl.cewit.stonybrook.edu/