1 |
127 |
guanucolui |
|
2 |
|
|
<!-- saved from url=(0054)http://arch.ece.gatech.edu/research/3dmaps/3dmaps.html -->
|
3 |
|
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
4 |
|
|
<title>3D-MAPS Many-Core Processor</title>
|
5 |
|
|
<meta name="keywords" content="3D IC, Georgia Tech, 3D MAPS, 3D architecture, 3-D IC, 3D-MAPS">
|
6 |
|
|
<meta description="description" content="3D IC architecture research using die stacking and TSV at Georgia Tech">
|
7 |
|
|
<meta name="author" content="Hsien-Hsin Lee">
|
8 |
|
|
|
9 |
|
|
<style type="text/css">
|
10 |
|
|
A:link{text-decoration:none}
|
11 |
|
|
A:visited{text-decoration:none}
|
12 |
|
|
TD{font-family: verdana; font-size: 10pt}
|
13 |
|
|
</style></head>
|
14 |
|
|
|
15 |
|
|
|
16 |
|
|
|
17 |
|
|
<body bgcolor="white" text="black" link="red" vlink="red" alink="red">
|
18 |
|
|
|
19 |
|
|
<font face="verdana" size="2">
|
20 |
|
|
|
21 |
|
|
|
22 |
|
|
|
23 |
|
|
|
24 |
|
|
<h3>
|
25 |
|
|
<font color="blue">
|
26 |
|
|
The 3D-MAPS Many-Core Processor</font></h3>
|
27 |
|
|
|
28 |
|
|
|
29 |
|
|
<h5>
|
30 |
|
|
<font color="green">
|
31 |
|
|
Announcements
|
32 |
|
|
</font>
|
33 |
|
|
</h5>
|
34 |
|
|
|
35 |
|
|
|
36 |
|
|
<ul>
|
37 |
|
|
<li> 10/24/11: We taped out 3D-MAPS V2 to MOSIS/Tezzaron MPW run.
|
38 |
|
|
|
39 |
|
|
</li><li> 02/21/12: 3D-MAPS V1 is presented at ISSCC 2012. Here is our <a href="http://arch.ece.gatech.edu/pub/isscc12.pdf" target="new">
|
40 |
|
|
paper</a> and <a href="http://arch.ece.gatech.edu/present/isscc12.pdf" target="new">presentation</a>.
|
41 |
|
|
|
42 |
|
|
</li></ul>
|
43 |
|
|
|
44 |
|
|
|
45 |
|
|
|
46 |
|
|
<h5>
|
47 |
|
|
<font color="green">
|
48 |
|
|
Media Coverages
|
49 |
|
|
</font>
|
50 |
|
|
</h5>
|
51 |
|
|
|
52 |
|
|
<ul>
|
53 |
|
|
<li> <a href="http://www.i-micronews.com/news/3D-MAPS-multicore-processor-closer-look,8706.html" target="new">3D-MAPS multicore processor: A closer look</a> (I-Micronews)
|
54 |
|
|
|
55 |
|
|
</li><li> <a href="http://eda360insider.wordpress.com/2012/03/01/3d-thursday-three-on-3d-papers-from-isscc/" target="new">3D Thursday: Threee on 3D --- papers from ISSCC</a> (EDA360 Insider)
|
56 |
|
|
|
57 |
|
|
</li><li> <a href="http://www.electroiq.com/blogs/insights_from_leading_edge/2012/04/iftle-97-date-in-dresden-synopsys-3d-eda-solution.html" target="new">3D EDA Solution</a> (Solid State Technology)
|
58 |
|
|
|
59 |
|
|
</li><li> <a href="http://www.electroiq.com/blogs/insights_from_leading_edge/2012/03/iftle-93-2-5-3d-at-the-2012-ieee-isscc.html" target="new">IFTLE 93 2.5 / 3D at the 2012 IEEE ISSCC</a> (Solid State Technology)
|
60 |
|
|
|
61 |
|
|
</li><li> <a href="http://www.eetimes.com/electronics-news/4236570/ISSCC--Picture-from-a-silicon-exhibition?pageNumber=1" target="new">ISSCC: Pictures from a silicon exhibition</a> (EE Times)
|
62 |
|
|
|
63 |
|
|
</li><li> <a href="http://www.theregister.co.uk/2011/12/22/isscc_2012_chip_preview/" target="new">
|
64 |
|
|
New chippery on parade at ISSCC: CPU and memory makers strut their stuff</a>
|
65 |
|
|
(The Register)
|
66 |
|
|
|
67 |
|
|
</li><li> <a href="http://www.theregister.co.uk/2012/02/24/3d_chips/page3.html" target="new">
|
68 |
|
|
Real apps, real benchmarks. Georgia Institute of Technology's "3D-MAPS: 3D
|
69 |
|
|
massively parallel processor with stacked memory"</a> (The Register)
|
70 |
|
|
|
71 |
|
|
|
72 |
|
|
</li></ul>
|
73 |
|
|
|
74 |
|
|
|
75 |
|
|
<h5>
|
76 |
|
|
<font color="green">
|
77 |
|
|
Overview
|
78 |
|
|
</font>
|
79 |
|
|
</h5>
|
80 |
|
|
|
81 |
|
|
|
82 |
|
|
3D-MAPS (3D <u>MA</u>ssively <u>P</u>arallel processor with <u>S</u>tacked
|
83 |
|
|
memory) V1 is a logic+memory 2-tier 3D IC, where the logic die consists of 64
|
84 |
|
|
general purpose processor cores running at 277MHz, and the memory die contains
|
85 |
|
|
256KB SRAM. <font color="red"> This 3D IC is arguably the FIRST many-core general
|
86 |
|
|
purpose 3D processor developed in academia.</font> This 3D processor achieves up
|
87 |
|
|
to 64GB/s memory bandwidth while consuming 5W power. This project is led by
|
88 |
|
|
<a href="http://users.ece.gatech.edu/limsk/">Prof. Sung Kyu Lim</a> (PI) and <a href="http://users.ece.gatech.edu/~leehs/">Prof. Hsien-Hsin Lee</a> (co-PI) from the Georgia
|
89 |
|
|
Institute of Technology and Dr. Gabriel Loh (co-PI) from AMD with funding from
|
90 |
|
|
the US Department of Defense. There have been 20+ students involved in this
|
91 |
|
|
project working on architecture, programming, CAD tools, circuit and physical
|
92 |
|
|
design, packaging, board design, and testing. Our collaborators include KAIST,
|
93 |
|
|
Tezzaron, Amkor Inc, and Board Lab.
|
94 |
|
|
|
95 |
|
|
|
96 |
|
|
<p><img src="./3D-MAPS Many-Core Processor_files/team.JPG"></p><p>
|
97 |
|
|
|
98 |
|
|
</p><p>The fabrication of this chip is completed in July 2011 using the 130nm
|
99 |
|
|
GlobalFoundies device technology and 1.2um TSV diameter Tezzaron technology. The
|
100 |
|
|
packaging is completed in August 2011 by Amkor. 8 parallel applications are
|
101 |
|
|
developed to demonstrate the bandwidth and power benefit of 3D MAPS processor.
|
102 |
|
|
This processor contains 33M transistors, 50K TSVs, and 50K face-to-face
|
103 |
|
|
connections in 5mm x 5mm footprint and 0.8mm thickness.
|
104 |
|
|
|
105 |
|
|
</p><p>The core architecture is developed from scratch by our architecture team to
|
106 |
|
|
benefit from single-cycle access to SRAM. One of the two instructions we issue
|
107 |
|
|
in one cycle can be memory read/write, so it is possible to access memory at
|
108 |
|
|
every clock cycle. Our RTL-to-GDSII tool chain is based on commercial tools from
|
109 |
|
|
Synopsys, Cadence, and Mentor Graphics. Since these tools can only handle 2D
|
110 |
|
|
ICs, we have developed plug-ins to handle TSVs and 3D stacking.
|
111 |
|
|
|
112 |
|
|
</p><p> Here is our <a href="http://arch.ece.gatech.edu/pub/cicc10.pdf" target="new">
|
113 |
|
|
CICC 2010</a> and <a href="http://arch.ece.gatech.edu/pub/3dtest10.pdf" target="new">
|
114 |
|
|
3D-TEST 2010</a> papers on 3D-MAPS V1.
|
115 |
|
|
|
116 |
|
|
</p><p>We are currently working on 3D-MAPS V2 that features 128 cores and 2GB DRAM
|
117 |
|
|
stacked in 5 dies. Here are the differences:</p><p>
|
118 |
|
|
|
119 |
|
|
<table border="1">
|
120 |
|
|
<tbody><tr>
|
121 |
|
|
<td></td>
|
122 |
|
|
<td align="center"><font color="blue">3D-MAPS V1</font></td>
|
123 |
|
|
<td align="center"><font color="blue">3D-MAPS V2</font></td>
|
124 |
|
|
</tr>
|
125 |
|
|
<tr>
|
126 |
|
|
<td># of tiers</td>
|
127 |
|
|
<td align="center">2, one logic and one SRAM</td>
|
128 |
|
|
<td align="center">5, two logic and three DRAM</td>
|
129 |
|
|
</tr>
|
130 |
|
|
<tr>
|
131 |
|
|
<td># of cores</td>
|
132 |
|
|
<td align="center">64</td>
|
133 |
|
|
<td align="center">128</td>
|
134 |
|
|
</tr>
|
135 |
|
|
<tr>
|
136 |
|
|
<td>logic footprint</td>
|
137 |
|
|
<td align="center">5mm x 5mm</td>
|
138 |
|
|
<td align="center">10mm x 10mm</td>
|
139 |
|
|
</tr>
|
140 |
|
|
<tr>
|
141 |
|
|
<td>DRAM footprint</td>
|
142 |
|
|
<td align="center">-</td>
|
143 |
|
|
<td align="center">20mm x 12mm</td>
|
144 |
|
|
</tr>
|
145 |
|
|
<tr>
|
146 |
|
|
<td>device technology</td>
|
147 |
|
|
<td align="center">130nm, Globalfoundries</td>
|
148 |
|
|
<td align="center">130nm, Globalfoundries</td>
|
149 |
|
|
</tr>
|
150 |
|
|
<tr>
|
151 |
|
|
<td>bonding style</td>
|
152 |
|
|
<td align="center">face-to-face</td>
|
153 |
|
|
<td align="center">face-to-face & face-to-back</td>
|
154 |
|
|
</tr>
|
155 |
|
|
<tr>
|
156 |
|
|
<td>TSV technology</td>
|
157 |
|
|
<td align="center">Tezzaron, 1.2um diam</td>
|
158 |
|
|
<td align="center">Tezzaron, 1.2um diam</td>
|
159 |
|
|
</tr>
|
160 |
|
|
</tbody></table>
|
161 |
|
|
|
162 |
|
|
|
163 |
|
|
</p><h5>
|
164 |
|
|
<font color="green">
|
165 |
|
|
3D-MAPS V1 Specifications
|
166 |
|
|
</font>
|
167 |
|
|
</h5>
|
168 |
|
|
|
169 |
|
|
<p><img src="./3D-MAPS Many-Core Processor_files/spec.jpg"></p><p>
|
170 |
|
|
|
171 |
|
|
|
172 |
|
|
</p><h5>
|
173 |
|
|
<font color="green">
|
174 |
|
|
3D-MAPS V1 Measurement Results
|
175 |
|
|
</font>
|
176 |
|
|
</h5>
|
177 |
|
|
|
178 |
|
|
3D-MAPS V1 supports 42 instructions, and we wrote 8 parallel applications and
|
179 |
|
|
ran them on our chip. Here are the memory bandwidth and power measurement
|
180 |
|
|
results.<p>
|
181 |
|
|
|
182 |
|
|
<table border="1">
|
183 |
|
|
<tbody><tr>
|
184 |
|
|
<td>application</td>
|
185 |
|
|
<td align="center">memory BW (GB/s)</td>
|
186 |
|
|
<td align="center">power consumption (W)</td>
|
187 |
|
|
</tr>
|
188 |
|
|
<tr>
|
189 |
|
|
<td>AES encryption</td>
|
190 |
|
|
<td align="center">49.5</td>
|
191 |
|
|
<td align="center">4.032</td>
|
192 |
|
|
</tr>
|
193 |
|
|
<tr>
|
194 |
|
|
<td>edge detection</td>
|
195 |
|
|
<td align="center">15.6</td>
|
196 |
|
|
<td align="center">3.768</td>
|
197 |
|
|
</tr>
|
198 |
|
|
<tr>
|
199 |
|
|
<td>histogram</td>
|
200 |
|
|
<td align="center">30.3</td>
|
201 |
|
|
<td align="center">3.588</td>
|
202 |
|
|
</tr>
|
203 |
|
|
<tr>
|
204 |
|
|
<td>k-means clustering</td>
|
205 |
|
|
<td align="center">40.6</td>
|
206 |
|
|
<td align="center">4.014</td>
|
207 |
|
|
</tr>
|
208 |
|
|
<tr>
|
209 |
|
|
<td>matrix multiply</td>
|
210 |
|
|
<td align="center">13.8</td>
|
211 |
|
|
<td align="center">3.789</td>
|
212 |
|
|
</tr>
|
213 |
|
|
<tr>
|
214 |
|
|
<td>median filter</td>
|
215 |
|
|
<td align="center"><font color="red">63.8</font></td>
|
216 |
|
|
<td align="center"><font color="red">4.007</font></td>
|
217 |
|
|
</tr>
|
218 |
|
|
<tr>
|
219 |
|
|
<td>motion estimation</td>
|
220 |
|
|
<td align="center">24.1</td>
|
221 |
|
|
<td align="center">3.830</td>
|
222 |
|
|
</tr>
|
223 |
|
|
<tr>
|
224 |
|
|
<td>string search</td>
|
225 |
|
|
<td align="center">8.9</td>
|
226 |
|
|
<td align="center">3.876</td>
|
227 |
|
|
</tr>
|
228 |
|
|
</tbody></table>
|
229 |
|
|
|
230 |
|
|
</p><p>The theoretical maximum memory bandwidth 3D-MAPS V1 can achieve is 70.9GB/s,
|
231 |
|
|
which is computed by 277MHz x 64 (cores) x 4 Bytes (1 word). One of our
|
232 |
|
|
applications, median filter, got very close to this theoretical value at the
|
233 |
|
|
lowest power consumption. As a comparison, here are the maximum achievable
|
234 |
|
|
bandwidth values of the state-of-the-art processor and memory technology (as of
|
235 |
|
|
Sep 2011):
|
236 |
|
|
</p><ul>
|
237 |
|
|
<li> Intel i7 Extreme Edition + Samsung DDR3 1600 MHz = 1600 MHz x 2 ch x 8
|
238 |
|
|
Bytes = 25.6 GB/s
|
239 |
|
|
</li><li> Intel Xeon E7 + Samsung DDR3 1066 MHz = 1066 MHz x 4 ch x 8 Bytes = 34.1
|
240 |
|
|
GB/s
|
241 |
|
|
</li></ul>
|
242 |
|
|
|
243 |
|
|
3D-MAPS V1 is fabricated in 130nm technology in 5mm x 5mm footprint. If 3D-MAPS
|
244 |
|
|
V1 is fabricated in 45nm in 15mm x 15mm footprint (as in Intel i7), the maximum
|
245 |
|
|
memory bandwidth skyrockets as follows:
|
246 |
|
|
|
247 |
|
|
<ul>
|
248 |
|
|
<li> 277 MHz X 5 (speedup from 45nm) x 64 ch x 9 (more area) x 4 (smaller cores)
|
249 |
|
|
x 4 Bytes = <font color="red">12,764 GB/s</font>
|
250 |
|
|
</li></ul>
|
251 |
|
|
This truly demonstrates the enormous memory bandwidth benefit of core+memory 3D
|
252 |
|
|
IC.
|
253 |
|
|
|
254 |
|
|
<h5>
|
255 |
|
|
<font color="green">
|
256 |
|
|
3D-MAPS V1 Photos
|
257 |
|
|
</font>
|
258 |
|
|
</h5>
|
259 |
|
|
|
260 |
|
|
<font color="blue">
|
261 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig8.jpg"><br>FIGURE 1: This is the stacking information of
|
262 |
|
|
3D-MAPS V1. We use bonding wires and TSVs for the package-to-chip signal and P/G
|
263 |
|
|
delivery. Chip-to-chip communication is done using F2F pads. The bonding wires
|
264 |
|
|
did not break the TSVs underneath during manufacturing. Each IO cell contains
|
265 |
|
|
204 redundant TSVs.<p>
|
266 |
|
|
|
267 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig1.JPG"><br><font color="blue">FIGURE 2: The topside of
|
268 |
|
|
3D-MAPS V1 is actually the backside of the core die that is thinned down to
|
269 |
|
|
12um. With bare eyes, we can only see dummy TSVs and IO cells.</font></p><p><font color="blue">
|
270 |
|
|
|
271 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig3.JPG"><br>FIGURE 3: SEM image of Tezzaron TSVs and
|
272 |
|
|
face-to-face bond pads.</font></p><p><font color="blue">
|
273 |
|
|
|
274 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig10.jpg"><br>FIGURE 4: More SEM images of TSVs and F2F
|
275 |
|
|
pads.</font></p><p><font color="blue">
|
276 |
|
|
|
277 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig4.png"><br>FIGURE 5: The above image is obtained using an
|
278 |
|
|
infrared microscope with 6um depth. Since the top surface of 3D-MAPS V1 is the
|
279 |
|
|
thinned substrate of top die, we had to use an IR microscope to reveal the
|
280 |
|
|
circuitry that is buried under this substrate. The white dots are dummy TSVs we
|
281 |
|
|
had to add to satisfy the TSV density rule set by Tezzaron.</font></p><p><font color="blue">
|
282 |
|
|
|
283 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig7.jpg"><br>FIGURE 6: Some details of single core and
|
284 |
|
|
single IO cell.</font></p><p><font color="blue">
|
285 |
|
|
|
286 |
|
|
|
287 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig2.JPG"><br>FIGURE 7: Bare die and its package
|
288 |
|
|
side-by-side.</font></p><p><font color="blue">
|
289 |
|
|
|
290 |
|
|
|
291 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig5.JPG"><br>FIGURE 8: The open TSV above, fortunately, does
|
292 |
|
|
not cause any problem because all of the TSVs shown are redundant. The top die
|
293 |
|
|
(= core die) is thinned down to 12um, and the bottom die (= memory die) height
|
294 |
|
|
is 765um, making the total thickness to be roughly 0.8mm.</font></p><p><font color="blue">
|
295 |
|
|
|
296 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig6.JPG"><br>FIGURE 9: A dummy TSV</font></p><p><font color="blue">
|
297 |
|
|
|
298 |
|
|
|
299 |
|
|
|
300 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/3D-MAPS-V1-fig9.jpg"><br>FIGURE 10: Layouts of full-die (core and
|
301 |
|
|
memory) and single core/memory tile.</font></p><p><font color="blue">
|
302 |
|
|
|
303 |
|
|
</font>
|
304 |
|
|
|
305 |
|
|
|
306 |
|
|
</p><p>
|
307 |
|
|
|
308 |
|
|
<img src="./3D-MAPS Many-Core Processor_files/Count.cgi">
|
309 |
|
|
|
310 |
|
|
|
311 |
|
|
</p></font><p><font color="blue">
|
312 |
|
|
</font>
|
313 |
|
|
|
314 |
|
|
</p></font></body></html>
|