240 likes | 1.27k Views
Slides from Shaaban. Systolic Architectures. Replace single processor with an array of regular processing elements Orchestrate data flow for high throughput with less memory access. Different from pipelining
E N D
Slides from Shaaban Systolic Architectures • Replace single processor with an array of regular processing elements • Orchestrate data flow for high throughput with less memory access • Different from pipelining • Nonlinear array structure, multidirection data flow, each PE may have (small) local instruction and data memory • Different from SIMD: each PE may do something different • Initial motivation: VLSI enables inexpensive special-purpose chips • Represent algorithms directly by chips connected in regular pattern
b2,2 b2,1 b1,2 b2,0 b1,1 b0,2 b1,0 b0,1 b0,0 a0,2 a0,1 a0,0 a1,2 a1,1 a1,0 a2,2 a2,1 a2,0 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication • Processors arranged in a 2-D grid • Each processor accumulates one • element of the product Alignments in time Columns of B Rows of A T = 0 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
b2,2 b2,1 b1,2 b2,0 b1,1 b0,2 b1,0 b0,1 a0,2 a0,1 a1,2 a1,1 a1,0 a2,2 a2,1 a2,0 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication • Processors arranged in a 2-D grid • Each processor accumulates one • element of the product Alignments in time b0,0 a0,0*b0,0 a0,0 T = 1 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
b2,2 b2,1 b1,2 b2,0 b1,1 b0,2 a0,2 a1,2 a1,1 a2,2 a2,1 a2,0 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication • Processors arranged in a 2-D grid • Each processor accumulates one • element of the product Alignments in time b1,0 b0,1 a0,0*b0,0 + a0,1*b1,0 a0,0*b0,1 a0,0 a0,1 b0,0 a1,0*b0,0 a1,0 T = 2 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
a1,2 a2,2 a2,1 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication • Processors arranged in a 2-D grid • Each processor accumulates one • element of the product b2,2 b2,1 b1,2 Alignments in time b2,0 b0,2 b1,1 a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a0,0*b0,1 + a0,1*b1,1 a0,0 a0,1 a0,2 a0,0*b0,2 b0,1 b1,0 a1,0*b0,0 + a1,1*b1,0 a1,0 a1,1 a1,0*b0,1 b0,0 a2,0*b0,0 a2,0 T = 3 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
Systolic Array Example: 3x3 Systolic Array Matrix Multiplication • Processors arranged in a 2-D grid • Each processor accumulates one • element of the product Alignments in time b2,2 b1,2 b2,1 a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a0,0*b0,1 + a0,1*b1,1 + a0,2*b2,1 a0,1 a0,2 a0,0*b0,2 + a0,1*b1,2 b1,1 b2,0 b0,2 a1,0*b0,0 + a1,1*b1,0 + a1,2*a2,0 a1,1 a2,2 a1,0 a1,0*b0,2 a1,2 a1,0*b0,1 +a1,1*b1,1 b0,1 b1,0 a2,0*b0,1 a2,0 a2,0*b0,0 + a2,1*b1,0 a2,1 a2,2 T = 4 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
Systolic Array Example: 3x3 Systolic Array Matrix Multiplication • Processors arranged in a 2-D grid • Each processor accumulates one • element of the product Alignments in time b2,2 a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a0,0*b0,1 + a0,1*b1,1 + a0,2*b2,1 a0,2 a0,0*b0,2 + a0,1*b1,2 + a0,2*b2,2 b2,1 b1,2 a1,0*b0,0 + a1,1*b1,0 + a1,2*a2,0 a1,2 a1,1 a1,0*b0,2 + a1,1*b1,2 a1,0*b0,1 +a1,1*b1,1 + a1,2*b2,1 b1,1 b0,2 b2,0 a2,0*b0,1 + a2,1*b1,1 a2,0*b0,2 a2,0 a2,1 a2,0*b0,0 + a2,1*b1,0 + a2,2*b2,0 a2,2 T = 5 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
Systolic Array Example: 3x3 Systolic Array Matrix Multiplication • Processors arranged in a 2-D grid • Each processor accumulates one • element of the product Alignments in time a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a0,0*b0,1 + a0,1*b1,1 + a0,2*b2,1 a0,0*b0,2 + a0,1*b1,2 + a0,2*b2,2 b2,2 a1,0*b0,0 + a1,1*b1,0 + a1,2*a2,0 a1,2 a1,0*b0,2 + a1,1*b1,2 + a1,2*b2,2 a1,0*b0,1 +a1,1*b1,1 + a1,2*b2,1 b2,1 b1,2 a2,0*b0,1 + a2,1*b1,1 + a2,2*b2,1 a2,0*b0,2 + a2,1*b1,2 a2,1 a2,2 a2,0*b0,0 + a2,1*b1,0 + a2,2*b2,0 T = 6 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
Systolic Array Example: 3x3 Systolic Array Matrix Multiplication • Processors arranged in a 2-D grid • Each processor accumulates one • element of the product Alignments in time a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a0,0*b0,1 + a0,1*b1,1 + a0,2*b2,1 a0,0*b0,2 + a0,1*b1,2 + a0,2*b2,2 a1,0*b0,0 + a1,1*b1,0 + a1,2*a2,0 a1,0*b0,2 + a1,1*b1,2 + a1,2*b2,2 a1,0*b0,1 +a1,1*b1,1 + a1,2*b2,1 Done b2,2 a2,0*b0,1 + a2,1*b1,1 + a2,2*b2,1 a2,0*b0,2 + a2,1*b1,2 + a2,2*b2,2 a2,2 a2,0*b0,0 + a2,1*b1,0 + a2,2*b2,0 T = 7 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
玻璃钢生产厂家辽宁运动人物玻璃钢雕塑批发价江苏蛋型玻璃钢花盆怎么用玻璃钢做雕塑宝鸡标牌标识玻璃钢卡通雕塑东沙群岛玻璃钢座椅雕塑公司玻璃钢西瓜雕塑行情马鞍山水果玻璃钢雕塑价位商场冬季美陈换装白银公园玻璃钢雕塑商丘玻璃钢雕塑大象生产厂天津商场美陈雕塑宿迁玻璃钢仿铜雕塑多少钱湖州玻璃钢广场雕塑厂家柳河玻璃钢雕塑厂家梅州玻璃钢动物雕塑推荐厂家巴中市玻璃钢雕塑信阳仿古玻璃钢卡通雕塑厂家铜陵百货商场美陈垫江县玻璃钢雕塑玻璃钢雕塑翻模6株洲玻璃钢雕塑生产厂家南京玄武商场美陈晋中玻璃钢广场雕塑厂家梅州玻璃钢彩绘雕塑昆明玻璃钢雕塑加工6米玻璃钢雕塑大概多少钱商场海底世界美陈dp点效果图山西欧式玻璃钢雕塑哪家便宜哈尔滨景观雕塑玻璃钢玻璃钢雕塑滨州香港通过《维护国家安全条例》两大学生合买彩票中奖一人不认账让美丽中国“从细节出发”19岁小伙救下5人后溺亡 多方发声单亲妈妈陷入热恋 14岁儿子报警汪小菲曝离婚始末遭遇山火的松茸之乡雅江山火三名扑火人员牺牲系谣言何赛飞追着代拍打萧美琴窜访捷克 外交部回应卫健委通报少年有偿捐血浆16次猝死手机成瘾是影响睡眠质量重要因素高校汽车撞人致3死16伤 司机系学生315晚会后胖东来又人满为患了小米汽车超级工厂正式揭幕中国拥有亿元资产的家庭达13.3万户周杰伦一审败诉网易男孩8年未见母亲被告知被遗忘许家印被限制高消费饲养员用铁锨驱打大熊猫被辞退男子被猫抓伤后确诊“猫抓病”特朗普无法缴纳4.54亿美元罚金倪萍分享减重40斤方法联合利华开始重组张家界的山上“长”满了韩国人?张立群任西安交通大学校长杨倩无缘巴黎奥运“重生之我在北大当嫡校长”黑马情侣提车了专访95后高颜值猪保姆考生莫言也上北大硕士复试名单了网友洛杉矶偶遇贾玲专家建议不必谈骨泥色变沉迷短剧的人就像掉进了杀猪盘奥巴马现身唐宁街 黑色着装引猜测七年后宇文玥被薅头发捞上岸事业单位女子向同事水杯投不明物质凯特王妃现身!外出购物视频曝光河南驻马店通报西平中学跳楼事件王树国卸任西安交大校长 师生送别恒大被罚41.75亿到底怎么缴男子被流浪猫绊倒 投喂者赔24万房客欠租失踪 房东直发愁西双版纳热带植物园回应蜉蝣大爆发钱人豪晒法院裁定实锤抄袭外国人感慨凌晨的中国很安全胖东来员工每周单休无小长假白宫:哈马斯三号人物被杀测试车高速逃费 小米:已补缴老人退休金被冒领16年 金额超20万