| FPT'14   | <b>ICFPT 2014</b>                                                            |
|----------|------------------------------------------------------------------------------|
| Shanghai | The 2014 International Conference on Field-Programmable Technology           |
|          | Parkyard Hotel, Zhangjiang High-Tech Park, Shanghai, China, Dec. 10-12, 2014 |

# **Conference Program**





|             |                                        | Tuesday,<br>9 December                          |                                                                        | Wednesday,<br>10 December                                              | Thursday,<br>11 December        | Friday,<br>12 December                      |                                     | Saturday,<br>13 December           |                        | Sunday,<br>14 December             |
|-------------|----------------------------------------|-------------------------------------------------|------------------------------------------------------------------------|------------------------------------------------------------------------|---------------------------------|---------------------------------------------|-------------------------------------|------------------------------------|------------------------|------------------------------------|
|             |                                        | Day 0<br>Design Competition<br>& Workshop I, II | tion<br>II                                                             | Day 1<br>FPT Conference                                                | Day 2<br>FPT Conference         | Day 3<br>FPT Conference                     | и                                   | Day 4<br>Workshop III, IV, V       | Λ                      | Day 5<br>Workshop IV               |
| 8:00-8:40   |                                        |                                                 |                                                                        | Registration                                                           | Remistration                    | Redistration                                |                                     |                                    |                        |                                    |
| 8:40-8:50   |                                        |                                                 |                                                                        | Omminor                                                                | Including                       | IImanagou                                   |                                     |                                    |                        |                                    |
| 8:50-9:00   |                                        |                                                 |                                                                        | Opermig                                                                | Announcement                    | Announcement                                |                                     |                                    |                        |                                    |
| 9:00-10:00  |                                        |                                                 |                                                                        | Keynote Lecture<br>I                                                   | Keynote Lecture<br>II           | Keynote Lecture<br>III                      |                                     |                                    |                        |                                    |
| 10:00-11:00 |                                        |                                                 |                                                                        | Poster Session I:<br>Application                                       | Poster Session II:<br>PhD Forum | Poster Session III:<br>Architecture & Tools |                                     |                                    |                        |                                    |
| 11:00-12:40 |                                        |                                                 |                                                                        | 1.1<br>Tools &<br>Design Productivity                                  | 2.1<br>Mathematical Circuits    | 3.1<br>Applications &<br>Devices            |                                     |                                    |                        |                                    |
| 12:40-13:00 |                                        |                                                 |                                                                        |                                                                        |                                 |                                             |                                     |                                    |                        |                                    |
| 13:00-14:00 | Xiiinx-ARM<br>Workshop<br>(Workshop 1) | Altera Workshop<br>(Workshop 2)                 | Design<br>Competition<br>Registration<br>&<br>Pre-Competition<br>Check | Lunch                                                                  | Lunch                           | Lunch                                       | Cypress<br>Workshop<br>(Workshop 3) | Xilinx<br>Workshop<br>(Workshop 4) | IWETSO<br>(Workshop 5) | Xilinx<br>Workshop<br>(Workshop 4) |
| 14:00-15:20 |                                        |                                                 |                                                                        | 1.2<br>Financial Applications                                          | 2.2                             | 3.2                                         |                                     |                                    |                        |                                    |
| 15:20-15:40 |                                        |                                                 |                                                                        | Break                                                                  | Special Session: MOOC           | Industrial Session                          |                                     |                                    |                        |                                    |
| 15:40-16:00 |                                        |                                                 | Design Competition                                                     |                                                                        | Hardware Security               |                                             |                                     |                                    |                        |                                    |
| 16:00-16:45 |                                        |                                                 | Pre-Finals                                                             | 1.3<br>Architecture &<br>Duriting Sustance                             |                                 | Closing                                     |                                     |                                    |                        |                                    |
| 16:45-17:00 |                                        |                                                 |                                                                        |                                                                        | Banquet at                      |                                             |                                     |                                    |                        |                                    |
| 17:00-18:00 |                                        |                                                 |                                                                        | Design Competition and Demo<br>Posters &<br>Design Competition (Final) | Huangpu River Cruise            |                                             |                                     |                                    |                        |                                    |
| 18:00-20:00 |                                        |                                                 |                                                                        | Welcome Reception                                                      |                                 |                                             |                                     |                                    |                        |                                    |

# **Keynote Speeches**

Day 1 (Dec. 10)

### Logic Emulation in the MegaLUT Era – Moore's Law Beats Rent's Rule

by Mike Butts, Synopsys

#### Abstract

Throughout its twenty-five year history, logic emulation architectures have been governed by Rent's Rule. This empirical observation, first used to build 1960s mainframes, predicts the average number of cut nets that result when a digital module is arbitrarily partitioned into multiple parts, such as the FPGAs of a logic emulator.

A fundamental advantage of emulation is that, unlike most devices, FPGAs always grow in capacity according to Moore's Law, just as the designs to be emulated have grown. Unfortunately packaging technology advances at a far slower pace, leaving emulators short on the pins demanded by Rent's Rule. Many cut nets are now sent through each package pin, which costs speed, power and area.

At today's system-on-chip level of design, the number of system-level modules is growing, while their sizes are remaining constant. In the meantime, FPGAs have grown from a handful of logic lookup tables (LUTs) at the beginning to over a million LUTs today. At this scale, an entire system-level module such as an advanced 64-bit CPU can fit inside a single FPGA. Fewer module-internal nets need be cut, so Rent's Rule constraints are relaxing. Fewer and higher-level cut nets means logic emulation with megaLUT FPGAs is becoming faster, cooler, smaller, cheaper, and more reliable. FPGA's Moore's Law scaling is escaping from Rent's Rule.

### **Biography**



MIKE BUTTS is Senior Member Technical Staff of the Verification Group at Synopsys. He has a rich history of innovation in reconfigurable hardware and hardware-based verification.

Mike co-invented hardware logic emulation, which has developed into an essential tool for validating and modeling large silicon projects. Mike architected and designed a number of reconfigurable FPGA and crossbar chips and

system products in over twenty years in the electronic design automation industry, at Mentor Graphics, Quickturn, Cadence, where he was a Cadence Fellow, and now at Synopsys.

#### **Automating Customized Computing**

#### Abstract

Customized computing has been of interest to the research community for over three decades. The interest has intensified in the recent years as the power and energy become a significant limiting factor to the computing industry. For example, the energy consumed by the datacenters of some large internet service providers is well over 10<sup>9</sup> Kilowatt-hours. FPGA-based acceleration has shown 10-1000X performance/energy efficiency over the general-purpose processors in many applications. However, programming FPGAs as a computing device is still a significant challenge. Most of accelerators are designed using manual RTL coding. The recent progress in high-level synthesis (HLS) has improved the programming productivity considerably where one can quickly implement functional blocks written using high-level programming languages as C or C++ instead of RTL. But in using the HLS tool for accelerated computing, the programmer still faces a lot of design decisions, such as implementation choices of each module and communication schemes between different modules, and has to implement additional logic for data management, such as memory partitioning, data prefetching and reuse. Extensive source code rewriting is often required to achieve high-performance acceleration using the existing HLS tools.

In this talk, I shall present the ongoing work at UCLA to enable further automation for customized computing. One effort is on automated compilation to combining source-code level transformation for HLS with efficient parameterized architecture template generations. I shall highlight our progress on loop restructuring and code generation, memory partitioning, data prefetching and reuse, combined module selection, duplication, and scheduling with communication optimization. These techniques allow the programmer to easily compile computation kernels to FPGAs for acceleration. Another direction is to develop efficient runtime support for scheduling and transparent resource management for integration of FPGAs for datacenter-scale acceleration, which is becoming a reality (for example, Microsoft recently used over 1,600 servers with FPGAs for accelerating their search engine and reported very encouraging results). Our runtime system provides scheduling and resource management support at multiple levels, including server node-level, job-level, and datacenter-level so that programmer can make use the existing programming interfaces, such as MapReduce or Hadoop, for large-scale distributed computation.

#### **Biography**



JASON CONG received his B.S. degree from Peking University in 1985, his M.S. and Ph. D. degrees from the University of Illinois at Urbana-Champaign in 1987 and 1990, respectively. Currently, he is a Chancellor's Professor at the Computer Science Department of University of California, Los Angeles, the Director of Center for Domain-Specific Computing, and co-director of the VLSI CAD Laboratory. He also served as the department chair from 2005 to 2008. Dr. Cong's research interests

include CAD of VLSI circuits and systems, design and synthesis of SoC, programmable systems, novel computer architectures, nano-systems, and highly scalable algorithms. He has published over 400 research papers and led over 50 research projects in these areas. Dr. Cong received many awards and recognitions, including 10 Best Paper Awards and the 2011 ACM/IEEE A. Richard Newton Technical Impact Award in Electric Design Automation.

Day 3 (Dec. 12)

#### **Doing FPGA in a Former Software Company**

by Feng-hsiung Hsu, Microsoft Research Asia

#### Abstract

Microsoft has gone through massive changes in the last few years. First, it was the dominant software company. Then, it became a "Devices and Services" company, and now it is "Mobile First, Cloud First". Of course, deep down in the bones, it is still a software company. In this talk, I will give a personal account on how FPGA acceleration gradually gained traction inside Microsoft, difficulties and lessons learned in getting acceptance, FPGA's apparently imminent deployment inside Microsoft data centers, and finally what may be needed in FPGA programming software tool developments for wider acceptance inside a company like Microsoft.

#### **Biography**



FENG-HSIUNG HSU is the research manager for Hardware Computing Group at Microsoft Research Asia. Prior to Microsoft, he had worked at IBM's T. J. Watson Research Center, Compaq's Western Research Lab, and HP's Research Lab. He received his Ph. D. in Computer Science from Carnegie Mellon University in 1989 and B. S. in Electrical Engineering from National Taiwan University in 1980. He is sometimes known by his nick

name "CB", which stands for "Crazy Bird". CB's research interests include VLSI design, special purpose algorithms,

machine learning, device physics, optics, FPGA systems, computer architecture, mobile systems, 3D imaging systems, human-computer interface, and "whatever makes sense". Recently, he has been known to dabble in keyboard design, among other things.

CB received ACM's Grace Murray Hopper Award for his work at Carnegie Mellon on Deep Thought, the first chess machine to play chess at Grandmaster level. To the best of his knowledge, Deep Thought was also the first chess machine to use FPGAs (as part of the evaluation function). In 1997, CB won the Fredkin Prize, along with Murray Scott Campbell and Arthur Joseph Hoane, for Deep Blue's defeating the World Chess Champion (Gary Kasparov) in a set match. CB served as the chip designer and system architect for Deep Blue. CB is the author of the book, "Behind Deep Blue: Building the Computer that Defeated the World Chess Champion".

# **Conference Program**

# [Design Competition & Workshop I, II] Day 0 (Tuesday, 9 December)

| Workshops         |                                                         |
|-------------------|---------------------------------------------------------|
| All day           | Workshop 1: Xilinx-ARM Workshop "Zynq Lab in a box"     |
|                   | Location: Room 101, Computer Building,                  |
|                   | Zhangjiang Campus, Fudan University                     |
| All day           | Workshop 2: Altera Workshop "Introduction of SoC FPGA   |
|                   | using DE1-SOC kit"                                      |
|                   | Location: Room 313, Administration Building,            |
|                   | Zhangjiang Campus, Fudan University                     |
| <b>Design Com</b> | petition                                                |
| Session Chai      | r: Qiang Liu, Tianjin University                        |
| 13:00-14:00       | Design Competition Registration & Pre-Competition Check |
|                   | Location: Parkyard Hotel                                |
| 14:00-17:00       | Design Competition Pre-Finals                           |
|                   | Location: Ballroom, Parkyard Hotel                      |
|                   |                                                         |

# [FPT Conference] Day 1 (Wednesday, 10 December)

| Registration  |                                                        |
|---------------|--------------------------------------------------------|
| 8:00-8:40     | Location: Parkyard Hotel                               |
| Opening       |                                                        |
| 8:40-9:00     | Location: Ballroom, Parkyard Hotel                     |
| Keynote Lec   | ture I                                                 |
| Session Chai  | r: Lingli Wang, Fudan University                       |
| 9:00-10:00    | Logic Emulation in the MegaLUT Era – Moore's Law Beats |
|               | Rent's Rule                                            |
|               | Mike Butts, Synopsys                                   |
|               | Location: Ballroom, Parkyard Hotel                     |
| Poster Sessio | on I: Application                                      |
| Session Chai  | r: Tomonori Izumi, Ritsumeikan University              |
| 10:00-11:00   | Location: Poster Area, Parkyard Hotel                  |
|               | A Flexible Interface Architecture for Reconfigurable   |
|               | Coprocessors in Embedded Multicore Systems using PCIe  |
|               | Single-Root I/O Virtualization                         |
|               | Oliver Sander, Steffen Baehr, Enno Luebbers,           |
|               | Timo Sandmann, Viet Vu Duy and Juergen Becker          |
|               | Gigabyte-Scale Alignment Acceleration of Biological    |
|               | Sequences via Ethernet Streaming                       |
|               | Theepan Moorthy and Sathish Gopalakrishnan             |

Power Modelling and Capping for Heterogeneous **ARM/FPGA SoCs** Yun Wu, Jose Nunez-Yanez, Roger Woods and Dimitrios S. Nikolopoulos Analysis and Optimization of a Deeply Pipelined FPGA Soft Processor Hui Yan Cheah, Suhaib A. Fahmy and Nachiket Kapre A Circuit to Synchronize High Speed Serial Communication Channel Mrinal J Sarmah Novel Reconfigurable Hardware Implementation of Polynomial Matrix/Vector Multiplications Server Kasap and Soydan Redif A Complementary Architecture for High-Speed True Random Number Generator *Xian Yang and Ray C.C. Cheung* Fanout Decomposition Dataflow Optimizations for FPGA-based Sparse LU Factorization Siddhartha and Nachiket Kapre Zero Latency Encryption with FPGAs for Secure **Time-Triggered Automotive Networks** Shanker Shreejith and Suhaib A. Fahmy Using C to Implement High-efficient Computation of Dense **Optical Flow on FPGA-accelerated Heterogeneous Platforms** Zhilei Chai, Haojie Zhou, Zhibin Wang and Dong Wu Hardware Architecture of Bi-Cubic Convolution Interpolation for Real-time Image Scaling Gopinath Mahale, Hamsika Mahale, Rajesh Babu Parimi, S.K. Nandy and S. Bhattacharya FPGA-based High Throughput XTS-AES Encryption/Decryption for Storage Area Network Yi Wang, Akash Kumar and Yajun Ha Zyndroid: An Android Platform for Software/Hardware Coprocessing Susumu Mashimo, Motoki Amagasaki, Masahiro Iida, Morihiro Kuga and Toshinori Sueyoshi A Dataflow System for Anomaly Detection and Analysis Andrei Bara, Xinyu Niu and Wayne Luk AMMC: Advanced Multi-core Memory Controller Tassadaq Hussain, Oscar Palomar, Osman Unsal, Adrian Cristal, Eduard Ayguadé, Mateo Valero and S. A. Gursal

# 1.1 Tools & Design Productivity

| Session Chai  | r: Brad Hutchings, Brigham Young University                                                                                                             |
|---------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
|               | Location: Ballroom, Parkyard Hotel                                                                                                                      |
| 11:00-11:20   | Design Re-Use for Compile Time Reduction in FPGA                                                                                                        |
|               | High-Level Synthesis Flows                                                                                                                              |
|               | Marcel Gort and Jason Anderson                                                                                                                          |
| 11:20-11:40   | Is High Level Synthesis Ready for Business? A                                                                                                           |
|               | Computational Finance Case Study                                                                                                                        |
|               | Gordon Inggs, Shane Fleming, David Thomas and                                                                                                           |
|               | Wayne Luk                                                                                                                                               |
| 11:40-12:00   | Comparing Performance, Productivity and Scalability of the                                                                                              |
|               | TILT Overlay Processor to OpenCL HLS                                                                                                                    |
|               | Rafat Rashid, J. Gregory Steffan and Vaughn Betz                                                                                                        |
| 12:00-12:20   | Size Aware Placement for Island Style FPGAs                                                                                                             |
|               | Junying Huang, Colin Yu Lin, Yang Liu, Zhihua Li and                                                                                                    |
| 10 00 10 10   | Haigang Yang                                                                                                                                            |
| 12:20-12:40   | Analyzing the Impact of Heterogeneous Blocks on FPGA                                                                                                    |
|               | Placement Quality                                                                                                                                       |
| <b>T</b> 1    | Chang Xu, Wentai Zhang and Guojie Luo                                                                                                                   |
| Lunch         |                                                                                                                                                         |
|               | Location: Parkyard Hotel                                                                                                                                |
|               | Applications                                                                                                                                            |
| Session Chai  | r: Nachiket Kapre, Nanyang Technological University                                                                                                     |
| 14.00.14.00   | Location: Ballroom, Parkyard Hotel                                                                                                                      |
| 14:00-14:20   | Low-latency Option Pricing using Systolic Binomial Trees                                                                                                |
| 14 20 14 40   | Aryan Tavakkoli and David B. Thomas                                                                                                                     |
| 14:20-14:40   | Collaborative Processing of Least-Square Monte Carlo for                                                                                                |
|               | American Options                                                                                                                                        |
| 14.40 15.00   | Jinzhe Yang, Ce Guo, Wayne Luk and Terence Nahar                                                                                                        |
| 14:40-15:00   | Accelerating Transfer Entropy Computation                                                                                                               |
| 15.00 15.20   | Shengjia Shao, Ce Guo, Wayne Luk and Stephen Weston                                                                                                     |
| 15:00-15:20   | FPGA-Accelerated Monte-Carlo Integration using Stratified                                                                                               |
|               | Sampling and Brownian Bridges                                                                                                                           |
|               | Mark de Jong, Vlad-Mihai Sima, Koen Bertels and<br>David Thomas                                                                                         |
| Break         | Davia Inomas                                                                                                                                            |
| 15:20-15:40   | Location: Parkyard Hotel                                                                                                                                |
|               | ure & Runtime Systems                                                                                                                                   |
|               | r: Paul Chow, University of Toronto                                                                                                                     |
|               |                                                                                                                                                         |
|               | Location: Ballroom Parkvard Hotel                                                                                                                       |
| 15.40 - 16.00 | Location: Ballroom, Parkyard Hotel<br>Time Sharing of Runtime Coarse-Grain Reconfigurable                                                               |
| 15:40-16:00   | Location: Ballroom, Parkyard Hotel<br>Time Sharing of Runtime Coarse-Grain Reconfigurable<br>Architectures Processing Elements in Multi-Process Systems |

Benjamin Carrion Schafer

| 16:00-16:20 | Architectural Synthesis of Computational Pipelines with<br>Decoupled Memory Access                                                                                  |
|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|             | Shaoyi Cheng and John Wawrzynek                                                                                                                                     |
| 16:20-16:40 | Improve Memory Access for Achieving Both Performance<br>and Energy Efficiencies on Heterogeneous Systems<br><i>Hongyuan Ding and Miaoqing Huang</i>                 |
| 16:40-17:00 | Approaching Overhead-Free Execution on FPGA<br>Soft-Processors<br>Charles Eric LaForest, Jason Anderson and<br>J. Gregory Steffan                                   |
| Design Com  |                                                                                                                                                                     |
|             | r: Qiang Liu, Tianjin University                                                                                                                                    |
| 17:00-18:00 | Location: Poster Area, Parkyard Hotel<br>Hardware/Software Co-Design Architecture for Blokus Duo<br>Solver<br>Naru Sugimoto and Hideharu Amano                      |
|             | Optimize MinMax Algorithm to Solve Blokus Duo Game by HDL                                                                                                           |
|             | Hossein Borhanifar and Seyed Peyman Zolnouri<br>An Improved FPGA-based Specific Processor for Blokus<br>Duo                                                         |
|             | Javier Olivito, Alberto Delmás and Javier Resano                                                                                                                    |
|             | Highly Scalable, Shared Memory , Monte-Carlo Tree Search based Blokus Duo Solver on FPGA                                                                            |
|             | Ehsan Qasemi, Amir Samadi, Mohammad H. Shadmehr,<br>Bardia Azizian, Sajjad Mozaffari, Amir Shirian and<br>Bijan Alizadeh                                            |
|             | Blokus Duo Engine on a Zynq<br>Susumu Mashimo, Kansuke Fukuda, Motoki Amagasaki,<br>Magahino Jida, Morihino Kuga and Tashinori Sumoshi                              |
|             | Masahiro Iida, Morihiro Kuga and Toshinori Sueyoshi<br>FPGA Implementation of Blokus Duo Player using<br>Hardware/Software Co-Design<br>Akira Kojima                |
|             | An FPGA Blokus Duo Solver with a High Activity<br>Takumi Fujimori, Kouta Akagi, Retsu Moriwaki,<br>Hiroyuki Ito, Takayuki Kubota, Masato Seo and<br>Minoru Watanabe |
|             | Stratics FPGA Blokus Duo Solver<br>Takumi Fujimori, Kouta Akagi, Retsu Moriwaki,<br>Hiroyuki Ito, Takayuki Kubota, Masato Seo and<br>Minoru Watanabe                |
|             |                                                                                                                                                                     |

An Improved FPGA Blokus Player via Alpha-Betha Pruning and Monte-Carlo Algorithm Nariman Eskandari, Ali Jahanshahi and Mohammad Kazem Taram The Stochastic Blokus Duo Solver Rie Soejima, Kota Aoki, Kaoru Hamasaki, Masahito Oishi, Koji Okina, Jimpei Hamamura, Shun Kashiwagi, Yoshiki Hayashida, Ryo Fujita, Yudai Shirakura, Fumihiko Iwasaki, Tai Noguchi, Aiko Iwasaki, Kota Fukumoto and Yuichiro Shibata Developing an FPGA Blokus Duo Solver By System-Level Design Masataka Ogawa, Yuki Ando, Shinya Honda, Go Sato and Yusuke Kato Blokus Duo Player Based on ZYBO Song Xu and Lin Wang **BLUE STORM-Blokus Unified Engine of Search and Test** Operation by RitsuMei Masashi Ohno, Yuu Nakahara, Kazuya Ohtsu, Tatsuya Suzuki, Tomonori Izumi and Meng Lin An Implementation of Multi Game AI System of Blokus Duo on FPGA with NSL Rvo Tamaki

#### **Demo Session**

#### Session Chair: Hao Zhou, Fudan University

Location: Ballroom B, Parkyard Hotel 17:00-18:00 Network Recorder and Player: FPGA-based Network Traffic Capture and Replay Siyi Qiao, Chen Xu, Lei Xie, Ji Yang, Chengchen Hu, Xiaohong Guan and Jianhua Zhou Implementation of LS-SVM with HLS on Zyng Ma Ning, Wang Shaojun, Pang Yeyong and Peng Yu A High-Performance and High-Programmability **Reconfigurable Wireless Development Platform** Jiahua Chen, Tao Wang, Haoyang Wu, Jian Gong, Xiaoguang Li, Yang Hu, Gaohan Zhang, Zhiwei Li, Junrui Yang and Songwu Lu Image Processing by a 0.3V 2MW Coarse-Grained Reconfigurable Accelerator CMA-SOTB with a Solar Battery Yu Fujita, Koichiro Masuyama and Hideharu Amano

# Design Competition (Final)Session Chair: Qiang Liu, Tianjin University17:00-18:00Location: Ballroom A, Parkyard HotelWelcome Reception18:00-20:00Location: Phoenix

# [FPT Conference] Day 2 (Thursday, 11 December)

| Registration |                                                           |
|--------------|-----------------------------------------------------------|
| 8:00-8:50    | Location: Parkyard Hotel                                  |
| Announceme   | ent                                                       |
| 8:50-9:00    | Location: Ballroom, Parkyard Hotel                        |
| Keynote Lect | ture II                                                   |
| Session Chai | r: Hayden Kwok-Hay So, University of Hong Kong            |
| 9:00-10:00   | Automating Customized Computing                           |
|              | Prof. Jason Cong, UCLA                                    |
|              | Location: Ballroom, Parkyard Hotel                        |
|              | n II: PhD Forum                                           |
|              | r: Yu Hu, Chinese Academy of Sciences                     |
| 10:00-11:00  | Location: Poster Area, Parkyard Hotel                     |
|              | Design Space Exploration for FPGA-based Hybrid Multicore  |
|              | Architecture                                              |
|              | Jian Yan, Junqi Yuan, Ying Wang, Philip Leong and         |
|              | Lingli Wang                                               |
|              | Reducing the Overhead of Dynamic Partial Reconfiguration  |
|              | for Multi-Mode Circuits                                   |
|              | Brahim Al Farisi, Karel Heyse and Dirk Stroobandt         |
|              | HW Acceleration of Multiple Applications on a Single FPGA |
|              | Yidi Liu and Benjamin Carrion Schafer                     |
|              | Towards Automatic Partial Reconfiguration in FPGAs        |
|              | Fubing Mao, Wei Zhang and Bingsheng He                    |
|              | Achieving Higher Performance of Memcached by Caching at   |
|              | Network Interface                                         |
|              | Eric S. Fukuda, Hiroaki Inoue, Takashi Takenaka,          |
|              | Dahoo Kim, Tsunaki Sadahisa, Tetsuya Asai and             |
|              | Masato Motomura                                           |
|              | No Zero Padded Sparse Matrix-Vector Multiplication on     |
|              | FPGAs                                                     |
|              | Jiasen Huang, Junyan Ren, Wenbo Yin and Lingli Wang       |

# 2.1 Mathematical Circuits

| Session Chain | r: Donald Bailey, Massey University                                                                                                         |
|---------------|---------------------------------------------------------------------------------------------------------------------------------------------|
|               | Location: Ballroom, Parkyard Hotel                                                                                                          |
| 11:00-11:20   | Low-Latency Double-Precision Floating-Point Division for<br>FPGAs                                                                           |
| 11 00 11 40   | Björn Liebig and Andreas Koch                                                                                                               |
| 11:20-11:40   | Efficient FPGA Implementation of Digit Parallel Online<br>Arithmetic Operators<br><i>Kan Shi, David Boland and George A. Constantinides</i> |
| 11:40-12:00   | An Efficient FPGA Implementation of QR Decomposition                                                                                        |
| 11.40-12.00   | using a Novel Systolic Array Architecture based on<br>Enhanced Vectoring CORDIC<br>Jianfeng Zhang, Paul Chow and Hengzhu Liu                |
| 12.00 12.20   |                                                                                                                                             |
| 12:00-12:20   | Area Efficient Floating Point Adder and Multiplier with<br>IEEE-754 Compatible Semantics<br>Andreas Ehliar                                  |
| 12:20-12:40   | A Universal FPGA-based Floating-Point Matrix Processor                                                                                      |
| 12.20 12.40   | for Mobile Systems                                                                                                                          |
|               | Wenqiang Wang, Kaiyuan Guo, Mengyuan Gu, Yuchun Ma                                                                                          |
|               | and Yu Wang                                                                                                                                 |
| Lunch         |                                                                                                                                             |
|               | Location: Parkyard Hotel                                                                                                                    |
|               | ession: Hardware Security                                                                                                                   |
| —             | r: Yongqiang Lyu, Tsinghua University                                                                                                       |
|               | Location: Ballroom, Parkyard Hotel                                                                                                          |
| 14:00-14:25   | A Survey on Security and Trust of FPGA-based Systems                                                                                        |
|               | Jiliang Zhang and Gang Qu                                                                                                                   |
| 14:25-14:50   | Hardware Trojan Detection Acceleration Based on                                                                                             |
|               | Word-Level Statistical Properties Management                                                                                                |
|               | He Li and Qiang Liu                                                                                                                         |
| 14:50-15:15   | Power Supply Noise Aware Evaluation Framework for Side                                                                                      |
|               | Channel Attacks and Countermeasures                                                                                                         |
|               | Jianlei Yang, Chenguang Wang, Yici Cai and Qiang Zhou                                                                                       |
| 15:15-15:40   | Memory Security in Reconfigurable Computers: Combining                                                                                      |
|               | Formal Verification with Monitoring                                                                                                         |
|               | Tobias Wiersema, Stephanie Drzevitzky and Marco Platzner                                                                                    |
| 15:40-16:05   | An FPGA-based Spectral Anomaly Detection System                                                                                             |
|               | Duncan J.M. Moss, Zhe Zhang, Nicholas J Fraser and                                                                                          |
| 16000         | Philip H.W. Leong                                                                                                                           |
| MOOC          |                                                                                                                                             |
|               | r: Manfred Glesner, Technische Universitat Darmstadt                                                                                        |
| 14:00-16:05   | Logation Masting Doom I Daubuand Hotal                                                                                                      |
|               | Location: Meeting Room 1, Parkyard Hotel<br>Iuangpu River Cruise                                                                            |

16:05-20:00

# [FPT Conference] Day 3 (Friday, 12 December)

| Registration         |                                                           |
|----------------------|-----------------------------------------------------------|
| 8:00-8:50            | Location: Parkyard Hotel                                  |
| Announcem            | ent                                                       |
| 8:50-9:00            | Location: Ballroom, Parkyard Hotel                        |
| Keynote Lec          | · · · · · · · · · · · · · · · · · · ·                     |
| •                    | r: Yu Wang, Tsinghua University                           |
| 9:00-10:00           | Doing FPGA in a Former Software Company                   |
|                      | Feng-hsiung Hsu, Microsoft Research Asia                  |
|                      | Location: Ballroom, Parkyard Hotel                        |
| <b>Poster Sessio</b> | on III: Architecture & Tools                              |
| Session Chai         | r: Kentaro Sato, Tohoku University                        |
| 10:00-11:00          | Location: Poster Area, Parkyard Hotel                     |
|                      | Assessing Scrubbing Techniques for Xilinx SRAM-based      |
|                      | FPGAs in Space Applications                               |
|                      | Fredrik Brosser, Emil Milh, Vilhelm Geijer and            |
|                      | Per Larsson-Edefors                                       |
|                      | A Fast, Energy Efficient, Field Programmable              |
|                      | Threshold-Logic Array                                     |
|                      | Niranjan Kulkarni, Jinghua Yang and Sarma Vrudhula        |
|                      | A Novel Three-Dimensional FPGA Architecture with          |
|                      | High-Speed Serial Communication Links                     |
|                      | Takuya Kajiwara, Qian Zhao, Motoki Amagasak,              |
|                      | Masahiro Iida, Morihiro Kuga and Toshinori Sueyoshi       |
|                      | Scalable Radio Processor Architecture for Modern Wireless |
|                      | Communications                                            |
|                      | Young-Hwan Park, Keshava Prasad, Yeonbok Lee,             |
|                      | Kitaek Bae and Ho Yang                                    |
|                      | Integrating FPGA-based Processing Elements into a Runtime |
|                      | for Parallel Heterogeneous Computing                      |
|                      | David de la Chevallerie, Jens Korinth and Andreas Koch    |
|                      | Deep and Narrow Binary Content-Addressable Memories       |
|                      | using FPGA-based BRAMs                                    |
|                      | Ameer M.S. Abdelhadi and Guy G.F. Lemieux                 |
|                      | Development Productivity in Implementing a Complex        |
|                      | Heterogeneous Computing Application                       |
|                      | Anthony Milton, David Kearney, Sebastien Wong and         |
|                      | Simon Lemmo                                               |
|                      | Real-Time 3D Reconstruction for FPGAs: A Case Study for   |
|                      | Evaluating the Performance, Area, and Programmability     |
|                      | Trade-offs of the Altera OpenCL SDK                       |
|                      | Quentin Gautier, Alexandria Shearer, Janarbek Matai,      |
|                      | Dustin Richmond, Pingfan Meng and Ryan Kastner            |
|                      |                                                           |

|               | Online Scheduling for FPGA Computation in the Cloud                                      |
|---------------|------------------------------------------------------------------------------------------|
|               | Guohao Dai, Yi Shan, Fei Chen, Yu Wang, Kun Wang and                                     |
|               | Huazhong Yang                                                                            |
|               | High Performance Relevance Vector Machine on HMPSoC                                      |
|               | Yongfu He, Shaojun Wang, Yu Peng, Yeyong Pang, Ning Ma                                   |
|               | and Jingyue Pang                                                                         |
|               | Improving the Reliability of RO PUF using Frequency Offset                               |
|               | Bin Tang, Yaping Lin and Jiliang Zhang                                                   |
| 3 1 Annlicati | ions & Devices                                                                           |
|               | ir: Andre DeHon, University of Pennsylvania                                              |
| Session Cha   | Location: Ballroom, Parkyard Hotel                                                       |
| 11:00-11:20   | ROTORouter: Router Support for Endpoint-Authorized                                       |
| 11.00-11.20   | Decentralized Traffic Filtering to Prevent DoS Attacks                                   |
|               | -                                                                                        |
|               | Albert Kwon, Kaiyu Zhang, Perk Lun Lim, Yuchen Pan,<br>Jonathan M. Smith and André DeHon |
| 11.20 11.40   |                                                                                          |
| 11:20-11:40   | Parallel Resampling for Particle Filters on FPGAs                                        |
|               | Shuanglong Liu, Grigorios Mingas and                                                     |
| 11 40 10 00   | Christos-Savvas Bouganis                                                                 |
| 11:40-12:00   | Evaluation of SNMP-Like Protocol to Manage a NoC                                         |
|               | Emulation Platform                                                                       |
|               | Otávio Alcântara de Lima Junior, Virginie Fresse and                                     |
| 10 00 10 00   | Frédéric Rousseau                                                                        |
| 12:00-12:20   | A High-Performance Low-Power Near-Vt RRAM-based                                          |
|               | FPGA                                                                                     |
|               | Xifan Tang, Pierre-Emmanuel Gaillardon and                                               |
| 10 00 10 10   | Giovanni De Micheli                                                                      |
| 12:20-12:40   | A Pure-CMOS Nonvolatile Multi-Context Configuration                                      |
|               | Memory for Dynamically Reconfigurable FPGAs                                              |
|               | Kosuke Tatsumura, Masato Oda and Shinichi Yasuda                                         |
| Lunch         |                                                                                          |
|               | Location: Parkyard Hotel                                                                 |
| 3.2 Industria | ll Session                                                                               |
| Session Chai  | ir: Haile Yu, Cluster Technology Limited                                                 |
|               | Location: Ballroom, Parkyard Hotel                                                       |
| 14:00-14:20   | Programmable Multimedia Platform based on Samsung                                        |
|               | Reconfigurable Processor                                                                 |
|               | Sukjin Kim, Samsung                                                                      |
| 14:20-14:40   | Staying a Generation Ahead                                                               |
|               | Jason Wong, Xilinx                                                                       |
| 14:40-15:00   | IoT and Wearable Applications Enabled by Bluetooth Low                                   |
|               | Energy (BLE) solutions                                                                   |
|               | Patrick Kane, Cypress                                                                    |
| 15:00-15:20   | The Implementation Methodology of FPGA in Future                                         |
|               | Dylan Wang, Altera                                                                       |
|               |                                                                                          |

| 15:20-15:40 | On-Chip Loop Timing Design on a Virtex-7 FPGA <i>Xiaolong Xie, ZTE</i>                          |
|-------------|-------------------------------------------------------------------------------------------------|
| 15:40-16:00 | China Programmable IC: Innovation with CAP Technology<br>Dr. Ming Liu, Capital Microelectronics |
| Closing     |                                                                                                 |

16:00-16:45 Location: Ballroom, Parkyard Hotel

# [Workshop III, IV, V] Day 4 (Saturday, 13 December)

| Workshops |                                                        |
|-----------|--------------------------------------------------------|
| All day   | Workshop 3: Cypress Workshop "ARM PSoC 4               |
|           | Lab-in-a-box" and "Bluetooth Low Energy Workshop"      |
|           | Location: Ballroom A, Parkyard Hotel                   |
| All day   | Workshop 4: Xilinx Workshop "OpenCV Application        |
|           | Acceleration with Vivado High Level Synthesis"         |
|           | Location: Room 101, Computer Building,                 |
|           | Zhangjiang Campus, Fudan University                    |
| All day   | Workshop 5: IWETSO (International Workshop on Emerging |
|           | Technologies of Synthesis and Optimization)            |
|           | Location: Room 103, Administration Building,           |
|           | Zhangjiang Campus, Fudan University                    |

# [Workshop IV] Day 5 (Sunday, 14 December)

| Workshops |                                                                                                                                                                                    |
|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| All day   | Workshop 4: Xilinx Workshop "OpenCV Application<br>Acceleration with Vivado High Level Synthesis"<br>Location: Room 101, Computer Building,<br>Zhangjiang Campus, Fudan University |

# 😁 Shanghai Metro Network Map





The conference will be held at **Parkyard Hotel Address:** No.699 Bibo Road, Pudong New Area, Shanghai, China.



# Zhangjiang Campus, Fudan University





# First Floor Plan of Shanghai Parkyard Hotel

## **FPT'14 Conference**

Fudan University E-mail: fpt2014@fudan.edu.cn Tel: +8621 65643761 / Fax: +8621 65643449 Website: http://www.icfpt2014.org/ Parkyard WiFi: Parkyard (no password required)

