# Dan Zhang

# **EDUCATION**

University of Texas – Austin (Austin, TX) PhD Computer Engineering, August 2017 MS Computer Engineering, Dec 2012 Overall MS GPA: 4.000/4.000 **University of Michigan** – Ann Arbor (Ann Arbor, MI) BSE Computer Engineering, Dec 2008 Overall GPA: 3.870/4.000

## **PUBLICATIONS**

Dan Zhang, Xiaoyu Ma, Derek Chiou, "Worklist-directed Prefetching," in *Computer Architecture Letters*, 2016. Xiaoyu Ma, Dan Zhang, Derek Chiou. "FPGA-Accelerated Transactional Execution of Graph Workloads," in *FPGA*, 2017. Andrea Pellegrini, Kypros Constantinides, Dan Zhang, Shobana Sudhakar, Valeria Bertacco, and Todd Austin, "CrashTest: A Fast High-Fidelity FPGA-Based Resiliency Analysis Framework," in *ICCD*, 2008.

### WORK EXPERIENCE

#### Microsoft Research (Contractor): Redmond, WA

FPGA Developer, October 2015-Current

Main developer for the Catapult academic program. Developed and tested Catapult academic shell and sample projects for the Mt Granite Catapult FPGA board. Wrote Catapult documentation and supported academic users.

#### Apple Corporation: Cupertino, CA

CPU Front-End Design Intern, May 2015-August 2015

Proposed, invented, simulated and evaluated new techniques for next-line prediction and instruction prefetching. **Microsoft Research:** Redmond, WA

#### Catapult Research Intern, June 2014-August 2014

Evaluated and helped design custom FPGA manycore architecture for accelerating Bing ranking in datacenters. Characterized next-generation cores compare to current core design and baseline Altera NIOS cores.

#### Centaur Technology: Austin, TX

Intern, May 2013-August 2013

Invented, proposed, simulated and evaluated new branch predictor for upcoming x86 processor.

#### Intel Corporation: Hillsboro, OR and Santa Clara, CA

Interned 4 times with ORCA, Larrabee, Jaketown, Intel Labs, May-Aug 2007, May-Aug 2009-2011
 Intel Labs: analyzed and simulated x86 extensions to improve vectorization of general purpose code.
 Jaketown: simulation and verification of their global power management microcontroller.
 Larrabee: created Python-based graphing infrastructure for perf correlation and validation.
 ORCA: SPECINT06 perf analysis for compilers, owned large cache study.

#### NVIDIA Corporation: Santa Clara, CA

Intern GPU HW, May 2008-August 2008

Engineering Change Orders, designed web-based fuse utility and scripts for performance validation.

#### **HACKATHONS**

SXSW Music 2015 - Best Use of Rdio, MusicGraph: CrowdJockey, judges crowd intensity using Kinect.
TreeHacks 2015 - Best Use of Meta: MetaWorld, augmented reality data visualization tool using Meta AR.
PennApps 2015 - Best Wearable Health Hack: Dr. Cloud, suggests diagnostics using vitals + NLP.
HackTX 2014 - Top 10 finalist: Facelt, Chrome plugin that enables upvotes/downvotes for Facebook posts.
HackTX 2013 - 2<sup>nd</sup> place: Relevant XKCD, web app shows relevant comic given search phrase using tf-idf.
Ebay 2012 - 3<sup>rd</sup> place: NoGo, Chrome plugin that donates \$1 via Paypal when you visit blocked websites.

#### **CLASS PROJECT EXPERIENCE**

# A 1GHz Custom-Logic 16-bit DSP for Multimedia Applications

AMD Design Competition-Winning Semester Project for VLSI I @ UMich, Spring 2008

4-way SIMD, Sklansky adder, full custom datapath, clock gating, stream buffer, 7-stage pipeline, GShare.

# Duchess: an Out-of-Order Alpha CPU

Semester Project for Computer Architecture @ UMich, Fall 2006

Written with synthesizable Verilog, contains ROB, RRAT, RAS, GShare branch predictor, OOO LSQ.

## KEY SKILLS