Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
readings [2015/09/16 03:50]
nandita [Review Set 1 due (3 PM)]
readings [2015/09/23 15:57]
nandita [Review Set 3 (due 3 PM)]
Line 24: Line 24:
 ===== Recitation 2 ===== ===== Recitation 2 =====
  
-==== Review Set 2 due (3 PM)====+==== Review Set 2 (due 3 PM)====
   -  Ahn et al., [[ http://​users.ece.cmu.edu/​~omutlu/​pub/​tesseract-pim-architecture-for-graph-processing_isca15.pdf | A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing]], ​ //ISCA 2015.// **[Review Required]**   -  Ahn et al., [[ http://​users.ece.cmu.edu/​~omutlu/​pub/​tesseract-pim-architecture-for-graph-processing_isca15.pdf | A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing]], ​ //ISCA 2015.// **[Review Required]**
   -  Stephen W. Keckler, William J. Dally, Brucek Khailany, Michael Garland, David Glasco, [[ http://​www.cs.nyu.edu/​courses/​spring12/​CSCI-GA.3033-012/​ieee-micro-echelon.pdf | GPUs and the Future of Parallel Computing]],​ IEEE Micro 2011. **[Review Required]**   -  Stephen W. Keckler, William J. Dally, Brucek Khailany, Michael Garland, David Glasco, [[ http://​www.cs.nyu.edu/​courses/​spring12/​CSCI-GA.3033-012/​ieee-micro-echelon.pdf | GPUs and the Future of Parallel Computing]],​ IEEE Micro 2011. **[Review Required]**
Line 52: Line 52:
 ===== Recitation 3 ===== ===== Recitation 3 =====
 ==== Review Set 3 (due 3 PM) ==== ==== Review Set 3 (due 3 PM) ====
 +  -  Cai et al., [[ http://​users.ece.cmu.edu/​~omutlu/​pub/​flash-read-disturb-errors_dsn15.pdf | Read Disturb Errors in MLC NAND Flash Memory: Characterization and Mitigation]], ​ //DSN 2015.// **[Review Required]**
 +  - Lee et al., [[ http://​users.ece.cmu.edu/​~omutlu/​pub/​pcm_isca09.pdf | Architecting Phase Change Memory as a Scalable DRAM Alternative]],​ //ISCA 2009// **[Review Required]**
 +  - Joao et al., [[http://​users.ece.cmu.edu/​~omutlu/​pub/​bottleneck-identification-and-scheduling_asplos12.pdf | Bottleneck Identification and Scheduling in Multithreaded Applications]],​ //ASPLOS 2012.// **[Review Required]**
 +  - Ebrahimi et al., [[ http://​users.ece.cmu.edu/​~omutlu/​pub/​fst_asplos10.pdf| Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multi-Core Memory Systems]], //ASPLOS 2010.// **[Optional]**
 ==== Optional Readings Mentioned in Lecture ==== ==== Optional Readings Mentioned in Lecture ====
 +  * Cai et al., [[ https://​users.ece.cmu.edu/​~omutlu/​pub/​flash-memory-data-retention_hpca15.pdf | Data Retention in MLC NAND Flash Memory: Characterization,​ Optimization and Recovery]], //HPCA 2015// ​
 +  * Lee et al., [[http://​www.cs.rochester.edu/​~ipek/​ieeemicro10.pdf | Phase Change Technology and the Future of Main Memory]], //IEEE Micro 2010//
 +  * Suleman et al., [[http://​users.ece.cmu.edu/​~omutlu/​pub/​acs_asplos09.pdf | Accelerating Critical Section Execution with Asymmetric Multi-Core Architectures]],​ //ASPLOS 2009//
 +  * Kang et al., [[http://​users.ece.cmu.edu/​~yoonguk/​papers/​kang-memoryforum14.pdf | Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling]], //Memory Form 2014//
 +  * Seshadri et al., [[http://​users.ece.cmu.edu/​~omutlu/​pub/​in-DRAM-bulk-AND-OR-ieee_cal15.pdf | Fast Bulk Bitwise AND and OR in DRAM]], //IEEE Computer Architecture Letters (CAL), April 2015.//
 +   * Seshadri et al., [[http://​users.ece.cmu.edu/​~omutlu/​pub/​in-DRAM-bulk-AND-OR-ieee_cal15.pdf | RowClone: Fast and Energy-Efficient In-DRAM Bulk Data Copy and Initialization ]], //MICRO 2013// ​
 +   * Ahn et al., [[ http://​users.ece.cmu.edu/​~omutlu/​pub/​tesseract-pim-architecture-for-graph-processing_isca15.pdf | A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing]], ​ //ISCA 2015.//
 +   * Loh et al., [[http://​ag-rs-www.informatik.uni-kl.de/​publications/​data/​Loh08.pdf| 3D-Stacked Memory Architectures for Multi-Core Processors]],​ //ISCA 2008.//
 +   * Dennis et al.,​[[https://​courses.cs.washington.edu/​courses/​cse548/​11au/​Dennis-Dataflow.pdf | A Preliminary Architecture for a Basic Data Flow Processor]],​ //ISCA 1974.//
  
 +===== Lecture 5 =====
 +==== Optional Readings Mentioned in Lecture ====
 +
 +  * T. Yeh and Y. Patt [[http://​web.cecs.pdx.edu/​~herb/​ece587s15/​Papers/​08_yeh_patt_br_predict_1991.pdf ​ | Two-Level Adaptive Training Branch Prediction]], ​ //Intl. Symposium on Microarchitecture,​ November 1991. MICRO Test of Time Award Winner (after 24 years).//
 +  * Kessler, R. E., [[http://​cseweb.ucsd.edu/​classes/​sp00/​cse241/​alpha.pdf | The Alpha 21264 Microprocessor]],​ //IEEE Micro, March/April 1999, pp. 24-36 //
 +  * McFarling, S., [[http://​www.hpl.hp.com/​techreports/​Compaq-DEC/​WRL-TN-36.pdf | Combining Branch Predictors]],​ //DEC WRL Technical Report, TN-36, June 1993//
 +  * Smith and Sohi, [[ftp://​ftp.cs.wisc.edu/​sohi/​papers/​1995/​ieee-proc.superscalar.pdf | The Microarchitecture of Superscalar Processors]],​ //​Proceedings of the IEEE, 1995.//
 +  * Evers et al., [[http://​www.ece.cmu.edu/​~ece740/​f10/​lib/​exe/​fetch.php?​media=analysisofcorrelationandpredictability.pdf | An Analysis of Correlation and Predictability:​ What Makes Two-Level Branch Predictors Work]], //ISCA 1998//
 +  * Chang et al., [[http://​ieeexplore.ieee.org/​xpls/​abs_all.jsp?​arnumber=717404 | Branch classification:​ a new mechanism for improving branch predictor performance]],​ //MICRO 1994//
 +  * Sprangle et al., [[http://​ieeexplore.ieee.org/​xpls/​abs_all.jsp?​arnumber=604711 | The Agree Predictor: A Mechanism for Reducing Negative Branch History Interference]],​ //ISCA 1997.//
 +  * Seznec, [[http://​www.irisa.fr/​caps/​oldcaps/​people/​seznec/​Optim2bcgskew.pdf | An optimized 2bcgskew branch predictor]],​ //IRISA Tech Report 1993.//
 +  * Michaud, [[http://​citeseerx.ist.psu.edu/​viewdoc/​download?​doi=10.1.1.94.63&​rep=rep1&​type=pdf | Trading conflict and capacity aliasing in conditional branch predictors]],​ //ISCA 1997//
 +  * Lee et al., [[http://​www-inst.eecs.berkeley.edu/​~cs152/​sp05/​handouts/​p4-lee.pdf | The bi-mode branch predictor]],​ //MICRO 1997.//
 +  * Eden and Mudge, [[http://​web.eecs.umich.edu/​~tnm/​papers/​yags.pdf | The YAGS branch prediction scheme]], //MICRO 1998.//
 +  * Seznec et al., [[http://​www.cs.utah.edu/​~rajeev/​cs7810/​papers/​seznec02.pdf | Design tradeoffs for the Alpha EV8 conditional branch predictor]],​ //ISCA 2002.//
 +  * Chappell et al., [[http://​www.ece.cmu.edu/​~ece740/​f13/​lib/​exe/​fetch.php?​media=chappell_ssmt99.pdf | Simultaneous Subordinate Microthreading (SSMT)]], //ISCA 1999.//
 +  * Seznec, [[https://​classes.soe.ucsc.edu/​cmpe221/​Spring06/​papers/​03trace.pdf | Analysis of the O-Geometric History Length branch predictor]],​ //ISCA 2005//
 +  * Gochman et al., [[http://​www.weblearn.hs-bremen.de/​risse/​RST/​WS04/​Centrino/​vol7iss2_art03.pdf | The Intel Pentium M Processor: Microarchitecture and Performance]],​ //Intel Technology Journal, May 2003//
 +  * Jimenez and Lin, [[https://​www.cs.utexas.edu/​~lin/​papers/​hpca01.pdf | Dynamic Branch Prediction with Perceptrons]],​ //HPCA 2001//
 +  * Rosenblatt, [[http://​catalog.hathitrust.org/​Record/​000203591 | Principles of Neurodynamics:​ Perceptrons and the Theory of Brain Mechanisms]],​ //1962//
 +  * Seznec and Michaud, ​ [[http://​www.jilp.org/​vol8/​v8paper1.pdf | A case for (partially) tagged Geometric History Length Branch Prediction]],​ //JILP 2006.//
 +  * Andre Seznec, [[http://​www.jilp.org/​cbp2014/​paper/​AndreSeznec.pdf | TAGE-SC-L branch predictors]],​ //CBP 2014//
 +  * Chappell et al., [[http://​hps.ece.utexas.edu/​pub/​ssmt_isca_29.pdf | Difficult-Path Branch Prediction Using Subordinate Microthreads]],​ //ISCA 2002.//
 +  * Jacobsen et al., [[http://​people.engr.ncsu.edu/​ericro/​publications/​conference_MICRO-29_jrs.pdf | Assigning Confidence to Conditional Branch Predictions]],​ //MICRO 1996.//
 +  * Manne et al., [[http://​www.cs.utah.edu/​~rajeev/​cs7810/​papers/​manne98.pdf | Pipeline Gating: Speculation Control for Energy Reduction]],​ //ISCA 1998//
 +  * Pettis and Hansen, [[http://​perso.ensta-paristech.fr/​~bmonsuez/​Cours/​B6-4/​Articles/​papers15.pdf | Profile Guided Code Positioning]],​ //PLDI 1990.//
 +  * Hwu et al., [[http://​impact.crhc.illinois.edu/​shared/​papers/​hwu_jsuper93.pdf | The Superblock: An effective technique for VLIW and superscalar compilation,​” Journal of Supercomputing]],​ //1993.//
 +  * Rotenberg et al., [[http://​people.engr.ncsu.edu/​ericro/​publications/​conference_MICRO-29_rbs.pdf | Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching]], //MICRO 1996.//
 +  * Patel et al., [[https://​www.eecs.umich.edu/​techreports/​cse/​97/​CSE-TR-335-97.pdf | Critical Issues Regarding the Trace Cache Fetch Mechanism]],​ //Umich TR, 1997.//
 +  * A. Peleg, U. Weiser, [[http://​patft1.uspto.gov/​netacgi/​nph-Parser?​Sect1=PTO1&​Sect2=HITOFF&​d=PALL&​p=1&​u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&​r=1&​f=G&​l=50&​s1=5381533.PN.&​OS=PN/​5381533&​RS=PN/​5381533 | Dynamic Flow Instruction Cache Memory Organized Around Trace Segments Independent of Virtual Address Line]], //United States Patent No. 5,381,533, Jan 10, 1995.// ​
 +
 +
 +
 +===== Recitation 4 =====
 +
 +==== Review Set 4 ====
 +  * Eiman Ebrahimi et. al., [[ https://​users.ece.cmu.edu/​~omutlu/​pub/​fst_asplos10.pdf | Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multi-Core Memory Systems]], ​ // ASPLOS 2010//
 +
 +