User Tools

Site Tools


readings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
readings [2014/09/09 16:13]
yixinluo
readings [2014/11/12 15:46]
yixinluo
Line 5: Line 5:
 Note that the reviews are due at 11:59 PM on the due date. Note that the reviews are due at 11:59 PM on the due date.
  
-==== Reading List ==== +==== Reading List (in reverse order) ​==== 
-=== 8/26/2014 === +=== 11/13/2014 === 
-^ Due 9/2/2014 | {{motmoo-springer-chapter-7-30-2014.pdf|Onur Mutlu, "Main Memory ScalingChallenges ​and Solution Directions", ​preprint book Chapter 6, 2014.}} | +Reviews required for both papers, due on Wednesday, Nov 12. 
-^ Due 9/6/2014 | Pick 3 papers ​referenced by the above paper that pique your interest ​+  * **[[http://dl.acm.org/​citation.cfm?​id=2503257|Sridharan et al., "Feng shui of supercomputer memorypositional effects in DRAM and SRAM faults", ​SC 2013.]]** 
-^ Due 9/6/2014 [[http://www.cs.virginia.edu/~robins/YouAndYourResearch.html|Hamming, "You and Your Research," ​Bell Communications Research Colloquium Seminar7 March 1986.]] | +  * **[[https://www.cs.princeton.edu/​~appel/​papers/memerr.pdf|Sudhakar Govindavajhala,​ Andrew W. Appel, "Using Memory Errors to Attack a Virtual Machine",​ SP 2003.]]** 
-| | [[http://web.stanford.edu/class/cs240/readings/lampson-hints.pdf|Butler WLampson"​Hints ​for computer system design"​SOSP 1983]] | +=== 11/12/2014 === 
-| | [[http://books.google.com/​books/about/Inside_the_AS_400.html?​id=hJtyAAAACAAJ|Frank Soltis, "Inside ​the AS/400", ​1996]] | +Reviews required for both papers, due on Tuesday, Nov 11. 
-[[http://​www.cs.utexas.edu/​users/mckinley/notes/reviewing.html|Hill and McKinley, "Notes on Constructive and Positive Reviewing"​.]] ​| +  * **[[http://users.ece.utexas.edu/~merez/vecc_asplos_2010.pdf|Doe Hyun Yoon, Mattan Erez, "Virtualized ​and flexible ECC for main memory", ​ASPLOS 2010.]]** 
-| | [[https://www.usenix.org/legacy/publications/​library/​proceedings/​dsl97/good_paper.html|Levin and Redell, "How (and how not) to write a good systems paper", ​OSR 1983.]] | +    ​* ​[[http://ece.umd.edu/courses/enee759h.S2003/references/ibm_chipkill.pdf|TJ. Dell“A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory,” IBM Microelectronics Division, 1997.]] 
-| | [[http://www.ifs.tuwien.ac.at/~silvia/research-tips/smith-advice.pdf|Alan Jay Smith“The Task of the Referee”IEEE Computer ​1990.]] ​| +    ​* ​[[http://static.googleusercontent.com/​media/​research.google.com/​en/us/​pubs/​archive/35162.pdf|Bianca Schroeder et al., "DRAM Errors in the Wild: A Large-Scale Field Study", ​SIGMETRICS 2009.]] 
-| | [[http://research.microsoft.com/en-us/um/people/simonpj/papers/giving-a-talk/writing-a-paper-slides.pdf|Jones, "How to Write a Great Research Paper"​.]] ​| +    * [[http://​research.microsoft.com/​pubs/​144888/​eurosys84-nightingale.pdf|Edmund B. Nightingale et al., "​Cycles,​ cells and platters: an empirical analysisof hardware failures on a million consumer PCs", EuroSys 2011.]] 
-[[http://www2.cs.uregina.ca/~pwlfong/CS499/writing-paper.pdf|Philip WLFong“How to Write a CS Research Paper: A Bibliography”2004.]] | +  * **[[http://​passat.crhc.illinois.edu/​rakeshk/​hpca13.pdf|Xun Jian, Rakesh Kumar, "​Adaptive Reliability Chipkill Correct (ARCC)",​ HPCA 2013.]]** 
-=== 9/3/2014 ==+ 
-DRAM arch[[http://users.ece.cmu.edu/~omutlu/pub/salp-dram_isca12.pdf|Kim et al., "​A ​Case for Exploiting Subarray-Level ​Parallelism (SALP) in DRAM", ISCA 2012.]] | +=== 11/5/2014 === 
-DRAM arch. | [[http://users.ece.cmu.edu/~omutlu/pub/raidr-dram-refresh_isca12.pdf|Liu et al., "RAIDR: Retention-Aware Intelligent DRAM Refresh", ISCA 2012.]] | +=== 10/31/2014 === 
-Flash | [[http://users.ece.cmu.edu/​~omutlu/pub/flash-error-analysis-and-management_itj13.pdf|Cai et al."Error Analysis and Retention-Aware Error Management for NAND Flash Memory",​ ITJ Vol17-1 2013.]] | +Review required for the following two papers, due on Friday, Oct 31. 
-| DRAM reliab. | [[http://users.ece.cmu.edu/​~omutlu/pub/error-mitigation-for-intermittent-dram-failures_sigmetrics14.pdf|Khan et al., "The Efficacy of Error Mitigation Techniques for DRAM Retention FailuresA Comparative Experimental Study",​ SIGMETRICS 2014.]] | +  * **[[http://​www.cs.utexas.edu/​~pingali/CS395T/2009fa/lectures/​herlihy93transactional.pdf|Maurice Herlihy ​and J. Eliot B. Moss, "Transactional Memory: Architectural Support for Lock-Free Data Structures", ISCA 1993.]]** 
-Reliability ​| [[http://www.crhc.illinois.edu/ACS/pub/branchflip.pdf|Wang et al., "Y-BranchesWhen You Come to Fork in the Road, Take It", PACT 2003.]] | +    ​* ​[[http://web.mit.edu/~mmt/Public/Knight86.pdf|Tom Knight, "An achitecture for mostly functional languages", ​LFP 1986.]] 
-| Reliability | [[http://users.ece.cmu.edu/~omutlu/pub/heterogeneous-reliability-memory-for-data-centers_dsn14.pdf|Luo et al., "Characterizing Application ​Memory ​Error Vulnerability to Optimize ​Data Center Cost", ​DSN 2014.]] | +    ​* ​[[http://cs.brown.edu/~mph/AspnesH90/p340-aspnes.pdf|James AspnesMaurice Herlihy, "​Wait-Free Data Structures in the Asynchronous PRAM Model"SPAA 1990.]] 
-| Security | [[https://www.cs.princeton.edu/~appel/papers/memerr.pdf|Govindavajhala ​et al., "Using Memory Errors ​to Attack a Virtual Machine", ​SP 2003.]] | +    ​* ​[[http://cs.brown.edu/~mph/​Herlihy91/​p124-herlihy.pdf|Maurice Herlihy, "​Wait-free synchronization",​ TOPLAS 1991.]] 
-3d stacking | [[http://​www.cs.cmu.edu/~chensm/LBA_reading_group/papers/3Ddram-isca08.pdf|Loh et al., "​3D-Stacked ​Memory ​Architectures ​for Multi-core Processors"​, ISCA 2008.]] | +    * [[http://web.stanford.edu/class/cs343/resources/crusoe.pdf|Alexander Klaiber, "The Technology Behind Crusoe™ Processors", 2000.]] 
-3d stacking | [[http://​pdf.aminer.org/000/499/580/die_stacking_d_microarchitecture.pdf|Black et al., "Die Stacking (3D) Microarchitecture", ​MICRO 2006.]] | +    * [[http://​courses.cs.vt.edu/​cs5204/​fall11-kafura/​Papers/​TransactionalMemory/​TM-Book-V2.pdf.pdf|J. Larus and R. Rajwar. Transactional Memory. Synthesis Lectures on Computer Architecture (Ch. 1 & 2).]] 
-| In mem comp. | [[http://ieeexplore.ieee.org/stamp/stamp.jsp?​arnumber=4115697|Kogge et al., "EXECUBE - A New Architecture ​for Scalable MPPs", ​ICPP 1994.]] | +    * [[http://www.cs.binghamton.edu/~dima/cs580a/spec_wake_micro00.pdf|Jared Stark, Mary DBrown, Yale NPatt"On pipelining dynamic instruction scheduling logic"MICRO 2000.]] 
-| In mem comp. | [[http://​www.ece.umd.edu/courses/​enee759m.S2002/papers/fromm1997-isca24.pdf|Fromm et al., "The Energy Efficiency of IRAM Architectures", ISCA 1997.]] | +    * [[http://ieeexplore.ieee.org/​xpls/​abs_all.jsp?​arnumber=931895|Sanjay JPatel et al., “rePLay: a hardware framework for dynamic optimization,​” IEEE TC, June 2001.]] 
-| In mem comp. | [[http://www.eecs.berkeley.edu/~yelick/​yelick/iram-micro97.pdf|Patterson ​et al., "A Case for Intelligent DRAM: IRAM", ​IEEE Micro 1997.]] | +  * **[[http://www.cs.cmu.edu/~tcm/tcm_papers/isca00.pdf|J. Gregory Steffan ​et al., "​A ​Scalable Approach to Thread-Level ​Speculation", ISCA 2000.]]** 
-=== 9/9/2014 === +    * [[ftp://​ftp.cs.wisc.edu/​sohi/​papers/​1992/​isca.esw.pdf|Manoj Franklin, Gurindar SSohi, "The expandable split window paradigm for exploiting fine-grain parallelsim",​ ISCA 1992.]] 
-Papers discussed in class: +    * [[ftp://​ftp.cs.wisc.edu/​sohi/​papers/​1995/​isca.multiscalar.pdf|Sohi et al., “Multiscalar Processors,​” ISCA 1995.]] 
-  * [[http://users.ece.cmu.edu/~omutlu/pub/dirty-block-index_isca14.pdf|Seshadri ​et al., "The Dirty-Block Index", ​ISCA 2014.]] +    * [[http://www.cs.cmu.edu/~colohan/papers/tls_isca00.pdf|Steffan et al., “A Scalable Approach to Thread-Level Speculation,​” ISCA 2000.]] 
-  * [[http://users.ece.cmu.edu/~omutlu/pub/staged-memory-scheduling_isca12.pdf|Ausavarungnirun ​et al., "Staged Memory Scheduling: Achieving High Performance and Scalability ​in Heterogeneous Systems", ISCA 2012.]] +    * [[https://​homes.cs.washington.edu/​~luisceze/​publications/​isca06_bulk.pdf|Luis Ceze et al., "Bulk Disambiguation of Speculative Threads in Multiprocessors", ISCA 2006.]] 
-  * [[http://​users.ece.cmu.edu/​~omutlu/​pub/​dram-access-refresh-parallelization_hpca14.pdf|Chang et al., "Improving ​DRAM Performance by Parallelizing Refreshes with Accesses", ​HPCA 2014.]] +    * [[http://​www.princeton.edu/​~rblee/​ELE572Papers/​DynamicMultithreadingProc_akkary.pdf?​q=tilde/​rblee/​ELE572Papers/​DynamicMultithreadingProc_akkary.pdf|Akkary and Driscoll, “A dynamic multithreading processor,​” MICRO 1998.]] 
-Related ​papers: +Required videos for module 2.5.* in [[http://​www.ece.cmu.edu/​~ece740/​f13/​doku.php?​id=schedule#​schedule|18-740]]: 
-  * [[https://www.cs.sfu.ca/~ashriram/publications/2012_MICRO_AmoebaCache.pdf|Kumar et al., "Amoeba-Cache: Adaptive Blocks ​for Eliminating Waste in the Memory Hierarchy", MICRO 2012.]] +| 9/25 Wed. | 2.5.1 Speculation ​| [[http://www.ece.cmu.edu/​~ece742/f14/files/onur-740-fall13-module2.5-speculation.pdf|pdf]][[http://​www.ece.cmu.edu/​~ece742/​f14/​files/​onur-740-fall13-module2.5-speculation.pptx|pptx]], [[https://​www.youtube.com/​watch?​v=g3IF8DTtr8c|YouTube]][[http://​cmu.vid.acatar.com/​Panopto/​Pages/​Viewer/​Default.aspx?​id=dbb3baf9-c85e-4007-8c71-1e3204fe9907|Panopto]] ​| [[http://www.ece.cmu.edu/​~ece740/f13/doku.php?​id=readings#​module_2-5|readings]] | 
-  * [[https://www.usenix.org/legacy/publications/library/proceedings/osdi/full_papers/waldspurger.pdf|Waldspurger ​et al., "Lottery SchedulingFlexible Proportional-Share Resource Management", ​OSDI 1994.]] +| 9/27 Fri. | 2.5.2 Speculation | [[https://​www.youtube.com/​watch?​v=FqHk4bxrI8Y|video]] [[http://​cmu.vid.acatar.com/​Panopto/​Pages/​Viewer/​Default.aspx?​id=bf92ac2b-5fb5-4896-9bbd-6cb216a16cdd|Panopto]] | ::: | 
-=== 9/11/2014 === +| 9/30 Mon. | 2.5.3 Speculation | [[https://www.youtube.com/​watch?​v=uhyNTy8hvDs|video]] [[http://​cmu.vid.acatar.com/​Panopto/​Pages/​Viewer/​Default.aspx?​id=7dbc66b0-b245-45ba-b047-09a84427676a|Panopto]] | ::: 
-  * [[http://users.ece.cmu.edu/~omutlu/pub/mise-predictable_memory_performance-hpca13.pdf|Subramanian ​et al., "MISEProviding Performance Predictability and Improving Fairness in Shared Main Memory Systems", HPCA 2013.]] +::: | 2.5.4 Speculation ​| [[https://www.youtube.com/​watch?​v=McMyefc8CCE|video]] [[http://​cmu.vid.acatar.com/​Panopto/​Pages/​Viewer/​Default.aspx?​id=e9eddf7c-c5c0-4ab4-a731-30852d601506|Panopto]] | ::: | 
-  * [[http://users.elis.ugent.be/~seyerman/ISCA10.pdf|Eyerman ​et al., "Modeling critical sections in amdahl’s law and its implications for multicore design", ​ISCA 2010.]] +Related readings: 
-  ​* ​[[http://​users.ece.cmu.edu/​~omutlu/​pub/​bottleneck-identification-and-scheduling_asplos12.pdf|Joao et al., "Bottleneck Identification and Scheduling in Multithreaded Applications",​ ASPLOS 2012.]]+  * [[ftp://​ftp.cs.wisc.edu/sohi/papers/1995/isca.multiscalar.pdf|Sohi et al., “Multiscalar Processors,​” ISCA 1995.]] 
 +  * [[http://​classes.soe.ucsc.edu/​cmpe202/​Spring13/​papers/​12a.pdf|Zhou,​ “Dual-Core ExecutionBuilding ​Highly Scalable Single-Thread Instruction Window,” PACT 2005.]] 
 +  ​* ​[[http://pages.cs.wisc.edu/~rajwar/papers/micro01.pdf|Rajwar and Goodman, “Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution,​” MICRO 2001.]] 
 + 
 +=== 10/16/2014 === 
 +Consistency II -- Review required (one out of two) due on Wednesday night 
 +  * **[[http://​www.eecg.toronto.edu/​~moshovos/​research/​store-wait-free.pdf|Wenisch et al., "​Mechanisms ​for Store-wait–free Multiprocessors",​ ISCA 2007]]** 
 +    * [[http://​www.cs.utexas.edu/​~pingali/​CS395T/​2009fa/​lectures/​herlihy93transactional.pdf|Herlihy ​et al., "Transactional ​Memory: Architectural Support for Lock-Free ​Data Structures", ​ISCA 1993.]] 
 +    ​* ​[[http://www.cs.cmu.edu/~tcm/tcm_papers/isca00.pdf|Steffan ​et al., "A Scalable Approach ​to Thread-Level Speculation", ​ISCA 2000.]] 
 +  * **[[https://​homes.cs.washington.edu/​~luisceze/​publications/​isca07_bulksc.pdf|Ceze et al., "​BulkSC:​ Bulk Enforcement of Sequential Consistency",​ ISCA 2007.]]** 
 +    * [[https://​homes.cs.washington.edu/​~luisceze/​publications/​isca06_bulk.pdf|Ceze et al., "Bulk Disambiguation of Speculative Threads in Multiprocessors",​ ISCA 2006.]] 
 +    * [[http://​www.ece.cmu.edu/~ece740/f13/lib/exe/​fetch.php?​media=rajwar01.pdf|Rajwar and Goodman, “Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution”,​ MICRO 2001.]] 
 +    * [[http://​www.ece.cmu.edu/​~ece740/​f13/​lib/​exe/​fetch.php?​media=herlihy93.pdf|Herlihy and Moss“Transactional ​Memory: Architectural Support ​for Lock-Free Data Structures”, ISCA 1993.]] 
 + 
 +=== 10/14/2014 === 
 +Background lecture and paper (required reading, no need to review): 
 +  * [[https://​www.youtube.com/​watch?​v=Mq24MXW4g3U|Consistency & Coherence Lecture]] 
 +  * [[http://courses.cs.washington.edu/​courses/​cse548/​10wi/​Lamport.pdf|Leslie Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs",​ IEEE TransComputers 28(9): 690-691 (1979)]] 
 +Consistency I -- Required review (one out of two) due on Monday night 
 +  * **[[http://​hpc.cs.tsinghua.edu.cn/​research/zwm/reading/prof/2a.pdf|Gharachorloo ​et al., "Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors", ​ISCA 1990.]]** 
 +  * **[[https://​courses.engr.illinois.edu/​cs533/​sp2012/​reading_list/​gharachorloo91two.pdf|Gharachorloo et al., "Two Techniques to Enhance the Performance of Memory Consistency Models",​ ICPP 1991.]]** 
 +    * [[http://www.csd.uoc.gr/~hy460/pdf/kung.pdf|Kung et al., "On Optimistic Methods ​for Concurrency Control", ​TODS 1981.]] 
 +    * [[https://​engineering.purdue.edu/​~vpai/​Publications/​ranganathan-spaa97.pdf|Ranganathan et al., "Using Speculative Retirement and Larger Instruction Windows to Narrow the Performance Gap Between Memory Consistency Models",​ SPAA 1997.]] 
 +    * [[http://​www.cs.arizona.edu/~gniady/papers/isca99_scrc.pdf|Gniady ​et al., "Is SC + ILP = RC?", ISCA 1999.]] 
 +Recommended book 
 +  * [[http://​www.morganclaypool.com/​doi/​pdfplus/​10.2200/​S00346ED1V01Y201104CAC016|A Primer on Memory Consistency and Cache Coherence, Chapters 1, 3, 4, 5]] 
 +Further reading in a different area (system level consistency) 
 +  * [[http://sns.cs.princeton.edu/docs/eiger-nsdi13.pdf|Lloyd et al., "Stronger Semantics ​for Low-Latency Geo-Replicated Storage", ​NSDI 2013.]] 
 + 
 +=== 10/9/2014 === 
 +Amirali'​s Literature Survey 
 +  * [[http://www.cs.utah.edu/​events/​thememoryforum/​mike.pdf|Connor et al., "​Highlights of the high-bandwidth memory (HBM) standard",​ Memory Forum 2014.]] 
 +  * [[https://​cs.uwaterloo.ca/~brecht/courses/702/Possible-Readings/​multiprocessor/​tlb-consistency-computer-1990.pdf|Teller ​et al., "Translation-Lookaside Buffer Consistency", ​Computer 1990.]] 
 +  * [[http://research.cs.wisc.edu/multifacet/papers/isca13_direct_segment.pdf|Basu et al., "Efficient virtual memroy support ​in big memory servers", ISCA 2013.]] 
 + 
 +Jiyuan'​s Paper Discussion: (required 1 out of 3 reviews) 
 +  ​* **[[http://​mercury.pr.erau.edu/​~davisb22/​papers/​burst_scheduling_hpca13.pdf|Shao et al., "A Burst Scheduling Access Reordering Mechanism",​ HPCA 2007.]]** 
 +    ​* [[http://​users.ece.cmu.edu/​~omutlu/​pub/​dram-aware-caches-TR-HPS-2010-002.pdf|Lee et al., "DRAM-Aware Last-Level Cache Writeback: Reducing Write-Caused Interference in Memory Systems", ​HPS 2010.]] 
 +    * [[http://​lca.ece.utexas.edu/​people/​kaseridis/​papers/​ISCA_2010.pdf]] 
 +    * [["​FIRMFair and High-Performance Memory Control for Persistent Memory Systems"​]] 
 +    * [[Adaptive History-Based Memory Schedulers]] 
 +    * [["​Self Optimizing Memory Controllers:​ A Reinforcement Learning Approach"​]] 
 +  ​* **[[http://www.cs.cmu.edu/~chensm/LBA_reading_group/papers/​3Ddram-isca08.pdf|Loh et al., "3D-Stacked Memory Architectures ​for Multi-Core Processors",​ ISCA 2008.]]** 
 +    * [[http://​users.ece.gatech.edu/​~moin/​papers/​micro12.pdf|Qureshi et al., "​Fundamental Latency Trade-off ​in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design", MICRO 2012.]] 
 +    * [[http://sampa.cs.washington.edu/papers/micro06_mshr.pdf|Tuck et al., "​Scalable Cache Miss Handling for High Memory Level Parallelism",​ MICRO 2006.]] 
 +    * [[http://comparch.gatech.edu/hparch/papers/sim_isca13.pdf|Sim et al., "Resilient die-stacked DRAM caches",​ ISCA 2013.]] 
 +  * **[[http://​www.dongpingzhang.com/​wordpress/​wp-content/​uploads/​2013/​06/​MSPC6-Zhang.pdf|Zhang et al., "A New Perspective on Processing-in-memory Architecture Design", ​MSPC 2013.]]** 
 +    * [[https://​www.google.com/​url?​sa=t&rct=j&q=&​esrc=s&​source=web&​cd=1&​ved=0CB4QFjAA&​url=http%3A%2F%2Fisca2010.inria.fr%2Fmedia%2Fslides%2FISCA_Needle_A_0610.pptx&​ei=9kE9VMOuMtLCsATIuIGQCQ&​usg=AFQjCNGfI_qA9tHBnR8pJo50uRNYvgVEBw&​sig2=OTCWpdAXmMmekm9jun8uBg&​bvm=bv.77161500,​d.cWc&​cad=rja|Dally et al., "​Moving the needle Computer Architecture Research in Academe and Industry",​ ISCA keynote 2010.]] 
 + 
 +=== 10/7/2014 === 
 +Hui: (required 1 out of 3 reviews) 
 +  ​* **[[http://tinker.cc.gatech.edu/pdfs/MICRO44_Jesse_Beu.pdf|Beu et al., "​Manager-Client Pairing: A Framework for Implementing Coherence Hierarchies",​ MICRO 2011.]]** 
 +  * **[[http://​research.cs.wisc.edu/​multifacet/​papers/​hpca14_quick_release.pdf|Hechtman ​et al., "Quick ReleaseA Throughput-oriented Approach to Release Consistency on GPUs", HPCA 2014.]]** 
 +  ​* **[[http://dl.acm.org/​citation.cfm?​id=2541982|Voskuilen et al., "​High-Performance Fractal Coherence",​ ASPLOS 2014.]]** 
 + 
 + 
 +=== 10/2/2014 === 
 +Required review for the Memory Forum paper: 
 +^ Due 9/28/2014 | [[http://​www.cs.utah.edu/​events/​thememoryforum/kang.pdf|Kang et al., "Co-Architecting Controllers ​and DRAM to Enhance DRAM Process Scaling", ​Memory Forum 2014.]] | 
 +^ | [[http://​users.ece.cmu.edu/​~omutlu/​pub/​salp-dram_isca12.pdf|Kim et al., "
readings.txt · Last modified: 2014/12/03 21:12 by yixinluo