Differences

This shows you the differences between two versions of the page.

--- readings [2014/09/24 15:37]
yixinluo
+++ readings [2014/12/03 21:12]
yixinluo
@@ Line 5: / Line 5: @@
 Note that the reviews are due at 11:59 PM on the due date.
-==== Reading List (now in reverse order) ====
+==== Reading List (in reverse order) ====
+=== 12/3/2014 ===
+  * Christian Jacobi, Timothy J. Slegel, Dan F. Greiner: Transactional Memory Architecture and Implementation for IBM System Z. MICRO 2010.
+  * Jae-Woong Chung, Luke Yen, Stephan Diestelhorst, Martin Pohlack, Michael Hohmuth, David Christie, Dan Grossman: ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory. MICRO 2010.
+  * Jack B. Dennis, David Misunas: A Preliminary Architecture for a Basic Data Flow Processor. ISCA 1974.
+  * James E. Smith, G. E. Dermer, B. D. Vanderwarn, S. D. Klinger, C. M. Rozewski, D. L. Fowler, K. R. Scidmore, James Laudon: The ZS-1 Central Processor. ASPLOS 1987.
+  * James E. Smith: Decoupled access/execute computer architectures. ISCA 1982
-=== 9/25/2014 ===
+=== 11/18/2014 ===
-Doru's literature survey:
+Review required for the following two paper, due on Monday, Nov 17.
-  * [[http://hps.ece.utexas.edu/pub/morphcore_micro2012.pdf|Khubaib et al., "MorphCore: An Energy-Efficient Microarchitecture for High Performance ILP and High Throughput TLP", ISCA 2012.]]
+  * **Justin Meza et al., "Memory Errors at Scale: What the Trends Across a Billion-User Web Services Company Foretell", under submission. (Sent through email, please do not distribute)**
-  * [[http://users.eecs.northwestern.edu/~rjoseph/eecs453/papers/quereshi-micro2006.pdf|Qureshi et al., "Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches", MICRO 2006.]]
+  * **[[https://www.usenix.org/legacy/event/sec08/tech/full_papers/halderman/halderman.pdf|J. Alex Halderman et al., "Lest We Remember: Cold Boot Attacks on Encryption Keys", USENIX Security Symposium 2008.]]**
-  * [[http://users.elis.ugent.be/~leeckhou/papers/isca13.pdf|Bois et al., "Criticality Stacks: Identifying Critical Threads in Parallel Programs using Synchronization Behavior", ISCA 2013.]]
-Yang's literature survey:
-=== 9/24/2014 ===
+=== 11/13/2014 ===
-Kevin's literature survey:
+Reviews required for both papers, due on Wednesday, Nov 12.
-  * [[http://users.ece.cmu.edu/~omutlu/pub/raidr-dram-refresh_isca12.pdf|Liu et al., "RAIDR: Retention-Aware Intelligent DRAM Refresh", ISCA 2012.]]
+  * **[[http://dl.acm.org/citation.cfm?id=2503257|Sridharan et al., "Feng shui of supercomputer memory: positional effects in DRAM and SRAM faults", SC 2013.]]**
-  * [[http://users.ece.cmu.edu/~omutlu/pub/dram-retention-time-characterization_isca13.pdf|Liu et al., "An Experimental Study of Data Retention Behavior in Modern DRAM Devices: Implications for Retention Time Profiling Mechanisms", ISCA 2013.]]
+    * [[https://www.cs.cmu.edu/~bianca/fast07.pdf|Bianca Schroeder, Garth A. Gibson, "Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You?", FAST 2007.]]
-  * [[http://www.ece.cmu.edu/~safari/pubs/error-mitigation-for-intermittent-dram-failures_sigmetrics14.pdf|Khan et al., "The Efficacy of Error Mitigation Techniques for DRAM Retention Failures: A Comparative Experimental Study", SIGMETRICS 2014.]]
+    * [[http://www.pdl.cmu.edu/PDL-FTP/associated/dsn06.pdf|Bianca Schroeder, Garth A. Gibson, "A large-scale study of failures in high-performance computing systems", DSN 2006.]]
+    * [[http://www.cs.toronto.edu/~hwang/papers/asplos2012.pdf|Andy A. Hwang, et al., "Cosmic rays don't strike twice: understanding the nature of DRAM errors and the implications for system design", ASPLOS 2012.]]
+    * [[http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/35162.pdf|Bianca Schroeder et al., "DRAM Errors in the Wild: A Large-Scale Field Study", SIGMETRICS 2009.]]
+  * **[[https://www.cs.princeton.edu/~appel/papers/memerr.pdf|Sudhakar Govindavajhala, Andrew W. Appel, "Using Memory Errors to Attack a Virtual Machine", SP 2003.]]**
+    * [[https://www.usenix.org/legacy/event/sec08/tech/full_papers/halderman/halderman.pdf|J. Alex Halderman et al., "Lest We Remember: Cold Boot Attacks on Encryption Keys", USENIX Security Symposium 2008.]]
-Amirali's literature survey:
+=== 11/12/2014 ===
+Reviews required for both papers, due on Tuesday, Nov 11.
+  * **[[http://users.ece.utexas.edu/~merez/vecc_asplos_2010.pdf|Doe Hyun Yoon, Mattan Erez, "Virtualized and flexible ECC for main memory", ASPLOS 2010.]]**
+    * [[http://ece.umd.edu/courses/enee759h.S2003/references/ibm_chipkill.pdf|T. J. Dell, “A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory,” IBM Microelectronics Division, 1997.]]
+    * [[http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/35162.pdf|Bianca Schroeder et al., "DRAM Errors in the Wild: A Large-Scale Field Study", SIGMETRICS 2009.]]
+    * [[http://research.microsoft.com/pubs/144888/eurosys84-nightingale.pdf|Edmund B. Nightingale et al., "Cycles, cells and platters: an empirical analysisof hardware failures on a million consumer PCs", EuroSys 2011.]]
+  * **[[http://passat.crhc.illinois.edu/rakeshk/hpca13.pdf|Xun Jian, Rakesh Kumar, "Adaptive Reliability Chipkill Correct (ARCC)", HPCA 2013.]]**
-=== 9/23/2014 ===
+=== 11/5/2014 ===
-Hui's literature survey:
+=== 10/31/2014 ===
-  * [[https://www.ece.ubc.ca/~aamodt/papers/isingh.hpca2013.pdf|Singh et al., "Cache Coherence for GPU Architectures", HPCA 2013.]]
+Review required for the following two papers, due on Friday, Oct 31.
-  * [[http://research.cs.wisc.edu/multifacet/papers/micro13_hsc.pdf|Power et al., "Heterogeneous System Coherence for Integrated CPU-GPU Systems", MICRO 2013.]]
+  * **[[http://www.cs.utexas.edu/~pingali/CS395T/2009fa/lectures/herlihy93transactional.pdf|Maurice Herlihy and J. Eliot B. Moss, "Transactional Memory: Architectural Support for Lock-Free Data Structures", ISCA 1993.]]**
-  * [[http://users.crhc.illinois.edu/djohns53/pub/cohesion-isca2010.pdf|Kelm et al., "Cohesion: A Hybrid Memory Model for Accelerators", ISCA 2010.]]
+    * [[http://web.mit.edu/~mmt/Public/Knight86.pdf|Tom Knight, "An achitecture for mostly functional languages", LFP 1986.]]
+    * [[http://www.cs.rice.edu/~alc/old/comp520/papers/SW91.pdf|Frank Schmuck, Jim Wyllie, "Experience with transactions in QuickSilver", SOSP 1991.]]
+    * [[http://cs.brown.edu/~mph/AspnesH90/p340-aspnes.pdf|James Aspnes, Maurice Herlihy, "Wait-Free Data Structures in the Asynchronous PRAM Model", SPAA 1990.]]
+    * [[http://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf|Maurice Herlihy, "Wait-free synchronization", TOPLAS 1991.]]
+    * [[http://www.cs.utexas.edu/~pingali/CS395T/2009fa/lectures/herlihy93transactional.pdf|Maurice Herlihy, J. Eliot B. Moss, "Transactional Memory: Architectural Support for Lock-Free Data Structures", ISCA 1993.]]
+    * [[http://csl.stanford.edu/~christos/publications/2006.bliss.taco.pdf|Ahmad Zmily and Christos Kozyrakis, "Block-Aware Instruction Set Architecture", TACO 2006.]]
+    * [[http://web.stanford.edu/class/cs343/resources/crusoe.pdf|Alexander Klaiber, "The Technology Behind Crusoe™ Processors", 2000.]]
+    * [[http://courses.cs.vt.edu/cs5204/fall11-kafura/Papers/TransactionalMemory/TM-Book-V2.pdf.pdf|J. Larus and R. Rajwar. Transactional Memory. Synthesis Lectures on Computer Architecture (Ch. 1 & 2).]]
+    * [[http://www.cs.binghamton.edu/~dima/cs580a/spec_wake_micro00.pdf|Jared Stark, Mary D. Brown, Yale N. Patt, "On pipelining dynamic instruction scheduling logic", MICRO 2000.]]
+    * [[http://www.christianjacobi.de/publications/jsg12_tx.pdf|Christian Jacobi, et al., "Transactional Memory Architecture and Implementation for IBM System Z", MICRO 2012.]]
+    * [[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=931895|Sanjay J. Patel et al., “rePLay: a hardware framework for dynamic optimization,” IEEE TC, June 2001.]]
+  * **[[http://www.cs.cmu.edu/~tcm/tcm_papers/isca00.pdf|J. Gregory Steffan et al., "A Scalable Approach to Thread-Level Speculation", ISCA 2000.]]**
+    * [[ftp://ftp.cs.wisc.edu/sohi/papers/1992/isca.esw.pdf|Manoj Franklin, Gurindar S. Sohi, "The expandable split window paradigm for exploiting fine-grain parallelsim", ISCA 1992.]]
+    * [[ftp://ftp.cs.wisc.edu/sohi/papers/1995/isca.multiscalar.pdf|Sohi et al., “Multiscalar Processors,” ISCA 1995.]]
+    * [[http://www.cs.cmu.edu/~colohan/papers/tls_isca00.pdf|Steffan et al., “A Scalable Approach to Thread-Level Speculation,” ISCA 2000.]]
+    * [[https://homes.cs.washington.edu/~luisceze/publications/isca06_bulk.pdf|Luis Ceze et al., "Bulk Disambiguation of Speculative Threads in Multiprocessors", ISCA 2006.]]
+    * [[http://www.princeton.edu/~rblee/ELE572Papers/DynamicMultithreadingProc_akkary.pdf?q=tilde/rblee/ELE572Papers/DynamicMultithreadingProc_akkary.pdf|Akkary and Driscoll, “A dynamic multithreading processor,” MICRO 1998.]]
+Required videos for module 2.5.* in [[http://www.ece.cmu.edu/~ece740/f13/doku.php?id=schedule#schedule|18-740]]:
+| 9/25 Wed. | 2.5.1 Speculation | [[http://www.ece.cmu.edu/~ece742/f14/files/onur-740-fall13-module2.5-speculation.pdf|pdf]], [[http://www.ece.cmu.edu/~ece742/f14/files/onur-740-fall13-module2.5-speculation.pptx|pptx]], [[https://www.youtube.com/watch?v=g3IF8DTtr8c|YouTube]][[http://cmu.vid.acatar.com/Panopto/Pages/Viewer/Default.aspx?id=dbb3baf9-c85e-4007-8c71-1e3204fe9907|Panopto]] | [[http://www.ece.cmu.edu/~ece740/f13/doku.php?id=readings#module_2-5|readings]] |
+| 9/27 Fri. | 2.5.2 Speculation | [[https://www.youtube.com/watch?v=FqHk4bxrI8Y|video]] [[http://cmu.vid.acatar.com/Panopto/Pages/Viewer/Default.aspx?id=bf92ac2b-5fb5-4896-9bbd-6cb216a16cdd|Panopto]] | ::: |
+| 9/30 Mon. | 2.5.3 Speculation | [[https://www.youtube.com/watch?v=uhyNTy8hvDs|video]] [[http://cmu.vid.acatar.com/Panopto/Pages/Viewer/Default.aspx?id=7dbc66b0-b245-45ba-b047-09a84427676a|Panopto]] | ::: |
+| ::: | 2.5.4 Speculation | [[https://www.youtube.com/watch?v=McMyefc8CCE|video]] [[http://cmu.vid.acatar.com/Panopto/Pages/Viewer/Default.aspx?id=e9eddf7c-c5c0-4ab4-a731-30852d601506|Panopto]] | ::: |
+Related readings:
+  * [[ftp://ftp.cs.wisc.edu/sohi/papers/1995/isca.multiscalar.pdf|Sohi et al., “Multiscalar Processors,” ISCA 1995.]]
+  * [[http://classes.soe.ucsc.edu/cmpe202/Spring13/papers/12a.pdf|Zhou, “Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window,” PACT 2005.]]
+  * [[http://pages.cs.wisc.edu/~rajwar/papers/micro01.pdf|Rajwar and Goodman, “Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution,” MICRO 2001.]]
-Jiyuan's literature survey:
+=== 10/16/2014 ===
-  * **[[http://cseweb.ucsd.edu/~swanson/papers/ASPLOS2011Prefetching.pdf|Kamruzzaman et al., "Inter-core Prefetching for Multicore Processors Using Migrating Helper Threads", ASPLOS 2011.]]**
+Consistency II -- Review required (one out of two) due on Wednesday night
-    * [[http://www.cs.ucf.edu/~zhou/dce_pact_2005_ieee.pdf|Zhou et al., "Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window", PACT 2005.]]
+  * **[[http://www.eecg.toronto.edu/~moshovos/research/store-wait-free.pdf|Wenisch et al., "Mechanisms for Store-wait–free Multiprocessors", ISCA 2007]]**
-    * [[http://people.engr.ncsu.edu/ericro/publications/conference_ASPLOS-9.pdf|Sundaramoorthy et al., "Slipstream Processors: Improving both Performance and Fault Tolerance", ASPLOS 2000.]]
+    * [[http://www.cs.utexas.edu/~pingali/CS395T/2009fa/lectures/herlihy93transactional.pdf|Herlihy et al., "Transactional Memory: Architectural Support for Lock-Free Data Structures", ISCA 1993.]]
-  * **[[http://www.cs.utah.edu/wondp/sqrl.pdf|Kumar et al., "SQRL: Hardware Accelerator for Collecting Software Data Structures", PACT 2014.]]**
+    * [[http://www.cs.cmu.edu/~tcm/tcm_papers/isca00.pdf|Steffan et al., "A Scalable Approach to Thread-Level Speculation", ISCA 2000.]]
-  * **[[http://www.cse.ust.hk/catalac/papers/scatter_sc07.pdf|He et al., "Efficient Gather and Scatter Operations on Graphics Processors", SC 2007.]]**
+  * **[[https://homes.cs.washington.edu/~luisceze/publications/isca07_bulksc.pdf|Ceze et al., "BulkSC: Bulk Enforcement of Sequential Consistency", ISCA 2007.]]**
-    * [[http://www.cs.utah.edu/~ald/pubs/hpca99.pdf|Carter et al., "Impulse: Building a Smarter Memory Controller", HPCA 1999.]]
+    * [[https://homes.cs.washington.edu/~luisceze/publications/isca06_bulk.pdf|Ceze et al., "Bulk Disambiguation of Speculative Threads in Multiprocessors", ISCA 2006.]]
-    * [[http://www.cs.utah.edu/~rajeev/pubs/asplos10.pdf|Sudan et al., "Micro-Pages: Increasing DRAM Efficiency with Locality-Aware Data Placement", ASPLOS 2010.]]
+    * [[http://www.ece.cmu.edu/~ece740/f13/lib/exe/fetch.php?media=rajwar01.pdf|Rajwar and Goodman, “Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution”, MICRO 2001.]]
+    * [[http://www.ece.cmu.edu/~ece740/f13/lib/exe/fetch.php?media=herlihy93.pdf|Herlihy and Moss, “Transactional Memory: Architectural Support for Lock-Free Data Structures”, ISCA 1993.]]
-=== 9/18/2014 ===
+=== 10/14/2014 ===
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/stfm_micro07.pdf|Mutlu et al., "Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors", MICRO 2007.]]**
+Background lecture and paper (required reading, no need to review):
-    * [[http://users.elis.ugent.be/~leeckhou/papers/isca12-2.pdf|Craeynest et al., "Scheduling Heterogeneous Multi-Cores through Performance Impact Estimation (PIE)", ISCA 2012.]]
+  * [[https://www.youtube.com/watch?v=Mq24MXW4g3U|Consistency & Coherence Lecture]]
-    * [[http://mprc.pku.edu.cn/~liuxianhua/chn/corpus/Notes/articles/isca/ISCA2002/p47.pdf|Fields et al., "Slack: Maximizing Performance Under Technological Constraints", ISCA 2002.]]
+  * [[http://courses.cs.washington.edu/courses/cse548/10wi/Lamport.pdf|Leslie Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs", IEEE Trans. Computers 28(9): 690-691 (1979)]]
-    * [[http://cadal3.cse.nsysu.edu.tw/seminar/seminar_file/2002/10/Focusing%20processor%20policies%20via%20critical-path%20prediction.pdf|Fields et al., "Focusing Processor Policies via Critical-Path Prediction", ISCA 2001.]]
+Consistency I -- Required review (one out of two) due on Monday night
-    * [[http://www.ece.ncsu.edu/arpers/Papers/faircaching.pdf|Kim et al., "Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture", PACT 2004.]]
+  * **[[http://hpc.cs.tsinghua.edu.cn/research/zwm/reading/prof/2a.pdf|Gharachorloo et al., "Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors", ISCA 1990.]]**
-  * **[[ftp://ftp-sop.inria.fr/maestro/Sigmetrics-Performance-2012-papers-and-posters/p295.pdf|Xu et al., "Providing Fairness on Shared-Memory Multiprocessors via Process Scheduling ", Sigmetrics 2012.]]**
+  * **[[https://courses.engr.illinois.edu/cs533/sp2012/reading_list/gharachorloo91two.pdf|Gharachorloo et al., "Two Techniques to Enhance the Performance of Memory Consistency Models", ICPP 1991.]]**
-    * [[http://cseweb.ucsd.edu/~calder/papers/ASPLOS-02-SimPoint.pdf|Sherwood et al., "Automatically characterizing large scale program behavior", ASPLOS 2002.]]
+    * [[http://www.csd.uoc.gr/~hy460/pdf/kung.pdf|Kung et al., "On Optimistic Methods for Concurrency Control", TODS 1981.]]
-    * [[http://cseweb.ucsd.edu/~calder/papers/ISCA-03-Phase.pdf|Sherwood et al., "Phase Tracking and Prediction", ISCA 2003.]]
+    * [[https://engineering.purdue.edu/~vpai/Publications/ranganathan-spaa97.pdf|Ranganathan et al., "Using Speculative Retirement and Larger Instruction Windows to Narrow the Performance Gap Between Memory Consistency Models", SPAA 1997.]]
-    * [[http://www.csl.cornell.edu/~martinez/doc/isca13-ghose.pdf|Ghose et al., "Improving memory scheduling via processor-side load criticality information", ISCA 2013.]]
+    * [[http://www.cs.arizona.edu/~gniady/papers/isca99_scrc.pdf|Gniady et al., "Is SC + ILP = RC?", ISCA 1999.]]
-    * [[http://users.eecs.northwestern.edu/~rjoseph/eecs453/papers/quereshi-micro2006.pdf|Qureshi et al., "Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches", MICRO 2006.]]
+Recommended book
-    * [[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=165388|Stone et al., "Optimal partitioning of cache memory", IEEE Trans. 1992.]]
+  * [[http://www.morganclaypool.com/doi/pdfplus/10.2200/S00346ED1V01Y201104CAC016|A Primer on Memory Consistency and Cache Coherence, Chapters 1, 3, 4, 5]]
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/fairness-via-throttling_acm_tocs12.pdf|Ebrahimi et al., "Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multi-Core Memory Systems", TOCS 2012.]]**
+Further reading in a different area (system level consistency)
+  * [[http://sns.cs.princeton.edu/docs/eiger-nsdi13.pdf|Lloyd et al., "Stronger Semantics for Low-Latency Geo-Replicated Storage", NSDI 2013.]]
-=== 9/16/2014 ===
+=== 10/9/2014 ===
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/rowclone_micro13.pdf|Seshadri et al., "RowClone: Fast and Efficient In-DRAM Copy and Initialization of Bulk Data", MICRO 2013.]]**
+Amirali's Literature Survey
-    * [[http://scale.eecs.berkeley.edu/papers/mmp-asplos2002.pdf|Witchel et al., "Mondrian Memory Protection", ASPLOS 2002.]]
+  * [[http://www.cs.utah.edu/events/thememoryforum/mike.pdf|Connor et al., "Highlights of the high-bandwidth memory (HBM) standard", Memory Forum 2014.]]
-    * [[http://www.info.uni-karlsruhe.de/lehre/2002SS/uebau2/papers/ChilimbiHillLarus-1999.pdf|Chilimbi et al., "Cache-Conscious Structure Layout", PLDI 1999.]]
+  * [[https://cs.uwaterloo.ca/~brecht/courses/702/Possible-Readings/multiprocessor/tlb-consistency-computer-1990.pdf|Teller et al., "Translation-Lookaside Buffer Consistency", Computer 1990.]]
-    * [[http://dl.acm.org/citation.cfm?id=301635|Chilimbi et al., "Cache-conscious structure definition", PLDI 1999.]]
+  * [[http://research.cs.wisc.edu/multifacet/papers/isca13_direct_segment.pdf|Basu et al., "Efficient virtual memroy support in big memory servers", ISCA 2013.]]
-    * [[http://www.cs.tufts.edu/comp/150CMP/papers/chilimbi02prefetching.pdf|Chilimbi et al., "Dynamic Hot Data Stream Prefetching for General-Purpose Programs", PLDI 2002.]]
-    * [[http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4115697|Kogge et al., "EXECUBE - A New Architecture for Scalable MPPs", ICPP 1994.]]
-    * [[http://www.ai.mit.edu/projects/aries/course/notes/terasys.pdf|Gokhale et al., "Processing in memory: The Terasys massively parallel PIM array." Computer 28.4 1995.]]
-    * [[http://www.eecg.toronto.edu/~dunc/cram/#Bibliography|The Computational RAM (C-RAM) project.]]
-    * [[http://pages.cs.wisc.edu/~isca2005/papers/04B-03.PDF|Cantin et al., "Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking", ISCA 2005.]]
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/bdi-compression_pact12.pdf|Pekhimenko et al., "Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches", PACT 2012.]]**
-    * [[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6657054|Chen et al., "Free ECC: An efficient error protection for compressed last-level caches", ICCD 2013.]]
-    * [[http://taco.cse.tamu.edu/pdfs/p53-tian.pdf|Tian et al., "Last-Level Cache Deduplication", ICS 2014.]]
-    * [[http://dl.acm.org/citation.cfm?id=2370864|Sathish et al., "Lossless and lossy memory I/O link compression for improving performance of GPGPU workloads", PACT 2012.]]
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/linearly-compressed-pages_micro13.pdf|Pekhimenko et al., "Linearly Compressed Pages: A Low-Complexity, Low-Latency Main Memory Compression Framework", MICRO 2013.]]**
-=== 9/11/2014 ===
+Jiyuan's Paper Discussion: (required 1 out of 3 reviews)
-  * **[[http://users.elis.ugent.be/~seyerman/ISCA10.pdf|Eyerman et al., "Modeling critical sections in amdahl’s law and its implications for multicore design", ISCA 2010.]]**
+  * **[[http://mercury.pr.erau.edu/~davisb22/papers/burst_scheduling_hpca13.pdf|Shao et al., "A Burst Scheduling Access Reordering Mechanism", HPCA 2007.]]**
-    * [[http://research.cs.wisc.edu/multifacet/papers/tr1593_amdahl_multicore.pdf|Hill et al., "Amdahl’s Law in the Multicore Era", HPCA 2008.]]
+    * [[http://users.ece.cmu.edu/~omutlu/pub/dram-aware-caches-TR-HPS-2010-002.pdf|Lee et al., "DRAM-Aware Last-Level Cache Writeback: Reducing Write-Caused Interference in Memory Systems", HPS 2010.]]
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/bottleneck-identification-and-scheduling_asplos12.pdf|Joao et al., "Bottleneck Identification and Scheduling in Multithreaded Applications", ASPLOS 2012.]]**
+    * [[http://lca.ece.utexas.edu/people/kaseridis/papers/ISCA_2010.pdf]]
-    * [[http://users.ece.cmu.edu/~omutlu/pub/acs_asplos09.pdf|Suleman et al., "Accelerating Critical Section Execution with Asymmetric Multi-Core Architectures", ASPLOS 2009.]]
+    * [["FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems"]]
-    * [[http://www.ann.ece.ufl.edu/courses/eel6686_14spr/papers/MeetingPointsUsingThreadCriticalitytToAdaptToMulticoreHardwareToParallelRegions.pdf|Cai et al., "Meeting Points: Using Thread Criticality to Adapt Multicore Hardware to Parallel Regions", PACT 2008.]]
+    * [[Adaptive History-Based Memory Schedulers]]
-    * [[http://users.ece.cmu.edu/~omutlu/pub/srinath_hpca07.pdf|Srinath et al., "Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers", HPCA 2007.]]
+    * [["Self Optimizing Memory Controllers: A Reinforcement Learning Approach"]]
-    * [[http://mrmgroup.cs.princeton.edu/papers/abhattac-isca2009.pdf|Bhattacharjee et al., "Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors", ISCA 2009.]]
+  * **[[http://www.cs.cmu.edu/~chensm/LBA_reading_group/papers/3Ddram-isca08.pdf|Loh et al., "3D-Stacked Memory Architectures for Multi-Core Processors", ISCA 2008.]]**
-    * [[http://users.ece.cmu.edu/~omutlu/pub/dm_isca10.pdf|Suleman et al., "Data Marshaling for Multi-core Systems", ISCA 2010.]]
+    * [[http://users.ece.gatech.edu/~moin/papers/micro12.pdf|Qureshi et al., "Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design", MICRO 2012.]]
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/mise-predictable_memory_performance-hpca13.pdf|Subramanian et al., "MISE: Providing Performance Predictability and Improving Fairness in Shared Main Memory Systems", HPCA 2013.]]**
+    * [[http://sampa.cs.washington.edu/papers/micro06_mshr.pdf|Tuck et al., "Scalable Cache Miss Handling for High Memory Level Parallelism", MICRO 2006.]]
-    * [[http://hps.ece.utexas.edu/pub/morphcore_micro2012.pdf|Khubaib et al., "MorphCore: An Energy-Efficient Microarchitecture for High Performance ILP and High Throughput TLP", ISCA 2012.]]
+    * [[http://comparch.gatech.edu/hparch/papers/sim_isca13.pdf|Sim et al., "Resilient die-stacked DRAM caches", ISCA 2013.]]
-    * [[http://m3.csl.cornell.edu/papers/isca07.pdf|Ipek et al., "Core Fusion: Accommodating Software Diversity in Chip Multiprocessors", ISCA 2007.]]
+  * **[[http://www.dongpingzhang.com/wordpress/wp-content/uploads/2013/06/MSPC6-Zhang.pdf|Zhang et al., "A New Perspective on Processing-in-memory Architecture Design", MSPC 2013.]]**
-    * [[http://www.istc-cc.cmu.edu/publications/papers/2013/joao_isca13_preprint.pdf|Joao et al., "Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs", ISCA 2013.]]
+    * [[https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CB4QFjAA&url=http%3A%2F%2Fisca2010.inria.fr%2Fmedia%2Fslides%2FISCA_Needle_A_0610.pptx&ei=9kE9VMOuMtLCsATIuIGQCQ&usg=AFQjCNGfI_qA9tHBnR8pJo50uRNYvgVEBw&sig2=OTCWpdAXmMmekm9jun8uBg&bvm=bv.77161500,d.cWc&cad=rja|Dally et al., "Moving the needle Computer Architecture Research in Academe and Industry", ISCA keynote 2010.]]
-    * [[http://cseweb.ucsd.edu/~calder/papers/ISCA-03-Phase.pdf|Sherwood et al., "Phase Tracking and Prediction", ISCA 2003.]]
-    * [[http://www.cs.rochester.edu/~ipek/micro08.pdf|Bitirgen et al., "Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors: A Machine Learning Approach", MICRO 2008.]]
-=== 9/9/2014 ===
+=== 10/7/2014 ===
-Papers discussed in class (and their related papers):
+Hui: (required 1 out of 3 reviews)
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/dirty-block-index_isca14.pdf|Seshadri et al., "The Dirty-Block Index", ISCA 2014.]]**
+  * **[[http://tinker.cc.gatech.edu/pdfs/MICRO44_Jesse_Beu.pdf|Beu et al., "Manager-Client Pairing: A Framework for Implementing Coherence Hierarchies", MICRO 2011.]]**
-    * [[http://users.ece.utexas.edu/~merez/vecc_asplos_2010.pdf|Yoon et al., "Virtualized and Flexible ECC for Main Memory", ASPLOS 2010.]]
+  * **[[http://research.cs.wisc.edu/multifacet/papers/hpca14_quick_release.pdf|Hechtman et al., "Quick Release: A Throughput-oriented Approach to Release Consistency on GPUs", HPCA 2014.]]**
-    * [[https://www.cs.sfu.ca/~ashriram/publications/2012_MICRO_AmoebaCache.pdf|Kumar et al., "Amoeba-Cache: Adaptive Blocks for Eliminating Waste in the Memory Hierarchy", MICRO 2012.]]
+  * **[[http://dl.acm.org/citation.cfm?id=2541982|Voskuilen et al., "High-Performance Fractal Coherence", ASPLOS 2014.]]**
-    * [[http://pages.cs.wisc.edu/~isca2005/papers/04B-03.PDF|Cantin et al., "Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking", ISCA 2005.]]
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/staged-memory-scheduling_isca12.pdf|Ausavarungnirun et al., "Staged Memory Scheduling: Achieving High Performance and Scalability in Heterogeneous Systems", ISCA 2012.]]**
-    * [[https://www.usenix.org/legacy/publications/library/proceedings/osdi/full_papers/waldspurger.pdf|Waldspurger et al., "Lottery Scheduling: Flexible Proportional-Share Resource Management", OSDI 1994.]]
-    * [[http://www.eecg.toronto.edu/~moshovos/ACA05/read/complexity.pdf|Palacharla et al., "Complexity-Effective Superscalar Processors", ISCA 1997.]]
-    * [[http://users.ece.cmu.edu/~omutlu/pub/parbs_isca08.pdf|Mutlu et al., "Parallelism-Aware Batch Scheduling: Enabling High-Performance and Fair Memory Controllers", ISCA 2008.]]
-    * [[http://cfall.in/pubs/micro2011_pams.pdf|Fallin et al., "Parallel Application Memory Scheduling", MICRO 2011.]]
-  * **[[http://users.ece.cmu.edu/~omutlu/pub/dram-access-refresh-parallelization_hpca14.pdf|Chang et al., "Improving DRAM Performance by Parallelizing Refreshes with Accesses", HPCA 2014.]]**
-=== 9/3/2014 ===
-| DRAM arch. | [[http://users.ece.cmu.edu/~omutlu/pub/salp-dram_isca12.pdf|Kim et al., "A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM", ISCA 2012.]] |
-| DRAM arch. | [[http://users.ece.cmu.edu/~omutlu/pub/raidr-dram-refresh_isca12.pdf|Liu et al., "RAIDR: Retention-Aware Intelligent DRAM Refresh", ISCA 2012.]] |
-| Flash | [[http://users.ece.cmu.edu/~omutlu/pub/flash-error-analysis-and-management_itj13.pdf|Cai et al., "Error Analysis and Retention-Aware Error Management for NAND Flash Memory", ITJ Vol. 17-1 2013.]] |
-| DRAM reliab. | [[http://users.ece.cmu.edu/~omutlu/pub/error-mitigation-for-intermittent-dram-failures_sigmetrics14.pdf|Khan et al., "The Efficacy of Error Mitigation Techniques for DRAM Retention Failures: A Comparative Experimental Study", SIGMETRICS 2014.]] |
-| Reliability | [[http://www.crhc.illinois.edu/ACS/pub/branchflip.pdf|Wang et al., "Y-Branches: When You Come to a Fork in the Road, Take It", PACT 2003.]] |
-| Reliability | [[http://users.ece.cmu.edu/~omutlu/pub/heterogeneous-reliability-memory-for-data-centers_dsn14.pdf|Luo et al., "Characterizing Application Memory Error Vulnerability to Optimize Data Center Cost", DSN 2014.]] |
-| Security | [[https://www.cs.princeton.edu/~appel/papers/memerr.pdf|Govindavajhala et al., "Using Memory Errors to Attack a Virtual Machine", SP 2003.]] |
-| 3d stacking | [[http://www.cs.cmu.edu/~chensm/LBA_reading_group/papers/3Ddram-isca08.pdf|Loh et al., "3D-Stacked Memory Architectures for Multi-core Processors", ISCA 2008.]] |
-| 3d stacking | [[http://pdf.aminer.org/000/499/580/die_stacking_d_microarchitecture.pdf|Black et al., "Die Stacking (3D) Microarchitecture", MICRO 2006.]] |
-| In mem comp. | [[http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4115697|Kogge et al., "EXECUBE - A New Architecture for Scalable MPPs", ICPP 1994.]] |
-| In mem comp. | [[http://www.ece.umd.edu/courses/enee759m.S2002/papers/fromm1997-isca24.pdf|Fromm et al., "The Energy Efficiency of IRAM Architectures", ISCA 1997.]] |
-| In mem comp. | [[http://www.eecs.berkeley.edu/~yelick/yelick/iram-micro97.pdf|Patterson et al., "A Case for Intelligent DRAM: IRAM", IEEE Micro 1997.]] |
-=== 8/26/2014 ===
+=== 10/2/2014 ===
-^ Due 9/2/2014 | {{motmoo-springer-chapter-7-30-2014.pdf|Onur Mutlu, "Main Memory Scaling: Challenges and Solution Directions", preprint book Chapter 6, 2014.}} |
+Required review for the Memory Forum paper:
-^ Due 9/6/2014 | Pick 3 papers referenced by the above paper that pique your interest |
+^ Due 9/28/2014 | [[http://www.cs.utah.edu/events/thememoryforum/kang.pdf|Kang et al., "Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling", Memory Forum 2014.]] |
-^ Due 9/6/2014 | [[http://www.cs.virginia.edu/~robins/YouAndYourResearch.html|Hamming, "You and Your Research," Bell Communications Research Colloquium Seminar, 7 March 1986.]] |
+^ | [[http://users.ece.cmu.edu/~omutlu/pub/salp-dram_isca12.pdf|Kim et al., "A
-| | [[http://web.stanford.edu/class/cs240/readings/lampson-hints.pdf|Butler W. Lampson, "Hints for computer system design", SOSP 1983]] |
-| | [[http://books.google.com/books/about/Inside_the_AS_400.html?id=hJtyAAAACAAJ|Frank Soltis, "Inside the AS/400", 1996]] |
-| | [[http://www.cs.utexas.edu/users/mckinley/notes/reviewing.html|Hill and McKinley, "Notes on Constructive and Positive Reviewing".]] |
-| | [[https://www.usenix.org/legacy/publications/library/proceedings/dsl97/good_paper.html|Levin and Redell, "How (and how not) to write a good systems paper", OSR 1983.]] |
-| | [[http://www.ifs.tuwien.ac.at/~silvia/research-tips/smith-advice.pdf|Alan Jay Smith, “The Task of the Referee”, IEEE Computer 1990.]] |
-| | [[http://research.microsoft.com/en-us/um/people/simonpj/papers/giving-a-talk/writing-a-paper-slides.pdf|Jones, "How to Write a Great Research Paper".]] |
-| | [[http://www2.cs.uregina.ca/~pwlfong/CS499/writing-paper.pdf|Philip W. L. Fong, “How to Write a CS Research Paper: A Bibliography”, 2004.]] |

18-742 Research in Parallel Computer Architecture, Fall 2014

User Tools

Site Tools

Differences

Page Tools