Differences

This shows you the differences between two versions of the page.

--- readings [2014/02/12 00:18]
rachata
+++ readings [2015/02/12 19:20]
albert
@@ Line 8: / Line 8: @@
   * **P&H** stands for Patterson & Hennessy's //Computer Organization and Design: The Hardware/Software Interface//
-===== Lecture 1 (1/13 Mon.) =====
+====== Guides on how to review papers critically ======
+  * Lecture slides: {{onur-447-s15-how-to-do-the-paper-reviews.pdf | pdf}} {{onur-447-s15-how-to-do-the-paper-reviews.ppt | Slides ppt}}
+  * Example reviews on "Main Memory Scaling: Challenges and Solution Directions" (link to the paper)
+      * {{review-chapter.pdf | Review 1}}
+      * {{review-chapter-2.pdf | Review 2}}
+  * Example review on "Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems" (link to the paper)
+      * {{review-sms.pdf | Review 1}}
+===== Lecture 1 (1/12 Mon.) =====
 **Required:**
-  * None
+  * For HW1: {{00964437.pdf|Patt, Y. (2001). Requirements, bottlenecks, and good fortune: agents for microprocessor evolution. Proceedings of the IEEE.}}
 **Mentioned during lecture:**
   * {{bstj29-2-147.pdf|Hamming, R. W. (1950). Error Detecting and Error Correcting Codes. Bell System Technical Journal, 29(2).}}
   * {{youandyourresearch.pdf|Hamming, R. W. (1986). You and Your Research. Transcription of the Bell Communications Research Colloquium Seminar.}}
     * [[http://www.youtube.com/watch?v=a1zDuOPkMSw|youtube]]
-  * {{05392210.pdf|Amdahl, G. M., Blaauw, G. A., & Brooks, F. P. (1964). Architecture of the IBM system/360. IBM J. Res. Dev., 8(2).}}
   * {{p128-rixner.pdf|Rixner, S., Dally, W. J., Kapasi, U. J., Mattson, P., & Owens, J. D. (2000). Memory access scheduling. Proceedings of the 27th annual international symposium on Computer architecture.}}
-  * {{us5630096.pdf|William K. Zuravleff, & Robinson, T. (1997). Controller for a synchronous DRAM that maximizes throughput by allowing memory requests and commands to be issued out of order.}}
-  * {{00964437.pdf|Patt, Y. (2001). Requirements, bottlenecks, and good fortune: agents for microprocessor evolution. Proceedings of the IEEE.}}
   * {{http://users.ece.cmu.edu/~omutlu/pub/mph_usenix_security07.pdf|Moscibroda, T., & Mutlu, O. (2007). Memory performance attacks: denial of memory service in multi-core systems. Proceedings of 16th USENIX Security Symposium.}}
    * {{http://research.microsoft.com/pubs/79625/MICRO2007.pdf|Onur Mutlu and Thomas Moscibroda, "Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors", MICRO 2007. }}
-   * {{http://users.ece.cmu.edu/~omutlu/pub/memory-channel-partitioning-micro11.pdf|Sai Prashanth Muralidhara, Lavanya Subramanian, Onur Mutlu, Mahmut Kandemir, and Thomas Moscibroda, "Reducing Memory Interference in Multicore Systems via Application-Aware Memory Channel Partitioning", MICRO 2011.}}
+   * {{http://users.ece.cmu.edu/~omutlu/pub/memory-channel-partitioning-micro11.pdf|Sai Prashanth Muralidhara, Lavanya Subramanian, Onur Mutlu, Mahmut Kandemir, and Thomas Moscibroda, "Reducing Memory Interference in Multicore Systems via Application-Aware
+   * Memory Channel Partitioning", MICRO 2011.}}
    * {{http://users.ece.cmu.edu/~omutlu/pub/raidr-dram-refresh_isca12.pdf|Liu et al., “RAIDR: Retention-Aware Intelligent DRAM Refresh,” ISCA 2012.}}
    * {{http://users.ece.cmu.edu/~omutlu/pub/memory-scaling_memcon13.pdf|Onur Mutlu, "Memory Scaling: A Systems Architecture Perspective" Technical talk at MemCon 2013 (MEMCON), Santa Clara, CA, August 2013.}}
+   * {{http://users.ece.cmu.edu/~kevincha/papers/chang_hpca2014.pdf|Kevin Chang, Donghyuk Lee, Zeshan Chishti, Alaa Alameldeen, Chris Wilkerson, Yoongu Kim, Onur Mutlu, "Improving DRAM Performance by Parallelizing Refreshes with Accesses", In HPCA 2014, Orlando, Feb. 2014.}}
+   * {{http://users.ece.cmu.edu/~yoonguk/papers/kim-isca14.pdf | Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, Onur Mutlu, "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors", In ISCA-41, 2014.}}
-===== Lecture 2 (1/15 Wed.) =====
+===== Lecture 2 (1/14 Wed.) =====
 **Required:**
   * {{00964437.pdf|Patt, Y. (2001). Requirements, bottlenecks, and good fortune: agents for microprocessor evolution. Proceedings of the IEEE.}}
@@ Line 33: / Line 41: @@
   * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/PP_Chap1.pdf|P&P Chapter 1 (Fundamentals)]]
   * P&H Chapters 1 and 2 (Intro, Abstractions, ISA, MIPS)
 **Mentioned during lecture:**
-  * {{gordon_moore_1965_article.pdf|Moore, G. E. (1965). Cramming More Components onto Integrated Circuits. Electronics, 38(8).}}
-  * {{bab6286.0001.001.pdf|Burks, A. W., Goldstine, H. H., & Neumann, J. von. (1946). Preliminary discussion of the logical design of an electronic computing instrument.}}
+===== Lecture 3 (1/16 Fri.) =====
-  * {{p126-dennis.pdf|Dennis, J. B., & Misunas, D. P. (1975). A preliminary architecture for a basic data-flow processor. Proceedings of the 2nd annual symposium on Computer architecture.}}
-  * {{p34-gurd.pdf|Gurd, J. R., Kirkham, C. C., & Watson, I. (1985). The Manchester prototype dataflow computer. Commun. ACM, 28(1).}}
-  * Kuhn, T. S. (1962). The Structure of Scientific Revolutions.
-  * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/PP_Chap4.pdf|P&P Chapter 4 (The von Neumann Model)]]
-===== Lecture 3 (1/17 Fri.) =====
 **Required:**
   * Note that you should familiarize yourself with these manuals. Please briefly skim through these manuals as you will probably need to refer to them while working on labs and homework
-  * ARM Architecture Reference Manual
+  * MIPS Architecture Reference Manual
-    * [[https://www.scss.tcd.ie/~waldroj/3d1/arm_arm.pdf|Manual (5MB)]]
+    * {{mips_r4000_users_manual.pdf |Manual (the instruction set reference starts on pg.469)}}
-  * ARM Architecture Instruction Quick Reference
-    * {{arm-instructionset.pdf|Quick Ref (.5MB)}}
   * Intel® 64 and IA-32 Architectures Software Developer Manual (2013)
     * [[http://download.intel.com/products/processor/manual/325462.pdf|(15MB) Combined Volumes 1-3]]3
 **Mentioned during lecture:**
-  * P&H Chapter 4, Sections 4.1-4.4.
+  * {{paper_aklaiber_19jan00.pdf | Klaiber, "The Technology Behind CrusoeTM Processors", Transmeta White Paper, 2000}}
-  * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixc.pdf|P&P Appendix C (The Microarchitecture of the LC-3b, Basic Machine)]]
-  * P&P Chapter 5 (The LC3)
-  * {{p25-patterson.pdf|Patterson, D. A., & Ditzel, D. R. (1980). The case for the reduced instruction set computer. SIGARCH Comput. Archit. News, 8(6).}}
-  * [[http://www.ece.cmu.edu/~koopman/stack_computers/sec3_2.html | Koopman, P. (1989) Stack Computers: The New Wave.]]
-  * {{chapter9.pdf|Levy, H. (1984). Capability-Based Computer Systems. Chapter 9. The Intel iAPX 432.}}
-  * {{p489-wilner.pdf|Wilner, W. T. (1972). Design of the Burroughs B1700. Proceedings of the December 5-7, 1972, fall joint computer conference, part I. }}
+===== Lecture 4 (1/21 Wed.) =====
-===== Lecture 4 (1/22 Wed.) =====
+**Required:**
-**Required**
   * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/PP_Chap4.pdf|P&P Chapter 4 (The von Neumann Model)]]
   * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixa.pdf|P&P Appendix A (The LC-3b ISA)]]
   * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixc.pdf|P&P Appendix C (The Microarchitecture of the LC-3b, Basic Machine)]]
-===== Lecture 5 (1/24 Fri.) =====
+**Mentioned during lecture:**
+===== Lecture 5 (1/23 Fri.) =====
 **Required**
   * None
-===== Lecture 6 (1/27 Mon.) =====
+===== Lecture 6 (1/26 Mon.) =====
 **Required:**
   * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixc.pdf|P&P Appendix C (The Microarchitecture of the LC-3b, Basic Machine)]]
@@ Line 79: / Line 75: @@
   * {{bestway.pdf|Wilkes, M. V. (1951). The best way to design an automatic calculating machine. Manchester University Computer Inaugural Conference.}}
 **Mentioned during lecture:**
-  * {{bestway.pdf|Wilkes, M. V. (1951). The best way to design an automatic calculating machine. Manchester University Computer Inaugural Conference.}}
-===== Lecture 7 (1/29 Wed.) =====
+===== Lecture 7 (1/28 Wed.) =====
 **Required:**
   * None
 **Mentioned during lecture:**
-  * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixc.pdf|P&P Appendix C (The Microarchitecture of the LC-3b, Basic Machine)]]
+  * {{bestway.pdf|Wilkes, M. V. (1951). The best way to design an automatic calculating machine. Manchester University Computer Inaugural Conference.}}
+  * [[http://research.microsoft.com/pubs/68221/acrobat.pdf |Butler W. Lampson, “Hints for Computer System Design,” ACM Operating Systems Review, 1983.]]
-===== Lecture 8 (1/31 Fri.) =====
+===== Lecture 8 (2/2 Mon.) =====
 **Required:**
-  * None
+  * P&H Sections 4.9-4.11
+  * {{00476078.pdf|Smith, J. E., & Sohi, G. S. (1995). The microarchitecture of superscalar processors. Proceedings of the IEEE.}}
+  * {{00004607.pdf|Smith, J. E., & Pleszkun, A. R. (1988). Implementing precise interrupts in pipelined processors. Computers, IEEE Transactions on.}}
+  * {{mcfarling_-_1993_-_combining_branch_predictors.pdf|Mcfarling, S. (1993). Combining branch predictors. WRL Technical Note TN-36.}}
+  * {{kessler_-_1999_-_the_alpha_21264_microprocessor.pdf|Kessler, R. E. (1999). The Alpha 21264 Microprocessor. IEEE Micro.}}
-===== Lecture 9 (2/3 Mon.) =====
+**Mentioned during lecture:**
+  * {{p16-pettis.pdf|Pettis, K., & Hansen, R. C. (1990). Profile guided code positioning. Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation.}}
+===== Lecture 9 (2/4 Wed.) =====
 **Required:**
   * P&H Sections 4.9-4.11
   * {{00476078.pdf|Smith, J. E., & Sohi, G. S. (1995). The microarchitecture of superscalar processors. Proceedings of the IEEE.}}
+  * {{00004607.pdf|Smith, J. E., & Pleszkun, A. R. (1988). Implementing precise interrupts in pipelined processors. Computers, IEEE Transactions on.}}
+  * {{mcfarling_-_1993_-_combining_branch_predictors.pdf|Mcfarling, S. (1993). Combining branch predictors. WRL Technical Note TN-36.}}
+  * {{kessler_-_1999_-_the_alpha_21264_microprocessor.pdf|Kessler, R. E. (1999). The Alpha 21264 Microprocessor. IEEE Micro.}}
 **Mentioned during lecture:**
-  * {{p177-allen.pdf|Allen, J. R., Kennedy, K., Porterfield, C., & Warren, J. (1983). Conversion of control dependence to data dependence. Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages.}}
+  * {{flash-memory-data-retention_hpca15.pdf|Yu Cai, Yixin Luo, Erich F. Haratsch, Ken Mai, and Onur Mutlu, Data Retention in MLC NAND Flash Memory: Characterization, Optimization and Recovery, HPCA 2015.}}
-  * {{24400043.pdf|Kim, H., Mutlu, O., Stark, J., & Patt, Y. N. (2005). Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution. Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture.}}
+  * {{adaptive-latency-dram_hpca15.pdf|Donghyuk Lee, Yoongu Kim, Gennady Pekhimenko, Samira Khan, Vivek Seshadri, Kevin Chang, and Onur Mutlu, Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case, HPCA 2015.}}
-  * {{thornton_-_1964_-_parallel_operation_in_the_control_data_6600.pdf|Thornton, J. E. (1964). Parallel Operation in the Control Data 6600. Proceedings of the Fall Joint Computer Conference.}}
+  * {{compression-aware-cache-management_hpca15.pdf|Gennady Pekhimenko, Tyler Huberty, Rui Cai, Onur Mutlu, Phillip P. Gibbons, Michael A. Kozuch, and Todd C. Mowry, Exploiting Compressed Block Size as an Indicator of Future Reuse, HPCA 2015.}}
-  * {{smith78_hep.pdf|Smith, B. J. (1978). A pipelined, shared resource MIMD computer. International Conference on Parallel Processing.}}
-  * {{p16-pettis.pdf|Pettis, K., & Hansen, R. C. (1990). Profile guided code positioning. Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation.}}
-===== Lecture 10 (2/5 Wed.) =====
+===== Lecture 10 (2/6 Fri.) =====
 **Required:**
+  * P&H Sections 4.9-4.11
+  * {{00476078.pdf|Smith, J. E., & Sohi, G. S. (1995). The microarchitecture of superscalar processors. Proceedings of the IEEE.}}
+  * {{00004607.pdf|Smith, J. E., & Pleszkun, A. R. (1988). Implementing precise interrupts in pipelined processors. Computers, IEEE Transactions on.}}
   * {{mcfarling_-_1993_-_combining_branch_predictors.pdf|Mcfarling, S. (1993). Combining branch predictors. WRL Technical Note TN-36.}}
   * {{kessler_-_1999_-_the_alpha_21264_microprocessor.pdf|Kessler, R. E. (1999). The Alpha 21264 Microprocessor. IEEE Micro.}}
-**Mentioned during lecture:**
+**Mentioned in the Lecture:**
   * {{p300-ball.pdf|Ball, T., & Larus, J. R. (1993). Branch prediction for free. Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation.}}
   * {{p135-smith.pdf|Smith, J. E. (1981). A study of branch prediction strategies. Proceedings of the 8th annual symposium on Computer Architecture.}}
@@ Line 116: / Line 124: @@
   * {{hpca01.pdf|Daniel A. Jimenez and Calvin Lin. 2001. Dynamic Branch Prediction with Perceptrons. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture (HPCA '01)}}
   * {{Riseman.1972.TC.pdf|E. M. Riseman and C. C. Foster. 1972. The Inhibition of Potential Parallelism by Conditional Jumps. IEEE Trans. Comput. 21, 12 (December 1972)}}
+  * {{p274-chang.pdf|Po-Yung Chang, Eric Hao, and Yale N. Patt. 1997. Target prediction for indirect jumps. ISCA'97.}}
+  * {{kim_isca07.pdf|Hyesoon Kim, José A. Joao, Onur Mutlu, Chang Joo Lee, Yale N. Patt, and Robert Cohn. 2007. VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization. ISCA'07}}
-===== Lecture 11 (2/12 Wed.) =====
+===== Lecture 11 (2/11 Wed.) =====
-** Required **
+**Required:**
-  * None
-===== Lecture 12 (2/14 Fri.) =====
+**Mentioned in the Lecture:**
-** Required **
+  * {{p18-hwu.pdf|Hwu and Patt (1987). Checkpoint Repair for Out-of-order Execution Machines.}}
-  * P&H Sections 4.9-4.11
-  * {{00476078.pdf|Smith, J. E., & Sohi, G. S. (1995). The microarchitecture of superscalar processors. Proceedings of the IEEE.}}
   * {{00004607.pdf|Smith, J. E., & Pleszkun, A. R. (1988). Implementing precise interrupts in pipelined processors. Computers, IEEE Transactions on.}}
+  * {{ogehl.pdf | Seznec (2005). Analysis of the O-GEometric History Length Branch Predictor. ISCA}}
+  * {{hpca01.pdf|Daniel A. Jimenez and Calvin Lin. 2001. Dynamic Branch Prediction with Perceptrons. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture (HPCA '01)}}

18-447 Introduction to Computer Architecture – Spring 2015

User Tools

Site Tools

Differences

Page Tools