This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
readings [2014/01/14 05:25] rachata created |
readings [2014/03/19 03:15] rachata |
||
---|---|---|---|
Line 2: | Line 2: | ||
* **P&P** stands for Patt & Patel's //Introduction to Computing Systems: From Bits and Gates to C and Beyond// | * **P&P** stands for Patt & Patel's //Introduction to Computing Systems: From Bits and Gates to C and Beyond// | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/PP_Chap1.pdf|P&P Chapter 1 (Fundamentals)]] | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/PP_Chap4.pdf|P&P Chapter 4 (The von Neumann Model)]] | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixa.pdf|P&P Appendix A (The LC-3b ISA)]] | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixc.pdf|P&P Appendix C (The Microarchitecture of the LC-3b, Basic Machine)]] | ||
* **P&H** stands for Patterson & Hennessy's //Computer Organization and Design: The Hardware/Software Interface// | * **P&H** stands for Patterson & Hennessy's //Computer Organization and Design: The Hardware/Software Interface// | ||
- | ===== Lecture 1 (1/14 Mon.) ===== | + | ===== Lecture 1 (1/13 Mon.) ===== |
**Required:** | **Required:** | ||
* None | * None | ||
Line 23: | Line 27: | ||
* {{http://users.ece.cmu.edu/~omutlu/pub/memory-scaling_memcon13.pdf|Onur Mutlu, "Memory Scaling: A Systems Architecture Perspective" Technical talk at MemCon 2013 (MEMCON), Santa Clara, CA, August 2013.}} | * {{http://users.ece.cmu.edu/~omutlu/pub/memory-scaling_memcon13.pdf|Onur Mutlu, "Memory Scaling: A Systems Architecture Perspective" Technical talk at MemCon 2013 (MEMCON), Santa Clara, CA, August 2013.}} | ||
- | ===== Lecture 2 (1/16 Wed.) ===== | + | ===== Lecture 2 (1/15 Wed.) ===== |
**Required:** | **Required:** | ||
* {{00964437.pdf|Patt, Y. (2001). Requirements, bottlenecks, and good fortune: agents for microprocessor evolution. Proceedings of the IEEE.}} | * {{00964437.pdf|Patt, Y. (2001). Requirements, bottlenecks, and good fortune: agents for microprocessor evolution. Proceedings of the IEEE.}} | ||
Line 29: | Line 33: | ||
* (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/PP_Chap1.pdf|P&P Chapter 1 (Fundamentals)]] | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/PP_Chap1.pdf|P&P Chapter 1 (Fundamentals)]] | ||
* P&H Chapters 1 and 2 (Intro, Abstractions, ISA, MIPS) | * P&H Chapters 1 and 2 (Intro, Abstractions, ISA, MIPS) | ||
+ | |||
+ | **Mentioned during lecture:** | ||
+ | * {{gordon_moore_1965_article.pdf|Moore, G. E. (1965). Cramming More Components onto Integrated Circuits. Electronics, 38(8).}} | ||
+ | * {{bab6286.0001.001.pdf|Burks, A. W., Goldstine, H. H., & Neumann, J. von. (1946). Preliminary discussion of the logical design of an electronic computing instrument.}} | ||
+ | * {{p126-dennis.pdf|Dennis, J. B., & Misunas, D. P. (1975). A preliminary architecture for a basic data-flow processor. Proceedings of the 2nd annual symposium on Computer architecture.}} | ||
+ | * {{p34-gurd.pdf|Gurd, J. R., Kirkham, C. C., & Watson, I. (1985). The Manchester prototype dataflow computer. Commun. ACM, 28(1).}} | ||
+ | * Kuhn, T. S. (1962). The Structure of Scientific Revolutions. | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/PP_Chap4.pdf|P&P Chapter 4 (The von Neumann Model)]] | ||
+ | |||
+ | ===== Lecture 3 (1/17 Fri.) ===== | ||
+ | **Required:** | ||
+ | * Note that you should familiarize yourself with these manuals. Please briefly skim through these manuals as you will probably need to refer to them while working on labs and homework | ||
+ | * ARM Architecture Reference Manual | ||
+ | * [[https://www.scss.tcd.ie/~waldroj/3d1/arm_arm.pdf|Manual (5MB)]] | ||
+ | * ARM Architecture Instruction Quick Reference | ||
+ | * {{arm-instructionset.pdf|Quick Ref (.5MB)}} | ||
+ | * Intel® 64 and IA-32 Architectures Software Developer Manual (2013) | ||
+ | * [[http://download.intel.com/products/processor/manual/325462.pdf|(15MB) Combined Volumes 1-3]]3 | ||
+ | |||
+ | **Mentioned during lecture:** | ||
+ | * P&H Chapter 4, Sections 4.1-4.4. | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixc.pdf|P&P Appendix C (The Microarchitecture of the LC-3b, Basic Machine)]] | ||
+ | * P&P Chapter 5 (The LC3) | ||
+ | * {{p25-patterson.pdf|Patterson, D. A., & Ditzel, D. R. (1980). The case for the reduced instruction set computer. SIGARCH Comput. Archit. News, 8(6).}} | ||
+ | * [[http://www.ece.cmu.edu/~koopman/stack_computers/sec3_2.html | Koopman, P. (1989) Stack Computers: The New Wave.]] | ||
+ | * {{chapter9.pdf|Levy, H. (1984). Capability-Based Computer Systems. Chapter 9. The Intel iAPX 432.}} | ||
+ | * {{p489-wilner.pdf|Wilner, W. T. (1972). Design of the Burroughs B1700. Proceedings of the December 5-7, 1972, fall joint computer conference, part I. }} | ||
+ | |||
+ | |||
+ | ===== Lecture 4 (1/22 Wed.) ===== | ||
+ | **Required** | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/PP_Chap4.pdf|P&P Chapter 4 (The von Neumann Model)]] | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixa.pdf|P&P Appendix A (The LC-3b ISA)]] | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixc.pdf|P&P Appendix C (The Microarchitecture of the LC-3b, Basic Machine)]] | ||
+ | |||
+ | ===== Lecture 5 (1/24 Fri.) ===== | ||
+ | **Required** | ||
+ | * None | ||
+ | |||
+ | ===== Lecture 6 (1/27 Mon.) ===== | ||
+ | **Required:** | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixc.pdf|P&P Appendix C (The Microarchitecture of the LC-3b, Basic Machine)]] | ||
+ | * P&H Appendix D (Mapping Control to Hardware) | ||
+ | **Optional:** | ||
+ | * {{bestway.pdf|Wilkes, M. V. (1951). The best way to design an automatic calculating machine. Manchester University Computer Inaugural Conference.}} | ||
+ | **Mentioned during lecture:** | ||
+ | * {{bestway.pdf|Wilkes, M. V. (1951). The best way to design an automatic calculating machine. Manchester University Computer Inaugural Conference.}} | ||
+ | |||
+ | ===== Lecture 7 (1/29 Wed.) ===== | ||
+ | **Required:** | ||
+ | * None | ||
+ | |||
+ | **Mentioned during lecture:** | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/pp-appendixc.pdf|P&P Appendix C (The Microarchitecture of the LC-3b, Basic Machine)]] | ||
+ | |||
+ | ===== Lecture 8 (1/31 Fri.) ===== | ||
+ | **Required:** | ||
+ | * None | ||
+ | |||
+ | ===== Lecture 9 (2/3 Mon.) ===== | ||
+ | **Required:** | ||
+ | * P&H Sections 4.9-4.11 | ||
+ | * {{00476078.pdf|Smith, J. E., & Sohi, G. S. (1995). The microarchitecture of superscalar processors. Proceedings of the IEEE.}} | ||
+ | |||
+ | **Mentioned during lecture:** | ||
+ | * {{p177-allen.pdf|Allen, J. R., Kennedy, K., Porterfield, C., & Warren, J. (1983). Conversion of control dependence to data dependence. Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages.}} | ||
+ | * {{24400043.pdf|Kim, H., Mutlu, O., Stark, J., & Patt, Y. N. (2005). Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution. Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture.}} | ||
+ | * {{thornton_-_1964_-_parallel_operation_in_the_control_data_6600.pdf|Thornton, J. E. (1964). Parallel Operation in the Control Data 6600. Proceedings of the Fall Joint Computer Conference.}} | ||
+ | * {{smith78_hep.pdf|Smith, B. J. (1978). A pipelined, shared resource MIMD computer. International Conference on Parallel Processing.}} | ||
+ | * {{p16-pettis.pdf|Pettis, K., & Hansen, R. C. (1990). Profile guided code positioning. Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation.}} | ||
+ | |||
+ | ===== Lecture 10 (2/5 Wed.) ===== | ||
+ | |||
+ | **Required:** | ||
+ | * {{mcfarling_-_1993_-_combining_branch_predictors.pdf|Mcfarling, S. (1993). Combining branch predictors. WRL Technical Note TN-36.}} | ||
+ | * {{kessler_-_1999_-_the_alpha_21264_microprocessor.pdf|Kessler, R. E. (1999). The Alpha 21264 Microprocessor. IEEE Micro.}} | ||
+ | **Mentioned during lecture:** | ||
+ | * {{p300-ball.pdf|Ball, T., & Larus, J. R. (1993). Branch prediction for free. Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation.}} | ||
+ | * {{p135-smith.pdf|Smith, J. E. (1981). A study of branch prediction strategies. Proceedings of the 8th annual symposium on Computer Architecture.}} | ||
+ | * {{yeh_patt_-_1991_-_two-level_adaptive_training_branch_prediction.pdf|Yeh, T.-Y., & Patt, Y. N. (1991). Two-level adaptive training branch prediction. Proceedings of the 24th annual international symposium on Microarchitecture.}} | ||
+ | * {{p22-chang.pdf|Chang, P.-Y., Hao, E., Yeh, T.-Y., & Patt, Y. (1994). Branch classification: a new mechanism for improving branch predictor performance. Proceedings of the 27th annual international symposium on Microarchitecture.}} | ||
+ | * {{hpca01.pdf|Daniel A. Jimenez and Calvin Lin. 2001. Dynamic Branch Prediction with Perceptrons. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture (HPCA '01)}} | ||
+ | * {{Riseman.1972.TC.pdf|E. M. Riseman and C. C. Foster. 1972. The Inhibition of Potential Parallelism by Conditional Jumps. IEEE Trans. Comput. 21, 12 (December 1972)}} | ||
+ | |||
+ | ===== Lecture 11 (2/12 Wed.) ===== | ||
+ | ** Required ** | ||
+ | * None | ||
+ | |||
+ | ** Mentioned during the lecture ** | ||
+ | * {{p274-chang.pdf|Po-Yung Chang, Eric Hao, and Yale N. Patt. 1997. Target prediction for indirect jumps. ISCA'97.}} | ||
+ | * {{kim_isca07.pdf|Hyesoon Kim, José A. Joao, Onur Mutlu, Chang Joo Lee, Yale N. Patt, and Robert Cohn. 2007. VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization. ISCA'07}} | ||
+ | |||
+ | ===== Lecture 12 (2/14 Fri.) ===== | ||
+ | ** Required ** | ||
+ | * P&H Sections 4.9-4.11 | ||
+ | * {{00476078.pdf|Smith, J. E., & Sohi, G. S. (1995). The microarchitecture of superscalar processors. Proceedings of the IEEE.}} | ||
+ | * {{00004607.pdf|Smith, J. E., & Pleszkun, A. R. (1988). Implementing precise interrupts in pipelined processors. Computers, IEEE Transactions on.}} | ||
+ | |||
+ | ===== Lecture 13 (2/17 Mon.) ===== | ||
+ | ** Required ** | ||
+ | * none | ||
+ | |||
+ | ===== Lecture 14 (2/19 Wed.) ===== | ||
+ | ** Required ** | ||
+ | * {{p18-hwu.pdf|Hwu, W. W., & Patt, Y. N. (1987). Checkpoint repair for out-of-order execution machines. Proceedings of the 14th annual international symposium on Computer architecture.}} | ||
+ | * {{00476078.pdf|Smith, J. E., & Sohi, G. S. (1995). The microarchitecture of superscalar processors. Proceedings of the IEEE.}} | ||
+ | * {{00004607.pdf|Smith, J. E., & Pleszkun, A. R. (1988). Implementing precise interrupts in pipelined processors. Computers, IEEE Transactions on.}} | ||
+ | |||
+ | |||
+ | ===== Lecture 15 (2/21 Fri.) ===== | ||
+ | ** Required ** | ||
+ | * {{04523358.pdf|Lindholm, E., Nickolls, J., Oberman, S., & Montrym, J. (2008). NVIDIA Tesla: A Unified Graphics and Computing Architecture. Micro, IEEE.}} | ||
+ | * {{p50-fatahalian.pdf|Fatahalian, K., & Houston, M. (2008). A closer look at GPUs. Commun. ACM.}} | ||
+ | |||
+ | ===== Lecture 19 (3/19 Wed.) ===== | ||
+ | **Required:** | ||
+ | * P&H Chapters 5.1-5.3 (cache chapters) | ||
+ | * Hamacher et al. Chapters 8.1-8.7 (cache/memory chapters) | ||
+ | * {{wilkes_-_1965_-_slave_memories_and_dynamic_storage_allocation.pdf|Wilkes, M. V. (1965). Slave Memories and Dynamic Storage Allocation. IEEE Transactions on Electronic Computers.}} |