This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
readings [2015/04/03 17:37] albert |
readings [2015/04/13 15:31] (current) kevincha |
||
---|---|---|---|
Line 346: | Line 346: | ||
* {{zilles-2000.pdf|Zilles and Sohi, ”Understanding the backward slices of performance degrading instructions,” ISCA 2000.}} | * {{zilles-2000.pdf|Zilles and Sohi, ”Understanding the backward slices of performance degrading instructions,” ISCA 2000.}} | ||
* {{http://www.amazon.com/Inside-AS-400-Second-Edition/dp/1882419669|Frank Soltis,"Inside the AS/400"}} | * {{http://www.amazon.com/Inside-AS-400-Second-Edition/dp/1882419669|Frank Soltis,"Inside the AS/400"}} | ||
+ | |||
+ | ===== Lecture 27 (4/6 Mon.) ===== | ||
+ | ** Required: ** | ||
+ | * {{amdahl_-_1967_-_validity_of_the_single_processor_approach_to_achieving_large_scale_computing_capabilities.pdf|Amdahl, G. M. (1967). Validity of the single processor approach to achieving large scale computing capabilities. Proceedings of the April 18-20, 1967, spring joint computer conference.}} | ||
+ | * {{lamport_-_1979_-_how_to_make_a_multiprocessor_computer_that_correctly_executes_multiprocess_programs.pdf|Lamport, L. (1979). How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs.}} | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/culler-mesi.pdf|C&S, Chapters 5.1 & 5.3]] | ||
+ | * P&H, Chapter 5.8 | ||
+ | ** Recommended: ** | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/hill_309_314.pdf|Hill, Jouppi, Sohi. "Multiprocessors and Multicomputers," pp. 551-560 in Readings in Computer Architecture.]] | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/hill_551_560.pdf|Hill, Jouppi, Sohi. "Dataflow and Multithreading," pp. 309-314 in Readings in Computer Architecture.]] | ||
+ | * {{01447203.pdf|Flynn, M. J. (1966). Very high-speed computing systems. Proceedings of the IEEE.}} | ||
+ | * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos, M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings of the 11th annual international symposium on Computer architecture.}} | ||
+ | ** Mentioned during lecture: ** | ||
+ | * {{horner-1819.pdf|Horner (1819). A new method of solving numerical equations of all orders, by continuous approximation. Philosophical Transactions of the Royal Society}} | ||
+ | |||
+ | ===== Lecture 28 (4/8 Wed.) ===== | ||
+ | ** Required: ** | ||
+ | * {{lamport_-_1979_-_how_to_make_a_multiprocessor_computer_that_correctly_executes_multiprocess_programs.pdf|Lamport, L. (1979). How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs.}} | ||
+ | * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos, M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings of the 11th annual international symposium on Computer architecture.}} | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/culler-mesi.pdf|C&S, Chapters 5.1 & 5.3]] | ||
+ | * P&H, Chapter 5.8 | ||
+ | ** Recommended: ** | ||
+ | * {{10.1.1.17.8112.pdf|Gharachorloo et al. (1990). Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors.}} | ||
+ | * {{10.1.1.89.3693.pdf|Gharachorloo et al. (1991). Two Techniques to Enhance the Performance of Memory Consistency Models.}} | ||
+ | * {{isca07_bulksc.pdf|Ceze et al. (2007). BulkSC: Bulk Enforcement of Sequential Consistency.}} | ||
+ | * {{censier.pdf|Censier et al. (1978). A new solution to coherence problems in multicache systems.}} | ||
+ | * {{goodman-snoopyprotocol.pdf|Goodman (1983). Using cache memory to reduce processor-memory traffic.}} | ||
+ | * {{isca123.pdf|Laudon et al. (1997). The SGI Origin: a ccNUMA highly scalable server.}} | ||
+ | * {{isca03_token_coherence.pdf|Martin et al. (2003). Token coherence: decoupling performance and correctness.}} | ||
+ | * {{p73-baer.pdf|Baer et al. (1988). On the inclusion properties for multi-level cache hierarchies.}} | ||
+ | ** Mentioned during lecture: ** | ||
+ | * (HTML) [[http://www.cs.utexas.edu/users/EWD/transcriptions/EWD01xx/EWD123.html|Dijkstra (1965) Cooperating Sequential Processes.]] | ||
+ | |||
+ | ===== Lecture 29 (4/10 Fri.) ===== | ||
+ | ** Required: ** | ||
+ | * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/culler-mesi.pdf|C&S, Chapters 5.1 & 5.3]] | ||
+ | * P&H, Chapter 5.8 | ||
+ | * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos, M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings of the 11th annual international symposium on Computer architecture.}} | ||
+ | ** Recommended: ** | ||
+ | * {{censier.pdf|Censier et al. (1978). A new solution to coherence problems in multicache systems.}} | ||
+ | * {{goodman-snoopyprotocol.pdf|Goodman (1983). Using cache memory to reduce processor-memory traffic.}} | ||
+ | * {{isca123.pdf|Laudon et al. (1997). The SGI Origin: a ccNUMA highly scalable server.}} | ||
+ | * {{isca03_token_coherence.pdf|Martin et al. (2003). Token coherence: decoupling performance and correctness.}} | ||
+ | * {{p73-baer.pdf|Baer et al. (1988). On the inclusion properties for multi-level cache hierarchies.}} | ||
+ | |||
+ | ===== Lecture 30 (4/13 Mon.) ===== | ||
+ | ** Required: ** | ||
+ | * {{rowclone_micro13.pdf|Seshadri et al., “RowClone: Fast and Efficient In-DRAM Copy and Initialization of Bulk Data,” MICRO 2013.}} | ||
+ |