User Tools

Site Tools


readings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
readings [2015/04/03 17:12]
albert
readings [2015/04/13 15:31] (current)
kevincha
Line 345: Line 345:
   * {{zilles-2001.pdf|Zilles and Sohi, “Execution-based Prediction Using Speculative Slices”, ISCA 2001.}}   * {{zilles-2001.pdf|Zilles and Sohi, “Execution-based Prediction Using Speculative Slices”, ISCA 2001.}}
   * {{zilles-2000.pdf|Zilles and Sohi, ”Understanding the backward slices of performance degrading instructions,​” ISCA 2000.}}   * {{zilles-2000.pdf|Zilles and Sohi, ”Understanding the backward slices of performance degrading instructions,​” ISCA 2000.}}
 +  * {{http://​www.amazon.com/​Inside-AS-400-Second-Edition/​dp/​1882419669|Frank Soltis,"​Inside the AS/​400"​}}
 +
 +===== Lecture 27 (4/6 Mon.) =====
 +** Required: **
 +  * {{amdahl_-_1967_-_validity_of_the_single_processor_approach_to_achieving_large_scale_computing_capabilities.pdf|Amdahl,​ G. M. (1967). Validity of the single processor approach to achieving large scale computing capabilities. Proceedings of the April 18-20, 1967, spring joint computer conference.}}
 +  * {{lamport_-_1979_-_how_to_make_a_multiprocessor_computer_that_correctly_executes_multiprocess_programs.pdf|Lamport,​ L. (1979). How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs.}}
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​culler-mesi.pdf|C&​S,​ Chapters 5.1 & 5.3]]
 +  * P&H, Chapter 5.8
 +** Recommended:​ **
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​hill_309_314.pdf|Hill,​ Jouppi, Sohi. "​Multiprocessors and Multicomputers,"​ pp. 551-560 in Readings in Computer Architecture.]]
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​hill_551_560.pdf|Hill,​ Jouppi, Sohi. "​Dataflow and Multithreading,"​ pp. 309-314 in Readings in Computer Architecture.]]
 +  * {{01447203.pdf|Flynn,​ M. J. (1966). Very high-speed computing systems. Proceedings of the IEEE.}}
 +  * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos,​ M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings of the 11th annual international symposium on Computer architecture.}}
 +** Mentioned during lecture: **
 +  * {{horner-1819.pdf|Horner (1819). A new method of solving numerical equations of all orders, by continuous approximation. Philosophical Transactions of the Royal Society}}
 +
 +===== Lecture 28 (4/8 Wed.) =====
 +** Required: **
 +  * {{lamport_-_1979_-_how_to_make_a_multiprocessor_computer_that_correctly_executes_multiprocess_programs.pdf|Lamport,​ L. (1979). How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs.}}
 +  * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos,​ M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings of the 11th annual international symposium on Computer architecture.}}
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​culler-mesi.pdf|C&​S,​ Chapters 5.1 & 5.3]]
 +  * P&H, Chapter 5.8
 +** Recommended:​ **
 +  * {{10.1.1.17.8112.pdf|Gharachorloo et al. (1990). Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors.}}
 +  * {{10.1.1.89.3693.pdf|Gharachorloo et al. (1991). Two Techniques to Enhance the Performance of Memory Consistency Models.}}
 +  * {{isca07_bulksc.pdf|Ceze et al. (2007). BulkSC: Bulk Enforcement of Sequential Consistency.}}
 +  * {{censier.pdf|Censier et al. (1978). A new solution to coherence problems in multicache systems.}}
 +  * {{goodman-snoopyprotocol.pdf|Goodman (1983). Using cache memory to reduce processor-memory traffic.}}
 +  * {{isca123.pdf|Laudon et al. (1997). The SGI Origin: a ccNUMA highly scalable server.}}
 +  * {{isca03_token_coherence.pdf|Martin et al. (2003). Token coherence: decoupling performance and correctness.}}
 +  * {{p73-baer.pdf|Baer et al. (1988). On the inclusion properties for multi-level cache hierarchies.}}
 +** Mentioned during lecture: **
 +  * (HTML) [[http://​www.cs.utexas.edu/​users/​EWD/​transcriptions/​EWD01xx/​EWD123.html|Dijkstra (1965) Cooperating Sequential Processes.]]
 +
 +===== Lecture 29 (4/10 Fri.) =====
 +** Required: **
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​culler-mesi.pdf|C&​S,​ Chapters 5.1 & 5.3]]
 +  * P&H, Chapter 5.8
 +  * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos,​ M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings of the 11th annual international symposium on Computer architecture.}}
 +** Recommended:​ **
 +  * {{censier.pdf|Censier et al. (1978). A new solution to coherence problems in multicache systems.}}
 +  * {{goodman-snoopyprotocol.pdf|Goodman (1983). Using cache memory to reduce processor-memory traffic.}}
 +  * {{isca123.pdf|Laudon et al. (1997). The SGI Origin: a ccNUMA highly scalable server.}}
 +  * {{isca03_token_coherence.pdf|Martin et al. (2003). Token coherence: decoupling performance and correctness.}}
 +  * {{p73-baer.pdf|Baer et al. (1988). On the inclusion properties for multi-level cache hierarchies.}}
 +
 +===== Lecture 30 (4/13 Mon.) =====
 +** Required: **
 +  * {{rowclone_micro13.pdf|Seshadri et al., “RowClone:​ Fast and Efficient In-DRAM Copy and Initialization of Bulk Data,” MICRO 2013.}}
 +
readings.1428095578.txt.gz · Last modified: 2015/04/03 17:12 by albert