User Tools

Site Tools


readings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
readings [2015/04/06 19:05]
clement
readings [2015/04/13 15:31] (current)
kevincha
Line 348: Line 348:
  
 ===== Lecture 27 (4/6 Mon.) ===== ===== Lecture 27 (4/6 Mon.) =====
-**Required:​** +** Required: ** 
-  * {{main-memory-system_kiise15.pdf| Onur MutluJustin Meza, and Lavanya Subramanian, "The Main Memory SystemChallenges ​and Opportunities," ​Invited Article ​in Communications ​of the Korean Institute ​of Information Scientists ​and Engineers ​(KIISE), 2015.}}+  * {{amdahl_-_1967_-_validity_of_the_single_processor_approach_to_achieving_large_scale_computing_capabilities.pdf|AmdahlG. M. (1967). Validity of the single processor approach to achieving large scale computing capabilities. Proceedings of the April 18-201967, spring joint computer conference.}} 
 +  * {{lamport_-_1979_-_how_to_make_a_multiprocessor_computer_that_correctly_executes_multiprocess_programs.pdf|Lamport,​ L. (1979). How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs.}} 
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​culler-mesi.pdf|C&​S,​ Chapters 5.1 & 5.3]] 
 +  * P&H, Chapter 5.8 
 +** Recommended:​ ** 
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​hill_309_314.pdf|Hill,​ Jouppi, Sohi. "​Multiprocessors ​and Multicomputers," ​pp. 551-560 in Readings in Computer Architecture.]] 
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​hill_551_560.pdf|Hill,​ Jouppi, Sohi. "​Dataflow ​and Multithreading," ​pp. 309-314 ​in Readings in Computer Architecture.]] 
 +  * {{01447203.pdf|Flynn,​ M. J. (1966). Very high-speed computing systems. Proceedings ​of the IEEE.}} 
 +  * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos,​ M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings ​of the 11th annual international symposium on Computer architecture.}} 
 +** Mentioned during lecture: ** 
 +  * {{horner-1819.pdf|Horner (1819). A new method of solving numerical equations of all orders, by continuous approximation. Philosophical Transactions of the Royal Society}} 
 + 
 +===== Lecture 28 (4/8 Wed.) ===== 
 +** Required: ** 
 +  * {{lamport_-_1979_-_how_to_make_a_multiprocessor_computer_that_correctly_executes_multiprocess_programs.pdf|Lamport,​ L. (1979). How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs.}} 
 +  * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos,​ M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings of the 11th annual international symposium on Computer architecture.}} 
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​culler-mesi.pdf|C&​S,​ Chapters 5.1 & 5.3]] 
 +  * P&H, Chapter 5.8 
 +** Recommended:​ ** 
 +  * {{10.1.1.17.8112.pdf|Gharachorloo et al. (1990). Memory Consistency ​and Event Ordering in Scalable Shared-Memory Multiprocessors.}} 
 +  * {{10.1.1.89.3693.pdf|Gharachorloo et al. (1991). Two Techniques to Enhance the Performance of Memory Consistency Models.}} 
 +  * {{isca07_bulksc.pdf|Ceze et al. (2007). BulkSC: Bulk Enforcement of Sequential Consistency.}} 
 +  * {{censier.pdf|Censier et al. (1978). A new solution to coherence problems in multicache systems.}} 
 +  * {{goodman-snoopyprotocol.pdf|Goodman (1983). Using cache memory to reduce processor-memory traffic.}} 
 +  * {{isca123.pdf|Laudon et al. (1997). The SGI Origin: a ccNUMA highly scalable server.}} 
 +  * {{isca03_token_coherence.pdf|Martin et al. (2003). Token coherence: decoupling performance and correctness.}} 
 +  * {{p73-baer.pdf|Baer et al. (1988). On the inclusion properties for multi-level cache hierarchies.}} 
 +** Mentioned during lecture: ** 
 +  * (HTML) [[http://​www.cs.utexas.edu/​users/​EWD/​transcriptions/​EWD01xx/​EWD123.html|Dijkstra (1965) Cooperating Sequential Processes.]] 
 + 
 +===== Lecture 29 (4/10 Fri.) ===== 
 +** Required: ** 
 +  * (CMU WebISO) [[http://​www.ece.cmu.edu/​~ece447/​cmu_only/​culler-mesi.pdf|C&​S,​ Chapters 5.1 & 5.3]] 
 +  * P&H, Chapter 5.8 
 +  * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos,​ M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings of the 11th annual international symposium on Computer architecture.}} 
 +** Recommended:​ ** 
 +  * {{censier.pdf|Censier et al. (1978). A new solution to coherence problems in multicache systems.}} 
 +  * {{goodman-snoopyprotocol.pdf|Goodman (1983). Using cache memory to reduce processor-memory traffic.}} 
 +  * {{isca123.pdf|Laudon et al. (1997). The SGI Origin: a ccNUMA highly scalable server.}} 
 +  * {{isca03_token_coherence.pdf|Martin et al. (2003). Token coherence: decoupling performance and correctness.}} 
 +  * {{p73-baer.pdf|Baer et al. (1988). On the inclusion properties for multi-level cache hierarchies.}} 
 + 
 +===== Lecture 30 (4/13 Mon.) ===== 
 +** Required: ** 
 +  * {{rowclone_micro13.pdf|Seshadri et al., “RowClone:​ Fast and Efficient In-DRAM Copy and Initialization of Bulk Data,” MICRO 2013.}}
  
-**Mentioned During Lecture:** 
-  * {{jouppi1990.pdf|Jouppi,​ “Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers,” ISCA 1990.}} 
-  * {{http://​users.ece.cmu.edu/​~omutlu/​pub/​TR-HPS-2006-006.pdf|Srinathet al., “Feedback directed prefetching”,​ HPCA 2007.}} 
readings.1428361533.txt.gz · Last modified: 2015/04/06 19:05 by clement