Differences

This shows you the differences between two versions of the page.

--- readings [2014/04/16 14:12]
rachata
+++ readings [2014/05/20 01:46]
rachata
@@ Line 287: / Line 287: @@
   * P&H, Chapter 5.8
+===== Lecture 30 (4/18 Fri.) =====
+** Required: **
+  * {{LCP.pdf|Pekhimenko et al., “Linearly Compressed Pages: A Main Memory Compression Framework with Low Complexity and Low Latency,” MICRO 2013.}}
+  * {{bdi-compression_pact12.pdf|Pekhimenko et al., "Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches," PACT 2012.}}
+  * {{mise-predictable_memory_performance-hpca13.pdf|Subramanian et al., “MISE: Providing Performance Predictability and Improving Fairness in Shared Main Memory Systems,” HPCA 2013.}}
+===== Lecture 31 (4/28 Mon.) =====
+** Required: **
+  * {{amdahl_-_1967_-_validity_of_the_single_processor_approach_to_achieving_large_scale_computing_capabilities.pdf|Amdahl, G. M. (1967). Validity of the single processor approach to achieving large scale computing capabilities. Proceedings of the April 18-20, 1967, spring joint computer conference.}}
+  * {{lamport_-_1979_-_how_to_make_a_multiprocessor_computer_that_correctly_executes_multiprocess_programs.pdf|Lamport, L. (1979). How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs.}}
+  * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/culler-mesi.pdf|C&S, Chapters 5.1 & 5.3]]
+  * P&H, Chapter 5.8
+** Recommended: **
+  * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/hill_309_314.pdf|Hill, Jouppi, Sohi. "Multiprocessors and Multicomputers," pp. 551-560 in Readings in Computer Architecture.]]
+  * (CMU WebISO) [[http://www.ece.cmu.edu/~ece447/cmu_only/hill_551_560.pdf|Hill, Jouppi, Sohi. "Dataflow and Multithreading," pp. 309-314 in Readings in Computer Architecture.]]
+  * {{01447203.pdf|Flynn, M. J. (1966). Very high-speed computing systems. Proceedings of the IEEE.}}
+  * {{papamarcos_patel_-_1984_-_a_low-overhead_coherence_solution_for_multiprocessors_with_private_cache_memories.pdf|Papamarcos, M. S., & Patel, J. H. (1984). A low-overhead coherence solution for multiprocessors with private cache memories. Proceedings of the 11th annual international symposium on Computer architecture.}}
+** Mentioned during lecture: **
+  * {{p168-patel.pdf|Patel, J. H. (1979). Processor-memory interconnections for multiprocessors. Proceedings of the 6th annual symposium on Computer architecture.}}
+  * {{p196-moscibroda.pdf|Moscibroda, T., & Mutlu, O. (2009). A case for bufferless routing in on-chip networks. Proceedings of the 36th annual international symposium on Computer architecture.}}
+  * {{p27-gottlieb.pdf|Gottlieb, A., Grishman, R., Kruskal, C. P., McAuliffe, K. P., Rudolph, L., & Snir, M. (1982). The NYU Ultracomputer -- designing a MIMD, shared-memory parallel machine (Extended Abstract). Proceedings of the 9th annual symposium on Computer Architecture.}}
+  * {{p22-seitz.pdf|Seitz, C. L. (1985). The cosmic cube. Commun. ACM.}}
+  * {{p278-glass.pdf|Glass, C. J., & Ni, L. M. (1992). The turn model for adaptive routing. Proceedings of the 19th annual international symposium on Computer architecture.}}
+===== Lecture 32 (4/30 Wed.) =====
+** Required: **
+  * None
+** Mentioned during lecture: **
+  * {{amdahl_-_1967_-_validity_of_the_single_processor_approach_to_achieving_large_scale_computing_capabilities.pdf|Amdahl, G. M. (1967). Validity of the single processor approach to achieving large scale computing capabilities. Proceedings of the April 18-20, 1967, spring joint computer conference.}}
+  * {{grochowski_et_al._-_2004_-_best_of_both_latency_and_throughput.pdf|Grochowski, E., Ronen, R., Shen, J., & Wang, H. (2004). Best of Both Latency and Throughput. Proceedings of the IEEE International Conference on Computer Design (pp. 236–243).}}
+  * {{tendler_et_al._-_2002_-_power4_system_microarchitecture.pdf|Tendler, J. M., Dodson, J. S., Fields, J. S., Le, H., & Sinharoy, B. (2002). POWER4 system microarchitecture. IBM J. Res. Dev.}}
+  * {{01289290.pdf|Kalla, R., Sinharoy, B., & Tendler, J. M. (2004). IBM Power5 Chip: A Dual-Core Multithreaded Processor. IEEE Micro.}}
+  * {{kongetira_aingaran_olukotun_-_2005_-_niagara_a_32-way_multithreaded_sparc_processor.pdf|Kongetira, P., Aingaran, K., & Olukotun, K. (2005). Niagara: A 32-Way Multithreaded Sparc Processor. IEEE Micro.}}
+  * {{p253-suleman.pdf|Suleman, M. A., Mutlu, O., Qureshi, M. K., & Patt, Y. N. (2009). Accelerating critical section execution with asymmetric multi-core architectures. Proceedings of the 14th international conference on Architectural support for programming languages and operating systems.}}
+  * {{p441-suleman.pdf|Suleman, M. A., Mutlu, O., Joao, J. A., Khubaib, & Patt, Y. N. (2010). Data marshaling for multi-core architectures. Proceedings of the 37th annual international symposium on Computer architecture.}}
+  * {{p223-joao.pdf|Joao, J. A., Suleman, M. A., Mutlu, O., & Patt, Y. N. (2012). Bottleneck identification and scheduling in multithreaded applications. Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems.}}
+===== Lecture 33 (5/2 Fri.) =====
+** Required: **
+  * None
+** Mentioned during lecture: **
+  * Liu, Jaiyen, Veras, Mutlu, “RAIDR: Retention-Aware Intelligent DRAM Refresh,” ISCA 2012.
+  * Kim, Seshadri, Lee+, “A Case for Exploiting Subarray-Level Parallelism in DRAM,” ISCA 2012.
+  * Lee+, “Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture,” HPCA 2013.
+  * Liu+, “An Experimental Study of Data Retention Behavior in Modern DRAM Devices,” ISCA 2013.
+  * Seshadri+, “RowClone: Fast and Efficient In-DRAM Copy and Initialization of Bulk Data,” MICRO 2013.
+  * Pekhimenko+, “Linearly Compressed Pages: A Main Memory Compression Framework,” MICRO 2013.
+  * Chang+, “Improving DRAM Performance by Parallelizing Refreshes with Accesses,” HPCA 2014.
+  * Khan+, “The Efficacy of Error Mitigation Techniques for DRAM Retention Failures: A Comparative Experimental Study,” SIGMETRICS 2014.
+  * Luo+, “Characterizing Application Memory Error Vulnerability to Optimize Data Center Cost,” DSN 2014.
+  * Kim+, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors,” ISCA 2014.
+  * Lee, Ipek, Mutlu, Burger, “Architecting Phase Change Memory as a Scalable DRAM Alternative,” ISCA 2009, CACM 2010, Top Picks 2010.
+  * Meza, Chang, Yoon, Mutlu, Ranganathan, “Enabling Efficient and Scalable Hybrid Memories,” IEEE Comp. Arch. Letters 2012.
+  * Yoon, Meza et al., “Row Buffer Locality Aware Caching Policies for Hybrid Memories,” ICCD 2012.
+  * Kultursay+, “Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative,” ISPASS 2013.
+  * Meza+, “A Case for Efficient Hardware-Software Cooperative Management of Storage and
+Memory,” WEED 2013.
+  * Lee, Ipek, Mutlu, Burger, “Architecting Phase Change Memory as a Scalable DRAM Alternative,” ISCA 2009.
+  * Meza+, “Enabling Efficient and Scalable Hybrid Memories,” IEEE Comp. Arch. Letters, 2012.
+  * Yoon, Meza et al., “Row Buffer Locality Aware Caching Policies for Hybrid Memories,” ICCD 2012 Best Paper Award.

18-447 Introduction to Computer Architecture – Spring 2015

User Tools

Site Tools

Differences

Page Tools