This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
buzzword [2015/04/10 16:38] kevincha [Lecture 28 (4/8 Wed.)] |
buzzword [2015/04/16 21:42] kevincha |
||
---|---|---|---|
Line 1280: | Line 1280: | ||
* Invalidates the block on a write | * Invalidates the block on a write | ||
* Has an exclusive state | * Has an exclusive state | ||
+ | |||
+ | ===== Lecture 29 (4/13 Fri.) ===== | ||
+ | * MSI coherent protocol | ||
+ | * The problem: unnecessary broadcasts of invalidations | ||
+ | * MESI coherent protocol | ||
+ | * Add the exclusive state: this is the only cache copy and it is a clean state to MSI | ||
+ | * Multiple invalidation tradeoffs | ||
+ | * Problem: memory can be unnecessarily updated | ||
+ | * A possible owner state (MOESI) | ||
+ | * Tradeoffs between snooping and directory based coherence protocols | ||
+ | * Slide 31 has a good summary | ||
+ | * Directory: data structures | ||
+ | * Bit vectors vs. linked lists | ||
+ | * Scalability of directories | ||
+ | * Size? Latency? Thousand of nodes? Best of both snooping and directory? | ||
+ | |||
+ | | ||
+ | ===== Lecture 30 (4/13 Mon.) ===== | ||
+ | * In-memory computing | ||
+ | * Design goals of DRAM | ||
+ | * DRAM structures | ||
+ | * Banks | ||
+ | * Capacitors and sense amplifiers | ||
+ | * Trade-offs b/w number of sense amps and cells | ||
+ | * Width of bank I/O vs. row size | ||
+ | * DRAM operations | ||
+ | * ACTIVATE, READ/WRITE, and PRECHARGE | ||
+ | * Trade-offs | ||
+ | * Latency | ||
+ | * Bandwidth: Chip vs. rank vs. bank | ||
+ | * What's the benefit of having 8 chips? | ||
+ | * Parallelism | ||
+ | * RowClone | ||
+ | * What are the problems? | ||
+ | * Copying b/w two rows that share the same sense amplifier | ||
+ | * System software support | ||
+ | * Bitwise AND/OR | ||
+ | |||
+ | ===== Lecture 31 (4/15 Wed.) ===== | ||
+ | |||
+ | * Application slowdown | ||
+ | * Interference between different applications | ||
+ | * Applications' performance depends on other applications that they are running with | ||
+ | * Predictable performance | ||
+ | * Why are they important? | ||
+ | * Applications that need predictibility | ||
+ | * How to predict the performance? | ||
+ | * What information are useful? | ||
+ | * What need to be guarantee? | ||
+ | * How to estimate the performance when running with others? | ||
+ | * Easy, just measure the performance while it is running. | ||
+ | * How to estimate the performance when the application is running by itself. | ||
+ | * Hard if there is no profiling. | ||
+ | * The relationship between memory service rate and the performance. | ||
+ | * Key assumption: applications are memory bound | ||
+ | * Behavior of memory-bound applications | ||
+ | * With and without interference | ||
+ | * Memory phase vs. compute phase | ||
+ | * MISE | ||
+ | * Estimating slowdown using request service rate | ||
+ | * Inaccuracy when measuring request service rate alone | ||
+ | * Non-memory-bound applications | ||
+ | * Control slowdown and provide soft guarantee | ||
+ | * Taking into account of the shared cache | ||
+ | * MISE model + cache resource management | ||
+ | * Aug tag store | ||
+ | * Separate tag store for different cores | ||
+ | * Cache access rate alone and shared as the metric to estimate slowdown | ||
+ | * Cache paritiioning | ||
+ | * How to determine partitioning | ||
+ | * Utility based cache partitioning | ||
+ | * Others | ||
+ | * Maximum slowdown and fairness metric | ||
+ | | ||
+ | |||
+ | |