User Tools

Site Tools


buzzword

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
buzzword [2015/04/10 16:38]
kevincha [Lecture 28 (4/8 Wed.)]
buzzword [2015/04/22 16:47]
rachata
Line 1250: Line 1250:
     * Debugging is also difficult (non-deterministic behavior)     * Debugging is also difficult (non-deterministic behavior)
   * Dekker'​s algorithm   * Dekker'​s algorithm
-    * Inconsistency -- the two processors did NOT see the same order of +    * Inconsistency -- the two processors did NOT see the same order of operations to memory
-      ​operations to memory+
   * Sequential consistency   * Sequential consistency
     * Multiple correct global orders     * Multiple correct global orders
Line 1281: Line 1280:
         * Invalidates the block on a write         * Invalidates the block on a write
         * Has an exclusive state         * Has an exclusive state
 +
 +===== Lecture 29 (4/10 Fri.) =====
 +  * MSI coherent protocol
 +    * The problem: unnecessary broadcasts of invalidations
 +  * MESI coherent protocol
 +    * Add the exclusive state: this is the only cache copy and it is a clean state to MSI
 +    * Multiple invalidation tradeoffs
 +    * Problem: memory can be unnecessarily updated
 +    * A possible owner state (MOESI)
 +  * Tradeoffs between snooping and directory based coherence protocols
 +    * Slide 31 has a good summary
 +  * Directory: data structures
 +    * Bit vectors vs. linked lists
 +  * Scalability of directories
 +    * Size? Latency? Thousand of nodes? Best of both snooping and directory?
 +
 +    ​
 +===== Lecture 30 (4/13 Mon.) =====
 +  * In-memory computing
 +  * Design goals of DRAM
 +  * DRAM structures
 +    * Banks
 +    * Capacitors and sense amplifiers
 +    * Trade-offs b/w number of sense amps and cells
 +    * Width of bank I/O vs. row size
 +  * DRAM operations
 +    * ACTIVATE, READ/WRITE, and PRECHARGE
 +  * Trade-offs
 +    * Latency
 +    * Bandwidth: Chip vs. rank vs. bank
 +      * What's the benefit of having 8 chips?
 +    * Parallelism
 +  * RowClone
 +    * What are the problems?
 +    * Copying b/w two rows that share the same sense amplifier
 +    * System software support
 +  * Bitwise AND/OR
 +
 +===== Lecture 31 (4/15 Wed.) =====
 +
 +  * Application slowdown
 +  * Interference between different applications
 +    * Applications'​ performance depends on other applications that they are running with
 +  * Predictable performance
 +    * Why are they important?
 +    * Applications that need predictibility
 +    * How to predict the performance?​
 +      * What information are useful?
 +      * What need to be guarantee?
 +      * How to estimate the performance when running with others?
 +        * Easy, just measure the performance while it is running.
 +      * How to estimate the performance when the application is running by itself.
 +        * Hard if there is no profiling.
 +      * The relationship between memory service rate and the performance.
 +        * Key assumption: applications are memory bound
 +    * Behavior of memory-bound applications
 +      * With and without interference
 +  * Memory phase vs. compute phase
 +  * MISE
 +    * Estimating slowdown using request service rate
 +    * Inaccuracy when measuring request service rate alone
 +    * Non-memory-bound applications
 +    * Control slowdown and provide soft guarantee
 +  * Taking into account of the shared cache
 +    * MISE model + cache resource management
 +    * Aug tag store
 +      * Separate tag store for different cores
 +    * Cache access rate alone and shared as the metric to estimate slowdown
 +  * Cache paritiioning
 +    * How to determine partitioning
 +      * Utility based cache partitioning
 +      * Others
 +  * Maximum slowdown and fairness metric
 +    ​
 +
 +
 +===== Lecture 32 (4/20 Mon.) =====
 +
 +  * Heterogeneous systems
 +    * Asymmetric cores: different types of cores on the chip
 +      * Each of these cores are optimized for different workloads/​requirements/​goals
 +      * Multiple special purpose processors
 +      * Flexible and can adapt to workload behavior
 +      * Disadvantages:​ complex and high overhead
 +    * Examples: CPU-GPU systems, heterogeneity in execution models
 +    * Heterogeneous resources
 +      * Example: reliable and non-reliable DRAM in the same system
 +  * Key problems in modern systems
 +    * Memory system
 +    * Efficiency
 +    * Predictability
 +    * Assymmetric design can help solving these problems
 +  * Serialized code sections
 +    * Bottleneck in multicore execution
 +    * Parallelizable vs. serial portion
 +    * Accelerate critical section
 +    * Cache ping-ponging
 +      * Synchronization latency
 +    * Symmetric vs. assymmetric design
 +  * Large cores + small cores
 +    * Core assymmetry
 +  * Amdahl'​s law with heterogeneous cores
 +  * Parallel bottlenecks
 +    * Resource contention
 +      * Depends on what are running
 +  * Accelerated critical section
 +    * Ship critical sections to large cores
 +    * Small modifications and low overhead
 +    * False serialization might become the bottleneck
 +    * Can reduce parallel throughput
 +    * Effect on private cache misses and shared cache misses
 +    ​
 +  ​
 +
 +    ​
 +    ​
buzzword.txt · Last modified: 2015/04/27 18:20 by rachata