This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
buzzword [2014/04/18 18:19] rachata |
buzzword [2014/04/30 18:12] rachata |
||
---|---|---|---|
Line 1250: | Line 1250: | ||
| | ||
| | ||
- | | + | ===== Lecture 31 (4/28 Mon.) ===== |
+ | * Directory based cache coherent | ||
+ | * Each directory has to handle validate/invalidation | ||
+ | * Extra cost of syncronization | ||
+ | * Need to ensure race conditions are resolved | ||
+ | * Interconnection | ||
+ | * Topology | ||
+ | * Bus | ||
+ | * Mesh | ||
+ | * Torus | ||
+ | * Tree | ||
+ | * Butterfly | ||
+ | * Ring | ||
+ | * Bi-directional ring | ||
+ | * More scalable | ||
+ | * Hierarchical ring | ||
+ | * Even more scalable | ||
+ | * More complex | ||
+ | * Crossbar | ||
+ | * etc. | ||
+ | * Circuit switching | ||
+ | * Multistage network | ||
+ | * Butterfly | ||
+ | * Delta network | ||
+ | * Handling contention | ||
+ | * Buffering vs. dropping/deflection (no buffering) | ||
+ | * Routing algorithm | ||
+ | * Handling deadlock | ||
+ | * X-Y routing | ||
+ | * Turn model (to avoid deadlocks) | ||
+ | * Add more buffering for an escape path | ||
+ | * Oblivious routing | ||
+ | * Can take different path | ||
+ | * DOR between each intermediate location | ||
+ | * Balance network load | ||
+ | * Adaptive routing | ||
+ | * Use the state of the network to determine the route | ||
+ | * Aware of local and/or global congestions | ||
+ | * Non minimal adaptive routing can have livelocks | ||
+ | |||
+ | ===== Lecture 32 (4/30 Wed.) ===== | ||
+ | |||
+ | |||
+ | * Serialized code section | ||
+ | * Degrade performance | ||
+ | * Waste energy | ||
+ | * Heterogeneous cores | ||
+ | * Can execute serialized portion on a powerful large core | ||
+ | * Tradeoff between multiple small cores, multiple large cores or heterogenerous cores | ||
+ | * Critical section | ||
+ | * bottleneck in several multithreaded workloads | ||
+ | * Assymmetry can help | ||
+ | * Accelerated critical section | ||
+ | * Use a large core to run serialized portion of the code | ||
+ | * How to correctly support ACS | ||
+ | * False serialization | ||
+ | * Handling private/shared data | ||
+ | * BIS | ||
+ | * Ideltify the bottleneck | ||
+ | * Serial bottleneck | ||
+ | * Barrier | ||
+ | * Critical section | ||
+ | * Pipeline stages | ||
+ | * Application might wait on different types of bottlenecks | ||
+ | * Allow bottleneckcall and bottleneckreturn | ||
+ | * Acceleration can be done in multiple ways | ||
+ | * ship to a big core | ||
+ | * increase the frequency | ||
+ | * Priorize the thread in share resources (memory scheduler always schedule reqeusts from the thread first, etc.) | ||
+ | * Bottleneck table keeps track of different thread's bottleneck and determine the criticality | ||
+ | | ||
+ | | ||
+ | |