User Tools

Site Tools


buzzword

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
buzzword [2014/04/16 18:22]
rachata
buzzword [2014/12/11 00:09]
127.0.0.1 external edit
Line 1200: Line 1200:
   * MOESI   * MOESI
     * Improvement over MESI protocol     * Improvement over MESI protocol
 +
     ​     ​
 +===== Lecture 29 (4/18 Wed.) =====    ​
 +
 +
 +
 +  * Interference
 +  * Complexity of the memory scheduler
 +    * Ranking/​prioritization has cost
 +    * Complex scheduler has higher latency
 +  * Performance metric for multicore/​multithead applications
 +    * Speedup
 +    * Slowdown
 +    * Harmonic vs wrighted
 +  * Fairness mertic
 +    * Maximum slowdown
 +      * Why does it make sense
 +      * Any scenario that it does not make sense?
 +  * Predictable performance
 +    * Why is it important?
 +      * In server environment,​ different jobs are on the same server
 +      * In a mobile environment,​ there are multiple sources that can slowdown other sources
 +    * How to relate slowdown with request service rate
 +    * MISE: soft slowdown guarantee
 +  * BDI
 +    * Memory wall
 +      * What is the concern regarding the memory wall
 +    * Size of the cache on the die (CPU die)
 +    * One possible solution: cache compression
 +      * What is the problems of existing cache compression mechanism
 +        * Some are too complex
 +        * Decompression is in the critical path
 +          * Need to decompress when reading the data -> decompression should not be in the critical path
 +          * Important factor to the performance
 +    * Software compression is not good enough to compress everything
 +    * Zero value compression
 +      * Simple
 +      * Good compression ratio
 +      * What is data does not have many zeroes
 +    * Frequent value compression
 +      * Some data appear fequently
 +      * Simple and good compression ratio
 +      * have to profile
 +      * decompression is complex
 +    * Frequent pattern compression
 +      * Still to complex in terms of decompression
 +    * Based delta compression
 +      * Easy to decompress but retain the benefit of compression
 +      ​
     ​     ​
 +===== Lecture 31 (4/28 Mon.) =====  ​
 +
 +  * Directory based cache coherent
 +    * Each directory has to handle validate/​invalidation
 +    * Extra cost of syncronization
 +    * Need to ensure race conditions are resolved
 +  * Interconnection
 +    * Topology
 +      * Bus
 +      * Mesh
 +        * Torus
 +      * Tree
 +      * Butterfly
 +      * Ring
 +        * Bi-directional ring
 +          * More scalable
 +        * Hierarchical ring
 +          * Even more scalable
 +          * More complex
 +      * Crossbar
 +      * etc.
 +    * Circuit switching
 +    * Multistage network
 +      * Butterfly
 +      * Delta network
 +    * Handling contention
 +      * Buffering vs. dropping/​deflection (no buffering)
 +    * Routing algorithm
 +      * Handling deadlock
 +      * X-Y routing
 +        * Turn model (to avoid deadlocks)
 +      * Add more buffering for an escape path
 +      * Oblivious routing
 +        * Can take different path
 +          * DOR between each intermediate location
 +        * Balance network load
 +      * Adaptive routing
 +        * Use the state of the network to determine the route
 +          * Aware of local and/or global congestions
 +        * Non minimal adaptive routing can have livelocks
 +
 +===== Lecture 32 (4/30 Wed.) =====  ​
 +
 +
 +  * Serialized code section
 +    * Degrade performance
 +    * Waste energy
 +  * Heterogeneous cores
 +    * Can execute serialized portion on a powerful large core
 +  * Tradeoff between multiple small cores, multiple large cores or heterogenerous cores
 +  * Critical section
 +    * bottleneck in several multithreaded workloads
 +    * Assymmetry can help
 +    * Accelerated critical section
 +      * Use a large core to run serialized portion of the code
 +      * How to correctly support ACS
 +      * False serialization
 +      * Handling private/​shared data
 +    * BIS
 +      * Ideltify the bottleneck
 +        * Serial bottleneck
 +        * Barrier
 +        * Critical section
 +        * Pipeline stages
 +      * Application might wait on different types of bottlenecks
 +      * Allow bottleneckcall and bottleneckreturn
 +      * Acceleration can be done in multiple ways
 +        * ship to a big core
 +        * increase the frequency
 +        * Priorize the thread in share resources (memory scheduler always schedule reqeusts from the thread first, etc.)
 +      * Bottleneck table keeps track of different thread'​s bottleneck and determine the criticality
 +    ​
 +
 +===== Lecture 33 (5/2 Fri.) =====     
 +
 +
 +  * DRAM scaling problem
 +  * Possible solutions to the scaling problem
 +    * Less leakage DRAM
 +    * Heterogeneous DRAM (TL-DRAM, etc.)
 +    * Add more functionality to DRAM
 +    * Denser design (3D stack)
 +    * Different technology
 +      * NVM
 +  * Non volatile memory
 +    * Resistive memory
 +      * PCM
 +        * Inject current to change the phase
 +        * Scales better than DRAM
 +          * Multiple bits per cell
 +            * Wider resistence range
 +        * No refresh is needed
 +        * Downside: Latency and write endurance
 +      * STT-MRAM
 +        * Inject current to change the polarity
 +      * Memristor
 +        * Inject current to change the structure
 +    * Persistency - data stay there even without power
 +      * Unified memory and storage management (persistent data structure) - Single level store
 +        * Improve energy and performance
 +        * Simplify programming model
 +
buzzword.txt ยท Last modified: 2015/04/27 18:20 by rachata