User Tools

Site Tools


buzzword

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
buzzword [2015/04/01 18:17]
rachata
buzzword [2015/04/10 16:37]
kevincha
Line 1121: Line 1121:
       * How much did the prefetcher cause misses in the demand misses?       * How much did the prefetcher cause misses in the demand misses?
         * Hard to quantify         * Hard to quantify
 +
 +
 +===== Lecture 26 (4/3 Fri.) =====
 +
 +  * Feedback directed prefetcher
 +    * Use the result of the prefetcher as a feedback to the prefetcher
 +      * with accuracy, timeliness, polluting information
 +  * Markov prefetcher
 +    * Prefetch based on the previous history
 +    * Use markov model to predict
 +    * Pros: Can cover arbitary pattern (easy for link list traversal or trees)
 +    * Downside: High cost, cannot help with compulsory misses (no history)
 +  * Content directed prefetching
 +    * Indentify the content in memory for pointers (which is used as the address to prefetch
 +    * Not very efficient (hard to figure out which block is the pointer)
 +      * Software can give hints
 +  * Correlation table
 +    * Address correlation
 +  * Execution based prefetcher
 +    * Helper thread/​speculative thread
 +      * Use another thread to pre-execute a program
 +    * Can be a software based or hardware based
 +    * Discover misses before the main program (to prefetch data in a timely manner)
 +    * How do you construct the helper thread
 +    * Preexecute instruction (one example of how to initialize a speculative thread), slide 9
 +    * Thread-based pre-execution
 +  * Error tolerance
 +  * Solution to errors
 +    * Tolerate errors
 +      * New interface, new design
 +    * Eliminate or minimize errors
 +      * New technology, system-wide rethinking
 +    * Embrace errors
 +      * Map data that can tolerate errors to error-prone area
 +  * Hybrid memory systesm
 +    * Combining multiple memory technology together  ​
 +  * What can emerging technology help?
 +    * Scalability
 +    * Lower the cost
 +    * Energy efficiency
 +  * Possible solutions to the scaling problem
 +    * Less leakage DRAM
 +    * Heterogeneous DRAM (TL-DRAM, etc.)
 +    * Add more functionality to DRAM
 +    * Denser design (3D stack)
 +    * Different technology
 +      * NVM
 +  * Charge vs. resistice memory
 +    * How data is written?
 +    * How to read the data?
 +  * Non volatile memory
 +    * Resistive memory
 +      * PCM
 +        * Inject current to change the phase
 +        * Scales better than DRAM
 +          * Multiple bits per cell
 +            * Wider resistence range
 +        * No refresh is needed
 +        * Downside: Latency and write endurance
 +      * STT-MRAM
 +        * Inject current to change the polarity
 +      * Memristor
 +        * Inject current to change the structure
 +    * Pros and cons between different technologies
 +    * Persistency - data stay there even without power
 +      * Unified memory and storage management (persistent data structure) - Single level store
 +        * Improve energy and performance
 +        * Simplify programming model
 +  * Different design options for DRAM + NVM
 +    * DRAM as a cache
 +    * Place some data in DRAM and other in PCM
 +      * Based on the characteristics
 +        * Frequently accessed data that need lower write latency in DRAM
 +  ​
 +
 +===== Lecture 27 (4/6 Mon.) =====
 +  * Flynn'​s taxonomy
 +  * Parallelism
 +    * Reduces power consumption (P ~ CV^2F)
 +    * Better cost efficiency and easier to scale
 +    * Improves dependability (in case the other core is faulty
 +  * Different types of parallelism
 +    * Instruction level parallelism
 +    * Data level parallelism
 +    * Task level parallelism
 +  * Task level parallelism
 +    * Partition a single, potentially big, task into multiple parallel sub-task
 +      * Can be done explicitly (parallel programming by the programmer)
 +      * Or implicitly (hardware partitions a single thread speculatively)
 +    * Or, run multiple independent tasks (still improves throughput, but the speedup of any single tasks is not better, also simpler to implement)
 +  * Loosely coupled multiprocessor
 +    * No shared global address space
 +      * Message passing to communicate between different sources
 +    * Simple to manage memory
 +  * Tightly coupled multiprocessor
 +    * Shared global address space
 +    * Need to ensure consistency of data
 +    * Programming issues
 +  * Hardware-based multithreading
 +    * Coarse grained
 +    * Find grained
 +    * Simultaneous:​ Dispatch instruction from multiple threads at the same time
 +  * Parallel speedup
 +    * Superlinear speedup
 +  * Utilization,​ Redundancy, Efficiency
 +  * Amdahl'​s law
 +    * Maximum speedup
 +    * Parallel portion is not perfect
 +      * Serial bottleneck
 +      * Synchronization cost
 +      * Load balance
 +        * Some threads has more work, requires more time to hit the sync. point
 +  * Critical sections
 +    * Enforce mutually exclusive access to shared data
 +  * Issues in parallel programming
 +    * Correctness
 +    * Synchronization
 +    * Consistency
 +
 +
 +===== Lecture 28 (4/8 Wed.) =====
 +
buzzword.txt ยท Last modified: 2015/04/27 18:20 by rachata