This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
buzzword [2015/02/25 21:02] kevincha |
buzzword [2015/02/25 21:05] kevincha [Lecture 16 (2/23 Mon.)] |
||
---|---|---|---|
Line 631: | Line 631: | ||
* Tradeoffs betwwen trace cache/Hyperblock/Superblock/BS-ISA | * Tradeoffs betwwen trace cache/Hyperblock/Superblock/BS-ISA | ||
| | ||
- | |||
===== Lecture 17 (2/25 Wed.) ===== | ===== Lecture 17 (2/25 Wed.) ===== | ||
- | * IA-64 | + | * IA-64 |
- | * EPIC | + | * EPIC |
- | * IA-64 instruction bundle | + | * IA-64 instruction bundle |
- | * Multiple instructions in the bundle along with the template bit | + | * Multiple instructions in the bundle along with the template bit |
- | * Template bits | + | * Template bits |
- | * Stop bits | + | * Stop bits |
- | * Non-faulting loads and exception propagation | + | * Non-faulting loads and exception propagation |
- | * Aggressive ST-LD reordering | + | * Aggressive ST-LD reordering |
- | * Phyiscal memory system | + | * Phyiscal memory system |
- | * Ideal pipelines | + | * Ideal pipelines |
- | * Ideal cache | + | * Ideal cache |
- | * More capacity | + | * More capacity |
- | * Fast | + | * Fast |
- | * Cheap | + | * Cheap |
- | * High bandwidth | + | * High bandwidth |
- | * DRAM cell | + | * DRAM cell |
- | * Cheap | + | * Cheap |
- | * Sense the purturbation through sense amplifier | + | * Sense the purturbation through sense amplifier |
- | * Slow and leaky | + | * Slow and leaky |
- | * SRAM cell (Cross coupled inverter) | + | * SRAM cell (Cross coupled inverter) |
- | * Expensice | + | * Expensice |
- | * Fast (easier to sense the value in the cell) | + | * Fast (easier to sense the value in the cell) |
- | * Memory bank | + | * Memory bank |
- | * Read access sequence | + | * Read access sequence |
* DRAM: Activate -> Read -> Precharge (if needed) | * DRAM: Activate -> Read -> Precharge (if needed) | ||
- | * What dominate the access laatency for DRAM and SRAM | + | * What dominate the access laatency for DRAM and SRAM |
- | * Scaling issue | + | * Scaling issue |
- | * Hard to scale the scale to be small | + | * Hard to scale the scale to be small |
- | * Memory hierarchy | + | * Memory hierarchy |
- | * Prefetching | + | * Prefetching |
- | * Caching | + | * Caching |
- | * Spatial and temporal locality | + | * Spatial and temporal locality |
- | * Cache can exploit these | + | * Cache can exploit these |
- | * Recently used data is likely to be accessed | + | * Recently used data is likely to be accessed |
- | * Nearby data is likely to be accessed | + | * Nearby data is likely to be accessed |
- | * Caching in a pipeline design | + | * Caching in a pipeline design |
- | * Cache management | + | * Cache management |
- | * Manual | + | * Manual |
- | * Data movement is managed manually | + | * Data movement is managed manually |
- | * Embedded processor | + | * Embedded processor |
- | * GPU scratchpad | + | * GPU scratchpad |
- | * Automatic | + | * Automatic |
- | * HW manage data movements | + | * HW manage data movements |
- | * Latency analysis | + | * Latency analysis |
- | * Based on the hit and miss status, next level access time (if miss), and the current level access time | + | * Based on the hit and miss status, next level access time (if miss), and the current level access time |
- | * Cache basics | + | * Cache basics |
- | * Set/block (line)/Placement/replacement/direct mapped vs. associative cache/etc. | + | * Set/block (line)/Placement/replacement/direct mapped vs. associative cache/etc. |
- | * Cache access | + | * Cache access |
- | * How to access tag and data (in parallel vs serially) | + | * How to access tag and data (in parallel vs serially) |
- | * How do tag and index get used? | + | * How do tag and index get used? |
- | * Modern processors perform serial access for higher level cache (L3 for example) to save power | + | * Modern processors perform serial access for higher level cache (L3 for example) to save power |
- | * Cost and benefit of having more associativity | + | * Cost and benefit of having more associativity |
- | * Given the associativity, which block should be replace if it is full | + | * Given the associativity, which block should be replace if it is full |
- | * Replacement poligy | + | * Replacement poligy |
- | * Random | + | * Random |
- | * Least recently used (LRU) | + | * Least recently used (LRU) |
- | * Least frequently used | + | * Least frequently used |
- | * Least costly to refetch | + | * Least costly to refetch |
- | * etc. | + | * etc. |
- | * How to implement LRU | + | * How to implement LRU |
- | * How to keep track of access ordering | + | * How to keep track of access ordering |
- | * Complexity increases rapidly | + | * Complexity increases rapidly |
- | * Approximate LRU | + | * Approximate LRU |
- | * Victim and next Victim policy | + | * Victim and next Victim policy |