Differences

This shows you the differences between two versions of the page.

--- buzzword [2015/02/25 21:02]
kevincha
+++ buzzword [2015/03/02 19:15]
kevincha
@@ Line 631: / Line 631: @@
   * Tradeoffs betwwen trace cache/Hyperblock/Superblock/BS-ISA
 ===== Lecture 17 (2/25 Wed.) =====
-* IA-64
+  * IA-64
-   * EPIC
+    * EPIC
-* IA-64 instruction bundle
+  * IA-64 instruction bundle
- * Multiple instructions in the bundle along with the template bit
+    * Multiple instructions in the bundle along with the template bit
- * Template bits
+    * Template bits
- * Stop bits
+    * Stop bits
- * Non-faulting loads and exception propagation
+    * Non-faulting loads and exception propagation
-* Aggressive ST-LD reordering
+  * Aggressive ST-LD reordering
-* Phyiscal memory system
+  * Phyiscal memory system
-* Ideal pipelines
+  * Ideal pipelines
-    * Ideal cache
+  * Ideal cache
-      * More capacity
+    * More capacity
-      * Fast
+    * Fast
-      * Cheap
+    * Cheap
-      * High bandwidth
+    * High bandwidth
-* DRAM cell
+  * DRAM cell
-  * Cheap
+    * Cheap
-  * Sense the purturbation through sense amplifier
+    * Sense the purturbation through sense amplifier
-  * Slow and leaky
+    * Slow and leaky
-* SRAM cell (Cross coupled inverter)
+  * SRAM cell (Cross coupled inverter)
-  * Expensice
+    * Expensice
-  * Fast (easier to sense the value in the cell)
+    * Fast (easier to sense the value in the cell)
-* Memory bank
+  * Memory bank
-  * Read access sequence
+    * Read access sequence
     * DRAM: Activate -> Read -> Precharge (if needed)
-  * What dominate the access laatency for DRAM and SRAM
+    * What dominate the access laatency for DRAM and SRAM
-* Scaling issue
+  * Scaling issue
-  * Hard to scale the scale to be small
+    * Hard to scale the scale to be small
-* Memory hierarchy
+  * Memory hierarchy
-  * Prefetching
+    * Prefetching
-  * Caching
+    * Caching
-* Spatial and temporal locality
+  * Spatial and temporal locality
-  * Cache can exploit these
+    * Cache can exploit these
-  * Recently used data is likely to be accessed
+    * Recently used data is likely to be accessed
-  * Nearby data is likely to be accessed
+    * Nearby data is likely to be accessed
-* Caching in a pipeline design
+  * Caching in a pipeline design
-* Cache management
+  * Cache management
-  * Manual
+    * Manual
-    * Data movement is managed manually
+      * Data movement is managed manually
-      * Embedded processor
+        * Embedded processor
-      * GPU scratchpad
+        * GPU scratchpad
-  * Automatic
+    * Automatic
-    * HW manage data movements
+      * HW manage data movements
-* Latency analysis
+  * Latency analysis
-  * Based on the hit and miss status, next level access time (if miss), and the current level access time
+    * Based on the hit and miss status, next level access time (if miss), and the current level access time
-* Cache basics
+  * Cache basics
-  * Set/block (line)/Placement/replacement/direct mapped vs. associative cache/etc.
+    * Set/block (line)/Placement/replacement/direct mapped vs. associative cache/etc.
-* Cache access
+  * Cache access
-  * How to access tag and data (in parallel vs serially)
+    * How to access tag and data (in parallel vs serially)
-  * How do tag and index get used?
+    * How do tag and index get used?
-  * Modern processors perform serial access for higher level cache (L3 for example) to save power
+    * Modern processors perform serial access for higher level cache (L3 for example) to save power
-* Cost and benefit of having more associativity
+  * Cost and benefit of having more associativity
-  * Given the associativity, which block should be replace if it is full
+    * Given the associativity, which block should be replace if it is full
-  * Replacement poligy
+    * Replacement poligy
-    * Random
+      * Random
-    * Least recently used (LRU)
+      * Least recently used (LRU)
-    * Least frequently used
+      * Least frequently used
-    * Least costly to refetch
+      * Least costly to refetch
-    * etc.
+      * etc.
-* How to implement LRU
+  * How to implement LRU
-  * How to keep track of access ordering
+    * How to keep track of access ordering
-    * Complexity increases rapidly
+      * Complexity increases rapidly
-  * Approximate LRU
+    * Approximate LRU
-    * Victim and next Victim policy
+      * Victim and next Victim policy
+===== Lecture 18 (2/27 Fri.) =====
+  * Tag store and data store
+  * Cache hit rate
+  * Average memory access time (AMAT)
+  * AMAT vs. Stall time
+  * Cache basics
+    * Direct mapped vs. associative cache
+    * Set/block (line)/Placement/replacement
+    * How do tag and index get used?
+  * Full associativity
+  * Set associative cache
+    * insertion, promotion, eviction (replacement)
+  * Various replacement policies
+  * How to implement LRU
+    * How to keep track of access ordering
+      * Complexity increases rapidly
+    * Approximate LRU
+      * Victim and next Victim policy
+  * Set thrashing
+    * Working set is bigger than the associativity
+  * Belady's OPT
+    * Is this optimal?
+    * Complexity?
+  * DRAM as a cache for disk
+  * Handling writes
+    * Write through
+      * Need a modified bit to make sure accesses to data got the updated data
+    * Write back
+      * Simpler, no consistency issues
+  * Sectored cache
+    * Use subblock
+      * lower bandwidth
+      * more complex
+  * Instruction vs data cache
+    * Where to place instructions
+      * Unified vs. separated
+    * In the first level cache
+  * Cache access
+    * First level access
+    * Second level access
+      * When to start the second level access
+  * Cache performance
+    * capacity
+    * block size
+    * associativity
+  * Classification of cache misses
+===== Lecture 19 (03/02 Mon.) =====
+* Subblocks
+* Victim cache
+  * Small, but fully assoc. cache behind the actual cache
+  * Cached misses cache block
+  * Prevent ping-ponging
+* Pseudo associtivity
+  * Simpler way to implement associative cache
+* Skewed assoc. cache
+  * Different hashing functions for each way
+* Restructure data access pattern
+  * Order of loop traversal
+  * Blocking
+* Memory level parallelism
+  * Cost per miss of a parallel cache miss is less costly compared to serial misses
+* MSHR
+  * Keep track of pending cache
+    * Think of this as the load/store buffer-ish for cache
+  * What information goes into the MSHR?
+  * When do you access the MSHR?
+* Memory banks
+* Shared caches in multi-core processors

18-447 Introduction to Computer Architecture – Spring 2015

User Tools

Site Tools

Differences

Page Tools