User Tools

Site Tools


buzzword

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
buzzword [2014/03/21 18:16]
rachata
buzzword [2014/03/28 17:39]
rachata
Line 777: Line 777:
       * Reduce interference       * Reduce interference
  
 +===== Lecture 21 (3/24 Mon.) =====
 +
 +
 +
 +  * Different parameters that affect cache miss
 +  * Thrashing
 +  * Different types of cache misses
 +    * Compulsory misses
 +      * Can mitigate with prefetches
 +    * Capacity misses
 +      * More assoc
 +      * Victim cache
 +    * Conflict misses
 +      * Hashing
 +  * Large block vs. small block
 +  * Subblocks
 +  * Victim cache
 +    * Small, but fully assoc. cache behind the actual cache
 +    * Cached misses cache block
 +    * Prevent ping-ponging
 +  * Pseudo associativity
 +    * Simpler way to implement associative cache
 +  * Skewed assoc. cache
 +    * Different hashing functions for each way
 +  * Restructure data access pattern
 +    * Order of loop traversal
 +    * Blocking
 +  * Memory level parallelism
 +    * Cost per miss of a parallel cache miss is less costly compared to serial misses
 +  * MSHR
 +    * Keep track of pending cache
 +      * Think of this as the load/store buffer-ish for cache
 +    * What information goes into the MSHR?
 +    * When do you access the MSHR?
 +
 +
 +===== Lecture 22 (3/26 Wed.) =====
 +
 +
 +
 +  * Multi-porting
 +    * Virtual multi-porting
 +      * Time-share the port, not too scalable but cheap
 +    * True multiporting
 +  * Multiple cache copies
 +  * Banking
 +    * Can have bank conflict
 +    * Extra interconnects across banks
 +    * Address mapping can mitigate bank conflict
 +    * Common in main memory (note that regFile in GPU is also banked, but mainly for the pupose of reducing complexity)
 +  * Accessing DRAM
 +    * Row bits
 +    * Column bits
 +    * Addressibility
 +    * DRAM has its own clock
 +  * DRAM (2T) vs. SRAM (6T)
 +    * Cost
 +    * Latency
 +  * Interleaving in DRAM
 +    * Effects from address mapping on memory interleaving
 +    * Effects from memory access patterns from the program on interleaving
 +  * DRAM Bank
 +    * To minimize the cost of interleaving (Shared the data bus and the command bus)
 +  * DRAM Rank
 +    * Minimize the cost of the chip (a bundle of chips operated together)
 +  * DRAM Channel
 +    * An interface to DRAM, each with its own ranks/banks
 +  * DIMM
 +    * More DIMM adds the interconnect complexity
 +  * List of commands to read/write data into DRAM
 +    * Activate -> read/write -> precharge
 +    * Activate moves data into the row buffer
 +    * Precharge prepare the bank for the next access
 +  * Row buffer hit
 +  * Row buffer conflict
 +  * Scheduling memory requests to lower row conflicts
 +  * Burst mode of DRAM
 +    * Prefetch 32-bits from an 8-bit interface if DRAM needs to read 32 bits
 +  * Address mapping
 +    * Row interleaved
 +    * Cache block interleaved
 +  * Memory controller
 +    * Sending DRAM commands
 +    * Periodically send commands to refresh DRAM cells
 +    * Ensure correctness and data integrity
 +    * Where to place the memory controller
 +      * On CPU chip vs. at the main memory
 +        * Higher BW on-chip
 +    * Determine the order of requests that will be serviced in DRAM
 +      * Request queues that hold requests
 +      * Send requests whenever the request can be sent to the bank
 +      * Determine which command (across banks) should be sent to DRAM
 +  * Priority of demand vs. prefetch requests
 +  * Memory scheduling policies
 +    * FCFS
 +    * FR-FCFS
 +      * Capped FR-FCFS: FR-FCFS with a timeout
 +      * Usually this is done in a command level (read/write commands and precharge/​activate commands)
 +    ​
 +  ​
 +===== Lecture 23 (3/28 Fri.) =====
 +  ​
 +  ​
 +  * DRAM design choices
 +    * Cost/​density/​latency/​BW/​Yield
 +  * Sense Amplifier
 +    * How do they work
 +  * Dual data rate
 +  * Subarray
 +  * Rowclone
 +    * Moving bulk of data from one row to others
 +    * Lower latency and BW when performing copies/​zeroes out the data
 +  * TL-DRAM
 +    * Far segment
 +    * Near segment
 +    * What causes the long latency
 +    * Benefit of TL-DRAM
 +      * TL-DRAM vs. DRAM cache (adding a small cache in DRAM)
 +
 +  ​
buzzword.txt ยท Last modified: 2015/04/27 18:20 by rachata