Differences

This shows you the differences between two versions of the page.

--- buzzword [2014/03/31 18:15]
rachata
+++ buzzword [2014/04/07 18:17]
rachata
@@ Line 878: / Line 878: @@
 ===== Lecture 23 (3/28 Fri.) =====
   * DRAM design choices
     * Cost/density/latency/BW/Yield
@@ Line 939: / Line 938: @@
+===== Lecture 25 (2/2 Wed.) =====
+  * Latency sensitivity
+    * Performance drops a lot when the memory request latency is long
+  * TCM
+    * Tradeoff between throughput and fairness
+    * Latency sensitive cluster (non-intensive cluster)
+      * Ranking based on memory intensity
+    * Bandwidth intensive cluster
+      * Round robin within the cluster
+    * Generally latency sensitive cluster has more priority
+    * Provide robust fairness vs. throughput
+    * Complexity of TCM?
+  * Different ways to control interference in DRAM
+    * Partitioning of resource
+      * Channel partitioning: map applications that interfere with each other in a different channel
+        * Keep track of application's characteristics
+        * Dedicate a channel might waste the bandwidth
+        * Need OS support to determine the channel bits
+    * Source throttling
+      * A controller throttle the core depends on the performance target
+      * Example: Fairness via source throttling
+        * Detect unfairness and throttle application that is interfering
+        * How do you estimate slowdown?
+        * Threshold based solution: hard to configure
+    * App/thread scheduling
+      * Critical threads usually stall the progress
+    * Designing DRAM controller
+      * Has to handle the normal DRAM operations
+        * Read/write/refresh/all the timing constraints
+      * Keep track of resources
+      * Assign priorities to different requests
+      * Manage requests to banks
+    * Self-optimizing controller
+      * Use machine learning to improve DRAM controller
+  * DRAM Refresh
+    * Why does DRAM has to refresh every 64ms
+    * Banks are unavailable during refresh
+      * LPDDR mitigate this by using a per-bank refresh
+    * Has to spend longer time with bigger DRAM
+    * Distributed refresh: stagger refresh every 64 ms in a distributed manner
+      * As oppose to burst refresh (long pause time)
+  * RAIDR: Reduce DRAM refresh by profiling and binning
+    * Some row do not have to be refresh very frequently
+      * Profile the row
+        * High temperature changes the retention time: need online profiling
+  * Bloom filter
+    * Represent set membership
+    * Approximated
+    * Can contain false positive
+      * Better/more hash function helps eliminate this
+===== Lecture 26 (4/7 Mon.) =====
+  * Tolerate latency can be costly
+    * Instruction window is complex
+      * Benefit also diminishes
+    * Designing the buffers can be complex
+    * A simpler way to tolerate out of order is desirable
+  * Different sources that cause the core to stall in OoO
+    * Cache miss
+    * Note that stall happens if the inst. window is full
+  * Scaling instruction window size is hard
+    * It is better (less complex) to make the windows more efficient
+  * Runahead execution
+    * Try to optain MLP w/o increasing instruction windows
+    * Runahead (i.e. execute ahead) when there is a long memory instruction
+      * Long memory instruction stall processor for a while anyways, so it's better to make use out of it
+      * Execute future instruction to generate accurate prefetches
+      * Allow future data to be in the cache
+    * How to support runahead execution?
+      * Need a way to checkpoing the state when entering runahead mode
+      * How to make executing in the wrong path useful?
+      * Need runahead cache to handle load/store in Runahead mode (since they are speculative)
+    * Cost and benefit of runahead execution (slide number 27)
+    * Runahead can have inefficiency
+      * Runahead period that are useless
+        * Get rid of useless inefficient period
+    * What if there is a dependent cache miss
+      * Cannot be paralellized in a vanilla runahead
+      * Can predict the value of the dependent load
+        * How to predict the address of the load
+          * Delta value information
+          * Stride predictor
+          * AVD prediction

18-447 Introduction to Computer Architecture – Spring 2015

User Tools

Site Tools

Differences

Page Tools