Differences

This shows you the differences between two versions of the page.

--- buzzword [2014/04/16 18:22]
rachata
+++ buzzword [2014/12/11 00:09]
127.0.0.1 external edit
@@ Line 1200: / Line 1200: @@
   * MOESI
     * Improvement over MESI protocol
+===== Lecture 29 (4/18 Wed.) =====
+  * Interference
+  * Complexity of the memory scheduler
+    * Ranking/prioritization has cost
+    * Complex scheduler has higher latency
+  * Performance metric for multicore/multithead applications
+    * Speedup
+    * Slowdown
+    * Harmonic vs wrighted
+  * Fairness mertic
+    * Maximum slowdown
+      * Why does it make sense
+      * Any scenario that it does not make sense?
+  * Predictable performance
+    * Why is it important?
+      * In server environment, different jobs are on the same server
+      * In a mobile environment, there are multiple sources that can slowdown other sources
+    * How to relate slowdown with request service rate
+    * MISE: soft slowdown guarantee
+  * BDI
+    * Memory wall
+      * What is the concern regarding the memory wall
+    * Size of the cache on the die (CPU die)
+    * One possible solution: cache compression
+      * What is the problems of existing cache compression mechanism
+        * Some are too complex
+        * Decompression is in the critical path
+          * Need to decompress when reading the data -> decompression should not be in the critical path
+          * Important factor to the performance
+    * Software compression is not good enough to compress everything
+    * Zero value compression
+      * Simple
+      * Good compression ratio
+      * What is data does not have many zeroes
+    * Frequent value compression
+      * Some data appear fequently
+      * Simple and good compression ratio
+      * have to profile
+      * decompression is complex
+    * Frequent pattern compression
+      * Still to complex in terms of decompression
+    * Based delta compression
+      * Easy to decompress but retain the benefit of compression
+===== Lecture 31 (4/28 Mon.) =====
+  * Directory based cache coherent
+    * Each directory has to handle validate/invalidation
+    * Extra cost of syncronization
+    * Need to ensure race conditions are resolved
+  * Interconnection
+    * Topology
+      * Bus
+      * Mesh
+        * Torus
+      * Tree
+      * Butterfly
+      * Ring
+        * Bi-directional ring
+          * More scalable
+        * Hierarchical ring
+          * Even more scalable
+          * More complex
+      * Crossbar
+      * etc.
+    * Circuit switching
+    * Multistage network
+      * Butterfly
+      * Delta network
+    * Handling contention
+      * Buffering vs. dropping/deflection (no buffering)
+    * Routing algorithm
+      * Handling deadlock
+      * X-Y routing
+        * Turn model (to avoid deadlocks)
+      * Add more buffering for an escape path
+      * Oblivious routing
+        * Can take different path
+          * DOR between each intermediate location
+        * Balance network load
+      * Adaptive routing
+        * Use the state of the network to determine the route
+          * Aware of local and/or global congestions
+        * Non minimal adaptive routing can have livelocks
+===== Lecture 32 (4/30 Wed.) =====
+  * Serialized code section
+    * Degrade performance
+    * Waste energy
+  * Heterogeneous cores
+    * Can execute serialized portion on a powerful large core
+  * Tradeoff between multiple small cores, multiple large cores or heterogenerous cores
+  * Critical section
+    * bottleneck in several multithreaded workloads
+    * Assymmetry can help
+    * Accelerated critical section
+      * Use a large core to run serialized portion of the code
+      * How to correctly support ACS
+      * False serialization
+      * Handling private/shared data
+    * BIS
+      * Ideltify the bottleneck
+        * Serial bottleneck
+        * Barrier
+        * Critical section
+        * Pipeline stages
+      * Application might wait on different types of bottlenecks
+      * Allow bottleneckcall and bottleneckreturn
+      * Acceleration can be done in multiple ways
+        * ship to a big core
+        * increase the frequency
+        * Priorize the thread in share resources (memory scheduler always schedule reqeusts from the thread first, etc.)
+      * Bottleneck table keeps track of different thread's bottleneck and determine the criticality
+===== Lecture 33 (5/2 Fri.) =====
+  * DRAM scaling problem
+  * Possible solutions to the scaling problem
+    * Less leakage DRAM
+    * Heterogeneous DRAM (TL-DRAM, etc.)
+    * Add more functionality to DRAM
+    * Denser design (3D stack)
+    * Different technology
+      * NVM
+  * Non volatile memory
+    * Resistive memory
+      * PCM
+        * Inject current to change the phase
+        * Scales better than DRAM
+          * Multiple bits per cell
+            * Wider resistence range
+        * No refresh is needed
+        * Downside: Latency and write endurance
+      * STT-MRAM
+        * Inject current to change the polarity
+      * Memristor
+        * Inject current to change the structure
+    * Persistency - data stay there even without power
+      * Unified memory and storage management (persistent data structure) - Single level store
+        * Improve energy and performance
+        * Simplify programming model

18-447 Introduction to Computer Architecture – Spring 2015

User Tools

Site Tools

Differences

Page Tools