Differences

This shows you the differences between two versions of the page.

--- buzzword [2015/04/01 18:17]
rachata
+++ buzzword [2015/04/10 16:37]
kevincha
@@ Line 1121: / Line 1121: @@
       * How much did the prefetcher cause misses in the demand misses?
         * Hard to quantify
+===== Lecture 26 (4/3 Fri.) =====
+  * Feedback directed prefetcher
+    * Use the result of the prefetcher as a feedback to the prefetcher
+      * with accuracy, timeliness, polluting information
+  * Markov prefetcher
+    * Prefetch based on the previous history
+    * Use markov model to predict
+    * Pros: Can cover arbitary pattern (easy for link list traversal or trees)
+    * Downside: High cost, cannot help with compulsory misses (no history)
+  * Content directed prefetching
+    * Indentify the content in memory for pointers (which is used as the address to prefetch
+    * Not very efficient (hard to figure out which block is the pointer)
+      * Software can give hints
+  * Correlation table
+    * Address correlation
+  * Execution based prefetcher
+    * Helper thread/speculative thread
+      * Use another thread to pre-execute a program
+    * Can be a software based or hardware based
+    * Discover misses before the main program (to prefetch data in a timely manner)
+    * How do you construct the helper thread
+    * Preexecute instruction (one example of how to initialize a speculative thread), slide 9
+    * Thread-based pre-execution
+  * Error tolerance
+  * Solution to errors
+    * Tolerate errors
+      * New interface, new design
+    * Eliminate or minimize errors
+      * New technology, system-wide rethinking
+    * Embrace errors
+      * Map data that can tolerate errors to error-prone area
+  * Hybrid memory systesm
+    * Combining multiple memory technology together
+  * What can emerging technology help?
+    * Scalability
+    * Lower the cost
+    * Energy efficiency
+  * Possible solutions to the scaling problem
+    * Less leakage DRAM
+    * Heterogeneous DRAM (TL-DRAM, etc.)
+    * Add more functionality to DRAM
+    * Denser design (3D stack)
+    * Different technology
+      * NVM
+  * Charge vs. resistice memory
+    * How data is written?
+    * How to read the data?
+  * Non volatile memory
+    * Resistive memory
+      * PCM
+        * Inject current to change the phase
+        * Scales better than DRAM
+          * Multiple bits per cell
+            * Wider resistence range
+        * No refresh is needed
+        * Downside: Latency and write endurance
+      * STT-MRAM
+        * Inject current to change the polarity
+      * Memristor
+        * Inject current to change the structure
+    * Pros and cons between different technologies
+    * Persistency - data stay there even without power
+      * Unified memory and storage management (persistent data structure) - Single level store
+        * Improve energy and performance
+        * Simplify programming model
+  * Different design options for DRAM + NVM
+    * DRAM as a cache
+    * Place some data in DRAM and other in PCM
+      * Based on the characteristics
+        * Frequently accessed data that need lower write latency in DRAM
+===== Lecture 27 (4/6 Mon.) =====
+  * Flynn's taxonomy
+  * Parallelism
+    * Reduces power consumption (P ~ CV^2F)
+    * Better cost efficiency and easier to scale
+    * Improves dependability (in case the other core is faulty
+  * Different types of parallelism
+    * Instruction level parallelism
+    * Data level parallelism
+    * Task level parallelism
+  * Task level parallelism
+    * Partition a single, potentially big, task into multiple parallel sub-task
+      * Can be done explicitly (parallel programming by the programmer)
+      * Or implicitly (hardware partitions a single thread speculatively)
+    * Or, run multiple independent tasks (still improves throughput, but the speedup of any single tasks is not better, also simpler to implement)
+  * Loosely coupled multiprocessor
+    * No shared global address space
+      * Message passing to communicate between different sources
+    * Simple to manage memory
+  * Tightly coupled multiprocessor
+    * Shared global address space
+    * Need to ensure consistency of data
+    * Programming issues
+  * Hardware-based multithreading
+    * Coarse grained
+    * Find grained
+    * Simultaneous: Dispatch instruction from multiple threads at the same time
+  * Parallel speedup
+    * Superlinear speedup
+  * Utilization, Redundancy, Efficiency
+  * Amdahl's law
+    * Maximum speedup
+    * Parallel portion is not perfect
+      * Serial bottleneck
+      * Synchronization cost
+      * Load balance
+        * Some threads has more work, requires more time to hit the sync. point
+  * Critical sections
+    * Enforce mutually exclusive access to shared data
+  * Issues in parallel programming
+    * Correctness
+    * Synchronization
+    * Consistency
+===== Lecture 28 (4/8 Wed.) =====

18-447 Introduction to Computer Architecture – Spring 2015

User Tools

Site Tools

Differences

Page Tools