Differences

This shows you the differences between two versions of the page.

buzzwords [2010/11/11 05:12]
lsubrama
buzzwords [2010/12/01 23:06] (current)
lsubrama
Line 311: Line 311:
===== Lecture 19 ==== ===== Lecture 19 ====
- +  Main memory system
-Main memory system+
  * Memory hierarchy   * Memory hierarchy
Line 330: Line 329:
    - Memory controller placement     - Memory controller placement
-==== Lecture 20 =====+===== Lecture 20 =====
  * DRAM controller functions   * DRAM controller functions
Line 355: Line 354:
===== Lecture 21 ==== ===== Lecture 21 ====
-Super scalar processing+  Super scalar processing I
  * Types of parallelism   * Types of parallelism
Line 374: Line 373:
===== Lecture 22 ==== ===== Lecture 22 ====
 +  Super scalar processing II
  * Trace Caches   * Trace Caches
Line 399: Line 399:
      - Micro op sequencer       - Micro op sequencer
    - Instruction buffering fetch and decode     - Instruction buffering fetch and decode
 +
 +===== Lecture 23 ====
 +  Superscalar Processing III
 +
 +  * Renaming multiple instructions
 +    - dependency check logic (n^2 comparators)
 +    - help from compiler
 +      * ensure instructions are independent (difficult for wide fetches)
 +      * hardware-software co-design to simplify dependency logic
 +
 +  * Dispatching multiple instructions
 +    - wakeup logic (compare all tags in reservation station with all the tags that are broadcast)
 +    - select logic (hierarchical tree based selection)
 +
 +  * Execute
 +    - enough execution units
 +    - enough forwarding paths (broadcast tag/value to all functional units)
 +
 +  * Reducing dispatch+bypass delays
 +    - clustering (divide window into multiple clusters)
 +    - intra-cluster bypass is fast
 +    - inter-cluster bypass can be slow
 +
 +  * Register file
 +    - need multiple reads/writes per cycle
 +    - Replicate or partition the register files
 +    - using block-structured ISA
 +
 +  * Retirement
 +    - updating architectural register map
 +   
 +
 +===== Lecture 24 ====
 +  Control Flow
 +
 +  * Problem of branches
 +  * Types
 +    * conditional, unconditional, call, return, indirect branches
 +  * Handling conditional branches
 +  * Predicate combining
 +    * condition codes vs condition registers
 +  * Delayed branching
 +  * Fine-grained multi-threading
 +  * Branch prediction
 +    * predicting if an instruction is a branch (predecoding)
 +    * predicting the direction of the branch
 +    * predicting the target address of a branch
 +  * Static branch predition
 +    * always taken/not taken
 +    * backward taken, forward not taken
 +    * by compiler based on profiling
 +  * Dynamic branch prediction
 +    * last time predictor
 +    * history based predictors
 +    * two-level predictors
 +===== Lecture 25 ====
 +  Control Flow - II
 +
 +  * 2-bit counter based prediction
 +  * Global branch prediction
 +  * Global branch correlation
 +  * Global two-level prediction
 +    - Global history register
 +  * Local two-level prediction
 +    - Pattern history table
 +    - Interference in the pattern history table
 +      - Randomizing the index into the pattern history table
 +      - Agree prediction
 +  * Alpha 21264 Tournament Predictor
 +  * Perceptron branch predictor
 +    - Perceptron - learns a target boolean function of N inputs
 +  * Call and Return Prediction
 +  * Indirect branch prediction
 +    - Virtual Conditional Branch prediction
 +  * Branch prediction issues
 +    - Need to know a branch as soon as it is fetched
 +    - Latency
 +    - State recovery upon misprediction
 +  * Predicated execution
 +
 +==== Lecture 26 ====
 +  Control Flow - III & Concurrency
 +
 +  * Predicated Execution
 +    - Predication decisions at the compiler
 +    - Rename stage modifications
 +  * Limitations of predication
 +    - Adaptivity
 +    - Complex Control Flow Graphs
 +    - ISA support
 +  * Wish branches
 +    - Wish jump/join
 +    - Wish loop
 +  * Wish branches vs Predicated Execution
 +  * Wish branches vs Branch prediction
 +  * Diverge-Merge Processor
 +  * Dynamic-Hammock
 +  * Multi-path Execution
 +  * Research issues in control flow handling
 +    - Hardware/software cooperation
 +    - Fetch gating
 +    - Recycling useful work done on wrong path
 +  Concurrency
 +  * Classification of machines
 +    - SISD
 +    - SIMD
 +    - MIMD
 +  * Decoupled Access/Execute
 +  * Astronautics ZS-1
 +  * Loop unrolling
 +
 +==== Lecture 27 ====
 +  VLIW
 +
 +  * Each VLIW instruction - a bundle of independent instructions (identified by compiler)
 +  * Each instruction bundle executed by hardware in lockstep
 +  * Commercial VLIW machines
 +    - TIC6000, Trimedia, STMicro
 +  * Intel IA-64 - Partially VLIW
 +  * Encoding VLIW NOPs
 +  * Static Instruction Scheduling for VLIW
 +  * Code motion - Safety & Legality
 +  * Trace scheduling
 +  * List scheduling
 +  * Super block scheduling
 +  * Hyperblock scheduling
 +  * The Intel IA-64 architecture
 +    - No lock step execution of a bundle
 +    - Specify dependencies between instructions within a bundle
 +    - Template bits
 +  * What hinder static mode motion?
 +    - Exceptions
 +    - Loads/Stores

Personal Tools