This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
buzzword [2014/01/17 19:19] rachata |
buzzword [2014/01/31 19:15] rachata |
||
---|---|---|---|
Line 94: | Line 94: | ||
* Tradeoffs between control-driven and data-driven | * Tradeoffs between control-driven and data-driven | ||
* What are easier to program? | * What are easier to program? | ||
- | * Which are easy to compile? | + | * Which are easy to compile? |
- | * What are more parallel (does that mean it is faster?) | + | * What are more parallel (does that mean it is faster?) |
- | * Which machines are more complex to design? | + | * Which machines are more complex to design? |
* In control flow, when a program is stop, there is a pointer to the current state (precise state). | * In control flow, when a program is stop, there is a pointer to the current state (precise state). | ||
* ISA vs. Microarchitecture | * ISA vs. Microarchitecture | ||
* Semantics in the ISA | * Semantics in the ISA | ||
- | * uArch should obey the ISA | + | * uArch should obey the ISA |
- | * Changing ISA is costly, can affect compatibility. | + | * Changing ISA is costly, can affect compatibility. |
* Instruction pointers | * Instruction pointers | ||
* uArch techniques: common and powerful techniques break Vonn Neumann model if done at the ISA level | * uArch techniques: common and powerful techniques break Vonn Neumann model if done at the ISA level | ||
Line 164: | Line 164: | ||
* Optimization opportunity | * Optimization opportunity | ||
+ | ===== Lecture 4 (1/22 Wed.) ===== | ||
+ | |||
+ | * Semantic gap | ||
+ | * Small vs. Large semantic gap (CISC vs. RISC) | ||
+ | * Benefit of RISC vs. CISC | ||
+ | * Micro operations/microcode | ||
+ | * Translate complex instructions into smaller instructions | ||
+ | * Parallelism (motivation for RISC) | ||
+ | * Compiler optimization | ||
+ | * Code optimization through translation | ||
+ | * VLIW | ||
+ | * Fixed vs. variable length instructions | ||
+ | * Tradeoffs | ||
+ | * Alignment issues? (fetch/decode) | ||
+ | * Decoding issues? | ||
+ | * Code size? | ||
+ | * Adding additional instructions? | ||
+ | * Memory bandwidth and cache utilization? | ||
+ | * Energy? | ||
+ | * Encoding in variable length instructions | ||
+ | * Structure of Alpha instructions and other uniform decode instructions | ||
+ | * Different type of instructions | ||
+ | * Benefit of knowing what type of instructions | ||
+ | * Speculatively operate future instructions | ||
+ | * x86 and other non-uniform decode instructions | ||
+ | * Tradeoff vs. uniform decode | ||
+ | * Tradeoffs for different number of registers | ||
+ | * Spilling into memory if the number of registers is small | ||
+ | * Compiler optimization on how to manage which value to keep/spill | ||
+ | * Addressing modes | ||
+ | * Benefits? | ||
+ | * Types? | ||
+ | * Different uses of addressing modes? | ||
+ | * Various ISA-level tradeoffs | ||
+ | * Virtual memory | ||
+ | * Unalign memory access/aligned memory access | ||
+ | * Cost vs. benefit of unaligned access | ||
+ | * ISA specification | ||
+ | * Things you have to obey/specifie in the ISA specification | ||
+ | * Architectural states | ||
+ | * Microarchitecture implements how arch. state A transformed to the next arch. state A' | ||
+ | * Single cycle machines | ||
+ | * Critical path in the single cycle machine | ||
+ | * Multi cycle machines | ||
+ | * Functional units | ||
+ | * Performance metrics | ||
+ | * CPI/IPC | ||
+ | * CPI of a single cycle microarchitecture | ||
+ | |||
+ | ===== Lecture 5 (1/24 Fri.) ===== | ||
+ | |||
+ | * Instruction processing | ||
+ | * Fetch | ||
+ | * Decode | ||
+ | * Execute | ||
+ | * Memory fetch | ||
+ | * Writeback | ||
+ | * Datapath & Control logic in microprocessors | ||
+ | * Different types of instructions (I-type, R-type, etc.) | ||
+ | * Control flow instructions | ||
+ | * Non-control flow instructions | ||
+ | * Delayed slot/Delayed branch | ||
+ | * Single cycle control logic | ||
+ | * Lockstep | ||
+ | * Critical path analysis | ||
+ | * Critical path of a single cycle processor | ||
+ | * Combinational logic & Sequential logic | ||
+ | * Control store | ||
+ | * Tradeoffs of a single cycle uarch | ||
+ | * Dynamic power/Static power | ||
+ | * Speedup calculation | ||
+ | * Parallelism | ||
+ | * Serial bottleneck | ||
+ | * Amdahl's bottleneck | ||
+ | * Design principles | ||
+ | * Common case design | ||
+ | * Critical path design | ||
+ | * Balanced designs | ||
+ | * Multi cycle design | ||
+ | |||
+ | ===== Lecture 6 (1/27 Mon.) ===== | ||
+ | |||
+ | * Microcoded/Microprogrammed machines | ||
+ | * States | ||
+ | * Microinstructions | ||
+ | * Microsequencing | ||
+ | * Control store - Product control signals | ||
+ | * Microsequencer | ||
+ | * Control signal | ||
+ | * What do they have to control? | ||
+ | * Instruction processing cycle | ||
+ | * Latch signals | ||
+ | * State machine | ||
+ | * State variables | ||
+ | * Condition code | ||
+ | * Steering bits | ||
+ | * Branch enable logic | ||
+ | * Difference between gating and loading? (write enable vs. driving the bus) | ||
+ | * Memory mapped I/O | ||
+ | * Hardwired logic | ||
+ | * What control signals come from hardwired logic? | ||
+ | * Variable latency memory | ||
+ | * Handling interrupts | ||
+ | * Difference betwen interrupts and exceptions | ||
+ | * Emulator (i.e. uCode allots minimal datapath to emulate the ISA) | ||
+ | * Updating machine behavior | ||
+ | * Horizontal microcode | ||
+ | * Vertical microcode | ||
+ | * Primitives | ||
+ | |||
+ | ===== Lecture 7 (1/29 Wed.) ===== | ||
+ | |||
+ | * Pipelining | ||
+ | * Limitations of the multi-programmed design | ||
+ | * Idle resources | ||
+ | * Throughput of a pipelined design | ||
+ | * What dictacts the throughput of a pipelined design? | ||
+ | * Latency of the pipelined design | ||
+ | * Dependency | ||
+ | * Overhead of pipelining | ||
+ | * Latch cost? | ||
+ | * Data forwarding/bypassing | ||
+ | * What are the ideal pipeline? | ||
+ | * External fragmentation | ||
+ | * Issues in pipeline designs | ||
+ | * Stalling | ||
+ | * Dependency (Hazard) | ||
+ | * Flow dependence | ||
+ | * Output dependence | ||
+ | * Anti dependence | ||
+ | * How to handle them? | ||
+ | * Resource contention | ||
+ | * Keeping the pipeline full | ||
+ | * Handling exception/interrupts | ||
+ | * Pipeline flush | ||
+ | * Speculation | ||
+ | * Interlocking | ||
+ | * Multipath execution | ||
+ | * Fine grain multithreading | ||
+ | * No-op (Bubbles in the pipeline) | ||
+ | * Valid bits in the instructions | ||
+ | |||
+ | ===== Lecture 8 (1/31 Fri.) ===== | ||
+ | * Branch prediction | ||
+ | * Different types of data dependence | ||
+ | * Pipeline stalls | ||
+ | * bubbles | ||
+ | * How to handle stalls | ||
+ | * Stall conditions | ||
+ | * Stall signals | ||
+ | * Dependences | ||
+ | * Distant between dependences | ||
+ | * Data forwarding/bypassing | ||
+ | * Maintaining the correct dataflow | ||
+ | * Different ways to design data forwarding path/logic | ||
+ | * Different techniques to handle interlockings | ||
+ | * SW based | ||
+ | * HW based | ||
+ | * Profiling | ||
+ | * Static profiling | ||
+ | * Helps from the software (compiler) | ||
+ | * Superblock optimization | ||
+ | * Analyzing basic blocks | ||
+ | * How to deal with branches? | ||
+ | * Branch prediction | ||
+ | * Delayed branching (branch delay slot) | ||
+ | * Forward control flow/backward control flow | ||
+ | * Branch prediction accuracy | ||
+ | * Profile guided code positioning | ||
+ | * Based on the profile info. position the code based on it | ||
+ | * Try to make the next sequential instruction be the next inst. to be executed | ||
+ | * Trace cache | ||
+ | * Predicate combining (combine predicate for a branch instruction) | ||
+ | * Predicated execution (control dependence becomes data dependence) | ||
+ | * Definition of basic blocks | ||
+ | * Control flow graph |