Differences

This shows you the differences between two versions of the page.

--- buzzword [2014/02/05 19:20]
rachata
+++ buzzword [2015/01/16 19:47]
kevincha [Lecture 3 (1/17 Fri.)]
@@ Line 1: / Line 1: @@
 ====== Buzzwords ======
-Buzzwords are terms that are mentioned during lecture which are particularly important to understand thoroughly.  This page tracks the buzzwords for each of the lectures and can be used as a reference for finding gaps in your understanding of course material.
+Buzzwords are terms that are mentioned during lecture which are particularly important to understand thoroughly. This page tracks the buzzwords for each of the lectures and can be used as a reference for finding gaps in your understanding of course material.
-===== Lecture 1 (1/13 Mon.) =====
+===== Lecture 1 (1/12 Mon.) =====
   * Level of transformation
     * Algorithm
@@ Line 10: / Line 9: @@
     * Compiler
   * Cross abstraction layers
-    * Expose an interface
   * Tradeoffs
   * Caches
-  * Multi-thread
+  * DRAM/memory controller
-  * Multi-core
+  * DRAM banks
-  * Unfairness
-  * DRAM controller/Memory controller
-  * Memory hog
   * Row buffer hit/miss
   * Row buffer locality
-  * Streaming access/ Random access
+  * Unfairness
-  * DRAM refresh
+  * Memory performance hog
-  * Retention time
+  * Shared DRAM memory system
-  * Profiling DRAM retention time
+  * Streaming access vs. random access
+  * Memory scheduling policies
+  * Scheduling priority
+  * Retention time of DRAM
+  * Process variation
+  * Retention time profile
   * Power consumption
-  * Wimpy cores
   * Bloom filter
-    * Pros/Cons
+  * Hamming code
-    * False Positive
+  * Hamming distance
-  * Simulation
+  * DRAM row hammer
-  * Memory performance attacks
-  * RTL design
-===== Lecture 2 (1/15 Wed.) =====
+===== Lecture 2 (1/14 Wed.) =====
-  * Optimizing for energy/ Optimizing for the performance
-    * Generally you should optimize for the users
-  * state-of-the-art
-  * RTL Simulation
-    * Long, slow and can be costly
-  * High level simulation
-    * What should be employed?
-	* Important to get the idea of how they are implemented in RTL
-	* Allows designer to filter out techniques that do not work well
-  * Design points
-    * Design processors to meet the design points
-  * Software stack
-  * Design decisions
-  * Datacenters
-  * MIPS R2000
-    * What are architectural techniques that improve the performance of a processor over MIPS 2000
   * Moore's Law
+  * Algorithm --> step-by-step procedure to solve a problem
   * in-order execution
   * out-of-order execution
@@ Line 65: / Line 46: @@
   * Scaling issue
     * Transister are getting smaller
+  * Key components of a computer
+  * Design points
+    * Design processors to meet the design points
+  * Software stack
+  * Design decisions
+  * Datacenters
   * Reliability problems that cause errors
   * Analogies from Kuhn's "The Structure of Scientific Revolutions" (Recommended book)
@@ Line 73: / Line 60: @@
     * Computation
       * Communication
-        * Storage
+      * Storage
-          * DRAM
+        * DRAM
-          * NVRAM (Non-volatile memory): PCM, STT-MRAM
+        * NVRAM (Non-volatile memory): PCM, STT-MRAM
-          * Storage (Flash/Harddrive)
+        * Storage (Flash/Harddrive)
   * Von Neumann Model (Control flow model)
     * Stored program computer
-        * Properties of Von Neumann Model: Stored program, sequential instruction processing
+      * Properties of Von Neumann Model: Stored program, sequential instruction processing
-        * Unified memory
+      * Unified memory
-          * When does an instruction is being interpreted as an instruction (as oppose to a datum)?
+        * When does an instruction is being interpreted as an instruction (as oppose to a datum)?
-        * Program counter
+      * Program counter
-        * Examples: x86, ARM, Alpha, IBM Power series, SPARC, MIPS
+      * Examples: x86, ARM, Alpha, IBM Power series, SPARC, MIPS
   * Data flow model
     * Data flow machine
@@ Line 94: / Line 81: @@
   * Tradeoffs between control-driven and data-driven
     * What are easier to program?
-    * Which are easy to compile?
+      * Which are easy to compile?
-    * What are more parallel (does that mean it is faster?)
+      * What are more parallel (does that mean it is faster?)
-    * Which machines are more complex to design?
+      * Which machines are more complex to design?
     * In control flow, when a program is stop, there is a pointer to the current state (precise state).
   * ISA vs. Microarchitecture
     * Semantics in the ISA
-    * uArch should obey the ISA
+      * uArch should obey the ISA
-    * Changing ISA is costly, can affect compatibility.
+      * Changing ISA is costly, can affect compatibility.
   * Instruction pointers
   * uArch techniques: common and powerful techniques break Vonn Neumann model if done at the ISA level
@@ Line 109: / Line 96: @@
       * Out-of-order executions
       * etc.
-    * Design techniques
+        * Design techniques
-      * Adder implementation (Bit serial, ripple carry, carry lookahead)
+          * Adder implementation (Bit serial, ripple carry, carry lookahead)
-      * Connection machine (an example of a machine that use bit serial to tradeoff latency for more parallelism)
+          * Connection machine (an example of a machine that use bit serial to tradeoff latency for more parallelism)
   * Microprocessor: ISA + uArch + circuits
   * What are a part of the ISA? Instructions, memory, etc.
@@ Line 120: / Line 107: @@
 ===== Lecture 3 (1/17 Fri.) =====
-  * Design tradeoffs
+   * Microarchitecture
-  * Macro Architectures
+   * Three major tradeoffs of computer architecture
-  * Reconfiguribility vs. specialized designs
+   * Macro-architecture
-  * Parallelism (instructions, data parallel)
+   * LC-3b ISA
-  * Uniform decode (Example: Alpha)
+   * Unused instructions
-  * Steering bits (Sub-opcode)
+   * Bit steering
-  * 0,1,2,3 address machines
+   * Instruction processing style
-    * Stack machine
+   * 0,1,2,3 address machines
-    * Accumulator machine
+   * Stack machine
-    * 2-operand machine
+   * Accumulator machine
-    * 3-operand machine
+   * 2-operand machine
-    * Tradeoffs between 0,1,2,3 address machines
+   * 3-operand machine
-  * Instructions/Opcode/Operade specifiers (i.e. addressing modes)
+   * Tradeoffs between 0,1,2,3 address machines
-  * Simply vs. complex data type (and their tradeoffs)
+   * Postfix notation
-  * Semantic gap
+   * Instructions/Opcode/Operade specifiers (i.e. addressing modes)
-  * Translation layer
+   * Simply vs. complex data type (and their tradeoffs)
-  * Addressability
+   * Semantic gap and level
-  * Byte/bit addressable machines
+   * Translation layer
-  * Virtual memory
+   * Addressability
-  * Big/little endian
+   * Byte/bit addressable machines
-  * Benefits of having registers (data locality)
+   * Virtual memory
-  * Programmer visible (Architectural) state
+   * Big/little endian
-    * Programmers can access this directly
+   * Benefits of having registers (data locality)
-    * What are the benefits?
+   * Programmer visible (Architectural) state
-  * Microarchitectural state
+   * Programmers can access this directly
-    * Programmers cannot access this directly
+   * What are the benefits?
-  * Evolution of registers (from accumulators to registers)
+   * Microarchitectural state
-  * Different types of instructions
+   * Programmers cannot access this directly
-    * Control instructions
+   * Evolution of registers (from accumulators to registers)
-    * Data instructions
+   * Different types of instructions
-    * Operation instructions
+   * Control instructions
-  * Addressing modes
+   * Data instructions
-    * Tradeoffs (complexity, flexibility, etc.)
+   * Operation instructions
-  * Orthogonal ISA
+   * Addressing modes
-    * Addressing modes that are orthogonal to instructino types
+   * Tradeoffs (complexity, flexibility, etc.)
-  * Vectors vs. non vectored interrupts
+   * Orthogonal ISA
-  * Complex vs. simple instructions
+   * Addressing modes that are orthogonal to instruction types
-    * Tradeoffs
+   * I/O devices
-  * RISC vs. CISC
+   * Vectored vs. non-vectored interrupts
-    * Tradeoff
+   * Complex vs. simple instructions
-    * Backward compatibility
+   * Tradeoffs
-    * Performance
+   * RISC vs. CISC
-    * Optimization opportunity
+   * Tradeoff
+   * Backward compatibility
-===== Lecture 4 (1/22 Wed.) =====
+   * Performance
+   * Optimization opportunity
-  * Semantic gap
+   * Translation
-    * Small vs. Large semantic gap (CISC vs. RISC)
-    * Benefit of RISC vs. CISC
-  * Micro operations/microcode
-    * Translate complex instructions into smaller instructions
-  * Parallelism (motivation for RISC)
-  * Compiler optimization
-  * Code optimization through translation
-  * VLIW
-  * Fixed vs. variable length instructions
-    * Tradeoffs
-      * Alignment issues? (fetch/decode)
-      * Decoding issues?
-      * Code size?
-      * Adding additional instructions?
-      * Memory bandwidth and cache utilization?
-      * Energy?
-    * Encoding in variable length instructions
-  * Structure of Alpha instructions and other uniform decode instructions
-    * Different type of instructions
-    * Benefit of knowing what type of instructions
-      * Speculatively operate future instructions
-  * x86 and other non-uniform decode instructions
-    * Tradeoff vs. uniform decode
-  * Tradeoffs for different number of registers
-    * Spilling into memory if the number of registers is small
-    * Compiler optimization on how to manage which value to keep/spill
-  * Addressing modes
-    * Benefits?
-    * Types?
-    * Different uses of addressing modes?
-  * Various ISA-level tradeoffs
-  * Virtual memory
-  * Unalign memory access/aligned memory access
-    * Cost vs. benefit of unaligned access
-  * ISA specification
-    * Things you have to obey/specifie in the ISA specification
-  * Architectural states
-  * Microarchitecture implements how arch. state A transformed to the next arch. state A'
-  * Single cycle machines
-    * Critical path in the single cycle machine
-  * Multi cycle machines
-  * Functional units
-  * Performance metrics
-    * CPI/IPC
-      * CPI of a single cycle microarchitecture
-===== Lecture 5 (1/24 Fri.) =====
-  * Instruction processing
-    * Fetch
-    * Decode
-    * Execute
-    * Memory fetch
-    * Writeback
-  * Datapath & Control logic in microprocessors
-  * Different types of instructions (I-type, R-type, etc.)
-  * Control flow instructions
-  * Non-control flow instructions
-  * Delayed slot/Delayed branch
-  * Single cycle control logic
-  * Lockstep
-  * Critical path analysis
-    * Critical path of a single cycle processor
-  * Combinational logic & Sequential logic
-  * Control store
-  * Tradeoffs of a single cycle uarch
-  * Dynamic power/Static power
-  * Speedup calculation
-    * Parallelism
-    * Serial bottleneck
-    * Amdahl's bottleneck
-  * Design principles
-    * Common case design
-    * Critical path design
-    * Balanced designs
-  * Multi cycle design
-===== Lecture 6 (1/27 Mon.) =====
-  * Microcoded/Microprogrammed machines
-    * States
-    * Microinstructions
-    * Microsequencing
-    * Control store - Product control signals
-    * Microsequencer
-    * Control signal
-      * What do they have to control?
-  * Instruction processing cycle
-  * Latch signals
-  * State machine
-  * State variables
-  * Condition code
-  * Steering bits
-  * Branch enable logic
-  * Difference between gating and loading? (write enable vs. driving the bus)
-  * Memory mapped I/O
-  * Hardwired logic
-    * What control signals come from hardwired logic?
-  * Variable latency memory
-  * Handling interrupts
-  * Difference betwen interrupts and exceptions
-  * Emulator (i.e. uCode allots minimal datapath to emulate the ISA)
-  * Updating machine behavior
-  * Horizontal microcode
-  * Vertical microcode
-  * Primitives
-===== Lecture 7 (1/29 Wed.) =====
-  * Pipelining
-  * Limitations of the multi-programmed design
-    * Idle resources
-  * Throughput of a pipelined design
-    * What dictacts the throughput of a pipelined design?
-  * Latency of the pipelined design
-  * Dependency
-  * Overhead of pipelining
-    * Latch cost?
-  * Data forwarding/bypassing
-  * What are the ideal pipeline?
-  * External fragmentation
-  * Issues in pipeline designs
-    * Stalling
-      * Dependency (Hazard)
-        * Flow dependence
-        * Output dependence
-        * Anti dependence
-        * How to handle them?
-    * Resource contention
-    * Keeping the pipeline full
-    * Handling exception/interrupts
-    * Pipeline flush
-    * Speculation
-  * Interlocking
-  * Multipath execution
-  * Fine grain multithreading
-  * No-op (Bubbles in the pipeline)
-  * Valid bits in the instructions
-===== Lecture 8 (1/31 Fri.) =====
-  * Branch prediction
-  * Different types of data dependence
-  * Pipeline stalls
-    * bubbles
-    * How to handle stalls
-    * Stall conditions
-    * Stall signals
-    * Dependences
-      * Distant between dependences
-    * Data forwarding/bypassing
-    * Maintaining the correct dataflow
-  * Different ways to design data forwarding path/logic
-  * Different techniques to handle interlockings
-    * SW based
-    * HW based
-  * Profiling
-    * Static profiling
-    * Helps from the software (compiler)
-      * Superblock optimization
-      * Analyzing basic blocks
-  * How to deal with branches?
-    * Branch prediction
-    * Delayed branching (branch delay slot)
-    * Forward control flow/backward control flow
-    * Branch prediction accuracy
-  * Profile guided code positioning
-    * Based on the profile info. position the code based on it
-    * Try to make the next sequential instruction be the next inst. to be executed
-  * Trace cache
-  * Predicate combining (combine predicate for a branch instruction)
-  * Predicated execution (control dependence becomes data dependence)
-  * Definition of basic blocks
-  * Control flow graph
-===== Lecture 9 (2/3 Mon.) =====
-  * Delayed branching
-    * benefit?
-    * What does it eliminates?
-    * downside?
-    * Delayed branching in SPARC (with squashing)
-    * Backward compatibility with the delayed slot
-    * What should be filled in the delayed slot
-    * How to ensure correctness
-  * Fine-grained multithreading
-    * fetch from different threads
-    * What are the issues (what if the program doesn't have many threads)
-    * CDC 6000
-    * Denelcor HEP
-    * No dependency checking
-    * Inst. from different thread can fill-in the bubbles
-    * Cost?
-  * Simulteneuos multithreading
-  * Branch prediction
-    * Guess what to fetch next.
-    * Misprediction penalty
-    * Need to guess the direction and target
-    * How to perform the performance analysis?
-      * Given the branch prediction accuracy and penalty cost, how to compute a cost of a branch misprediction.
-      * Given the program/number of instructions, percent of branches, branch prediction accuracy and penalty cost, how to compute a cost coming from branch mispredictions.
-        * How many extra instructions are being fetched?
-        * What is the performance degredation?
-    * How to reduce the miss penalty?
-    * Predicting the next address (non PC+4 address)
-    * Branch target buffer (BTB)
-      * Predicting the address of the branch
-    * Global branch history - for directions
-    * Can use compiler to profile and get more info
-      * Input set dictacts the accuracy
-      * Add time to compilation
-    * Heuristics that are common and doesn't require profiling.
-      * Might be inaccurate
-      * Does not require profiling
-    * Programmer can tell the hardware (via pragmas (hints))
-      * For example, x86 has the hint bit
-    * Dynamic branch prediction
-      * Last time predictor
-      * Two bits counter based prediction
-        * One more bit for hysteresis
-===== Lecture 10 (2/5 Wed.) =====
-  * Branch prediction accuracy
-    * Why are they very important?
-      * Differences between 99% accuracy and 98% accuracy
-      * Cost of a misprediction when the pipeline is veryd eep
-  * Value prediction
-  * Global branch correlation
-    * Some branches are correlated
-  * Local branch correlation
-    * Some branches can depend on the result of past branches
-  * Pattern history table
-    * Record global taken/not taken results.
-    * Cost vs. accuracy (What to record, do you record PC? Just taken/not taken info.?)
-  * One-level branch predictor
-    * What information are used
-  * Two-level branch prediction
-    * What entries do you keep in the glocal history?
-    * What entries do you keep in the local history?
-    * How many table?
-    * Cost when training a table
-    * What are the purposes of each table?
-    * Potential problems of a two-level history
-  * GShare predictor
-    * Global history predictor is hashed with the PC
-    * Store both GHP and PC in one combined information
-    * How do you use the information? Why does the XOR result still usable?
-  * Slides (page 16-18) for a good overview of one- and two-level predictors
-  * Warmup cost of the branch predictor
-    * Hybrid solution? Fast warmup is used first, then switch to the slower one.
-  * Tournament predictor (Alpha 21264)
-  * Other types of branch predictor
-    * Using machine learning?
-    * Geometric history length
-      * Look at branches far behind (but using geometric step)
-  * Predicated execution - eliminate branches
-    * What are the tradeoffs
-    * What is the block is big (can lead to execution a lot of useless work)
-    * Allows easier code optimization
-      * From the compiler PoV, predicated execution combine multiple basic blocks into one bigger basic block
-      * Reduce control dependences
-    * Need ISA support
-  * Wish branches
-    * Compiler generate both predicated and non-predicated codes
-    * HW design which one to use
-      * Use branch prediction on an easy to predict code
-      * Use predicated execution on a hard to predict code
-      * Compiler can be more aggressive in optimimzing the code
-    * What are the tradeoffs (slide# 47)
-  * Multi-path execution
-    * Execute both paths
-    * Can lead to wasted work

18-447 Introduction to Computer Architecture – Spring 2015

User Tools

Site Tools

Differences

Page Tools