Differences

This shows you the differences between two versions of the page.

--- buzzword [2014/02/12 19:20]
rachata
+++ buzzword [2015/02/13 19:19]
rachata
@@ Line 1: / Line 1: @@
 ====== Buzzwords ======
-Buzzwords are terms that are mentioned during lecture which are particularly important to understand thoroughly.  This page tracks the buzzwords for each of the lectures and can be used as a reference for finding gaps in your understanding of course material.
+Buzzwords are terms that are mentioned during lecture which are particularly important to understand thoroughly. This page tracks the buzzwords for each of the lectures and can be used as a reference for finding gaps in your understanding of course material.
-===== Lecture 1 (1/13 Mon.) =====
+===== Lecture 1 (1/12 Mon.) =====
   * Level of transformation
     * Algorithm
@@ Line 10: / Line 9: @@
     * Compiler
   * Cross abstraction layers
-    * Expose an interface
   * Tradeoffs
   * Caches
-  * Multi-thread
+  * DRAM/memory controller
-  * Multi-core
+  * DRAM banks
-  * Unfairness
-  * DRAM controller/Memory controller
-  * Memory hog
   * Row buffer hit/miss
   * Row buffer locality
-  * Streaming access/ Random access
+  * Unfairness
-  * DRAM refresh
+  * Memory performance hog
-  * Retention time
+  * Shared DRAM memory system
-  * Profiling DRAM retention time
+  * Streaming access vs. random access
+  * Memory scheduling policies
+  * Scheduling priority
+  * Retention time of DRAM
+  * Process variation
+  * Retention time profile
   * Power consumption
-  * Wimpy cores
   * Bloom filter
-    * Pros/Cons
+  * Hamming code
-    * False Positive
+  * Hamming distance
-  * Simulation
+  * DRAM row hammer
-  * Memory performance attacks
-  * RTL design
-===== Lecture 2 (1/15 Wed.) =====
+===== Lecture 2 (1/14 Wed.) =====
-  * Optimizing for energy/ Optimizing for the performance
-    * Generally you should optimize for the users
-  * state-of-the-art
-  * RTL Simulation
-    * Long, slow and can be costly
-  * High level simulation
-    * What should be employed?
-	* Important to get the idea of how they are implemented in RTL
-	* Allows designer to filter out techniques that do not work well
-  * Design points
-    * Design processors to meet the design points
-  * Software stack
-  * Design decisions
-  * Datacenters
-  * MIPS R2000
-    * What are architectural techniques that improve the performance of a processor over MIPS 2000
   * Moore's Law
+  * Algorithm --> step-by-step procedure to solve a problem
   * in-order execution
   * out-of-order execution
@@ Line 65: / Line 46: @@
   * Scaling issue
     * Transister are getting smaller
+  * Key components of a computer
+  * Design points
+    * Design processors to meet the design points
+  * Software stack
+  * Design decisions
+  * Datacenters
   * Reliability problems that cause errors
   * Analogies from Kuhn's "The Structure of Scientific Revolutions" (Recommended book)
@@ Line 73: / Line 60: @@
     * Computation
       * Communication
-        * Storage
+      * Storage
-          * DRAM
+        * DRAM
-          * NVRAM (Non-volatile memory): PCM, STT-MRAM
+        * NVRAM (Non-volatile memory): PCM, STT-MRAM
-          * Storage (Flash/Harddrive)
+        * Storage (Flash/Harddrive)
   * Von Neumann Model (Control flow model)
     * Stored program computer
-        * Properties of Von Neumann Model: Stored program, sequential instruction processing
+      * Properties of Von Neumann Model: Stored program, sequential instruction processing
-        * Unified memory
+      * Unified memory
-          * When does an instruction is being interpreted as an instruction (as oppose to a datum)?
+        * When does an instruction is being interpreted as an instruction (as oppose to a datum)?
-        * Program counter
+      * Program counter
-        * Examples: x86, ARM, Alpha, IBM Power series, SPARC, MIPS
+      * Examples: x86, ARM, Alpha, IBM Power series, SPARC, MIPS
   * Data flow model
     * Data flow machine
@@ Line 94: / Line 81: @@
   * Tradeoffs between control-driven and data-driven
     * What are easier to program?
-    * Which are easy to compile?
+      * Which are easy to compile?
-    * What are more parallel (does that mean it is faster?)
+      * What are more parallel (does that mean it is faster?)
-    * Which machines are more complex to design?
+      * Which machines are more complex to design?
     * In control flow, when a program is stop, there is a pointer to the current state (precise state).
   * ISA vs. Microarchitecture
     * Semantics in the ISA
-    * uArch should obey the ISA
+      * uArch should obey the ISA
-    * Changing ISA is costly, can affect compatibility.
+      * Changing ISA is costly, can affect compatibility.
   * Instruction pointers
   * uArch techniques: common and powerful techniques break Vonn Neumann model if done at the ISA level
@@ Line 109: / Line 96: @@
       * Out-of-order executions
       * etc.
-    * Design techniques
+        * Design techniques
-      * Adder implementation (Bit serial, ripple carry, carry lookahead)
+          * Adder implementation (Bit serial, ripple carry, carry lookahead)
-      * Connection machine (an example of a machine that use bit serial to tradeoff latency for more parallelism)
+          * Connection machine (an example of a machine that use bit serial to tradeoff latency for more parallelism)
   * Microprocessor: ISA + uArch + circuits
   * What are a part of the ISA? Instructions, memory, etc.
@@ Line 120: / Line 107: @@
 ===== Lecture 3 (1/17 Fri.) =====
-  * Design tradeoffs
+   * Microarchitecture
-  * Macro Architectures
+   * Three major tradeoffs of computer architecture
-  * Reconfiguribility vs. specialized designs
+   * Macro-architecture
-  * Parallelism (instructions, data parallel)
+   * LC-3b ISA
-  * Uniform decode (Example: Alpha)
+   * Unused instructions
-  * Steering bits (Sub-opcode)
+   * Bit steering
-  * 0,1,2,3 address machines
+   * Instruction processing style
-    * Stack machine
+   * 0,1,2,3 address machines
-    * Accumulator machine
+   * Stack machine
-    * 2-operand machine
+   * Accumulator machine
-    * 3-operand machine
+   * 2-operand machine
-    * Tradeoffs between 0,1,2,3 address machines
+   * 3-operand machine
-  * Instructions/Opcode/Operade specifiers (i.e. addressing modes)
+   * Tradeoffs between 0,1,2,3 address machines
-  * Simply vs. complex data type (and their tradeoffs)
+   * Postfix notation
-  * Semantic gap
+   * Instructions/Opcode/Operade specifiers (i.e. addressing modes)
-  * Translation layer
+   * Simply vs. complex data type (and their tradeoffs)
-  * Addressability
+   * Semantic gap and level
-  * Byte/bit addressable machines
+   * Translation layer
-  * Virtual memory
+   * Addressability
-  * Big/little endian
+   * Byte/bit addressable machines
-  * Benefits of having registers (data locality)
+   * Virtual memory
-  * Programmer visible (Architectural) state
+   * Big/little endian
-    * Programmers can access this directly
+   * Benefits of having registers (data locality)
-    * What are the benefits?
+   * Programmer visible (Architectural) state
-  * Microarchitectural state
+   * Programmers can access this directly
-    * Programmers cannot access this directly
+   * What are the benefits?
-  * Evolution of registers (from accumulators to registers)
+   * Microarchitectural state
-  * Different types of instructions
+   * Programmers cannot access this directly
-    * Control instructions
+   * Evolution of registers (from accumulators to registers)
-    * Data instructions
+   * Different types of instructions
-    * Operation instructions
+   * Control instructions
-  * Addressing modes
+   * Data instructions
-    * Tradeoffs (complexity, flexibility, etc.)
+   * Operation instructions
-  * Orthogonal ISA
+   * Addressing modes
-    * Addressing modes that are orthogonal to instructino types
+   * Tradeoffs (complexity, flexibility, etc.)
-  * Vectors vs. non vectored interrupts
+   * Orthogonal ISA
-  * Complex vs. simple instructions
+   * Addressing modes that are orthogonal to instruction types
-    * Tradeoffs
+   * I/O devices
-  * RISC vs. CISC
+   * Vectored vs. non-vectored interrupts
-    * Tradeoff
+   * Complex vs. simple instructions
-    * Backward compatibility
+   * Tradeoffs
-    * Performance
+   * RISC vs. CISC
-    * Optimization opportunity
+   * Tradeoff
+   * Backward compatibility
+   * Performance
+   * Optimization opportunity
+   * Translation
-===== Lecture 4 (1/22 Wed.) =====
+===== Lecture 4 (1/21 Wed.) =====
-  * Semantic gap
+  * Fixed vs. variable length instruction
-    * Small vs. Large semantic gap (CISC vs. RISC)
+  * Huffman encoding
-    * Benefit of RISC vs. CISC
+  * Uniform vs. non-uniform decode
-  * Micro operations/microcode
+  * Registers
-    * Translate complex instructions into smaller instructions
+    * Tradeoffs between number of registers
-  * Parallelism (motivation for RISC)
+  * Alignments
-  * Compiler optimization
+    * How does MIPS load words across alignment the boundary
-  * Code optimization through translation
-  * VLIW
-  * Fixed vs. variable length instructions
-    * Tradeoffs
-      * Alignment issues? (fetch/decode)
-      * Decoding issues?
-      * Code size?
-      * Adding additional instructions?
-      * Memory bandwidth and cache utilization?
-      * Energy?
-    * Encoding in variable length instructions
-  * Structure of Alpha instructions and other uniform decode instructions
-    * Different type of instructions
-    * Benefit of knowing what type of instructions
-      * Speculatively operate future instructions
-  * x86 and other non-uniform decode instructions
-    * Tradeoff vs. uniform decode
-  * Tradeoffs for different number of registers
-    * Spilling into memory if the number of registers is small
-    * Compiler optimization on how to manage which value to keep/spill
-  * Addressing modes
-    * Benefits?
-    * Types?
-    * Different uses of addressing modes?
-  * Various ISA-level tradeoffs
-  * Virtual memory
-  * Unalign memory access/aligned memory access
-    * Cost vs. benefit of unaligned access
-  * ISA specification
-    * Things you have to obey/specifie in the ISA specification
-  * Architectural states
-  * Microarchitecture implements how arch. state A transformed to the next arch. state A'
-  * Single cycle machines
-    * Critical path in the single cycle machine
-  * Multi cycle machines
-  * Functional units
-  * Performance metrics
-    * CPI/IPC
-      * CPI of a single cycle microarchitecture
-===== Lecture 5 (1/24 Fri.) =====
+===== Lecture 5 (1/26 Mon.) =====
+  * Tradeoffs in ISA: Instruction length
+    * Uniform vs. non-uniform
+  * Design point/Use cases
+    * What dictates the design point?
+  * Architectural states
+  * uArch
+    * How to implement the ISA in the uArch
+  * Different stages in the uArch
+  * Clock cycles
+  * Multi-cycle machine
+  * Datapath and control logic
+    * Control signals
+  * Execution time of instructions/program
+    * Metrics and what do they means
   * Instruction processing
     * Fetch
@@ Line 221: / Line 187: @@
     * Memory fetch
     * Writeback
-  * Datapath & Control logic in microprocessors
+  * Encoding and semantics
   * Different types of instructions (I-type, R-type, etc.)
   * Control flow instructions
@@ Line 230: / Line 196: @@
   * Critical path analysis
     * Critical path of a single cycle processor
-  * Combinational logic & Sequential logic
+  * What is in the control signals?
+    * Combinational logic & Sequential logic
   * Control store
   * Tradeoffs of a single cycle uarch
-  * Dynamic power/Static power
-  * Speedup calculation
-    * Parallelism
-    * Serial bottleneck
-    * Amdahl's bottleneck
   * Design principles
     * Common case design
     * Critical path design
     * Balanced designs
-  * Multi cycle design
+    * Dynamic power/Static power
+      * Increases in power due to frequency
-===== Lecture 6 (1/27 Mon.) =====
+===== Lecture 6 (1/28 Mon.) =====
+  * Design principles
+    * Common case design
+    * Critical path design
+    * Balanced designs
+  * Multi cycle design
   * Microcoded/Microprogrammed machines
     * States
+    * Translation from one state to another
     * Microinstructions
     * Microsequencing
@@ Line 273: / Line 242: @@
   * Vertical microcode
   * Primitives
+===== Lecture 7 (1/30 Fri.) =====
-===== Lecture 7 (1/29 Wed.) =====
+  * Emulator (i.e. uCode allots minimal datapath to emulate the ISA)
+  * Updating machine behavior
+  * Horizontal microcode
+  * Vertical microcode
+  * Primitives
+  * nanocode and millicode
+    * what are the differences between nano/milli/microcode
+  * microprogrammed vs. hardwire control
   * Pipelining
   * Limitations of the multi-programmed design
@@ Line 300: / Line 277: @@
     * Pipeline flush
     * Speculation
+===== Lecture 8 (2/2 Mon.) =====
   * Interlocking
   * Multipath execution
@@ Line 305: / Line 285: @@
   * No-op (Bubbles in the pipeline)
   * Valid bits in the instructions
-===== Lecture 8 (1/31 Fri.) =====
   * Branch prediction
   * Different types of data dependence
@@ Line 335: / Line 313: @@
     * Based on the profile info. position the code based on it
     * Try to make the next sequential instruction be the next inst. to be executed
-  * Trace cache
   * Predicate combining (combine predicate for a branch instruction)
   * Predicated execution (control dependence becomes data dependence)
+===== Lecture 9 (2/4 Wed.) =====
   * Definition of basic blocks
   * Control flow graph
-===== Lecture 9 (2/3 Mon.) =====
   * Delayed branching
     * benefit?
@@ Line 379: / Line 357: @@
       * Might be inaccurate
       * Does not require profiling
-    * Programmer can tell the hardware (via pragmas (hints))
+    * Static branch prediction
+      * Pregrammer provides pragmas, hinting the likelihood of taken/not taken branch
       * For example, x86 has the hint bit
     * Dynamic branch prediction
@@ Line 386: / Line 365: @@
         * One more bit for hysteresis
+===== Lecture 10 (2/6 Fri.) =====
-===== Lecture 10 (2/5 Wed.) =====
   * Branch prediction accuracy
     * Why are they very important?
       * Differences between 99% accuracy and 98% accuracy
       * Cost of a misprediction when the pipeline is veryd eep
-  * Value prediction
   * Global branch correlation
     * Some branches are correlated
@@ Line 404: / Line 380: @@
     * What information are used
   * Two-level branch prediction
-    * What entries do you keep in the glocal history?
+    * What entries do you keep in the global history?
     * What entries do you keep in the local history?
     * How many table?
@@ Line 414: / Line 390: @@
     * Store both GHP and PC in one combined information
     * How do you use the information? Why does the XOR result still usable?
-  * Slides (page 16-18) for a good overview of one- and two-level predictors
   * Warmup cost of the branch predictor
     * Hybrid solution? Fast warmup is used first, then switch to the slower one.
   * Tournament predictor (Alpha 21264)
-  * Other types of branch predictor
-    * Using machine learning?
-    * Geometric history length
-      * Look at branches far behind (but using geometric step)
   * Predicated execution - eliminate branches
     * What are the tradeoffs
-    * What is the block is big (can lead to execution a lot of useless work)
+    * What if the block is big (can lead to execution a lot of useless work)
     * Allows easier code optimization
       * From the compiler PoV, predicated execution combine multiple basic blocks into one bigger basic block
@@ Line 439: / Line 410: @@
     * Execute both paths
     * Can lead to wasted work
+    * VLIW
+    * SuperScalar
-===== Lecture 11 (2/12 Wed.) =====
+===== Lecture 11 (2/11 Wed.) =====
-  * Call and return prediction
+  * Geometric GHR length for branch prediction
-    * Direct call is easy to predict
+  * Perceptron branch predictor
-    * Retun is harder (indirect branches)
-      * Nested calls make return easier to predict
-        * Can use stack to predict the return
-    * Indirect branch prediction
-      * These branches have multiple targets
-      * For switch-case, virtual function calls, jump tables, interface calls
-      * BTB to predict the target address - low accuracy
-      * History based: BTB + GHR
-      * Virtual program counter prediction
-  * Complications in superscalar processors
-    * Fetch? What if multiple branches are fetched at the same time?
-    * Logic requires to ensure correctness?
   * Multi-cycle executions (Different functional units take different number of cycles)
     * Instructions can retire out-of-order
@@ Line 462: / Line 423: @@
   * Exceptions and Interrupts
     * When they are handled?
-      * Why are some interrupts should be handled right away?
+    * Why are some interrupts should be handled right away?
   * Precise exception
     * arch. state should be consistent before handling the exception/interrupts
@@ Line 470: / Line 431: @@
       * Easier to restart the processes
     * How to ensure precise exception?
-      * Tradeoffs between each method
+    * Tradeoffs between each method
   * Reorder buffer
     * Reorder results before they are visible to the arch. state
@@ Line 477: / Line 438: @@
     * Where to get the value from (forwarding path? reorder buffer?)
       * Extra logic to check where the youngest instructions/value is
-      * Content addressible search
+      * Content addressible search (CAM)
         * A lot of comparators
     * Different ways to simplify the reorder buffer
@@ Line 487: / Line 448: @@
   * Future file (commonly used, along with reorder buffer)
     * Keep two set of register files
-      * An updated value (Speculative), called fiture file
+      * An updated value (Speculative), called future file
       * A backup value (to restore the state quickly
     * Double the cost of the regfile, but reduce the area as you don't have to use a content addressible memory (compared to ROB alone)
@@ Line 498: / Line 459: @@
   * Checkpointing
     * Advantages?
+===== Lecture 12 (2/13 Fri.) =====
+  * Renaming
+  * Register renaming table
+  * Predictor (branch predictor, cache line predictor ...)
+  * Power budget (and its importance)
+  * Architectural state, precise state
+  * Memory dependence is known dynamically
+  * Register state is not shared across threads/processors
+  * Memory state is shared across threads/processors
+  * How to maintain speculative memory states
+  * Write buffers (helps simplify the process of checking the reorder buffer)
+  * Overall OoO mechanism
+    * What are other ways of eliminating dispatch stalls
+    * Dispatch when the sources are ready
+    * Retired instructions make the source available
+    * Register renaming
+    * Reservation station
+      * What goes into the reservation station
+      * Tags required in the reservation station
+    * Tomasulo's algorithm
+    * Without precise exception, OoO is hard to debug
+    * Arch. register ID
+    * Examples in the slides
+      * Slides 28 --> register renaming
+      * Slides 30-35 --> Exercise (also on the board)
+        * This will be usefull for the midterm
+    * Register aliasing table
+    * Broadcasting tags
+    * Using dataflow

18-447 Introduction to Computer Architecture – Spring 2015

User Tools

Site Tools

Differences

Page Tools