=====Buzzwords=====

=====Lecture 1=====
  * Architecture of Parallel Computers
    * Fundamentals and Tradeoffs
  * Static and Dynamic Scheduling
  * Parallel Task Assignment
    * Static/Dynamic
    * Task Queues
    * Task Stealing

=====Lecture 2=====
  * Parallel Computer
  * SISD, SIMD, MISD, MIMD
  * Performance
  * Power consumption
  * Cost efficiency
  * Scalability
  * Complexity
  * Dependability
  * Instruction Level Parallelism
  * Data Parallelism
  * Task Level Parallelism
    * Parallel programming
    * Thread level speculation
  * Loosely/Tightly coupled multiprocessors
  * Shared memory synchronization
  * Cache consistency
  * Ordering of memory operations
  * Hardware-based Multithreading
    * Coarse grained
    * Fine grained
    * Simultaneous
  * Amdahl’s Law
    * Serial bottleneck
    * Synchronization overhead
    * Load imbalance overhead
    * Resource sharing overhead
  * Superlinear Speedup
    * Unfair comparisons
    * Memory/cache effect
  * Utilization, Redundancy, Efficiency
  * Parallel Programming
  * Parallel and Serial Bottlenecks

=====Lecture 3=====
  * Programming Models vs. Architectures
  * Shared memory programming model
  * Message passing programming model
  * Shared memory hardware
  * Message passing hardware
  * Communication abstraction
  * Generic Parallel Machine
  * Data Flow Graph
  * Synchronization
  * Application Binary Interface (ABI)
  * Data parallel programming model
  * Data parallel hardware
  * Connection Machine
  * Data flow programming model
  * Data flow hardware
  * Scalability
  * Interconnection Schemes
  * Uniform Memory/Cache Access (UMA/UCA)
  * Memory latency
  * Memory bandwidth
  * Symmetric multiprocessing (SMP)
  * Data placement
  * Non-Uniform Memory/Cache Access (NUMA/NUCA)
  * Local and remote memories
  * Critical path of memory access

=====Lecture 4=====
  * Multi-Core Processors
  * Technology scaling
  * Transistors and die area
  * Large Superscalar
  * Single-thread performance
  * Instruction issue queue
  * Multi-ported register file
  * Loop-level parallelism
  * Multiprogramming
  * Bigger caches
  * Multithreading
  * Thread-level parallelism
  * Resource sharing
  * Integrating platform components
  * Clustered superscalar processor
  * Inter-cluster bypass
  * Traditional symmetric multiprocessors

=====Lecture 5=====
  * Chip Multiprocessor (CMP)
  * Workload Characteristics
  * Instruction Level Parallelism (ILP)
  * Piranha CMP
  * Processing Node
  * Coherence Protocol Engine
  * I/O Node
  * Sun Niagara (UltraSPARC T1)
  * Niagara Core
  * Sun Niagara II (UltraSPARC T2)
  * Chip Multithreading (CMT)
  * Sun Rock
  * Runahead Execution
  * Memory Level Parallelism (MLP)
  * IBM POWER4
  * IBM POWER5
  * IBM POWER6
  * IBM POWER7
  * Large vs. Small Cores
  * Tile-Large vs. Tile-Small
  * Asymmetric Chip Multiprocessor (ACMP)
  * Serial Bottlenecks
  * Amdahl's Law
  * Asymmetric vs. Symmetric Cores
  * Frequency Boosting
  * EPI Throttling
  * Dynamic voltage frequency scaling (DVFS)

=====Lecture 6=====
  * EPI Throttling
  * Asymmetric Chip Multiprocessor (ACMP)
  * Energy Efficiency
  * Programmer effort
  * Shared Resource Management
  * Serialized Code Sections
  * Accelerated Critical Sections (ACS)
  * Bottleneck Identification and Scheduling (BIS)

=====Lecture 7=====
  * Main Memory
  * Memory Capacity
  * Memory Latency
  * Memory Bandwidth
  * Memory Energy/Power
  * Technology Scaling
  * DRAM Scaling
  * Charge Memory
  * Resistive Memory
  * Non-volatile Memory
  * Phase Change Memory (PCM)
  * Hybrid Memory
  * Write Filtering
  * Row-Locality Aware Data Placement
  * Tags in Memory
  * Dynamic Data Transfer Granularity
  * Memory Security

=====Lecture 8=====
  * Barriers
  * Thread Waiting
  * Bottleneck Acceleration
  * False Serialization
  * Starvation
  * Preemptive Acceleration
  * Staged Execution Model
  * Segment Spawning
  * Inter-segment data
  * Generator instruction
  * Data Marshaling
  * Pipeline Parallelism
  * Coverage, Accuracy, Timeliness

=====Lecture 9=====
  * Memory Scheduling
  * Fairness-Throughput
  * Thread cluster
  * Memory intensity
  * CPU-GPU Systems
  * Heterogeneous Memory Systems
  * Thread
  * Multitasking
  * Thread context
  * Hardware Multithreading
  * Latency tolerance
  * Fine-grained Multithreading
  * Pipeline utilization
  * Coarse-grained Multithreading
  * Stall events
  * Thread Switching Urgency
  * Fairness

=====Lecture 10=====
  * Fine-grained Multithreading
  * Coarse-grained Multithreading
  * Fairness and throughput
  * Thread Switching Urgency
  * Simultaneous Multithreading
  * Functional Unit Utilization
  * Superscalar Out-of-Order Pipeline
  * SMT Pipeline
  * SMT Scalability
  * SMT Fetch Policy
  * Long Latency Loads
  * Memory-Level Parallelism (MLP)
  * Runahead Threads
  * Thread Priority Support
  * Thread Throttling

=====Lecture 11=====
  * Utility cache partitioning
  * Cache capacity
  * Cache data compression
  * Frequent value compression
  * Frequent pattern compression
  * Low dynamic range
  * Base+Delta encoding
  * Main memory compression
  * IBM MXT
  * Linearly compressed pages

=====Lecture 13=====
  * Fault and Error
  * Fault Detection
  * Fault Tolerance
  * Transient Fault
  * Permanent Fault
  * Space redundancy
  * Time redundancy
  * Lockstepping
  * Simultaneous Redundant Threading (SRT)
  * Sphere of Replication
  * Input Replication
  * Output Comparison
  * Branch Outcome Queue
  * Line Prediction Queue
  * Chip Level Redundant Threading
  * Exception Handling
  * Helper Threading for Prefetching
  * Thread-Based Pre-Execution

=====Lecture 15=====
  * Slipstreaming
  * Instruction Removal
  * Dual Core Execution
  * Thread Level Speculation
  * Conflict Detection
  * Speculative Parallelization
  * Inter-Thread Communication
  * Data Dependences and Versioning
  * Speculative Memory State
  * Multiscalar Processor

=====Lecture 16=====
  * Multiscalar Processor
  * Multiscalar Tasks
  * Register Forwarding
  * Task Sequencing
  * Inter-Task Dependences
  * Address Resolution Buffer
  * Memory Dependence Prediction
  * Store-Load Dependencies
  * Memory Disambiguation
  * Speculative Lock Elision
  * Atomicity
  * Speculative Parallelization
  * Accelerating Critical Section
  * Transactional Lock Removal

=====Lecture 17=====
  * Interconnection Network
  * Network Topology
  * Bus
  * Crossbar
  * Ring
  * Mesh
  * Torus
  * Tree
  * Hypercube
  * Multistage Logarithmic Network
  * Circuit vs. Packet Switching
  * Flow Control
  * Head of Line Blocking
  * Virtual Channel Flow Control
  * Communicating Buffer Availability

=====Lecture 18=====
  * Routing
  * Deadlock
  * Router Design
  * Router Pipeline Optimizations
  * Interconnection Network Performance
  * Packet Scheduling
  * Bufferless Deflection Routing
  * Livelock
  * Packet Reassembling
  * Golden Packet
  * Minimally-Buffered Deflection Routing
  * Side Buffer
  * Heterogeneous Adaptive Throttling
  * Application-Aware Source Throttling
  * Dynamic Throttling Rate Adjustment

=====Lecture 20=====
  * Locks vs. Transactions
  * Transactional Memory
    * Logging/buffering
    * Conflict detection
    * Abort/rollback
    * Commit
  * Routing
    * Deterministic
    * Oblivious
    * Adaptive
  * Deadlock

=====Lecture 21=====
  * Packet Scheduling
  * Stall Time Criticality
  * Memory Level Parallelism
  * Shortest Job First Principle
  * Application Aware
  * Packet Ranking and Batching
  * Slack of Packets
  * Packet Prioritizing using Slack
  * Starvation Avoidance
  * 2-D Mesh, Concentration, Replication
  * Flattened Butterfly
  * Multidrop Express Channels (MECS)
  * Kilo-NoC
  * Network-on-Chip (NoC) Quality of Service (QoS)
  * Topology-Aware QoS

=====Lecture 22=====
  * Data Flow
  * Data Flow Nodes
  * Data Flow Graphs
  * Control Flow vs. Data Flow
  * Static Data Flow
  * Reentrant code (Function calls, Loops)
  * Dynamic Data Flow
  * Frame Pointer
  * Tagging
  * Data Structures
  * I-Structure
  * MIT Tagged Token Data Flow Architecture
  * Manchester Data Flow Machine
  * Combining Data Flow and Control Flow

=====Lecture 23=====
  * Combining Data Flow and Control Flow
  * Macro Dataflow
  * Restricted Data Flow
  * Systolic Architecture
  * Systolic Computation
  * Pipeline Parallelism

=====Lecture 24=====
  * Resource Sharing
  * Shared Resource Management and QoS
  * Resource Sharing vs. Partitioning
  * Multi-core Caching
  * Shared Cache Management
  * Sharing in Main Memory
  * Memory Controller
  * Inter-Thread Interference
  * QoS-Aware Memory Scheduling
  * Stall-Time Fairness
  * Bank Parallelism-Awareness
  * Request Batching
  * Shortest Stall-Time First Ranking
  * Memory Episode Lengths
  * Least Attained Service

=====Lecture 25=====
  * QoS-Aware Memory Request Scheduling
  * Smart/Dumb Resources
  * Throughput vs. Fairness
  * Thread Cluster Memory Scheduling
  * Clustering Threads
  * CPU-GPU Systems
  * Staged Memory Scheduling
  * Parallel Application Memory QoS

=====Lecture 26=====
  * QoS-Aware Memory Systems
  * Smart vs. Dumb Resources
  * Memory Channel Partitioning
  * Application-Awareness
  * Multiple Channels
  * Memory Intensity
  * Row Buffer Locality
  * Preferred Channel
  * Integrated Memory Partitioning and Scheduling
  * Fairness via Source Throttling
  * Dynamic Request Throttling
  * Estimating System Unfairness
  * Inter-Core Interference
  * Row Buffer Interference
  * Memory Interference-induced Slowdown Estimation
  * Shared Memory Performance Predictability
  * Shared Resource Interference
  * Memory Phase Fraction
  * Alone Request Service Rate
  * Shared Request Service Rate
  * “Soft” Slowdown Guarantees

=====Lecture 27=====
  * CPU-GPU Memory Scheduling
  * Batch Formation
  * Batch Scheduler
  * DRAM Command Scheduler
  * Prefetcher Accuracy
  * Feedback-Directed Prefetching
  * Hierarchical Prefetcher Aggressiveness Control
  * Inter-Core Cache Pollution
  * Global Control