Differences
This shows you the differences between two versions of the page.
buzzwords [2012/09/27 02:44] hanbiny |
buzzwords [2014/09/02 03:31] (current) |
||
---|---|---|---|
Line 181: | Line 181: | ||
* Thread Switching Urgency | * Thread Switching Urgency | ||
* Fairness | * Fairness | ||
+ | |||
+ | =====Lecture 10===== | ||
+ | * Fine-grained Multithreading | ||
+ | * Coarse-grained Multithreading | ||
+ | * Fairness and throughput | ||
+ | * Thread Switching Urgency | ||
+ | * Simultaneous Multithreading | ||
+ | * Functional Unit Utilization | ||
+ | * Superscalar Out-of-Order Pipeline | ||
+ | * SMT Pipeline | ||
+ | * SMT Scalability | ||
+ | * SMT Fetch Policy | ||
+ | * Long Latency Loads | ||
+ | * Memory-Level Parallelism (MLP) | ||
+ | * Runahead Threads | ||
+ | * Thread Priority Support | ||
+ | * Thread Throttling | ||
+ | |||
+ | =====Lecture 11===== | ||
+ | * Utility cache partitioning | ||
+ | * Cache capacity | ||
+ | * Cache data compression | ||
+ | * Frequent value compression | ||
+ | * Frequent pattern compression | ||
+ | * Low dynamic range | ||
+ | * Base+Delta encoding | ||
+ | * Main memory compression | ||
+ | * IBM MXT | ||
+ | * Linearly compressed pages | ||
+ | |||
+ | =====Lecture 13===== | ||
+ | * Fault and Error | ||
+ | * Fault Detection | ||
+ | * Fault Tolerance | ||
+ | * Transient Fault | ||
+ | * Permanent Fault | ||
+ | * Space redundancy | ||
+ | * Time redundancy | ||
+ | * Lockstepping | ||
+ | * Simultaneous Redundant Threading (SRT) | ||
+ | * Sphere of Replication | ||
+ | * Input Replication | ||
+ | * Output Comparison | ||
+ | * Branch Outcome Queue | ||
+ | * Line Prediction Queue | ||
+ | * Chip Level Redundant Threading | ||
+ | * Exception Handling | ||
+ | * Helper Threading for Prefetching | ||
+ | * Thread-Based Pre-Execution | ||
+ | |||
+ | =====Lecture 15===== | ||
+ | * Slipstreaming | ||
+ | * Instruction Removal | ||
+ | * Dual Core Execution | ||
+ | * Thread Level Speculation | ||
+ | * Conflict Detection | ||
+ | * Speculative Parallelization | ||
+ | * Inter-Thread Communication | ||
+ | * Data Dependences and Versioning | ||
+ | * Speculative Memory State | ||
+ | * Multiscalar Processor | ||
+ | |||
+ | =====Lecture 16===== | ||
+ | * Multiscalar Processor | ||
+ | * Multiscalar Tasks | ||
+ | * Register Forwarding | ||
+ | * Task Sequencing | ||
+ | * Inter-Task Dependences | ||
+ | * Address Resolution Buffer | ||
+ | * Memory Dependence Prediction | ||
+ | * Store-Load Dependencies | ||
+ | * Memory Disambiguation | ||
+ | * Speculative Lock Elision | ||
+ | * Atomicity | ||
+ | * Speculative Parallelization | ||
+ | * Accelerating Critical Section | ||
+ | * Transactional Lock Removal | ||
+ | |||
+ | =====Lecture 17===== | ||
+ | * Interconnection Network | ||
+ | * Network Topology | ||
+ | * Bus | ||
+ | * Crossbar | ||
+ | * Ring | ||
+ | * Mesh | ||
+ | * Torus | ||
+ | * Tree | ||
+ | * Hypercube | ||
+ | * Multistage Logarithmic Network | ||
+ | * Circuit vs. Packet Switching | ||
+ | * Flow Control | ||
+ | * Head of Line Blocking | ||
+ | * Virtual Channel Flow Control | ||
+ | * Communicating Buffer Availability | ||
+ | |||
+ | =====Lecture 18===== | ||
+ | * Routing | ||
+ | * Deadlock | ||
+ | * Router Design | ||
+ | * Router Pipeline Optimizations | ||
+ | * Interconnection Network Performance | ||
+ | * Packet Scheduling | ||
+ | * Bufferless Deflection Routing | ||
+ | * Livelock | ||
+ | * Packet Reassembling | ||
+ | * Golden Packet | ||
+ | * Minimally-Buffered Deflection Routing | ||
+ | * Side Buffer | ||
+ | * Heterogeneous Adaptive Throttling | ||
+ | * Application-Aware Source Throttling | ||
+ | * Dynamic Throttling Rate Adjustment | ||
+ | |||
+ | =====Lecture 20===== | ||
+ | * Locks vs. Transactions | ||
+ | * Transactional Memory | ||
+ | * Logging/buffering | ||
+ | * Conflict detection | ||
+ | * Abort/rollback | ||
+ | * Commit | ||
+ | * Routing | ||
+ | * Deterministic | ||
+ | * Oblivious | ||
+ | * Adaptive | ||
+ | * Deadlock | ||
+ | |||
+ | =====Lecture 21===== | ||
+ | * Packet Scheduling | ||
+ | * Stall Time Criticality | ||
+ | * Memory Level Parallelism | ||
+ | * Shortest Job First Principle | ||
+ | * Application Aware | ||
+ | * Packet Ranking and Batching | ||
+ | * Slack of Packets | ||
+ | * Packet Prioritizing using Slack | ||
+ | * Starvation Avoidance | ||
+ | * 2-D Mesh, Concentration, Replication | ||
+ | * Flattened Butterfly | ||
+ | * Multidrop Express Channels (MECS) | ||
+ | * Kilo-NoC | ||
+ | * Network-on-Chip (NoC) Quality of Service (QoS) | ||
+ | * Topology-Aware QoS | ||
+ | |||
+ | =====Lecture 22===== | ||
+ | * Data Flow | ||
+ | * Data Flow Nodes | ||
+ | * Data Flow Graphs | ||
+ | * Control Flow vs. Data Flow | ||
+ | * Static Data Flow | ||
+ | * Reentrant code (Function calls, Loops) | ||
+ | * Dynamic Data Flow | ||
+ | * Frame Pointer | ||
+ | * Tagging | ||
+ | * Data Structures | ||
+ | * I-Structure | ||
+ | * MIT Tagged Token Data Flow Architecture | ||
+ | * Manchester Data Flow Machine | ||
+ | * Combining Data Flow and Control Flow | ||
+ | |||
+ | =====Lecture 23===== | ||
+ | * Combining Data Flow and Control Flow | ||
+ | * Macro Dataflow | ||
+ | * Restricted Data Flow | ||
+ | * Systolic Architecture | ||
+ | * Systolic Computation | ||
+ | * Pipeline Parallelism | ||
+ | |||
+ | =====Lecture 24===== | ||
+ | * Resource Sharing | ||
+ | * Shared Resource Management and QoS | ||
+ | * Resource Sharing vs. Partitioning | ||
+ | * Multi-core Caching | ||
+ | * Shared Cache Management | ||
+ | * Sharing in Main Memory | ||
+ | * Memory Controller | ||
+ | * Inter-Thread Interference | ||
+ | * QoS-Aware Memory Scheduling | ||
+ | * Stall-Time Fairness | ||
+ | * Bank Parallelism-Awareness | ||
+ | * Request Batching | ||
+ | * Shortest Stall-Time First Ranking | ||
+ | * Memory Episode Lengths | ||
+ | * Least Attained Service | ||
+ | |||
+ | =====Lecture 25===== | ||
+ | * QoS-Aware Memory Request Scheduling | ||
+ | * Smart/Dumb Resources | ||
+ | * Throughput vs. Fairness | ||
+ | * Thread Cluster Memory Scheduling | ||
+ | * Clustering Threads | ||
+ | * CPU-GPU Systems | ||
+ | * Staged Memory Scheduling | ||
+ | * Parallel Application Memory QoS | ||
+ | |||
+ | =====Lecture 26===== | ||
+ | * QoS-Aware Memory Systems | ||
+ | * Smart vs. Dumb Resources | ||
+ | * Memory Channel Partitioning | ||
+ | * Application-Awareness | ||
+ | * Multiple Channels | ||
+ | * Memory Intensity | ||
+ | * Row Buffer Locality | ||
+ | * Preferred Channel | ||
+ | * Integrated Memory Partitioning and Scheduling | ||
+ | * Fairness via Source Throttling | ||
+ | * Dynamic Request Throttling | ||
+ | * Estimating System Unfairness | ||
+ | * Inter-Core Interference | ||
+ | * Row Buffer Interference | ||
+ | * Memory Interference-induced Slowdown Estimation | ||
+ | * Shared Memory Performance Predictability | ||
+ | * Shared Resource Interference | ||
+ | * Memory Phase Fraction | ||
+ | * Alone Request Service Rate | ||
+ | * Shared Request Service Rate | ||
+ | * “Soft” Slowdown Guarantees | ||
+ | |||
+ | =====Lecture 27===== | ||
+ | * CPU-GPU Memory Scheduling | ||
+ | * Batch Formation | ||
+ | * Batch Scheduler | ||
+ | * DRAM Command Scheduler | ||
+ | * Prefetcher Accuracy | ||
+ | * Feedback-Directed Prefetching | ||
+ | * Hierarchical Prefetcher Aggressiveness Control | ||
+ | * Inter-Core Cache Pollution | ||
+ | * Global Control |