Differences
This shows you the differences between two versions of the page.
readings [2010/11/09 02:47] vseshadr |
readings [2010/12/04 06:00] (current) vseshadr |
||
---|---|---|---|
Line 170: | Line 170: | ||
== Required Readings (old) == | == Required Readings (old) == | ||
* {{microarchitectureofsuperscalar.pdf|Smith and Sohi, "The Microarchitecture of Superscalar Processors," Proc IEEE 1995}} | * {{microarchitectureofsuperscalar.pdf|Smith and Sohi, "The Microarchitecture of Superscalar Processors," Proc IEEE 1995}} | ||
- | |||
- | |||
- | == Recommended Readings == | ||
- | * {{tracecache.pdf|Rotenberg et al., "Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching," MICRO 1996}} | ||
* {{onpipeliningdynamicinstructionschedulinglogic.pdf|Stark, Brown, Patt, "On pipelining dynamic instruction scheduling logic," MICRO 2000}} | * {{onpipeliningdynamicinstructionschedulinglogic.pdf|Stark, Brown, Patt, "On pipelining dynamic instruction scheduling logic," MICRO 2000}} | ||
* {{Themicroarchitectureofthepentium4processor.pdf|Boggs et al., "The microarchitecture of the Pentium 4 processor," Intel Technology Journal, 2001}} | * {{Themicroarchitectureofthepentium4processor.pdf|Boggs et al., "The microarchitecture of the Pentium 4 processor," Intel Technology Journal, 2001}} | ||
* {{21264microprocessor.pdf|Kessler, "The Alpha 21264 microprocessor," IEEE Micro, March-April 1999}} | * {{21264microprocessor.pdf|Kessler, "The Alpha 21264 microprocessor," IEEE Micro, March-April 1999}} | ||
+ | == Recommended Readings == | ||
+ | * {{tracecache.pdf|Rotenberg et al., "Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching," MICRO 1996}} | ||
+ | |||
+ | ===== For Lecture 21 ===== | ||
+ | Same as previous lecture | ||
+ | ===== For Lecture 22 ===== | ||
+ | Same as previous lecture | ||
+ | ===== For Lecture 23 ===== | ||
+ | Same as previous lecture | ||
+ | |||
+ | ===== For Lecture 24 ===== | ||
+ | == Required Readings == | ||
+ | * {{conbiningbranchpredictors.pdf|McFarling, "Combining Branch Predictors," DEC WRL TR, 1993}} | ||
+ | * {{increasingprocessorperformance.pdf|Carmean and Sprangle, "Increasing Processor Performance by Implementing Deeper Pipelines," ISCA 2002}} | ||
+ | |||
+ | == Recommended Readings == | ||
+ | * {{analysisofcorrelationandpredictability.pdf|Evers et al., "An Analysis of Correlation and Predictability: What Makes Two-Level Branch Predictors Work," ISCA 1998}} | ||
+ | * {{alternativeimplementationoftwolevelbp.pdf|Yeh and Patt, "Alternative Implementations of Two-Level Adaptive Branch Prediction," ISCA 1992}} | ||
+ | * {{availableilpforsuperscalar.pdf|Jouppi and Wall, "Available instruction-level parallelism for superscalar and superpipelined machines," ASPLOS 1989}} | ||
+ | * {{divergemergeprocessors.pdf|Kim et al., "Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed Paths," MICRO 2006}} | ||
+ | * {{dynamicbranchpredictionwithperceptrons.pdf|Jimenez and Lin, "Dynamic Branch Prediction with Perceptrons," HPCA 2001}} | ||
+ | |||
+ | ===== For Lecture 25 ===== | ||
+ | Same as previous lecture | ||
+ | |||
+ | ===== For Lecture 26 ===== | ||
+ | |||
+ | === Control Flow III === | ||
+ | |||
+ | == Recommended Readings == | ||
+ | * {{wishbranches.pdf|Kim et al., "Wish Branches: Enabling Adaptive and Aggressive Predicated Execution," IEEE Micro Top Picks, Jan/Feb 2006}} | ||
+ | * {{divergemergeprocessors.pdf|Kim et al., "Diverge-Merge Processor: Generalized and Energy-Efficient Dynamic Predication," IEEE Micro Top Picks, Jan/Feb 2007}} | ||
+ | |||
+ | === Alternative Approaches to Concurrency === | ||
+ | == Required Readings == | ||
+ | * {{vliweli.pdf|Fisher, "Very Long Instruction Word architectures and the ELI-512," ISCA 1983}} | ||
+ | * {{introducingia64.pdf|Huck et al., "Introducing the IA-64 Architecture," IEEE Micro 2000}} | ||
+ | |||
+ | == Recommended Readings == | ||
+ | * {{cray1computersystem.pdf|Russell, "The CRAY-1 computer system," CACM 1978}} | ||
+ | * {{ilpprocessing.pdf|Rau and Fisher, "Instruction-level parallel processing: history,overview, and perspective," Journal of Supercomputing, 1993}} | ||
+ | * {{instructionschedulingforilpprocessors.pdf|Faraboschi et al., "Instruction Scheduling for Instruction Level Parallel Processors," Proc. IEEE, Nov. 2001}} | ||
+ | |||
+ | ===== For Lecture 26 ===== | ||
+ | Same as previous lecture (Alternative Approaches to Concurrency) | ||
+ | |||
+ | ===== For Lecture 27 ===== | ||
+ | == Required Readings == | ||
+ | * {{nvidiatesla.pdf|Lindholm et al., "NVIDIA Tesla: A Unified Graphics and Computing Architecture," IEEE Micro 2008}} | ||
+ | * {{cray1computersystem.pdf|Russell, "The CRAY-1 computer system," CACM 1978}} | ||
+ | |||
+ | == Recommended Readings == | ||
+ | * {{dynamicwarpformation.pdf|Fung et al., "Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow," MICRO 2007}} | ||
+ | * {{qilin.pdf|Luk et al., "Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping," MICRO 2009}} |