Differences

This shows you the differences between two versions of the page.

readings [2010/11/09 02:47]
vseshadr
readings [2010/12/04 06:00] (current)
vseshadr
Line 170: Line 170:
== Required Readings (old) == == Required Readings (old) ==
  * {{microarchitectureofsuperscalar.pdf|Smith and Sohi, "The Microarchitecture of Superscalar Processors," Proc IEEE 1995}}   * {{microarchitectureofsuperscalar.pdf|Smith and Sohi, "The Microarchitecture of Superscalar Processors," Proc IEEE 1995}}
-   
- 
-== Recommended Readings == 
-  * {{tracecache.pdf|Rotenberg et al., "Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching," MICRO 1996}} 
  * {{onpipeliningdynamicinstructionschedulinglogic.pdf|Stark, Brown, Patt, "On pipelining dynamic instruction scheduling logic," MICRO 2000}}   * {{onpipeliningdynamicinstructionschedulinglogic.pdf|Stark, Brown, Patt, "On pipelining dynamic instruction scheduling logic," MICRO 2000}}
  * {{Themicroarchitectureofthepentium4processor.pdf|Boggs et al., "The microarchitecture of the Pentium 4 processor," Intel Technology Journal, 2001}}   * {{Themicroarchitectureofthepentium4processor.pdf|Boggs et al., "The microarchitecture of the Pentium 4 processor," Intel Technology Journal, 2001}}
  * {{21264microprocessor.pdf|Kessler, "The Alpha 21264 microprocessor," IEEE Micro, March-April 1999}}   * {{21264microprocessor.pdf|Kessler, "The Alpha 21264 microprocessor," IEEE Micro, March-April 1999}}
 +== Recommended Readings ==
 +  * {{tracecache.pdf|Rotenberg et al., "Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching," MICRO 1996}}
 + 
 +===== For Lecture 21 =====
 +Same as previous lecture
 +===== For Lecture 22 =====
 +Same as previous lecture
 +===== For Lecture 23 =====
 +Same as previous lecture
 +
 +===== For Lecture 24 =====
 +== Required Readings ==
 +  * {{conbiningbranchpredictors.pdf|McFarling, "Combining Branch Predictors," DEC WRL TR, 1993}}
 +  * {{increasingprocessorperformance.pdf|Carmean and Sprangle, "Increasing Processor Performance by Implementing Deeper Pipelines," ISCA 2002}}
 +
 +== Recommended Readings ==
 +  * {{analysisofcorrelationandpredictability.pdf|Evers et al., "An Analysis of Correlation and Predictability: What Makes Two-Level Branch Predictors Work," ISCA 1998}}
 +  * {{alternativeimplementationoftwolevelbp.pdf|Yeh and Patt, "Alternative Implementations of Two-Level Adaptive Branch Prediction," ISCA 1992}}
 +  * {{availableilpforsuperscalar.pdf|Jouppi and Wall, "Available instruction-level parallelism for superscalar and superpipelined machines," ASPLOS 1989}}
 +  * {{divergemergeprocessors.pdf|Kim et al., "Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed Paths," MICRO 2006}}
 +  * {{dynamicbranchpredictionwithperceptrons.pdf|Jimenez and Lin, "Dynamic Branch Prediction with Perceptrons," HPCA 2001}}
 +
 +===== For Lecture 25 =====
 +Same as previous lecture
 +
 +===== For Lecture 26 =====
 +
 +=== Control Flow III ===
 +
 +== Recommended Readings ==
 +  * {{wishbranches.pdf|Kim et al., "Wish Branches: Enabling Adaptive and Aggressive Predicated Execution," IEEE Micro Top Picks, Jan/Feb 2006}}
 +  * {{divergemergeprocessors.pdf|Kim et al., "Diverge-Merge Processor: Generalized and Energy-Efficient Dynamic Predication," IEEE Micro Top Picks, Jan/Feb 2007}}
 + 
 +=== Alternative Approaches to Concurrency ===
 +== Required Readings ==
 +  * {{vliweli.pdf|Fisher, "Very Long Instruction Word architectures and the ELI-512," ISCA 1983}}
 +  * {{introducingia64.pdf|Huck et al., "Introducing the IA-64 Architecture," IEEE Micro 2000}}
 +
 +== Recommended Readings ==
 +  * {{cray1computersystem.pdf|Russell, "The CRAY-1 computer system," CACM 1978}}
 +  * {{ilpprocessing.pdf|Rau and Fisher, "Instruction-level parallel processing: history,overview, and perspective," Journal of Supercomputing, 1993}}
 +  * {{instructionschedulingforilpprocessors.pdf|Faraboschi et al., "Instruction Scheduling for Instruction Level Parallel Processors," Proc. IEEE, Nov. 2001}}
 +
 +===== For Lecture 26 =====
 +Same as previous lecture (Alternative Approaches to Concurrency)
 +
 +===== For Lecture 27 =====
 +== Required Readings ==
 +  * {{nvidiatesla.pdf|Lindholm et al., "NVIDIA Tesla: A Unified Graphics and Computing Architecture," IEEE Micro 2008}}
 +  * {{cray1computersystem.pdf|Russell, "The CRAY-1 computer system," CACM 1978}}
 +
 +== Recommended Readings ==
 +  * {{dynamicwarpformation.pdf|Fung et al., "Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow," MICRO 2007}}
 +  * {{qilin.pdf|Luk et al., "Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping," MICRO 2009}}

Personal Tools