Architecture Research Term Project

You will carry out a semester-long term project on a topic based on content from this course. Your project must begin with a research question, design and implement a system and/or architecture addressing that research question, and must use the implementation to perform a quantitative evaluation. The choice of project topic is deliberately open-ended. You are expected to identify an interesting question that does not have an obvious answer, and to answer that question with your project. A possibly helpful analogy for the scale of the project is that your project should be roughly the size and complexity of a paper that you might see published at a workshop. In planning your project topic, you are encouraged to consider outside academic interests (ML, HCI, AI, PL, etc) in the context of parallel, heterogeneous, and emerging computer architectures. Your project could be an implementation of an idea that we read about in class, perhaps transplanted to a new context (e.g., deterministic execution for GPU/CPU systems), or something completely novel (e.g., an FPGA-based accelerator for neural network computer vision workloads). Your project should not be a measurement study only (e.g., "how fast is PARSEC on an Intel CPU?"); however a research project that includes a well-crafted measurement study as a contribution could be very interesting. You will have to choose a project topic before we have covered all of the topics for the semester. It will benefit you to read the abstracts and skim any papers in the reading list that seem interesting, to help identify project ideas.

Deliverables

  • Team Roster and Preliminary Project Ideas - Form a team of 2 people and write a sentence or two about a few ideas that you might want to work on. These ideas are non-binding and intended to help you start brainstorming.
  • Project Proposal - A detailed project proposal document of no more than 3 pages. This proposal must include the research question you are asking, describe your project's design space assessment, sketch an architecture design, and describe your evaluation plan, including necessary hardware and software tools.
  • Project Infrastructure Check-in / Mid-Point Status Report - A document of no more than 2 pages describing your progress so far. You must include evidence that you have started working with the infrastructure that you will use for your project. For instance, if you are using an architectural simulator, you must show that you have gotten the simulator running on your machine and that you can process and understand its log files. If you are using an FPGA or other hardware board, you must show that you have "Hello World" or better running by this point. If your project has changed direction or scope since the initial project proposal, explain how in detail in this document.
  • Project Evaluation Methodology Writeup and Preliminary Results - A document with the structure, but not necessarily all of the content of your final report (see below). The only part of the document that we will be checking is the methodology / experimental setup, and the results. We will expect to see a detailed description of your experimental plan and data analysis methodology, as well as a concise description of what you intend to show experimentally in your project. In your evaluation section, we expect to see plotted (or tabulated, as appropriate) results for at least one of your main findings. Data do not have to be finalized at this point, but you must demonstrate to us that you have a process in place for generating data, and that you have a practical plan for designing, running, and interpreting experiments.
  • Project Final Report - A full description of your project, including the key question, design space, design choices, and quantitative evaluation methodology and results. This document should be no more than 10 pages conference format: 2 column, 11pt font with figures. This document should be written like the research papers that we are reading in class. Your report should have the following sections (i)Introduction, (ii)Background and Motivation, (iii)Your Main Idea, (iv)Your Architecture and/or Implementation,(v)Your Methodology and Experimental Design, (vi)Experimental Results, (vii) Conclusions, (viii) Per-students, separate submission detailing your personal contributions to the project and the contributions that you preceive having been made by your teammates.
  • Project Final Presentation - An in class presentation about your project including the important details from your project final report. You will have 10 minutes (I will be timing very precisely and I will cut you off!) to present. Your presentation must be focused, concise, and informative, and you must discuss your main research question, describe your project and implementation, and show experimental results and help us to interpret them to reach the right conclusion.
  • Project Ideas

    1. Architecture and systems support for MCUNetv2/v3 in memory-constrained dataflow processors
    2. Architecture support for sparse and irregular computation in resource-constrained ultra-low-power systems
    3. Architecture support for time-multiplexing of processing elements in an ordered-dataflow CGRA
    4. Computer architecture support for encrypted dataflow processing
    5. Analysis and design of approximate computer architectures for in-satellite, radiation-tolerant computing
    6. Cache flush management strategies for intermittent energy-harvesting computers
    7. Memory consistency models for ordered dataflow
    8. Defining a memory consistency model and synchronization primitives for a processing-in-memory system
    9. Evaluation of performance bottlenecks in encrypted computing systems
    10. Intermittently powered dataflow CGRA architecture
    11. Software or architecture support for resource-constrained or intermittent graph processing
    12. Distributed, intermittent Deep Neural Network training system
    13. Energy-harvesting simulation infrastructure: power and performance model
    14. Measurement study of architectural implications of non-volatile technology as storage or logic in a CPU
    15. Relaxed memory consistency for FPGA/CPU SoCs
    16. Performance and correctness impact of approximate synchronization operations on neural network or computer vision applications
    17. Hardware support for data-centric synchronization / per-address memory fences
    18. Heterogeneous memory consistency for CPU+FPGA systems with per-FPGA-state-machine consistency guarantees
    19. Design and evaluation of an intermittent reconfigurable architecture
    20. Approximate, compressive cache, LLC or main memory
    21. Data-race detection or SC-violation detection in a reconfigurable computing device or heterogeneous FPGA/CPU-based system.
    22. Application study: precision vs. performance trade-off in a parallel system with approximate cache coherence
    23. Deterministic parallel computation in an FPGA
    24. Application study: when is it beneficial to execute code on a GPU or FPGA in parallel with execution on a CPU?
    25. Symbolic execution to evaluate candidate power schedules for programs running on intermittently powered devices
    26. 3D-stacked, processing-in-memory to accelerate garbage collection or other pointer-chasing analysis
    27. Deterministic transactional execution with weak isolation guarantees
    28. Approximate, parallel scatter/gather or reduction
    29. Performance and Power model and assessment of a "perpetual" solar-powered, fully-nonvolatile processor
    30. Using shared memory communication graphs to predict magnitude/importance of shared value updates
    31. Cache architecture and memory hierarchy design for heterogeneous CPU/GPU/Accelerator architecture
    32. Feasibility assessment and performance model of porting TensorFlow kernels to an FPGA
    33. Environmental impact assessment and mitigation strategy for current and future cloud machine learning
    34. Hardware concurrency bug detection for FPGA designs

    Benchmarks

    1. PARSEC (parallel applications)
    2. Rodinia (heterogeneous parallel applications)

    Simulators and Tools

    1. Sniper (easy-to-use architecture simulator)
    2. Gem5 (very detailed architecture simulator)
    3. MarssX86 (detailed architecture simulator)
    4. McPat (architectural power modeling)
    5. Cacti (power modeling)
    6. Pin (binary instrumentation)
    7. LLVM (compiler infrastructure)
    8. Z3 (SMT solver)
    9. KLEE (C/C++ symbolic execution engine)