Team Status Report
UPDATES
This week, we had a presentation! We’ve posted the slides from that, as well the poster for our demo, under media.
RISKS
We have a viable demo currently. We have some improvements that we’d like to make in terms of processing speed, but our current demo constitutes our minimum viable project.
DESIGN CHANGES
The DMA solution from last week was successful, so we have incorporated that into our project. The DMA engine transfers pixel data to the BRAM, then transfers the resulting energies back to the DRAM when the computation is complete.
Maxwell Johnson
Personal work and Progress:
I spent most of this week debugging the DMA solution. Now that the system works, we can officially say that DMA is correct and faster than the alternative. For a 3.7 second video, the energy computation time decreased from 144s to 11s. These are the newest timing data I’ve collected:
- RX Time: 10.432682 s
- Total compute time: 81.239788 s
- Total finding seams time: 65.049371 s
- Min Cut Time (full frame): 62.142221 s
- Sift Time: 61.899109 s
- Search Time: 0.241102 s
- Incr Cut Time (partial frame): 2.907150 s
- Min Cut Time (full frame): 62.142221 s
- Total time deleting seams: 5.033181 s
- Total energy map time (incl. DMA): 10.929733 s
- Total finding seams time: 65.049371 s
- TX Time: 8.325749 s
The bottleneck is finding the min cut of every frame. Because this is only used to determine the radix frame and then to make the cut on the radix frame, this is not a task that deserves to take 75% of our compute time. Today, I experimented with other techniques for choosing a radix frame. Choosing a random frame creates a video with similar, though slightly worse, quality. Once Riki completes the script to analyze video quality using a quantitative energy function, we can decide on what the proper tradeoff is between processing time and output quality.
Deliverables:
Tomorrow, we’re all planning to work on some validation and optimization. My main goal is to successfully overlap the computation done by the system on chip with the DMA. Currently, the system waits for the transfer to complete before continuing to process pixels, which is wasted opportunity for computation. Following the demo, our deliverable will be the paper.
John Zhang
Personal Work
Progress
Deliverables
Riki Khorana
Personal Work and Progress
– Worked on the presentation slides
– Worked especially on summarizing and clarifying algorithm
– Tried to recreate a lost progress video, but failed
– Worked on optimization by threading seam finding on Vivado SDK
Deliverables
– Demo this Monday – work on further optimization
– Final report coming up soon