Adolfo’s Status Report for 4/26

Tess finished the memory bank controllers, which allow us to play more complex games that require more memory.  Meanwhile, I worked on an alternate solution to get the controller to work with the FPGA. I made a controller driver on an RPI and then sent signals to the FPGA through GPIO, with this I was able to play Tetris and Doctor Mario. I also fixed the timer bugs (which means our timer passes all the tests and as a results the sprites in Doctor Mario no longer play at super-fast speed). With this, no bugs were found while playing the games.

Thus, this concludes our main requirements for the FPGA portion of the project. I then restructured the MBC code to work better with the FPGA and then got an MBC1 game working, Super Mario World. This proves that our memory bank controller works in synth as well and that more complex games give our emulator no issues. We’re gonna try to get the off-board memory flashed with games to be able to try out notoriously difficult games to emulate, which require a bit more space than the FPGA has to run. We don’t expect them to work 100% since our PPU isn’t 100% accurate (this is the hardest part to get right in an emulator), but we expect them to be in an otherwise playable state. This is a stretch goal since Tess and I are a bit busy this week so we may not be able to put in as much work as we did in the previous weeks.

We are happy with our results as we have gotten more results than what we had originally aimed for on the FPGA side, even before the COVID issues. Hopefully, we’ll get the external memory on the FPGA to work to do the stretch goal of playing games like Pokemon.

Adolfo’s Status Report for 4/12

This week Tess and I worked on the CPU side to get it up to the standards needed to run most games. I worked on making automated testing for our tests, afterward, we started running the Mooneye’s tests and blargg’s instruction tests. They are the gold standard for emulators and are commonly used to compare the accuracy of emulators against each other. Tess and I debugged the CPU this whole week, in the end, we managed to pass all of Blargg’s tests. A significant milestone, given that it was our main accuracy requirement for the CPU. We are currently working on passing Mooneye’s tests. We passed the CPU tests, but we are still working through the tests that stress the Gameboy’s peripherals.

Apart from the PPU work, I worked on getting DMA and the PPU working with the CPU. I managed to get our setup to pass Mooneye’s DMA tests and am currently working on getting the PPU to pass the tests, although our reference or “goal” for performance, VerilogBoy, fails all the PPU tests. We’ll see if we can improve upon them!

PPU saw a lot of progress. I first worked on getting it to display an image in simulation. Shoutout to Ford, which let me use the script that he used for his project to output what the emulator displayed in sim. With this, I was able to get the PPU to display a pattern. The more challenging part was to get the PPU to display its outputs on the FPGA. It proved to be non-trivial due to the VGA running at a 25MHz clock and the PPU running at 4.19MHz. I remedied this by adding a framebuffer to the VGA and putting an async-fifo between the two, a solution that took a long time to get right. After that, everything worked perfectly! Finally, I was able to get the PPU to display images rendered by the CPU in simulation, Tess is working on getting the tests to run on the FPGA and see if it also works there.

This was a very exciting week in terms of progress, we are now able to match some of the well-known emulators in CPU, Timer, and DMA performance and accuracy. This upcoming week our goal is to get our setup running Tetris and Doctor Mario. Two of the first games most emulators run. To be able to run the rest of the tests we’ll need to add memory banking.

Adolfo’s Status Report 04/5/2020

This week Tess and I worked on getting the CPU to an almost finished state. We wrote tests that stressed each of the different possible instructions that the CPU can run and debugged the relevant parts. We spent the whole week on this.  At the beginning of the week I got the testing infrastructure working. Testing assembly files is really easy now, since I made a testbench that loads files without the need to recompile and also made a bash script that automatically generates .hex files to load into memory. On Friday, we managed to pass all the tests we wrote for all the supported instructions. The CPU is able to run assembly scripts of varying complexity, the hardest test we wrote is Fibonacci. This test stresses our different load instructions, arithmetic instructions, conditional execution, and function calls. We also had more focused tests that tested specific instructions.

This weekend we worked on the peripherals, I worked on the PPU. The PPU FSM is working correctly and the fetcher also is working correctly. Unfortunately, due to issues related to setting up the new board, the PPU testing got delayed. Fortunately, setup issues didn’t last too long since I was able to get the VGA test running again on the new board. This wasn’t a waste of time since I got the pin assignments for the board and also figured out how to upload everything. This was the first time any of us worked with the new board with respect to uploading FPGA files and setting pin assignments.

Tess and I also researched the functionality of the other peripherals and got the DMA and the timer working. We also have a rough draft of the MMU and memory bank switching. We were very excited since we are now able to run some of mooneye’s tests, which are the benchmarks for cycle accuracy. We did not pass because of small of by 1 errors, which we hope to fix with an improved timer design which is more .

These were really useful resources for the peripherals and the instruction timings (I was able to merge typo fixes in one of them!):

Cycle Accurate GB timings

Gameboy Complete Technical Reference 

These were the assemblers and linkers that we used to compile our tests:

RGBDS

WLA-DX

For this week we want to finish the simulation testing and go on to FPGA testing. I plan to finish the work on the PPU.

Adolfo’s Status Report for 03/29/202

This week I finished the decoder and FSM for the CPU. Furthermore, Tess and I worked on hooking everything up to get it ready for unit testing.  We did this by writing  a magic memory module which will load instruction files that we generate. Currently, I am still working on getting unit testing up and running for the CPU, I found some GB assemblers which I’ll use to write tests. I also worked on a few PPU components,  I am working on trying to display a basic background.

For the next week, I hope to have a good testing infrastructure up and running and start the process of testing the whole CPU. Furthermore, I want to get the PPU to display a background of any color, and if I have the time perhaps a valid background from some game.

Adolfo’s Status Report 03/21/2020

During spring break and this week, I worked on implementing and fleshing out the FSMs for the multi-cycle instructions. I also created a skeleton for the decoder module, which will work with the FSMs to control the pipeline. For this, I added a new module to our datapath which I thought was necessary to reduce clutter in the FSMs. This module is the FSM manager. It will communicate with the decoder and the rest of the datapath to decide which FSM to use and which of the control signals to use.  The plan was to finish the FSM, but I got sidetracked due to recent events.

The goal for this upcoming week is to finish the CPU and start testing it. Tess and I have been starting to look into how to get simulation going and how to instance block RAM to start testing while the SoC component gets developed. I am confident in the progress that we have so far, and hopefully, we’ll finish everything on time.

The other thing I have been planning to pick up is the PPU work, the progress I have so far is the FSM and the skeleton of the main PPU module. I still need to flesh out the sub-modules and start looking into testing it on simulation.

Finally, I plan on going back and cleaning up and commenting on some of the code which is half commented. We got to keep in mind that one of the objectives is to have a well-documented project to help everyone else out in the emulator community who wants to do a hardware emulator.

Goals for next week:

  • Finish CPU and start testing
  • Clean up and document CPU code.
  • Pick up work on PPU again.

Adolfo’s Status Update for 2/29/2020

This week I changed my focus to the design review and document. I took this as an opportunity to further understand the PPU and how it interacts with other systems. I think this helped a lot since I now have a much clearer view of how the PPU and its interactions with the rest of the system will play out. I am excited to start working on this part of the system once the CPU is done.

We decided to be proactive about the design document so we started working on it throughout the week. To do this, we had to do further research and we also incorporated the feedback from the presentations to further flesh out parts of our project that we thought were not well defined such as our metrics and evaluation, so did our classmates.  To do so, we looked at the performance of other emulators and try to pick based on varying levels of success. We are going to be using Verilog Boy as our baseline performance goal. We do this because that is the best emulator that we found in our area (hardware).

For the design document, I wrote the abstract, the design requirements, the architecture and principle of operation, the CPU sub-system description, the PPU sub-system description, and related work.

Next week we plan to go full force into implementation and have ambitious goals for the end of spring break. My personal plans for the end of spring break are as follows:

  • Get a working CPU implementation with interrupts disabled.
  • Get a working PPU implementation, try some display tests.

Failing to achieve these goals may put the success of our project into question since we still have to deal with integration.

Adolfo’s Status Update for 2/22/2020

This week I managed to get all the information we need to make the PPU. I used various sources to get a good idea of the timings and the overall architecture of the PPU and I think I’ve achieved a good understanding of its inner workings. I used the following sources:

I also browsed through some emulator forums’ threads to check for more specific architecture details about the Gameboy since these docs didn’t mention specific hardware details that could be useful for achieving maximum PPU fidelity. In the end, I decided that it may be a bit hard to get an architecturally accurate PPU representation, so I will just attempt to have one that matches the timings and expected values at each moment.

After this, this is the datapath I came up with:

A PPU “cycle” is divided into 4 stages, these will give the datapath above a bit more meaning:

  • OAM Search: In this stage, the sprite fetcher will look for the sprites that are needed for this VBlank stage and put them in the sprite buffer. It can only fetch up to 40.
  • Pixel Transfer: In this stage, the fetcher will start fetching tiles from memory and then put them in the pixel FIFO. Since it can fetch a whole row at a time (8 pixels) then the FIFO supports pushing 8 items at once. Based on the registers (which are not shown) the fetching may either fetch sprites, background tiles or window tiles. The pixel FIFO pushes the pixel at the front onto the screen.
  • H-Blank: This is the time between each row of the screen. The CPU usually does some work here.
  • V-Blank: This is the time between being done with the screen and starting to render the next screen. The CPU usually uses this time to modify VRAM.

I will further expand on these on the design review.
My goals for next week are as follows:

  • Start implementing the PPU
  • Finish the CPU design and start implementing it.
  • Work on the design report.

Adolfo’s Status Update for 2/15/2020

This week I worked with the rest of the group on figuring out the CPU’s design. I found this document which gave accurate timings for each instruction (it is incomplete though as it is a work in progress). With it, we were able to draw a datapath we believe accurately represents the Gameboy’s pipeline. Afterwards, Tess and I tried to figure out implementation details of multi-cycle instructions. We decided to solve this problem with an FSM which would control the different stages of a multi-cycle instruction. The basic idea is to have the FSM divide each multi-cycle instruction into smaller single cycle instructions. We believe this behavior mimics the Gameboy’s behavior best because it has been observed in the emulator community that in the middle of a multi-cycle instructions flags get set multiple times.

On my graphics work, I did some research on the inner workings of the Gameboy PPU. My next action item based on this will be to sketch out a datapath for it so that we can talk about it in the design review and also so I can get started with implementing it. Furthermore, I got the VGA controller working and tested it on the screens in the labs. It has a small bug with the bottom line of the display, I will try to figure it out sometime this week, but it’s not a priority. One thing that I will have to figure out is how to display the Gameboi’s output since the resolution doesn’t match the 640×480 resolution of the lab’s screens. I was thinking of probably having a pixel in the original Gameboy’s screen represent multiple pixels on the screen. This is a discussion for the future.

Action items for next report:

  • Make a PPU datapath drawing
  • Get the Gameboi decoder and datapath implemented
  • Fix the VGA glitch