Andy’s Status Report for 05/08/2021

Over this past week (the last week of classes!) I finished the NES emulator port and made some bug fixes to the sprite engine. I also worked with Joseph to create the poster for our project.

The portion of the emulation this week focuses on translating calls to the NES PPU made by the game being run into calls to our PPU. This is done by translating the corresponding calls for each subsection of the PPU. When the emulation signals to the application that the emulated graphics card has finished drawing the frame, the emulator inspects VRAM and recompiles the name tables of the NES into a partial name table for our PPU. It also updates the palette and OAM settings. These updates are sent to our PPU for rendering. Pattern RAM may be updated too, but that is not recompiled on the fly and is only sent when the application actually makes a change (since pattern ram is by far the largest and most expensive to transfer). Other changes were also made to the emulation for compatibility, but the details of that are far less interesting and relevant.

After bringing up the video output for the emulator, some bugs were noticed in the sprite engine hardware and had to be addressed. Notably, the sprite engine embarrassingly refused to draw more than one sprite per scanline due to a misunderstanding of what the $right() function in system verilog does (expected it to grab the least significant set bit, actually just gets bit 0). Additionally, a more obscure bug caused the sprite engine to ignore the last sprite defined in OAM at all times, which caused some flickering issues in NES games (the reasons make perfect sense but require understanding how NES games and hardware interact).

With all these issues fixed, a number of games work well enough on the FP-GAme port of my NES emulator to make for a fun and exciting demo. Notably, Mega Man 1/2, Metroid, and The Legend of Zelda work very well. Games like Super Mario Bros and Castlevania rely on a hardware feature of the NES’s PPU that ours does not support, namely the changing of the horizontal scroll value of the background layer in the middle of the frame. This means those games function, but have a graphical glitch that causes their menu items to “shake” as the game scrolls (the actual problem is that the menu graphics can only move in 8-pixel increments and the rest of the screen can move 1 pixel at a time). We plan to demo the games that work well so as to not confuse our audience with problems caused by emulation issues.

Over the course of this last week, we will create the final video and put the finishing touches on the final report.

Andy’s Status Report for 05/01/2021

This week, I was able to finish the sprite engine. Then, Joseph and I worked together to debug it and the library interface that is used to communicate with it. As seems to always be the case, debugging took more time than I expected. However, the sprite engine and library are now fully operational.

After some discussion at the beginning of the week, we decided to move away from the notion of making an entire game for our system. Instead, Joseph has made a small tech demo, and I’m porting an NES emulator that I wrote a few years ago to our system (while having it use our graphics card for hardware acceleration, so that it shows off the capabilities of our console and library).

As such, after the sprite engine was finished the remainder of my time has been spent working on the port. At the moment, the emulator itself can be built for our console and against our library, and the games will load and run. The audio and controller interfaces have been successfully moved from SDL2 to the FP-GAme library. Unfortunately, I haven’t yet finished the graphics card port yet (so I can hear Mega Man 2 and navigate the menus, but not see it; that’s fine, the music was the best part of that game anyway), but I didn’t expect to, so I’m still on schedule. The port will need to include functions to translate calls to the NES graphics card into calls to our PPU, so it’ll take a bit more work than the other pieces. My current goal is to have that done before Sunday night, so that we can include it in the presentation.

As part of my work on porting the emulator, I was able to set up a more consistent build environment for user mode C/C++ applications (oh yeah, our library works with C++ now because that’s what my emulator was written in). Before, we had been building all of our user mode programs against the c standard libraries installed on the system being built on. That isn’t a great setup, and so we now build against the libraries provided by our cross compiler.

Over the course of this coming week (the last week!), I’ll be finishing the NES emulator port and whatever else needs finishing (documentation, small tests, etc).

Andy’s Status Report for 04/24/2021

These past two weeks, aside from the ethics assignment, I focused on fixing the APU (which happened very early in the first week) and designing and implementing the sprite engine.

I’ve attached a diagram of the internal logic of the sprite engine. It’s currently roughly 80% done, and I expect to finish the remainder over this weekend. On the diagram, everything is implemented except for the OAM scanner and memory port wrappers.

The sprite engine itself works as follows. The engine will receive a signal from the tile engines that they have finished buffering their data, and that the sprite engine may now access the M10K’s which function as VRAM. The OAM scanner will then begin to scan OAM (Object Attribute Memory) for sprite data that corresponds to the current scanline being rendered. The first sixteen pieces of this data found will be sent to the sprite manager.

The sprite manager, on receiving an Object Attribute, will ask the scanner to halt while it makes accesses to pattern memory for the sprites visual data. It will then send its data to the sprite file, which is drawn in the diagram as 16 sprite units and a sprite tournament mux.

The sprite units are linked in a chain, where the last unit in the chain is connected to the sprite manager. After receiving a valid sprite, each sprite unit will send the data down the chain until this is no longer possible. Having the sprite units connected in the chain means we don’t need a large number of muxes to allow the sprite manager to write to each unit individually, and the cycle delay of things moving down chain is invisible to the rest of the board, as the sprite tournament logic will hide it. The sprite tournament logic is simply a special mux that chooses a sprite based on its visual attributes.

The sprite engine is the last piece of hardware we need to implement. After it’s done, we’ll just need to do some more tests and finalize our documentation before working on the game. I expect work on the game to begin around the middle of this coming week.

Sprite Engine Diagram: Sprite Engine V2

Andy’s Status Report for 04/10

The plan for this week had been for me to finish up the APU and then work on the sprite engine. Unfortunately, I hit a hard wall with communication between the APU and the CPU. Some progress has been made on that front, and I’m now able to send a 1KHz sine wave from a C program through the APU driver to the APU. Unfortunately, some corruption issues are preventing non-static data from being sent through.

Still, this does mean that the APU is fully written, just not fully debugged. The user space library for the APU is complete and tested, sending signals to a user process as a kind of user-mode interrupt works great. The kernel module is mostly written with the aforementioned corruption issues. The hardware, after some intense debugging and scrutiny from both Joseph and myself, seems to be fully operational.

I suspect that the issue is with the kernel module, and not the user space library or test program. It seems likely that I haven’t set up the APU kernel buffer for DMA correctly, and so that will be where I investigate next. The running theory is that our level 1 cache isn’t being flushed, and so we’re only getting some of the samples out of the kernel buffer into the APU, and are thus seeing corruption.

Thanks to the slack time we allocated for the end of this semester, there’s still a good chance that we’ll be able to get everything done on time. The plan for this coming week is for me to fix the APU and then deal with the sprite engine in any remaining time I have. After that, I’ll have a week to finish the sprite engine and two to work on the test game.

Andy’s Status Report for 04/03/2021

This week, I finished up the controller module (which now fully works) and made significant progress toward a finished audio implementation. Work on the sprite engine hasn’t begun yet, so we are still behind (arguably more so), but we’ve both increased the amount of time we’re spending on this project. All things considered, I’d still say this week went well and I’m optimistic that we’ll be able to catch up.

As I projected in my previous report, I was able to finish the controller kernel module over that weekend. The changes necessary to get the controller module were largely uninteresting, I simply had to build a new linux kernel and change the one the provided linux image used. Then, there were some bug fixes for the kernel module and the user space library functions. All of this was completed over the weekend, and the controller library works as intended.

For audio, so far the actual device is fully described in verilog and has undergone a reasonable amount of testing. I was successfully able to play a 1KHz sine wave using the APU. With the hardware itself working, my main focus now is on bringing up the software side of the apu. The communication between the CPU and the APU is setup, but untested. Around half the kernel module is done, and the user space portion of the APU library is specified but unwritten. All told, I’ve got ~400 lines of C code to look forward to, but I’m no stranger to that :). Hopefully, I can get most of it done over the weekend and dedicate some real time to bringing up the sprite engine next week. I’ve got some grading work to do for OS over the weekend, though, so that will probably eat up quite a bit of my time, unfortunately.

Don’t have anything super interesting to share this week. Next week, I should have a video of a working audio demo (that isn’t just a test sine wave). For now, here are the verilog files for the APU. Note that I wound up not using their premade I2S module because it was trash. I threw together something with a much nicer interface, instead.

APU + I2S files: https://drive.google.com/file/d/14EN3CveBKY3m0okfWgywr5-sRoCuts16/view?usp=sharing

 

Andy’s Status Report for 03/27/2021

Over the past few weeks, I’ve focused on understanding and implementing a kernel driver for our system. In our original schedule, this task was supposed to take around a week. That has turned out not to be the case due to me running a bit behind and complications with the implementation of the kernel driver.

 

What I do have, is a user space front end to the controller kernel driver and a kernel space driver file that has not been compiled yet. Theoretically, this should work fine, but I have been unable to compile the kernel module due to complications with building against the pre-built kernel provided by Terasic. As far as I can tell, they do not provide the tools necessary to build against the provided kernel, and so a new one must be built from scratch and supplied to our board. I’ve begun this process, and hope to have the kernel built and booting by the end of today. If all goes well, I’ll be able to jump straight into testing the controller module tonight/tomorrow.

 

Due to the excessive and frustrating amount of time it has taken to write my first kernel module, audio has been pushed back to the body of this coming week. I don’t anticipate audio taking much time, as Joseph has a firm understanding of what would be the hardest part (communication with DDR3 and the cpu). Aside from this, it will be a relatively simple FSM that reads from memory and sends data to I2C. I believe the driver will be simple as well, considering I’ve learned some useful tools while reading up for the controller driver (ex. I can create a device file and arrange it so that the write system call sends samples to the kernel driver).

 

Once audio has been finished, I’ll be working on the sprite engine.

 

Drafts of the user space and kernel space implementation of the controller driver are available here:

https://drive.google.com/file/d/1LZ3EGkWE5TbSmbO2-qg8U7c6oalinqTu/view?usp=sharing

Andy’s Status Report for 03/13/2021

This week, my main task was to create the memory mapped I/O necessary to send controller input from the FPGA to the ARM core. In our design, we chose to use the lightweight AXI bus of the Cyclone V FPGA to accomplish this. Additionally, I set up my FPGA with a linux image that allows for a UART console and wrote a brief controller read program to test the communication. Finally, I set up an ARM cross-compilation development environment on my computer, using the arm-none-linux-gnueabihf library provided by arm, in order to compile the controller test program.

 

Under the current settings, the controller input is available to the ARM core at the base address of the lightweight AXI memory mapped I/O space (address 0xFF200000). The controller test uses /dev/mem to mmap this address into its address space and read from it, then prints the results to the console.

 

Due to my unfamiliarity with qsys, the process of bringing up this communication took longer than I expected. When all was said and done, I did manage to get communication working, though I ran out of time and was unable to finish the controller section of our interface (which would require wrapping it in a kernel module which has a call safely exposed to the user and definitions to ease in parsing buttons). Due to this, I’m slightly behind, but anticipate being able to catch up again next week.

 

A video of me running the controller test is available here: https://www.youtube.com/watch?v=jU2mdtBN-I0

 

The controller test:

https://drive.google.com/file/d/1U8mn_7gTbG2sWP3rkUN0vx-6YtHiBfZu/view?usp=sharing

Andy’s Status Report for 03/06/2021

This week, I documented the internal structure of the audio and controller modules, and created block diagrams for those modules. I also detailed the controller FSM to aid in the implementation of the controller module.

On the subject of the controller module, I implemented and tested that this week. The protocol has been detailed in previous reports, but essentially the module is a timer running at 60Hz and a clock divider that generates the clock for the controller. Once the timer fires, the FSM waits to sync up with the controller clock and then begins the input request protocol. As discussed before, the controller is wired over GPIO. In the tester, the current input state of the controller is displayed on the LEDs of the DE10-Nano. Since the DE10-Nano only has 8 LEDs, and the controller state is 16-bit, a switch 0 is mapped to switching between the MSB and LSB of the controller state. After a few bugs and re-familiarizing myself with system verilog, it works like a dream. The LEDs lit up in response to my input apparently immediately. Unfortunately, a few of the wires for my controller port fell apart while I was moving the board, so I didn’t get the chance to capture video of the demo (at any rate, the LEDs are small and probably won’t show up well anyway).

I’ve attached the controller and audio diagrams, as well as the controller module implementation and test source code. Note that we’re managing our code on a private github, and the tossing of zips is just to ease distribution over our blog.

Diagrams:

Audio

Controller

Controller test source code:

https://drive.google.com/file/d/1dZzLapQ6Y364X1jte-TLIdnNxRoFoYm6/view?usp=sharing

Andy’s Status Report for 02/27/21

This week, aside from helping with the presentation and discussing the overall plan with Joseph, I worked on research for the Audio, Input, Software, and Communication systems. I’ve attached a copy of my notes.

Suffice it to say, the current plan for audio is to have a single channel of 8-bit PCM audio and 32KHz, which will be representative of the GBA in sound power. Note that our users will be able to produce better sound, as they aren’t bound to a 16MB cartridge, though. Audio will be output by sending samples supplied by the CPU over I2S to the HDMI output. The interface itself hasn’t been designed yet, as I’m not sold on how it should be done specifically, so I’m a touch behind. I plan to get that done as soon as possible, and I’m confident I can be back on track by the end of next week.

For input, we plan to use a SNES controller over GPIO. I go over the details of the SNES controller communication protocol in my notes. It’s a simple protocol thats clocked well below what our FPGA will be clocked at. It should be relatively simple to bring up, as long as we can find controller ports.

For software, I took another look at the C standard and realized that our requirement of providing it to the user is completely infeasible if we are also creating our own kernel. The C standard has a lot to say about what an OS should provide, and we won’t be providing much of it. I’ve narrowed it down to a useful subset, which is listed in my notes.

For communication, I spent way too much time trying to understand how the FPGA and HPS communicate with each other. From what I can gather, the last GB of the HPS’ address space is mapped to the HPS-to-FPGA AXI bridge, and reading from/writing to it will send a signal to the FPGA that it can interpret and respond to. I think I’ll have to run some experiments when I get my board to convince myself of this. I’ll also need to spend some more time with the Cyclone V manual, it seems.

Finally, I also wrote a small C program to convert 16-bit 32KHz Mono Microsoft PCM Wave audio to pure 8-bit samples. It works with all the wave files I’ve exported from audacity into this format, though I didn’t rigorously test it. I did this so that once the audio system is brought up, we have working audio to make a direct comparison against. The program will also output 8-bit wave form audio in the wave format, so that it can be played on a computer and compared to our FPGA’s output. The source code for the program has been attached as well, though I’ve not tried building it on Windows/Mac.

Program source

Design Notes

Andy’s Status Report for 2021-02-20

This past week, outside of our group meetings, I looked into what would become our requirements for the processor in our handheld. Previously, all we really had nailed down was that we wanted something Arm clocked at at least 16MHz. We can now specify some more detail on that.

 

Based on our projects overall goals of providing a safe development environment, we decided to create a small software kernel for our console. This means our processor must provide the tools to implement such a kernel, namely memory protection. Arm processors do offer memory protection, but as an optional component, so we’ll need to ensure that if we use an Arm processor it offers memory protection. We must also ensure that any processor we use allows us to externally send interrupts to it. This will become important when managing our graphics and sound output. Finally, we’ll need a timing register to allow our kernel to support some basic timing facilities. Currently, I think our best bet is to simply expose a configurable timing interrupt to the user. Such an interrupt would hand off to some user provided function in a specified interval and allow the user to perform some time sensitive operations. Doing so would allow the user to track the passage of time and perform time based actions like multi-threaded code, should they wish to. Due to timing constraints, providing the user with a full threading library is outside the scope of our project.

 

Additionally, I also was able to get more firm requirements on the memory of our console. The GBA offers 256KB of slow ram + 32KB of fast ram, with the latter often used to cache full arm instructions as it was the only ram in the console to use a 32-bit bus. Since our project is aiming to provide a similar experience to consoles like the GBA, we must provide at least 512KB of ram. Since we’ll be including a kernel with our console, we should provide more than that, likely closer to a megabyte. In any case, I doubt this requirement will be difficult to meet. The more interesting one is the long term storage requirement. GBA cartridges ranged in size from 4MB to 32MB. We’ll need to ensure that our console offers long term storage in this range, and preferably said storage will not be unbearably slow. One option could be to ensure that we find an FPGA which offers at least this much flash storage.

 

At this moment, we’re right on schedule. In this next week, we will finish off our proposal presentation and research the hardware we will use for our project.

 

The basic GBA hardware information was obtained here:

https://www.copetti.org/writings/consoles/game-boy-advance/

 

The cartridge size information was obtained from here:

https://projectpokemon.org/home/forums/topic/22987-what-are-all-possible-rom-sizes-for-clean-dumps-gb-gbc-gba-nds/