I recently wasted a good chunk of a weekend trying to debug a problem, and I feel like ranting about that a bit. Let me set the scene:
The EEPROM
The Project:65 homebrew computer stores its firmware in a 28c65 EEPROM. The firmware includes things like the system reset routine, device drivers, a command-line monitor, and some useful library routines.
When I started this project, chips like the 28c65 were readily available–I got mine from either Digikey or Mouser–but in the last few years new chips have gotten scarce. If you want one today, you should go to eBay and cross your fingers that what you get is what it says it is.
That’s a concern for me, since I only have one of these chips and I’ve already plugged it in backwards on at least one occasion. I pulled the chip out in order to burn a new firmware image, stuck it back in the wrong way around, and I didn’t even realize what had happened until I smelled burning plastic. Surprisingly, the 28c65 still worked after it had cooled down, and has continued to work ever since.
After that I wanted to avoid taking the chip out of the breadboard any more often than I had to, so I added a way to reprogram it in situ. I talked about the details way back in this post, but the only hardware modification needed was to add a “write protect” switch, which connected the 28c65’s Write Enable pin to either +5 volts or to the Write Enable signal. (I talked about the WE signal logic quite a bit in the previous installment). I also wrote a program called “insitu” that would download a firmware image using ZModem and write it to the EEPROM. Of course, it was still possible for a buggy firmware to brick the computer, forcing me to pull the EEPROM and reprogram it externally, but this was still a big help.
After my last post about improvements to the P:65’s expansion board, I decided it was time for another firmware update. Mainly, I wanted the computer to reset the expansion board to a known state during reset. I also had some improved I/O routines that I’d been working on, so they got included in the update. Unfortunately, I messed up the layout of the new image such that a bunch of important things–like the system reset vector–were shifted three bytes from where they were supposed to be. When I tried to reboot after the firmware update, the P:65 was completely stuck.
The Programmer
This was an annoyance, of course, but it was also an opportunity, because I’d recently upgraded my homebrew EEPROM programmer, and I hadn’t had the chance to do anything useful with it yet.
The first version of the programmer, described here and here, was put together using an Arduino Uno and a pair of MCP23017 port expanders. It also used an Arduino ethernet shield to provide an SD card for storing firmware images. In the years since then I’ve made some minor changes, like updating the design to support the 28c256 (a 32 KB EEPROM), and recently I ditched the SD card in favor of downloading images via XModem (the same way the “insitu” program does).
The biggest change, though, happened this past November, when I swapped out the original breadboard version of the programmer circuit for a PCB. This was my first attempt at PCB design, so I made sure to keep it very simple: Just a small board with room for the 23017s, a ZIF socket, and a connector to hook up the Arduino. When the boards arrived from JLCPCB a few weeks later I soldered on some sockets and did some very basic testing. At any rate, I made sure that I could write an image to an EEPROM using the new PCB version of the programmer, and then read the same image back out. Then I got distracted by the holidays and other projects and set it aside until I needed it.
Now that I needed it, I quickly fixed the glitch in my new firmware image, pulled the 28c65 out of the P:65, and rewrote it. And this new version didn’t work either.
Only mildly concerned, I pulled the 28c65 again and burned it with an older firmware image that I knew was good. And this one didn’t work either, either.
The Debugging
Things deteriorated rapidly at that point. I had been kind of fumble-fingered while extracting the EEPROM from the P:65 breadboard, so I thought the most likely problem was that one of the wires of the address bus or data bus had come loose–or at least had been disturbed enough to have a dodgy connection with the twelve-year-old breadboard.
An initial check for loose wires didn’t turn anything up, but the computer was acting in an inconsistent sort of way that didn’t feel like a simple software problem. For example, I have a set of LEDs attached to the address bus, and during a normal reset those LEDs blink in a familiar pattern. Now, during some resets they would blink randomly before stopping; other times, they would not change at all. The blinkenlights attached to the expansion board would sometimes change during a reset, and sometimes not.
I hooked up my logic analyzer to the data bus, but that didn’t provide much illumination either. I expected to see the CPU reading the reset vector location out of a table at the top of the ROM, but I never saw the expected value ($00 followed by $E0).
I thought connecting the logic analyzer to the address bus might be more informative, but I just didn’t have an easy way to get at those lines with the logic analyzer probes.
After debugging late into the night, and for most of the next day, I was about at the end of my rope. I was actually working myself up to the idea that I was going to have to dismantle the entire memory breadboard and start over from scratch, so that I could test everything in a more systematic way.
It was a really good time to stop and think.
The Assumption
Throughout my debugging, I had it in my head that the problem was with the computer, and not with the EEPROM Programmer. Why was that? Well, it had been a few months since I’d put together the new PCB version, and I remembered that I had tested it… but I didn’t actually remember how thoroughly I’d tested it, or how thoroughly I hadn’t.
Suppose there was a problem with the programmer, some kind of problem that hadn’t been triggered in my testing. What could I do to validate that? I’d need another programmer to test against. Which, actually, I kind of did.
When I built the new PCB programmer, I stuck the old breadboard version in a drawer, intact except for the MCP23017s that I scavenged for the new one. I’d been intending to reuse it, but then I got a stack of new breadboards for Christmas, so I hadn’t needed it yet (Thanks, Mom!). All I had to do was swap the MCP23017s over, reprogram the EEPROM with the breadboard programmer, and test it out…
And the P:65 lived again.
Well.
So, what kind of problem could cause the PCB EEPROM programmer to be able to write an image and read the image back with perfect accuracy, only for the image to not work when read from another board? Well, there’s actually lots of ways for that to happen. For example, swapping two of the data lines connecting the MCP23017s to the EEPROM would cause this sort of behavior. I loaded up the schematics for the PCB and, because I was looking for it, I saw it right away. It wasn’t just two lines. The entire lower byte of the address bus was connected backwards from how I had intended it. I could kick myself for not noticing it sooner.
Luckily, this is the kind of hardware problem that you can actually work around in software. In this case, I just had to tweak the code in the Arduino program that controls the programmer, so that it would reverse the bits in the lower byte of the address before writing it to the MCP23017. It’s just a few lines of code:
byte low_byte = address & 0x00FF;
// Reverse the bits to work around an error in the PCB.
// Fix this in version 2.2 of the programmer PCB!
low_byte = ((low_byte & 0x000f) << 4) | ((low_byte & 0x00f0) >> 4);
low_byte = ((low_byte & 0x0033) << 2) | ((low_byte & 0x00cc) >> 2);
low_byte = ((low_byte & 0x0055) << 1) | ((low_byte & 0x00aa) >> 1);
With this change in place, the PCB programmer now produces the same results as the breadboard programmer. And there was much rejoicing.
The Assets
Over the last few months I’ve been trying to get the source code for these projects uploaded to Github, so that curious readers can check them out:
- project65: This repository includes source for the P:65 computer’s firmware and several programs, including the “insitu” firmware updater. No schematics yet, but those should be coming soon.
- eeprom-programmer: This repository includes the Arduino project for the programmer, the breadboard layout, and schematics and Gerbers for the PCB version.
Memory Map
I got a question about what the current memory map for the P:65 looks like, so here it is:
$0000-$7FFF RAM (32 KB CY7C199)
$8000-$9FFF Unused (8 KB)
$A000-$BFFF Expansion Board
$C000-$DFFF I/O: 6522 VIA (controls UART & SD card)
$E000-$FFFF ROM (8 KB AT28C65)
At some point I’ll probably rearrange things so the expansion board and 6522 use smaller chunks of memory. That’d free up space for memory expansion or other devices.