I like to think that I’m a reasonably clever guy, and that I can deal with the complexities of software development with some degree of aplomb. Still, when you’re trying to navigate several different partially-incompatible standards, it’s easy to get confused, make bad decisions, and completely break your entire program.
And all I was trying to do was get upper and lower case letters to play nice with each other.
Let’s start from the top. I’m writing code on Windows with Emacs and compiling it with the cc65 tool chain. So the program text, including string literals, is in ASCII (or the ASCII-ish part of ISO-8859-1). In this encoding, upper-case letters are in the range 65-90 and lower-case are at 97-122.
The Commodore 64 doesn’t use “modern” ASCII, which dates to 1967. Instead, it uses a character set colloquially known as PETSCII, based on a 1963 version of ASCII (or so Wikipedia tells me). PETSCII includes only upper-case letters (with A to Z still in the range 65 through 90).
However! The C64 also has a “text mode”, which displays both upper and lower case characters. In this character set, lower case letters are at positions 65 to 90, and upper case characters are at 97 to 122. This is the opposite of ASCII.
It gets worse. The character values that actually get drawn to the screen belong to a set of screen display codes, which also serve as indices into the character set graphics. In this mapping, when using text mode, lower case letters exist at locations 1-26, and upper case characters are back at 65-90! Also, there are “inverted” versions of the characters at 129-154 and 193-218.
To handle this mess, the cc65 compiler applies a character mapping function to string literals. With the default C64 mapping, upper case characters are mapped from 65-90 (ASCII) to 193-218 (the domain of inverted capitals in C64 screen codes), while lower case letters are mapped from 97-122 (ASCII) to… 65-90 (capitals in PETSCII and screen codes).
If you send one of these strings to a cc65 library function like cprintf, the characters get further mapped into screen codes, so that upper and lower case letters appear as they were written in the original string in the source code.
If you’re just using the standard library IO, this actually works pretty smoothly. My problem is that I’m using cc65 library functions like cputs and cprintf, but for my “console” – the part of the screen where commands and conversation text are displayed – I’m bypassing that system and writing characters directly to the screen.
When I initially wrote that, all the text strings I was writing were in lower case in the C source code, and showed up as all-caps on screen. Which worked okay, but as I was reacquainting myself with the code I decided it was ugly. And I decided I would fix it. And that was a terrible, terrible mistake.
Unfortunately, I got it in my head that there had to be a solution to this mapping problem that would allow strings sent to the library functions to be written “correctly”, and allow strings that were blasted to the console to be written “correctly”, and allow strings that were written in the conversation scripting language to be written “correctly”, and that wouldn’t break things like C64 disk functions where I had to actually write out particular character sequences to the disk’s command channel.
I tried! Oh, I tried. I changed the character mapping settings inside cc65 to match the screen codes, and when that failed I mapped them to ASCII. I added a separate mapping function to the script compiler. I rearranged the custom character ROM so that the order of characters more closely matched traditional ASCII. And I broke the console display. I broke formatting of the cprintf output. I broke disk functions that were expecting particular character strings. I broke compilation by mapping characters to the same space PETSCII used to map the cursor keys.
Then I reverted all that cruft, and fixed my console print routine so that it mapped characters from PETSCII to screen codes the same way that cc65’s IO routines did. Like I should have done in the first place.
Some days you can only write the right code after exhausting all the other options.