After I figured out that the save game files in Questron were memory dumps of the BASIC variables section, I had to find out how to interpret that data. Luckily, a lot of old books about Commodore 64 programming are now available in various places on the web. In particular, I referred to Jim Butterfield’s Machine Language for the Commodore 64, 128, and other Commodore Computers and Compute!’s Vic-20 and Commodore 64 Tool Kit: BASIC by Dan Heeb.
And by the way, a lot of this should be pretty applicable to other 6502 computers, since most of them used versions of Microsoft 6502 BASIC.
The variables space is divided into three sections – the scalar variables, array variables, and character data for strings. The scalar section is pretty straightforward. Each variable is described using 7 bytes. The first two bytes are the variable’s name, but there’s a trick: the type of the variable is also indicated by setting the high bit in one or byte or the other. The variable is a float if neither high bit is set, an int if both are set, and a character string if only the high bit in the second letter of the name is set.
This implementation detail actually explains some of the behavior of the BASIC interpreter. For example, C64 BASIC only cares about the first two letters of a variable name. You can use more, but it won’t matter – the interpreter treats FOO and FOOBAR as the same variable. On the other hand, variables with different types can have their own entries. So the floating-point variable HP would be distinct from the string HP$. I vaguely remember those rules from way back when I first learned BASIC, but it’s neat now to see where they came from.
The remaining 5 bytes of data are the value of the variable. For integers variables, two bytes are used to store a 16-bit integer value. For strings, one byte stores the string length and two more are a pointer into the string data area. Floating point variables – which are kind of the default – are more complicated. And as for why the authors of Questron used floating point variables to store fields like hit points or food? Who knows.
Commodore BASIC uses 5 bytes for “packed” floats. One byte is the exponent and 4 bytes are the mantissa, but it cheats and doesn’t save the actual first one bit of the mantissa, using that place to store a sign bit instead. I wrote a Python function to see if I could figure out the decoding:
def decodePackedFloat (data): """decodes the 5-byte packed c64 BASIC float in data, and returns a float value. See appendix F of Butterfield""" # data is both the exponent and a zero flag. If it's zero, the whole # number is 0. if data == 0: return 0.0 # The packed representation has an 8 bit exponent and a 32 bit mantissa. # If exponent is 128, all 32-bits of mantissa should be to the right # of the decimal point - which we get by multiplying the mantissa by # 2^-32. exponent = data - 128 - 32 # Since the highest bit of mantissa is always going to be a 1, the # packed format can cheat and use that high bit for something else - # the sign of the number. if data >= 128: sign = -1 else: sign = 1 # Build the mantissa out of the last 4 elements of data. Note we're # making sure the high bit is 1. Also note that the number is stored # big-endian, which is very un-C64-like. I blame Bill Gates. mantissa = ((data | 0x80) << 24) + (data << 16) + (data << 8) + data # put together our final result return sign * mantissa * pow (2, exponent)
The array variables are a bit more complicated, especially since you can theoretically have an array of up to 256 dimensions – you just wouldn’t be able to write a BASIC line long enough to dereference it. Arrays use the same naming convention as scalar variables, and the same rules apply. After that there’s two bytes storing the total size of the array entry, one byte storing the number of dimensions, and then the size of each dimension stored as two bytes (the dimensions are stored in reverse order). The array data follows, I believe in column-major order.
I had a hard time with the array data, and the thing that threw me was that the total size of the array entry was stored in little-endian order, but the size of the individual dimensions is apparently big-endian. This actual contradicts the diagrams in the Tool Kit: BASIC book. I did some peeking and poking on actual hardware to convince myself I had the right order, and then my decoding routine started giving good results. Here are some of the more interesting BASIC variables I found:
A $ = CORONAX-V // Save game filename NA$ = CORONAX // Character name HP = 240.0 // Hit points TM = 83.0 // Time GD = 338.0 // Gold FO = 207.0 // Food TW$ = GERALDTOWN // Last town entered BU = 100.0 // Bank balance? KA% = 15 // Charisma ST% = 15 // Strength AG% = 20 // Dexterity ZT% = 15 // Intelligence? SM% = 15 // Stamina ?
The array variables are mostly used to store lists of items, names, or locations, for example:
RA$  = ['SERF', 'PAGE', 'SQUIRE', 'SOLDIER', 'KNIGHT', 'BARON'] WE$  = ['BARE HANDS', 'SLING', 'WHIP', 'ROPE N HOOKS', 'MORNINGSTAR', 'FLAIL', 'CLUB', 'IRON HAMMER', 'MACE', 'CUTLASS', 'SHORT BOW', 'BATTLE AXE', 'LANCE', 'GREAT SWORD', 'LONG BOW', 'MUSKET'] AR$  = ['NOTHING', 'RAWHIDE', 'SHIELD', 'CHAIN MAIL', 'PLATE MAIL', 'MAGIC SHIELD', 'ENCHANTED PLATE'] MA$  = ['NOTHING', 'MAGIC MISSILE', 'FIRE BALL', 'STONE SPELL', 'ARMOUR ENHANCE', 'WALL PASS', 'HIT POINT'] IT$  = ['NOTHING', 'HOLY WATER', 'TRUMPET', 'MAGIC POWDER', 'MAGIC FLUTE', 'BOOK OF MAGIC', 'COMPASS', 'A', 'IRON KEY', 'DIAMOND RING', 'RUBY KEY', 'SILVER KEY', 'EMERALD KEY', 'LEAD KEY', 'GOLD KEY'] OP$  = ['OPERATE', 'DRINK', 'BLOW', 'SPRINKLE', 'PLAY', 'RECITE', 'READ', '*', 'USE', 'WEAR', 'USE', 'USE', 'USE', 'USE', 'USE'] TR$  = ['ON FOOT', 'HORSE', 'WAM LAMA', 'RAFT', 'CLIPPER', 'EAGLE'] TW$  = ['GERALDTOWN', 'HIDDEN PORT', 'PRISON MINES', 'FORT CAVERN', 'MALL CAVE', 'RIVER JUNCTION', 'LOST CHASM', 'FREE TOWN', 'LAKE CENTRE', 'LAGOON', 'GAMBLERS GROTTO', 'GHOST HOLLOW', 'WIMP CAVE', 'BLIND PASS', 'ROYAL CITY', 'SWAMP', 'ISLAND', 'OCEAN', 'BAY RIDGE', 'ISLAND VIEW', 'LIZARD CROSSING', 'SWAMP FORT', 'DEVIL LAKE POST', 'SNAKE LANDING', 'MOUNTAIN CATACOMBS', 'DUNGEON OF DEATH', "MANTOR'S MOUNTAIN"] TE$  = ['WATER', 'GRASS', 'JUNGLE', 'MOUNTAIN', 'SWAMP'] MU$  = ['PIERCING PUNGIE', 'SLIME SWIMMER', 'LEACH WOMAN', 'MASHER WHALE', 'HYDRO SNAKE', 'PIT SCREAMER', 'STRANGLER FIEND', 'ROT WEED', 'ALBINO LEACH', 'GAR MIND FLAYER', 'STONE AXE BEAK', 'PHAZOR SPIDER', 'JACKAL RAM', 'LEOPARD YETTI', 'IRISH STALKER', 'BEAR', 'WOODS OGRE', 'GORILLA', 'BLOODHOUND GHOUL', 'FLESH FEELER', 'ARMY SCORPION', 'BLACK KNIGHT', 'BANDIT', 'MASTON CENTAUR', 'DIRT WEIRD', 'WRENTION WARRIOR', 'BLIND BLOOD DOG', 'HIGH ELF', 'NAGA PILGRIM', 'FAUN NYMPH', 'SHEDU MONK', 'MERCHANT']
So now I pretty much have all the information I need to create a character editor – I just need to decide how to do it. I’ve been using Python for analyzing disks and prototyping stuff, but if I want a piece of code that’ll run on an actual C64. It might be time to dust off cc65 again.