Back in October, I mentioned that I’d been working my way through the Baking Pi course – an introduction to bare metal programming for the Raspberry Pi computer. I’d just started getting into the graphics routines, but around that time I started focusing on my Project:65 computer, and I set the Pi aside for a while.
Since the Raspberry Pi’s first birthday was this week, I figured it was high time I got back into it and actually did something with it. In fact, I’ve got a couple of ideas. For example, I’d like to use the Pi to control my EEPROM burner and give it a more convenient interface. But first, I figured it was time to get back into the Baking Pi tutorials.
It’s probably worth mentioning the big hullabaloo about the Raspberry Pi’s graphics drivers that happened a few months back, because it ties in directly to how the Baking Pi tutorials handle rendering. When the Pi was first released, there was very little documentation about the graphics system. The tutorials took the simplest possible approach: Get the graphics driver to give us a writable framebuffer (this in itself had to be partly reverse engineered) and then do all the graphics as software rendering into the buffer.
More information has been released since then, but in a way that hasn’t entirely made the community happy. A software driver for the Pi has been released, but it turns out the driver doesn’t actually do very much. Most of the real functionality is in the GPU’s firmware, which remains closed source. Now, every GPU has some functionality in firmware, but this is apparently an extreme case.
From an Open Source zealot’s point of view, it’s an unsatisfactory situation. From my point of view doing bare-iron development for fun, it means there are actually two options: Software rendering, like in the Baking Pi tutorials, or talking directly to the OpenGL ES library that’s been crammed into the firmware blob. The latter is certainly a possibility, but probably more complex than I want to try to do in ARM assembly.
When I was working on this stuff back in October, I left off just as things were getting interesting. I’d just finished the line drawing routines, and so the stuff I’ve been doing this week is all about rendering and formatting text. And I promise that’s a lot more exciting than it sounds.
Just getting a character up on the screen is quite a bit of work. The tutorial takes the expedient approach of linking the bitmaps of the font directly into the kernel image. It’s the only option, since Baking Pi never gets around to discussing filesystem access (which, admittedly, would be a big topic). So rendering a character becomes a task of finding its image data in memory and copying it into the right location in the framebuffer, one pixel at a time. You’re basically looking at pointer arithmetic and a few nested loops. A pretty good quantity of code, but not a lot of complexity of code.
Of course, I was coming to this project after a break of nearly four months, which adds its own complexity. I stepped away in the first place because of limited time, but also because switching between 6502 assembly and the much more complex ARM assembly in different problems was making my head hurt. After all, this is one of the reasons that high-level languages were invented in the first place.
Once I sat down and concentrated, I was able to “swap in” the ARM assembly I’d learned before, but it took a while before I really felt up to speed. 6502 stuff I can mostly write from memory, but for the ARM CPU I can’t get very far without a reference sheet. It’s important to remember that the ARM assembly language is much more complex: There are more opcodes, they take more arguments, and they have more variants and options.
A good example of how the complexity and options can trip you up happened to me when I started to work on routines to format numbers for printing. For example, I was trying to convert the number 456 to the character string “456” – but all I got was the “6”. I banged my head against this problem for more than an hour before I realized how trivial my mistake was. All the math was fine, and all the digits calculated correctly. But when I copied the character representation of each digit into my string buffer, I was using a 32-bit store (STR) instead of an 8-bit store (STRB). String-handling madness ensued.
Of course, it didn’t help that I’d been spending so much time lately doing 6502 assembly, where all operations are 8-bit operations. A 32-bit CPU is a different kind of beast altogether.
Speaking of calculations, one of the really fun things about this part of the tutorial was learning a method for division of binary integers. One of the first really complicated bits of 6502 assembly I learned was a software multiple for 16-bit integers. Well, the ARM CPU may have a built-in multiply, but it still doesn’t have a division operation.
In the tutorial, the division operator is used to convert numbers to different numeric bases for display. You divide the number by the base, and the remainder is the rightmost digit of the result. Do that a bunch of times and you’re done.
The actual division operation is done in binary, and is conceptually pretty similar to conventional long division. One advantage of doing it in binary is that you never have to do any multiplication – it’s pretty much all handled by a series of shifts and subtractions. As with multiplication, it’s an expensive operation, but pretty understandable once you’ve worked through a few examples in your head.
In order to keep my fun, colorful background while working on the string formatting and drawing routines, I also added a couple routines for drawing filled or outlined rectangles. Quickly putting together those two routines made me think about some of the tradeoffs in the Baking Pi tutorials – specifically, performance versus robustness.
For example, the tutorial’s implementation of the Bresenham line-drawing algorithm is done entirely in screen coordinates. Each pixel is sent to a DrawPixel routine that checks if the pixel is on-screen and then calculates its position in the framebuffer’s memory. There are ways to combine these operations for a significant speed boost, and in assembly written for a C64 demo, those optimizations are done as a matter of course. The approach here is more like traditional application development: small, well-encapsulated functions with error-checking and a lot of abstraction. This is a great way to write software, but it does kind of defeat the purpose of writing it in assembly language instead of C.
This is something I’ve been thinking about for future projects. I’ve been doing assembly language coding as a hobby because I appreciate the thrill of communicating with the machine on that low level. On the other hand, it does take a long time to write anything really substantial. It might be time to start combining the high-level languages with the languages of the raw iron. One way to do that would be to add some functions written in C to the Baking Pi’s pseudo-OS. I’d probably learn a lot about ABIs and the practicalities of linker configuration by doing that. Alternately, I would like to play with the CC65 compiler for the C64 (or my own Project:65). Just ideas for now, but anything’s fair game.