Amiga

All posts tagged Amiga

Adding some color using the Copper

Posted by coronax on July 14, 2015

Posted in: Programming, Retrocomputing. Tagged: Amiga, computer graphics, Retrochallenge. 1 Comment

One of the Amiga’s big claims to fame is its custom chipset, which gave it multimedia capabilities that were basically unheard of 30 years ago. I spent my retro-time this weekend learning a few new old tricks.

The Copper is a simple but powerful coprocessor that’s part of the Amiga’s Agnus chip. Out of all the custom chips, it’s the only one capable of running its own (exceedingly simple) program. It has three instructions:

WAIT – wait for the beam to reach a certain position on screen.

MOVE – write a value to one of the custom chip registers.

SKIP – skip the next instruction. Apparently you can use SKIP to create loops, but I haven’t mucked with that yet.

What can you actually do with that? Quite a few things. The Copper is what lets the Amiga display multiple screens with different resolutions or palettes simultaneously. It’s used to create special graphics modes like “Sliced HAM”, and can even draw blocky graphics all on its own.

On older computers like the Commodore 64 or Atari 2600, a lot of graphics routines relied on executing code or changing registers when the beam reached a certain point on screen, and the entire program had to be written around that. The Copper makes this a lot easier, because it’ll run through its program every frame while the CPU – and your program – is taking care of other stuff.

I decided to use the Copper to add some color and flair to my graphics program. My first attempt borrowed from an example in the RKRM – it changed the background from a flat gray to a rainbow of horizontal stripes. The Copper program (also called a “Copper list”) was simple – every 10 lines or so it would write a new value to color register 0, which contains the background color. Even though my graphics were only using a single bitplane, I suddenly had an image with almost two dozen distinct colors.

Compared to my trials with the back-face culling algorithm last week, getting the Copper list stuff to work was a breeze. I did run into one “old fashioned C compiler gotcha”. In K&R style C, functions didn’t have to be declared before being used – but without the declaration, the compiler would assume that all the arguments were supposed to be ints – or at least int-sized. I forgot to include the header file for graphics.library’s Copper list utility functions – which are actually supposed to take 16-bit words as arguments – resulting in some weird errors because the values I was passing were being promoted from words to ints behind my back – right under my nose!

Once I got that straightened out, I decided to have some fun. I came up with a Copper list that was my best attempt at coming up with a twilight sky and barren ground. The effect kind of worked, but it still needed something. It didn’t exactly need my mediocre artistic skills, but that’s what it got.

DeluxePaint IV’s exotic two-color mode.

Amiga displays are composed of multiple bitplanes. That’s not the way most modern graphics hardware works, but it has its pros and cons. One pro is that I can draw my graphics in one bitplane without modifying another one. So I added a second bitplane to the display, and to fill it in I created a picture in DeluxePaint IV – just some pyramids, a full moon, and some stars. I converted the picture to a C source code file using a program called “IFF to Source” by J.Tyberghein, which I vaguely remember installing on this hard drive image back in the early 90s.

With two bitplanes I have four colors. The background is 0, and the pyramids and moon are color 2. The rotating cube uses colors 1 and 3 (depending on whether a line overlaps the picture or not). I added a couple more entries to my copper list so that color 2 would be pale white at the top (for the moon and stars) and dark gray for the bottom (the pyramids).

The best part is that none of this Copper list stuff affected my frame rate. The Copper program doesn’t have a noticeable effect. The background details image is copied into the frame buffer once at the start of the program (well, twice since my graphics are double-buffered), and after that I don’t have to do anything with it. I like the overall effect – it’s a lot more fun than staring at gray and black all the time. The video at the top shows the transition of how things developed over the course of the weekend.

The Back Side of the Cube

Posted by coronax on July 9, 2015

Posted in: Programming, Retrocomputing. Tagged: Amiga, computer graphics, Retrochallenge. Leave a comment

With back-face culling, it just about looks like a solid object.

After restructuring my graphics program, one of the first features I wanted to add was back-face culling. That’s a basic graphics technique where you only draw a polygon if it’s actually facing towards the camera. For example, pick up a Rubik’s Cube (you do have a Rubik’s Cube within reach, don’t you?). No matter how you rotate the Cube, you can’t see more than three sides of it at once. As long as the shape you’re drawing is a convex hull, you can cull the back-facing polygons without any visible effect. And if you’re rendering in line mode, like I’ve been doing so far, you can get a sort of hidden-line removal out of it.

In principal, back-face culling is really simple. Let’s take one side of the Rubik’s Cube, which is a square polygon. You can think of each of its edges as a vector. If you take the cross product of two of those edges, you get a vector that’s perpendicular to the polygon and facing out from the “front”.

Next, you figure out a vector that goes from the camera to one of the vertices of the polygon. Then you calculate the dot product of those two vectors. If the dot product is positive, the polygon is facing away from the camera and you don’t have to draw it.

I figured it was a nice simple problem that I could hack out quickly. How could I mess up something that simple?

Ah. Let me count the ways…

First, I needed to change my data structure. So far, I’d only used vertices and edges (where an edge was just a pair of indices into the array of vertices). Now I needed to keep track of faces, which are polygons composed of multiple edges. Since each edge can be shared by two faces, I added a boolean to the edge to check whether it’d already been drawn each frame. So far, nice and easy.

There is one problem with using the cross-product of two edges to figure out the normal vector of a face: something called polygon winding. In order to make sure the normal is pointing outward, instead of inward, you have to pick your edge vectors so that they’re going counter-clockwise around the edge of the polygon (looking down from whichever side you want to be the “top”). My shared edge structs didn’t quite solve the problem – there was no way to define them so that they’d be counter-clockwise for every face referencing them. I could make it work, but it was easier to add a list of vertex indices in the correct order to each shape. It’s a little redundant, but right now speed is more important to me than memory efficiency, and it got the job done:

typedef struct Edge
{
    int p1;    // index to points
    int p2;    // index to points
    int drawn; // 1 if already drawn
} Edge;

typedef struct Face
{
    // vertex indices, in CCW winding order
    int mPoints[5];
    // edges, so we can keep track of which ones are
    // already drawn.
    int mEdges[5];
    int mNumEdges; // also number of points.  Convenient, huh?
} Face;

Of course, now that I had all that data structure to work with, I had to get it all filled in correctly, which was a real pain. I want to do other objects, but I may have to figure out how to define them programmatically.

With all those pieces in place, I could test my back-face culling routine and see it fail miserably. Um, yay?

Knuth writes somewhere about the dangers of premature optimization, but I ran into optimizations I added a year and a half ago and promptly forgot about. I only rediscovered them when they blew up in my face.

For example, during the Retrochallenge 2014 WW, I took a shortcut when multiplying vertices by my transformation matrix: I only computed the X and Y values. I didn’t have a depth buffer or anything, so I didn’t need to compute Z. Cue me, last night, looking baffled when every point of a cube had Z == 0. Oops.

The biggest optimization I did last year was my fixed-point math macros, which let me compute real number values using only integer math operations, which are much faster than any of the Amiga’s floating-point libraries. The downside is that the fixed-point numbers have a limited range – plus or minus about 32,000, in this case. For a while, I though I had a problem where I was overflowing those values. I probably was, but that was just a sign of a bigger, dumber mistake.

It’s important, when working in 3D, to always know what coordinate system you’re working in, and what set of transformations you’re applying to your data. I got this wrong at the outset, and when I tried to fix it the first time I was different, but still wrong, so it cost me a long time, and I had to go back to basics and look at the actual numbers in a very simple situation before I ever understood what was going on.

In my first implementation, I would create a single transformation matrix for each frame that was applied to my input data. A fixed set of vertices went in one end, and a set of screen coordinates came out the other. So this matrix looked something like this:

Screen × Projection × Camera_Translation × Model_Transform

Now, screen coordinates were never going to be the right frame of reference for the culling calculations (I realized after staring at results for an hour). I needed to see my vertices somewhere in the middle of that pipeline, which unfortunately meant that each vertex would have to be transformed twice. Oh, my poor framerate!

Unfortunately, for some reason I decided to compute the vertices with Projection × Camera_Translation × Model_Transform. This was the wrong choice because of the way the perspective projection interacts with the Z values of the vertices. But I couldn’t figure out that that was my problem until, in a desperate bid to simplify the situation, I replaced the perspective projection with an orthographic projection – and things started working the way they were supposed to.

By the way, in the course of this process I discovered that my camera translation had been wrong the whole time, and the shape was actually being rendered behind the camera! The only reason I was seeing anything at all was that, to save time, I never bothered testing the front or back clipping planes(!) This also meant that the shapes were all being drawn flipped along the Z axis, adding to my consternation. I swear I actually do know how this stuff is supposed to work. Please believe me!

To finally get things working, I ran the back-face culling routine on the polygon vertices after they were multiplied by the Camera_Translation and Model_Transform matrices, but not the Projection or Screen transforms. At long last, bingo! A cube that you can’t see through to the other side!

It’s not a perfect victory, though. It’s slower than it was. The extra calculations I’m doing cost more than the time i’m saving by not drawing the lines for the back-facing polygons. Now that the math’s right, we can fix some of that up – for example, it might be faster to calculate the normals once and then just transform them each frame. After that, we’ll see.

Memory pains

Posted by coronax on July 5, 2015

Posted in: Programming, Retrocomputing. Tagged: Amiga, Retrochallenge. 1 Comment

It’s not exactly a “snow crash”, but it’s still kind of cool-looking.

I knew right away that I wanted to make some architectural improvements before getting too far into this year’s project. For example, I wanted to be able to read geometry definitions from a file, rather than having them hardcoded into the application. I also wanted to make the organization of the code better encapsulated. My brain likes to think about software in an object-oriented way, even if I’m using a language like plain old C. All that class stuff is just syntactic sugar anyway, right?

I figured this would be a quick task that wouldn’t cause any trouble. My first test ended with a screenful of garbage followed by a Guru Meditation error. The gray and green lines suggest that not only was I not drawing graphics in the right location, I wasn’t drawing them in the right bitplane. Somehow I’d scribbled into memory far away from where I’d intended to, and the results were fatal.

This is an important reminder of a – perhaps the – key point of developing Amiga software. The Amiga is a multitasking computer running many user and system processes in a single shared memory space with absolutely no protection between processes. This combination is very efficient and very powerful, but it’s also extremely brittle. It’s trivially easy for one runaway program to trash the entire system, like mine had just done.

That particular error turned out to be a straightforward type mismatch in a call to fscanf(), which completely screwed up my vertex data. Since there’s no memory protection, and since the line drawing function didn’t check for out-of-bounds coordinates, it happily scribbled pixels across a bunch of neighboring memory regions (including the bitmaps for the other bitplanes of the image).

Once things were working again, I made the mistake of looking for trouble. I said the Amiga OS was brittle, but in spite of that – or actually, because of that – a lot of Amiga software is very robust. Of course you don’t want an application crash to hang the system, but there’s even more to it than that. Since all Amiga programs share the same memory space, memory leaks are a critical problem – that leaked memory is effectively gone until the system reboots. Good Amiga programs are very careful about letting go of their resources.

My program? Not so good. I was losing 1104 bytes from the system each time I ran the program. Now in all honesty I could have just ignored that. Considering the features I want to add, I’ll probably be crashing the machine way too frequently for 1104 bytes a pop to become a bottleneck. But darn it, this was a matter of pride.

Over the course of several pretty frustrating hours, I worked the memory leak down to 656 bytes and then to 368. Eventually I got it down to 224, and there I was stuck for a while.

Most of the leaked bytes came from an obvious source. Shortly after the end of the 2014 Winter Warmup, I rewrote the screen setup code so that it used the Amiga’s graphics.library directly, instead of the higher-level intuition.library. This gave me a noticeable boost to the framerate, and was totally worth it. However, I got distracted by other things and never really cleaned up the code. So the setup and teardown were organized differently, and it was easy to miss things.

Honestly, the graphics.library API is kind of a mess – when Commodore released Workbench 2.04, they added a lot of options for more flexibly controlling the Amiga’s video hardware. To do that, they added some new data structures. So for each of the standard OS structures that define a screen, like View and ViewPort, there are “associated” structs like ViewExtra and ViewPortExtra. In what may have been a stab at planning for the future, those “Extra” structs (but not anything else) are allocated and deallocated with a special set of function calls (GfxNew() and GfxFree()) instead of using malloc() or the OS-defined AllocMem().

By the way, when you’re allocating things using a bunch of different function calls – malloc(), AllocMem(), GfxNew(), etc., it’s important that your allocators and deallocators match. Don’t FreeMem() memory that was malloc()ed, or vice versa, and definitely don’t use FreeVec() when you mean FreeMem() or even GfxFree().

I eventually made sure that every View, ViewPort, Extra object, ColorMap, RastPort, BitMap, etc., was deallocated using the right function when the program terminated, but it still wasn’t enough.

The last couple of glitches I found were the sort of things an experienced programmer could look at for an hour before suddently saying, “Oh, duh!” and instantly fixing it. For example, the system FreeMem call takes the size of memory to free as an argument. In a case like that it’s really easy to type “FreeMem (sizeof (view))” (where view is a “struct View*”) instead of “FreeMem (sizeof (struct View))”. And of course, both lines make perfect sense to the compiler. Problems like that make me miss C++11 and RAII.

Eventually I found the last problem: I was creating the same ColorMap structure in two different places but only deallocating it once. With that out of the way, I did a test and found that the system available memory was exactly the same before and after running the graphics program. Hurray, at last! With the basics out of the way, maybe we can do something fun now.

Afterwords, the citizens of Ames gathered to celebrate my success. Or something.

Retrochallenge 2015 07

Posted by coronax on July 1, 2015

Posted in: Programming, Retrocomputing. Tagged: Amiga, computer graphics, Retrochallenge. 1 Comment

I’ve added to my collection of Amiga ROM Kernel manuals since the last time I did an Amiga project. I think they’ll come in handy.

So here I am with another last-minute entry into the Retrochallenge. I really thought I was going to be thoroughly prepared with something really ambitious this time. Yeah, who was I kidding?

Since it’s short notice, I’m picking up with something already in progress. Back in the 2014 challenge I experimented with 3D polygon graphics on the Amiga 500. I got the math for the 3D transforms all worked out, and even figured out a set of fixed-point math operations to speed things up, but I was never satisfied with the speed I got.

This year, I’ll try to make things more interesting, more colorful, and hopefully faster. I also hope to branch out from just the polygon drawing and look at some of the other graphical wizardry the Amiga is capable of.

For my first night on the project, I’m just picking up the pieces, checking out a fresh copy of the source code, and figuring out where I left off. It’s funny how code you’re looking at for the first time in 17 months is never as clearly documented as you remember it being…

The Amiga 500 was my main computer from about 1990 to mid 1995 – long after Commodore’s bankruptcy. To get in the mood, I’m listening to authentic 90s music on authentic 90s media:

Running with the numbers

Posted by coronax on January 31, 2014

Posted in: Programming, Retrocomputing. Tagged: Amiga, C, computer graphics, Retrochallenge. 1 Comment

Knuth is quoted as saying “premature optimization is the root of all evil,” which is why I’ve procrastinated so long before trying to speed up my 3D graphics code.

In my last exciting blog post, I made a cube spin on an Amiga 500 running at 7 MHz. The win here was correctness; I had demonstrated that my implementation of the algorithms of 3D rendering were correct. I could take a set of points, rotate and translate them in space, and then project them onto a 2D plane. No problem. Except.

It was too slow.

97 milliseconds of doom

With the first version of my code, I was getting a meager 10 FPS. Not that exciting, when you consider that I was only transforming 8 vertices and drawing 12 lines. I didn’t really expect my C code to run as fast as an Amiga demo written in 68000 assembly, but I figured I could do better than that.

My first step was to figure out where my time was being spent. The Amiga does have a high-resolution timer, and I used that to run a number of tests. This is an area where I’ve done formal research in the past, but for now I just wanted some quick-and-dirty average numbers. Here’s an example of what I found:

Clear buffer	6.8 ms
Update Transformations	28.3 ms
Transform Vertices	26.6 ms
Draw Lines	4.3 ms
Swap Buffers	31.0 ms
Total	97.0 ms

So there are two places where I’m spending a lot of time. The first is in the geometry calculations – creating transformation matrices and then transforming vertices. That’s not surprising – those functions involve a lot of multiplication of floating point numbers, and even a little division. The 68000 doesn’t have built-in support for floating point numbers, and my Amiga doesn’t have a math coprocessor (which, once upon a time, was a separate microchip sitting on the motherboard of your computer). It has to emulate floating point operations, and that’s really expensive.

(In fact, it’s such a big deal that the Amiga OS includes three different libraries of floating point operations, with different tradeoffs of speed and precision. For the record, my numbers above used the “Motorola fast floating point” libraries.)

The other place where I’m losing time is when I’m swapping buffers – 31 ms is a long time! Now, it’s important to understand what happens in that part of the code. My drawing routine is double-buffered – I’m drawing to one set of bitplanes while another set is being displayed. When I finish, I need to tell the graphics hardware to start displaying the newer frame of graphics. But that swap can only happen during the monitor’s video blank interval, which means that I might have to wait as much as 17 ms to swap buffers.

That’s still a lot less than 31 ms, so what’s going on? My screen code and double-buffering code all uses the Amiga’s high-level UI library, Intuition, so I suspect that’s where a lot of the overhead is coming from. If I were to take over the display and use the graphics.library functions directly, I might get better results.

Fixing the Math

I knew I was only going to have a couple more evenings before the end of January and the end of the Winter Warmup, so I had to pick which one of these problem areas to work on. I decided to focus on speeding up the geometry transforms, since I had some ideas I wanted to pursue and it seemed like the most generally applicable choice.

The first thing I did, and the hardest, was to rewrite my vertex and matrix operations to get rid of the floating-point math, replacing it all with fixed-point operations.

Fixed-point math is an approach to representing fractional numbers that used to be fairly popular when computers didn’t have floating-point units. The idea is that you multiply your numbers by a scaling factor so that they can be represented as integers.
Here’s an example. Say you have the numbers 3.4 and 2.2, and you want to represent them as fixed-point numbers with a scaling factor of 100. The numbers would be represented as 340 and 220. To add or subtract fixed-point numbers, you just add or subtract the representations, so 340 + 220 = 560, which is 5.6 when you divide it by the scale factor.

Multiplication and division are more complicated, since you’re also multiplying or dividing the scale factor, and you have to tread carefully to handle that without losing data to overflow or underflow. The multiplication algorithm I ended up with was to divide each number by the square root of the scale factor, and then multiply them together, like so: 340 x 220 => 34 x 22 = 748, which is 7.48 when you convert it back to a floating point number (and yes, 3.4 x 2.2 = 7.48. I checked).

Now all this multiplying and dividing by scale factors probably doesn’t sound very fast. But if your scaling factor is 65536 – i.e. 2^16 – then multiplying and dividing is just shifting the bits of your number to the left or right. I just baked these routines in a handful of C preprocessor macros that I could use when rewriting my matrix routines:

#define FIXSHIFT 16        // shift 16 bits = scale factor 65536
 #define HALFSHIFT 8
// convert float to fix (and back)
 #define FLOATTOFIX(x) ((int)((x) * (1<<FIXSHIFT)))
 #define FIXTOFLOAT(x) ((float)(x) / (1<<FIXSHIFT))
// convert int to fix (and back)
 #define INTTOFIX(x) ((x)<<FIXSHIFT)
 #define FIXTOINT(x) ((x)>>FIXSHIFT)
// multiply and divide
 #define FIXMULT(x,y) (((x)>>HALFSHIFT)*((y)>>HALFSHIFT))
 #define FIXDIV(x,y) (((x)/(y>>HALFSHIFT))<<HALFSHIFT)

Whew! That was a lot of work. And the result of all that meddling with my mathematics? 12.8 FPS!

Okay, that doesn’t sound like much. But think about it this way – my geometry transformations went from taking 54.9 ms to about 34 ms. That’s pretty significant, I think.

Or just get rid of it

With my mind thoroughly in the matrix code, I started looking for other optimizations. I’d made my multiplications and divisions faster, but could I get rid of some of them entirely?

Every frame, I’m creating three different rotation matrices – one for each axis – and multiplying them together. I could, algebraically, work out what that resulting matrix should look like and just plug in my sine and cosine values for the amount of rotation. More generally, there are shortcuts you can take anytime you’re multiplying two affine transformation matrices (like rotations or translations) together. In that case, you know the bottom row of each matrix is [0 0 0 1], so there are a bunch of terms you can eliminate.

I did the algebra on paper, and then made a special function for multiplying affine transformation matrices together. That eliminated something like 56 fixed-point multiplications each frame. I ran some tests and found that gave me almost another 2 FPS.

Transcendental defenstration

After all this, I was still doing three cosine and three sine operations each frame. Transcendental floating-point operations are expensive. I couldn’t just throw them out the window, but instead I replaced the function calls with lookups in a pregenerated table.

I hadn’t separated out the trig functions when I was taking my performance measurements, so I wasn’t sure how much difference that would make. I was pleasantly surprised by my final result – 16.8 fps. That’s almost a 70% improvement over where I started a couple days ago.

By the way, all those measurements above were done at regular old 7 MHz speed. With my new ACA 500, the frame rate gets a small boost up to 18.6 fps. That just makes it all the more obvious that the current bottleneck is that buffer swapping time.

And we’re done

I’m pretty much out of time for the Retrochallenge Winter Warmup, so that’s where we’ll pause for now. The current framerate looks reasonably smooth, and the math is correct, so at least it’s a good starting point for future projects.

There are some more optimizations I could make on the geometry calculations side, but right now the time it takes to swap buffers is overshadowing that significantly.

And, of course, I’ve also got new hardware considerations. I want to do a lot more analysis with the ACA 500 in place on my A500, and get a better sense of how much difference it makes.

If I want to get really serious, I could do a lot better job gathering timing information. It would be interesting to see how much variation there is from frame to frame (rather than collecting averages over multiple frames like I did above), and also to see exactly what effect waiting for the vertical sync has on performance.

So there we are: It was a pretty fun Retrochallenge project, and it definitely exercised some brain muscles. And like any really interesting project, it’s not really done. It’s just time to decide what to do next.

Computed in real time

Posted by coronax on January 23, 2014

Posted in: Programming, Retrocomputing. Tagged: Amiga, C, computer graphics, Retrochallenge. Leave a comment

Screenshot of a rotating 3D cube. Still not demoscene-worthy, but at least the math is solid.

On Saturday, I finally had a chance to dig into the 3D math part of this project – and then it’s taken me until tonight to write about it. But you can see the results in the picture – ooh! It’s a cube! And it rotates! Exciting!

I will point out that this cube is, as they say in the demos, entirely dynamically computed in real time! Of course, it’s still not up to snuff with what you’d see in a good demo for the Amiga 500. Those demoscene coders can do filled-polygon cubes and still get a better framerate than I’ve got so far. But it’s a start.

Writing the code for the math was actually a lot of fun. I’m using the standard approach from computer graphics where the entire problem of converting a 3D scene into a 2D projection in screen coordinates is handled by a bunch of linear algebra. I’ll describe the broad strokes of what I did below, but it’s a lot to discuss and I’ll probably do it badly. There are lots of books on the topic; the one I used when I needed a refresher on some of the details was an older edition of Edward Angel’s Interactive Computer Graphics.

The basic idea is that you represent each point in the geometry by a 4-element vector like [x, y, z, w]. Then you create a bunch of 4×4 matrices to do all your translations, rotations, scaling, and perspective transformation. You then multiply all these transformations together into a final transformation matrix. When you multiply the transformation matrix and point, you get a vector [x’, y’, z’, w’]. In the transformed vector, x’ and y’ are the screen coordinates of the point (with an important caveat involving that w’ term that I’ll get to later). Whew!

And yes, this means that to figure out the two-dimensional projection of a three-dimensional object, we’re using four-dimensional math. Makes you appreciate your graphics card a little more, doesn’t it?

For my first test, I wanted to keep things simple. A cube has 8 vertices, and my transformation was built from three matrixes:

Rotation matrix. I wanted to spin the cube around the X axis. I increment the amount of rotation every frame to provide the animation.
Projection matrix. The projection matrix maps an object in 3D space to a 2D projection. The simplest kind is an orthographic projection, where objects don’t get smaller when they’re farther away, and parallel lines don’t converge in the distance. Using this for the first test also let me fudge a little with my “camera positioning”.
Viewport matrix. The projection matrix gave me (x,y) coordinates on a scale from (-1,-1) to (1,1). The viewport matrix transforms that into screen coordinates that we can use to draw lines on the Amiga’s display. It’s basically scaling the x and y coordinates and adding offsets so that we get numbers that will fit on a 320×200 screen. The tricky part is that I wanted +Y to be “up” in my world coordinates, but the +Y axis goes down in Amiga screen coordinates. I was able to flip the Y coordinates by adding a negative scale factor.

Matrix multiplication is not commutative, so these matrices had to be multiplied together in the correct order to have the desired effect. The intuitive way to understand it is that the “rightmost” matrix is applied to the geometry first. With that in mind, my final transformation matrix T ends up as:

T = viewport matrix * projection matrix * rotation matrix

Now, the viewport and projection matrices are constant, so I was able to multiply them together once at the start of the program and save the result. Then, for every frame I had to calculate the rotation matrix and recreate T. (It’s safe to save the result of V * P because matrix multiplication is associative).

I had fun writing and testing the matrix code. Doing it in C felt a little weird, because my first instinct as a C++ programmer is to create classes for matrices and vertices. Even without the language support, I tried to implement those data structures in a relatively object-oriented way, as a set of well-defined C functions to operate on matrix structs.

I tried to do as much of it as I could without looking up references, including working out on paper the format of the rotation matrices (I couldn’t remember off the top of my head exactly where the sine and cosine terms go to rotate around various axes). I did refer to the book when it came time to do the projection matrix functions, since those involve more complicated shear and nonuniform scale operations. With that refresher, I was able to come up with the viewport transform on my own (with a little trial and error).

This code went together quickly. The trickiest problem turned out to be a sneaky typo in the code to generate the orthographic projection. The most time-consuming part was getting the viewport matrix to position the drawing area correctly. My failed attempts usually result in lines drawing partly or entirely off the edges of the screen. On a shared-memory system like the Amiga, “drawing off the edges of the screen” is another way to say “scribbling on a random chunk of memory”, with usually catastrophic results.

Once I got a cube that was spinning, instead of a computer that was crashing and rebooting, I wanted to do something a little more complicated. I added some more rotations around the other axes, so that the cube would spin and wobble in a more interesting pattern. I also wanted to switch the orthographic matrix out in favor of a proper perspective view (where objects get smaller in the distance and lines converge toward a vanishing point).

Using a perspective view also meant I had to think a little more about my camera position. I added a translation matrix to move the camera (or, alternately, move the world relative to the camera) away from the cube, so that the cube would be entirely within the viewing volume specified by the perspective transform. So it looked something like this:

T = viewport * perspective * translation * X rotation * Y rotation * Z rotation

And again, I could calculate (viewport * perspective * translation) just once, and multiply in the rotation matrices with each frame.

The immediate results of trying this were a fine-looking set of lines scribbled all over my drawing screen. And over some unrelated windows. And then the dreaded Guru Meditation error. Obviously, I’d forgotten something.

Remember how I said a point was defined by four numbers – x, y, z, and w? And that we were doing four-dimensional math? 3D graphics uses something called homogeneous coordinates to represent points (I dare you to read the Wikipedia page in the link). That w parameter is always a 1, and as long as you’re only performing affine transformations (translations, rotations, etc.), w’ will also always be 1.

But a perspective transform is not affine (it doesn’t preserve parallel lines, for one thing), and as a result w’ after a perspective transform is not 1. In order to get our coordinates back down into a regular 2D space, we have to calculate x’/w’ and y’/w’. Once I added those divisions to my code, my cube looked much more like a cube. I just wish I’d remembered to get some screen shots of the more interesting, um, attempts.

There’s a couple more places I could go with this, including hidden line removal and filled polygon surfaces, but the first thing I want to do is take some measurements and see if I can speed it up at least a little.

CJ's Project Blog

software, hardware, and occasional geekery