Building a Renderer with Metal (Part 1)

Kemal Enver
6 min readSep 12, 2021

--

There have been incredible advances in the world of computer graphics over the past thirty years. Graphics hardware is more powerful and capable than ever, and developer tools have advanced in many ways. One of the nicest things about Metal is the tooling that’s available to debug and visualise what’s happening under the hood.

These changes have contributed to the evolution of graphics APIs, and it can be daunting to know where to begin the journey of learning so much material. Although APIs have become more complex there are some core mathematical concepts, particularly in the field of linear algebra, that will always remain relevant—no matter what tools and APIs you use.

This series of posts will provide a brief overview of where we have come from, where we are now, and how to get set up and going in a pragmatic way so you’re ready to learn more advanced concepts and techniques when you need to.

Over the course of the series I will explain how to build a basic 3D renderer in Swift and Metal in a way that makes it reusable and simple to understand. Along the way I’ll cover the core principles required to create this renderer, including how to use the Metal API, and the core mathematics that drive 3D rendering. I’ll also provide references to further reading for when you’re ready to explore further.

An overview of defunct rendering techniques

My first experience with rendering was with OpenGL while I was studying at university around 2002. I had further exposure to this technology while working at a game studio, and CAD (Computer Aided Design) business.

The APIs emerging in the early ’90s were a foray into programming hardware dedicated to rendering graphics. Up to this point graphics applications did all of the drawing and necessary calculations in software that ran on general purpose CPUs.

OpenGL was one of these first APIs. It was (and still is!) a multi-platform, low-level graphics API from the Khronos group, based on a proprietary API from SGI. By using OpenGL a developer could target any graphics card with an OpenGL driver (or even perform software rendering). There were other APIs making moves around this time such as DirectX which only worked on Microsoft Windows systems (a bit like how Metal only works on Apple’s platforms today). This overview will cover the general capabilities and methods available to developers in those early days by focusing on OpenGL.

In its infancy OpenGL was built around a state machine model. As a developer you would write code that set states, drew things to the screen, and then reset states. This style of developing was called Immediate Mode and was conceptually fairly straight forward to understand.

Drawing a coloured triangle using OpenGL immediate mode. This is easy to understand but very inefficient due to having to call a function for every vertex and its attributes.
Hello Triangle - The OpenGL code above produces something similar to this. This particular triangle was rendered using Metal on an iPad.

Sending commands to the graphics card in this way is synchronous and repetitive—which has serious consequences for performance. In this environment the hardware is often idle, waiting for the application to set all the required states and send data.

To get around these limitations OpenGL required enhancing. These enhancements were collectively dubbed retained mode. It’s worth noting that retained mode is a collection of APIs rather than an official term. It’s better to think of it as ‘not immediate mode’.

One of the goals of these enhancements was to minimise the number of functions that needed to be called, and to minimise data copying. In immediate mode rendering, every vertex required a function call. For a large 3D model there may be thousands of vertices that have to be looped over, eventually becoming a serious bottleneck. Below is a brief explanation of some things that changed — don’t worry, you don’t need to understand the details, only that things evolved to solve problems as they arose.

The first enhancement to address bottlenecks was called a vertex attribute array. Instead of calling a function many times for a set of vertices, a pointer was set to a chunk of memory that held all the data. This was an improvement, but still had a drawback —the application code could change the data in the array at any time. This meant OpenGL had to keep its own copy of the data, requiring an expensive operation every draw call.

An example of setting a vertex array pointer and drawing the vertices.

The next evolution came with VBOs (vertex buffer objects). These allowed OpenGL to represent changes in data, rather than having to recopy every draw.

There have been more enhancements since, which have further resolved some of the early deficiencies.

Recent versions of OpenGL no longer support immediate mode and every enhancement since the first version has reduced inefficiency by putting more control in the hands of developers. Although a lot of things have been improved, the API has grown more complex, and harder to learn.

One of the attractive properties of a modern API like Metal is that it doesn’t carry forward all of the legacy issues that affect older APIs such as OpenGL.

Should you learn Metal?

If you’re interested in this series you’re likely feeling a bit overwhelmed by all of the technology and jargon that surrounds 3D graphics. In the section above I mentioned a bunch of things which probably sounded really confusing.

Before you begin, consider whether something as low level as Metal is the right tool for the job.

If you’re interested in creating a game, starting with Metal is probably a bad idea. You should consider learning a dedicated game engine such as Unreal, Unity, SceneKit, RealityKit (for VR), or Godot. These tools will allow you to render graphics, but will also provide you with other things such as physics engines (to handle collisions), UI components for drawing menus, sound APIs for music and sound effects, networking libraries for multiplayer games, and a multitude of other benefits. The feature set of the renderers in these tools is likely to be many times greater than what an individual could expect to create on their own — especially a beginner.

Given the above, there are still some great reasons to learn how to write your own renderer! If you’re writing something where the graphical style is unique, writing your own renderer will undoubtedly give you more control. If the number of renderer features you need is fairly small, you can absolutely build your own — adding more features as required. By doing this you have the entire world’s worth of literature and research to fall back on and learn!

Writing your own renderer is a great learning experience. The Metal API and others like it such as Vulkan and DirectX bring you close to the hardware and teach you a lot about how things work. By persisting you’ll also become intimately familiar with geometry and linear algebra—which is a skill in itself.

What’s next?

I’m expecting this series to consist of about 20 posts over the course of a year. I won’t be providing GitHub code, but will instead be making everything as clear as I can in this series. This technology starts with a steep learning curve —but by writing your own renderer and typing out the code (possibly multiple times), the material will stick and you’ll be grateful for it later.

After each post I’ll encourage you to read any references and play around with what has been covered.

In the next post I’ll be walking through how to set up a Mac environment to begin developing your renderer. At the end of that post you’ll understand the render process at a very high level, and where you as a developer fit in.

Follow me on Twitter: @kemalenver

Glossary

Below are some brief descriptions of the terms used in the article.

  • Vertex plural vertices
    A point in space. i.e. (x,y) or (x,y,z). 3D models are made up of many vertices.

Further Reading

--

--

Kemal Enver
Kemal Enver

Written by Kemal Enver

Software Engineer with a passion for iOS development. I’ve recently become a Technology Leader, and rediscovered a love of 3D graphics.

No responses yet