Adrian
Courrèges

DOOM (2016) - Graphics Study

DOOM pioneered fundamental changes in game design and mechanics back in 1993, it was a world-wide phenomenon which propelled to fame iconic figures like John Carmack and John Romero

23 years later, id Software now belongs to Zenimax, all the original founders are gone but it didn’t prevent the team at id from showing all its talent by delivering a great game.


The new DOOM is a perfect addition to the franchise, using the new id Tech 6 engine where ex-Crytek Tiago Sousa now assumes the role of lead renderer programmer after John Carmack’s departure.
Historically id Software is known for open-sourcing their engines after a few years, which often leads to nice remakes and breakdowns. Whether this will stand true with id Tech 6 remains to be seen but we don’t necessarily need the source code to appreciate the nice graphics techniques implemented in the engine.

How a Frame is Rendered

We’ll examine the scene below where the player attacks a Gore Nest defended by some Possessed enemies, right after obtaining the Praetor Suit at the beginning of the game.

Unlike most Windows games released these days, DOOM doesn’t use Direct3D but offers an OpenGL and Vulkan backend.
Vulkan being the new hot thing and Baldur Karlsson having recently added support for it in RenderDoc, it was hard resisting picking into DOOM internals. The following observations are based on the game running with Vulkan on a GTX 980 with all the settings on Ultra, some are guesses others are taken from the Siggraph presentation by Tiago Sousa and Jean Geffroy.

Mega-Texture Update

First step is the Mega-Texture update, a technique already present in id Tech 5 used in RAGE and now also used in DOOM.
To give a very basic explanation, the idea is that a few huge textures (16k x 8k in DOOM) are allocated on the GPU memory, each of these being a collection of 128x128 tiles.

16k x 8k storage with 128 x 128 pages

All these tiles are supposed to represent the ideal set of actual textures at the good mipmap level which will be needed by the pixel shaders later to render the particular scene you’re looking at.
When the pixel shader reads from a “virtual texture” it simply ends up reading from some of these 128x128 physical tiles.
Of course depending on where the player is looking at, this set is going to change: new models will appear on screen, referencing other virtual textures, new tiles must be streamed in, old ones streamed out…
So at the beginning of a frame, DOOM updates a few tiles through vkCmdCopyBufferToImage to bring some actual texture data into the GPU memory.

More information about Mega-Textures here and here.

Shadow Map Atlas

For each light casting a shadow, a unique depth map is generated and saved into one tile of a giant 8k x 8k texture atlas. However not every single depth map is calculated at every frame: DOOM heavily re-uses the result of the previous frame and regenerates only the depth maps which need to be updated.

8k x 8k Depth Buffer
(Previous Frame)
8k x 8k Depth Buffer
(Current Frame)

When a light is static and casts shadows only on static objects it makes sense to simply keep its depth map as-is instead of doing unnecessary re-calculation. If some enemy is moving under the light though, the depth map must be generated again.
Depth map sizes can vary depending on the light distance from the camera, also re-generated depth maps don’t necessarily stay inside the same tile within the atlas.
DOOM has specific optimizations like caching the static portion of a depth map, computing then only the dynamic meshes projection and compositing the results.

Getting Banned from Google Play

A couple of weeks ago I received an email from Google Play notifying me that my open-source game Exp3D had been suspended for policy violation.

Different things can happen to an application when it’s in violation.
It can be removed: this is like a warning, your application is not visible on the market anymore but you get the chance to fix the problem and republish it.
An application can also be suspended – as in banned forever – all your download figures, stats and ratings disappear. If you have an AdMob account associated, your old APK will stop serving ads. In the Google Play Developer Console you cannot even access the details of your application, everything is locked.

My game got suspended without warning – one strike you’re out.

But Why?

Which unforgivable sin did I commit? Copyright violation, DMCA take-down? No, I designed all the art myself: the 3D models, sprites, textures, artwork… The few 3rd party assets I’m using were either public domain or diligently purchased (I bought a license for the music tracks).

What Google was reproaching me was a “Metadata policy violation”.
It’s not the kind of “metadata” the NSA collects, I’m not aggregating private data about my players. My game is actually very privacy-friendly, the code is on GitHub, the only permission required is for the IAPs. This is its installation dialog:

By metadata Google basically means description on the market. The exact terms are on this page.

Playing by the Rules

Google has some handy example of “bad market description”, let’s see:

So… you can be banned for something as little as providing “excessive details”.
At that point Google might as well say “we’ll suspended you for whatever reason we like”.

I had quotes in my description, they were from official review websites, not users, so I guess this part is ok.
The main reason I was banned was because I used other game names in my description like “Ikaruga” and “DoDonPachi”. I never made a secret these games were a source of inspiration for Exp3D, I didn’t think it was inappropriate to quote them since they are relevant to the game.

But anyway my intent isn’t to criticize the policy: you’re on a platform, you play by the rules whether you agree with them or not. If Google wants to impose such restrictions, so be it.
What I do have a problem with is the fact I’ve been banned without any warning, not a chance to fix or even know about the issue before being taken down.
I hadn’t made any change to my market description for over 3 years so this ban came out of the blue.

Options

Google gives you 2 choices when your application is suspended.

You can start over, upload a new APK with a different package name and URL, but it means losing all your users and stats. And you’re warned that should you commit a violation again, your Google account may get terminated altogether.

Option #2 is to file an appeal. If you look online you’ll see most of the appeals get rejected. You can’t plea for a “second chance”, you must first declare that you have not violated any policy and that the takedown was a mistake.

Happy Ending

I filed an appeal without much hope and surprisingly 2 days later my application was reinstated.
I admit I wasn’t really couting on it.
The application status changed from “suspended” to “removed” which meant I had a chance to make adjustments and make it visible again on the market.

A week later AdMob resumed serving ads and all was back to normal.

Not sure how I feel about all this. Mixed feelings.
I’m relieved my game is back but this wouldn’t have happened if some Google bot (I presume it was an algorithm) hadn’t been so heavy handed in the first place.

How many other applications have been unfairly removed after the first strike?
The priority should be on banning malware from the store rather than a perfectly harmless space shooter.

GTA V - Graphics Study - Part 3

Post Processing Effects

Post-effects are performed once the scene has been rendered, to enhance, fix artifacts, change the mood…
We saw in Part 1 how a few post-effects were applied, like bloom, anti-aliasing or tone-mapping. But there are several other effects used in GTA V.

Lens Flares & Light Streaks

When the light goes through a real-world lens, the scattering and internal reflections sometimes cause artifacts.
What I call “lens flares” here is a collection of bright spots along the axis defined by a bright light source and the center of the screen. There are also “light streaks” which are rays originating from the light source. These artifacts are very common in movies, when they are added to a game frame they can give a kind of “cinematic” feeling.

Lens Flares & Light Streaks

There are usually 2 ways to render such artifacts:

  • image-based: extracting the brightest areas, duplicating and deforming them. Works for any number of bright light sources.
  • sprite-based: adding textured-sprites and managing their positions manually. Each light source must be handled separately but artists can have more control over the artifact shape, color, intensity…

GTA V actually uses both techniques: the image-based approach is used to add a subtle blue halo at the bottom-left corner of the image, it’s actually a symmetric of the bright-pass buffer. But the most visible artifacts in this scene result from the sprite-based approach, it is applied for the sun only. First, light streaks are added by rendering 12 rotated quads centered around the sun. Then for the lens flares, 70 sprites are drawn along the axis “sun – screen center”. Artifacts come closer to each other as the camera points towards the sun.

Base
Base + Light Streaks
Base + Light Streaks + Lens Flares
Base + Light Streaks + Lens Flares (Wireframe Highlight)

There are several sprites the engine uses to simulate different lens artifacts:

Base
Base + Flares
Base + Flares (Wireframe)

GTA V is all about the attention to details, and lens flares are no exception: their size is proportional to the aperture of the camera. So if you suddenly look towards the sun, the lens flares are big at first, but then as the aperture narrows to lower the exposure the lens flares become smaller too. The animation below illustrates the phenomenon.
Another nice detail: if you switch to the first-person view, there are barely any lens-flares visible, because we are now seeing through human eyes, not through a camera anymore.

Large Aperture
Small Aperture

GTA V - Graphics Study - Part 2

Level of Detail

If there’s one domain where Rockstar outperforms the competition, it’s when it comes to LOD. The world of Los Santos exists in so many different versions with more or less details/polygons, everything being streamed live while you play without a blocking loading screen. And it makes the whole experience so much more immersive.

Lights

All the little lights you see in the far distance are real, you can drive towards them and find the bulb that casts the light.

Aaron Garbut

This is what Aaron Garbut, one of the Rockstar North founders and art director, was declaring shortly before the PS3 release.
How accurate is his statement? Let’s consider this night scene:

Light Bulbs (Before)
Light Bulbs (After)

Well it simply is the truth: every single small light spot you can see is a quad rendered with a small 32x32 texture like the one on the right.

They are all heavily batched into instanced geometry but still it represents tens of thousands of polygons pushed to the GPU.

Light Bulbs
Light Bulbs - Wireframe
Light Bulbs - Depth Test Pass/Fail

And these are not just static geometry: the car headlights are also moving along the roads, updated in real-time. Of course at this distance, no need to render the full car models, only 2 headlights are enough to create the illusion. But if you decide to go near some distant light, as you come closer, the LOD increases and eventually the full model of the car is drawn.

Low-Poly Meshes

Let’s go back to the frame we dissected previously. Some really vast portions of the world are rendered in a single draw call. For example if we consider the rendering of the hill below:

Hill rendered in a single draw-call

So what exactly is that tiny hill far away?

Turns out it’s not small at all, it’s actually Vinewood Hills a big area spanning across several square kilometers, with dozens of houses and buildings.
There’s the Galileo Observatory sitting on the top of the hill, the Sisyphus Theater, Lake Vinewood, all of these areas which, when you explore them or drive by them, are rendered with thousands of draw-calls and tons of polygons.

But in our case, this zone is far away, so a low-poly version is rendered: a single draw-call pushing only 2500 triangles.

Low-Poly Model of Vinewood Hills

All the rendering is done with a single mesh reading from a diffuse texture. Even if some tools exist to convert a mesh to a lower-poly version, they can’t fully automate the process and I wouldn’t be surprised if the 3D artists at Rockstar spent days fine-tuning the meshes manually.

GTA V - Graphics Study

The Grand Theft Auto series has come a long way since the first opus came out back in 1997. About 2 years ago, Rockstar released GTA V. The game was an instant success, selling 11 million units over the first 24 hours and instantly smashing 7 world records.

Having played it on PS3 I was quite impressed by the level of polish and the technical quality of the game.
Nothing kills immersion more than a loading screen: in GTA V you can play for hours, drive hundreds of kilometers into a huge open-world without a single interruption. Considering the heavy streaming of assets going on and the specs of the PS3 (256MB RAM and 256MB of video memory) I’m amazed the game doesn’t crash after 20 minutes, it’s a real technical prowess.

Here I will be talking about the PC version in DirectX 11 mode, which eats up several GBs of memory from both the RAM and the GPU. Even if my observations are PC-specific, I believe many can apply to the PS4 and to a certain extent the PS3.

Dissecting a Frame

So here is the frame we’ll examine: Michael, in front of his fancy Rapid GT, the beautiful city of Los Santos in the background.

GTA V uses a deferred rendering pipeline, working with many HDR buffers. These buffers can’t be displayed correctly as-is on a monitor, so I post-processed them with a simple Reinhard operator to bring them back to 8-bit per channel.

Environment Cubemap

As a first step, the game renders a cubemap of the environment. This cubemap is generated in realtime at each frame, its purpose is to help render realistic reflections later. This part is forward-rendered.
How is such cubemap rendered? For those not familiar with the technique, this is just like you would do in the real world when taking a panoramic picture: put the camera on a tripod, imagine you’re standing right in the middle of a big cube and shoot at the 6 faces of the cube, one by one, rotating by 90° each time.

This is exactly how the game does: each face is rendered into a 128x128 HDR texture. The first face is rendered like this:

The same process is repeated for the 5 remaining faces, and we finally obtain the cubemap:

Each face is rendered with about 30 draw calls, the meshes are very low-poly only the “scenery” is drawn (terrain, sky, certain buildings), characters and cars are not rendered.
This is why in the game your car reflects the environment quite well, but other cars are not reflected, neither are characters.

Cubemap to Dual-Paraboloid Map

The environment cubemap we obtained is then converted to a dual-paraboloid map.
The cube is just projected into a different space, the projection looks similar to sphere-mapping but with 2 “hemispheres”.

Why such a conversion? I guess it is (as always) about optimization: with a cubemap the fragment shader can potentially access 6 faces of 128x128 texels, here the dual-paraboloid map brings it down to 2 “hemispheres” of 128x128. Even better: since the camera is most of the time on the top of the car, most of the accesses will be done to the top hemisphere.
The dual-paraboloid projection preserves the details of the reflection right at the top and the bottom, at the expense of the sides. For GTA it’s fine: the car roofs and the hoods are usually facing up, they mainly need the reflection from the top to look good.

Plus, cubemaps can be problematic around their edges: if each face is mip-mapped independently some seams will be noticeable around the borders, and GPUs of older generation don’t support filtering across faces. A dual-paraboloid map does not suffer from such issues, it can be mip-mapped easily without creating seams.

Update: as pointed-out in the comments below, it seems GTA IV was also relying on dual-paraboloid map, except it was not performed as a post-process from a cubemap: the meshes were distorted directly by a vertex-shader.

Culling and Level of Detail

This step is processed by a compute shader, so I don’t have any illustration for it.
Depending on its distance from the camera, an object will be drawn with a lower or higher-poly mesh, or not drawn at all.
For example, beyond a certain distance the grass or the flowers are never rendered. So this step calculates for each object if it will be rendered and at which LOD.

This is actually where the pipeline differs between a PS3 (lacking compute shader support) and a PC or a PS4: on the PS3 all these calculations would have to be run on the Cell or the SPUs.

G-Buffer Generation

The “main” rendering is happening here. All the visible meshes are drawn one-by-one, but instead of calculating the shading immediately, the draw calls simply output some shading-related information into different buffers called G-Buffer. GTA V uses MRT so each draw call can output to 5 render targets at once.
Later, all these buffers are combined to calculate the final shading of each pixel. Hence the name “deferred” in opposition to “forward” for which each draw call calculates the final shading value of a pixel.
For this step, only the opaque objects are drawn, transparent meshes like glass need special treatment in a deferred pipeline and will be treated later.

G-Buffer Generation: 15%
 
G-Buffer Generation: 30%
 
G-Buffer Generation: 50%
 
G-Buffer Generation: 75%
 
G-Buffer Generation: 100%