The RTX Scam

In late 2018, NVIDIA had launched RTX (Ray Traced Texel Extreme) with their 20 series cards. RTX is supposed to be image rendering using real time light path tracing, instead of pre-compiled shadows and light effects in rasterization which has been the case till then (actually even today, more on that later). NVIDIA’s 20 series cards introduced what they called RT Cores, which were special cores designed to facilitate the computation, and CUDA cores could not be reused. This meant, anybody wanting to enable RT in their games would need to buy a new GPU from the Big Green.

Almost 5 years later, what is the impact of Ray Tracing in a typical gamer’s experience while playing it? The answer to this question is more important than number of games adapting the technology, which is about 80 odd titles among thousands released since the first RTX card was launched. (We can discount the debate on gameplay etc. which arguably might be more important than graphical fidelity for old school gamers like me, because we are actually, discussing about graphical fidelity here. )

Disclaimer: This article does not aim to discredit the research in path tracing to render images. This is to find out whether it makes sense in the current retail space of video games.

Before proceeding with my observations, lets first see what others have observed, as per one of several such Youtube videos:

Control

Click on the gallery to see each image in full, and scroll left/ right for others.

RT on is simply making reflections more pronounced compared to RT off, which renders the floor like a blurry mess. However, while playing the game, the player would be more interested in finding the supernatural beings than noticing the window reflection on the floor in most cases. Further, some may feel the reflection on the right image is overdone (see next example).

Watch Dogs Legion

Again, RT on seems to simply make reflections more shiny, which in some old titles like GTA V/ Read Dead Redemption 2 were handled through reflection quality/ sub surface reflections. The RT implementation on the right actually seem to be overdone, with the road almost appearing to be made of glass.

Note these two examples from the above video which we will come back to later. Watch the video in full for more examples.

Saints Row the Third: Remastered

The first suspicion I personally had about questionable improvement in visual fidelity was while playing Saints Row the Third Remastered, which was given away for free by Epic Games Store at one point. I was surprised at how well the reflections looked in rain-soaked streets of Steelport, while being completely rasterized. Especially awesome were the reflections in small puddles scattered across the ground, with neon lights perfectly mirrored. Though reflections are not all that Ray Tracing has to offer theoretically, when it was first launched most demos were shiny and followed this context only, e.g. Metro Exodus or Battlefield’s European City level. Later, of course, Cyberpunk 2077 implemented path tracing going beyond simple reflections, and might still be one of the best RT implementations yet, but that still does not absolve most of the 80 titles mentioned above which have implemented RT only in the context of reflections as is evident by the video linked above.

Hitman 2

The second suspicion was while playing Hitman 2 (2018). The Glacier engine has some really great reflections and visual fidelity. Especially the levels of Island of Sgail, Hawke’s Bay, Mumbai, Paris, and Thailand really offer immersive enough fidelity while the player is finding ways to avoid detection and execute the targets to get a Silent Assassin rank (or similar). Hitman is one of the few games where fidelity on immediate surroundings matter greatly for immersion, including objects and reflections, and IO Interactive delivered that on even low end hardware. I ran it at 1440p 60 fps on Linux Steam/ Proton, with a 1070 Ti, and there is really not much difference if one plays with an older card not capable of using RT. One would not miss out on anything that Hitman as a genre-defining game has to offer.

Ghostwire: Tokyo

My suspicion cemented further recently, while playing Ghostwire: Tokyo. This is a 2022 game which arguably should be too new for a 1070 Ti. But I was wrong! The below are some of the screenshots while playing the game on that card, which does not support RT, so this is purely rasterized rendering (though Ghostwire does support RT too).

Image 1 shows shadows rendered by the Phone Booth. Others are more representative of reflection rendering, in rain-soaked Shibuya prefect of Tokyo with neon boards and lights (this, and Night City of Cyberpunk are probably the best locations to showcase any RT implementation).

Image 6 particularly includes reflection of neon billboard on the right, which showed a pattern in motion in the game, and the reflections too were in sync with the billboard motion.

Again, all the screenshots are using rasterized rendering, without any RT. Not particularly related, but the gameplay is at 1440p 60 fps, which the GTX 1070 Ti is perfectly capable to render while being undervolted to check temparatures, with 8 GB VRAM which is same as the amount offered in a 3070 Ti, 2 generations later.

Now lets go back to the 2 examples from Youtube above – Control and Watch Dogs. Compare the “RT on” screenshots with those of Ghostwire with “RT off”.

As we can see, Ghostwire: Tokyo can achieve similar visual fidelity in rasterized mode as Control or Watch Dogs in RT.

Of course, a scene using RT will be more accurate in reflections and lighting (duh, we’re not discounting the theory), but at what cost and benefit? There is no denying that the visual fidelity above in Ghostwire is good enough for most gamers while playing the game, comparable to when RT is enabled on a capable card. A well-done rasterized game can give as much immersive experience as the shiny RT that would need more VRAM, consume more power, heat up more, and ultimately won’t give any tangible benefit over old technology to the game. So if a game can offer this much fidelity with rasterization itself, why do we need RT which is not at all efficient even 5 years after launch?

Is RT only so that NVIDIA can sell more cards?

And this is not just generalising based on one game only. As mentioned, Saints Row The Third, and Hitman, too have great rasterized reflections. Suffices to say that immersive visual fidelity does not depend on RT; can be had with rasterization too provided the devs use the right engine, design lighting and effects well. My observations also include RT rendering on Cyberpunk and Bright Memory: Infinite – the two better examples in RT domain, and even games like a Plague Tale: Requiem and Metro: Exodus where the setting does not leave too much demand for RT.

Recently I came across another Youtube video discussing why no one is interested in new video cards. Based on my experience with a non-RT card from 6 years back in at least 3 titles, I think we know the answer. Companies like NVIDIA have no new hardware revolution going on that can considerably impact a gamer’s immersion. The shortage in last 2-3 years was due to crypto mining, not RT demand. Cards from 5-6 years back can play rasterized games at 1440p/ 1080p 60 fps just fine. NVIDIA probably knew this coming, so they tried to push RT as the next big revolution in gaming.

RT is more like creating a non-existent problem, then offering a solution.

5 years later, less than 2% of games launched are using RT, and even those who do, are faring pretty well on rasterization (there is no game that only works on RT by the way, so RT will always be a good to have addition, not a requirement for gaming). So they did not increase VRAM on 70 series cards for 3 generations until the recently launched 4070.

As for the 40 series, amusingly, NVIDIA is now pushing DLSS3, a form of supersampling, on these cards. Seems like they themselves have moved from RT on to more traditional problems in graphics rendering (outliers like path tracing “Overdrive” implementation by Cyberpunk notwithstanding). They needed something “new” to sell the 40 series cards, but thankfully we do not see too many suckers this time.

Two more “banes” for new graphics cards – the first is 1080p – 1440p “sweet spot” for gaming resolution. Most people don’t need 4k, because 1440p can give good enough fidelity with much reduced “aliasing” whether viewed on a 1440p or 1080p (even better) monitor at 2 feet, or a 4k TV at 6-8 feet. Try to play a game on a 4k TV at 1080p and see if you can notice the difference at 6 ft over playing at 4k or even 1440p. This is also a reason why heavily advertised “4k” consoles – Series X and PS5, often dynamically downscale even their “quality” modes to 1440p or around. Gamers would rather prefer locked framerates like those old generation consoles of yore, so Microsoft and Sony are prioritising locked framerates (even consistent 30 fps without a single frame drop might be better to some gamers than 60 fps with dips to 50).

We have almost arrived at the final resolution in video games

Occasional frame drops at 60 fps will be there in near future too, because that number is heavily dependent on CPU cache, and only recently have we started exploring CPU architecture with higher amounts of cache (i.e. AMD with Ryzen x3D solutions, though we are not there yet because they need to solve the thermal challenges) and performance/ efficient cores (i.e. Intel Alder Lake onwards), but such “better” CPUs have not penetrated the entry/ mid market.

The second “bane” for new card sales is AMD’s FSR, which works on NVIDIA cards also, as old as 10 series. If the game is not well optimised, AMD’s FSR can do wonders. I have used FSR occasionally in Ghostwire with my 1070 Ti, which drastically reduced power consumption without sacrificing too much on fidelity (I play 1440p on a 1080p monitor, which might help as I dont need AA). Though, FSR is ultimately a catch-all for optimised games. If we consider the examples of Resident Evil titles recently, or God of War, a well designed and optimised game will look and play great without RT.

We don’t need RT. Immersion depends on the game design, engine, and optimisation, not NVIDIA’s proprietary technology. So next time someone tries to sell you an RT card, remember to validate what exactly RT has to offer that you will miss in rasterization during gameplay.

Linux Gaming Experience – 5

In this 5th part of an ongoing series, we look at Dynamic Super Resolution (beyond 1080p gaming on a budget), and Vertical Sync settings for NVIDIA graphics cards on Linux. Though it does not matter, the below settings were tested on Manjaro Linux (with NVIDIA proprietary drivers).

5/n: NVIDIA DSR and V-Sync on Linux

Dynamic Super Resolution (DSR)

Dynamic Super Resolution is a great way of implementing anti-aliasing that gives almost same visual fidelity as the more resource intensive Multi-Level (or Multi-Sample) Anti-Aliasing (MLAA/ MSAA), as long as there is enough VRAM on the graphics card. We can have the game rendered higher than the native resolution of the monitor, then downscale the image to native resolution. This is more efficient than MLAA/ MSAA, which renders individual elements at higher resolution in order to optimize VRAM usage, then downscale them to lower resolution and eliminate the jagged edges, but the card has to do more work so MLAA usually causes a big hit on FPS. So if the card has enough VRAM for the higher resolution textures, we can use DSR to achieve AA without compromising much on the FPS (the card still has more work than rendering at 1080p, but way less than MLAA). Another reason it is more efficient is that we don’t have to depend on the game engine’s implementation of AA, which is often not standard across games. Some do it better than others; using DSR removes the bother.

For NVIDIA cards, on Windows we can set them pretty easily through the NVIDIA Control Panel. We need to enable DSR Factors in the panel, then in game the higher resolutions will be unlocked (as the game now thinks monitor supports higher resolutions), and we can just select them as any other resolution. For example, if the monitor is 1080p, we can select DSR factors of 1.33 and 2, to unlock 1440p and 2160p in the game video settings, and so on. The game will then render at that resolution and graphics card will downscale the image to 1080p, thereby providing MLAA for free. Mathematically, the DSR factor is desired vertical resolution / native vertical resolution, so 1.33 = 1440 / 1080, etc.

On Linux, however, DSR factors are not yet implemented in the NVIDIA Settings panel. In order to achieve it, we have to implicitly set a different input viewport resolution. So if we want to use dynamic resolution of 1440p on a 1080p monitor, we have to set Viewport-In to 2560×1440 (assuming monitor has 16:9 aspect ratio). Viewport-out is automatically set to the native resolution, and by default, “Panning” auto-populates with whatever we set in Viewport-In. This also means we cannot have multiple factors at once; we can only use one DSR factor at a time. For example, for 2160p we would have to set the Viewport-In to 3840×2160, which then becomes the desktop resolution. 

Desktop resolution

This setting changes the desktop resolution too, until changes are reverted again manually. We are essentially sending 1440p frames to the card to downscale and send 1080p to the monitor (e.g. if the monitor is 1080p). So we will have artificial, software rendered resolution of 1440p down-scaled to 1080p (native) so that the monitor can render everything at 1080p, as that is what it can only do. But the source image is 1440p, so it will be sharper, anti-aliased.

Setting ViewPortIn to a higher resolution than native increases perceptible real estate on the screen. In my opinion, it is actually a good thing and one need not be a gamer to experience it. If you are into programming, you will appreciate it even more.

For example, a 27″ monitor is really too big for 1080p. The ideal resolution for that size is 1440p, and this is really an effective way to get more out of the monitor. The fonts have perfect size and pixel pitch at this resolution. Of course, there will be artifacts if looked too closely, depending on the font being used, but one has to be really close, like say 3-4 inches from screen, but anything beyond 1 ft should not be discernible. Most people are at least 2 ft away from the screen. We might need to tinker a bit with font size, scale factor, and sub-pixel anti-alias settings though, but it wont take too long to arrive at the right combination. Mostly we will have to increase the font size amd/ or increase the scale factor.

Another way to “undo” the change for desktop (but keep it in NVIDIA settings for gaming) is to use display scaling. Most desktop environments like Gnome/ Budgie support it. So if we set DSR (ViewPortIn) to 2160p, then set display scaling to 200%, we get same real estate as 1080p. However, in my experience I have really not seen any deal-breaking alias with DSR, and I am happy with the pixel pitch that comes out of it.

NVIDIA Settings

We can set the values in Linux NVIDIA Settings panel, in the “X-Server Display Configuration” section as shown in Fig. 1. Launch the settings on Linux as root, assuming NVIDIA proprietary drivers are installed, from the terminal using

$ sudo nvidia-settings &

We need to open as root because we would be saving the settings file in 

/etc/X11/xorg.conf

Screenshot from 2020-12-29 19-47-25

Fig. 1: NVIDIA Viewport and Composition settings in Linux

After setting the required values in ViewPortIn field, and cross-checking the values for ViewPortOut (should be native resolution) and Panning (should be same as ViewPortIn), click on “Save to X Configuration File” button to save the settings. This will open a popup with location pre-populated as somewhere inside /etc/X11/ as shown earlier. Save the file. We can keep “Merge with existing file” option as selected.

For Manjaro, the location can be different as we will read this file through the mhwd utility. Basically, after we save the file, we need to enable it. In most distributions, that would mean adding the below 3 lines to ~/.xinitrc file at the end, save, then re-login (i.e. restart X-Server):

$ vim ~/.xinitrc

# nvidia-settings
nvidia-settings --load-config-only

exec $(get_session)

For Manjaro, however, we need to set the xorg file explicitly in mhwd. So we can save the file on NVIDIA-Settings panel to another location, i.e. 

/etc/X11/mhwd.d/nvidia.conf

then configure to read this file in mhwd using

sudo mhwd-gpu --setmod nvidia --setxorg /etc/X11/mhwd.d/nvidia.conf

and reboot.

Frame Limiting and Vertical Sync

Another thing we might want to do is achieve V-Sync through NVIDIA instead of the individual game engine, to prevent screen tearing. This is a phenomenon often seen if the video card is capable of rendering frames more than the native rate, when portions of the frame are rendered quicker, causing the screen to tear at the border of the two portions,  breaking immersion. The way to achieve it is by setting a frame limit and/ or enabling V-Sync. Neither of these settings, even if supported by the game, often work the way we want, or at the very least, cause more trouble than it solves problems, like additional stuttering/ lag.

Frame limit is best achieved using a third party monitoring tool like RivaTuner Statistics Server on Windows (a complementary application of MSI Afterburner), or Mangohud in Linux (ref. the previous article in this series). If using Mangohud, we can limit the frame count in Mangohud.conf, one of the possible locations of which is:

$ vim ~/.config/MangoHud/MangoHud.conf

In this file, we can set the var fps_limit to the frame limit we want:

fps_limit=60

Now if we open the game using Mangohud or set a Steam runtime for it (ref. Mangohud readme), the frames will not exceed 60 FPS. This might already be enough to prevent screen tearing in most games, but for a complete solution, we should also enable V-Sync through NVIDIA. 

Often the FPS is compromised if we enable V-Sync in game, so it is preferred to have it through NVIDIA settings. Again, this is pretty easy in Windows, where we just set V-Sync to “On” or “Adaptive” depending on whether the card can take it. It is not much difficult to set it on Linux either. 

NVIDIA Settings

We can achieve vertical sync using NVIDIA Settings, in the same config section by enabling the “Force Full Composition Pipeline” checkbox as shown in Fig. 1. Doing this will automatically select “Force Composition Pipeline” checkbox also. This will force rendering using the native refresh rate. On my monitor, it is 60 Hz, and it works flawlessly without causing any screen tearing. 

Note that if the game is too much work for the card, i.e. it is not able to render at 60 FPS in the first place, then these settings (one or both) will cause stuttering, as the renderer will wait for the frame interval before proceeding.

In some games these settings can actually cause stutters, annoying to the point of making the game unplayable. Should such thing happen, try to first ensure graphics settings apart from resolution are toned down to when the card can render the locked framerate. In some cases, we might have to trade off between AA and smooth framerate, so we can revert back DSR changes. Alternately we can try setting fps_limit to 30 or 45 in Mangohud.conf, or commenting it out altogether. If still does not work, disable the above checkbox in NVIDIA settings. In any case, the issue of screen tearing is when framerate goes above 60; we wont see it otherwise.

Conclusion

It is really easy to achieve DSR and V-Sync in Linux, for an efficient AA and/ or (screen) tear-free experience if the hardware supports it. Most games released up to 2017 should be playable with DSR of 1440p (from native 1080p) on a card having 4+ GB of VRAM. Those that don’t, can be tried with lower textures / other lower graphics settings, and it still might have better fidelity with DSR. VRAM is the limiting factor for rendering high resolution textures. And if the card emits more than 60 FPS, we can also set V-Sync for the perfect PC gaming experience.

As we have seen so far in my series on Linux Gaming, the above settings make the experience even closer to that on Windows. Now only if we could have a 1:1 parity between Windows and Linux control panel settings for NVIDIA. There is no reason for them to be different, most functions are supported on Linux by the hardware, it is just that they may be called differently or may not have a Linux wrapper call, which is not too much effort to write. Then we will be one step closer to achieving parity with Windows. We have come a long way already, so I have hopes…

Reset Nvidia Drivers on Linux

Often a major update (especially involving the kernel) might break the perfectly working desktop environment. Though of late most popular distributions have sorted this out quite well – I have had really good experience with Pop OS, Solus on this front. But, there can always be a corner case when this occurs.

For example, just for fun, I put a 11 year old 9600 GT card in the box to find if it still works. And Pop OS updated as per its schedule, including the kernel. It broke the Nvidia driver.

Firstly, not all drivers will work as this is an old card. The last version that supports it, is 340.x, whereas the latest that Pop installs (also comes in its ISO image) is 440.x. Any version after 340.x on Linux does not have 9600 GT in its list of supported cards. So we need to stick to 340.x for this. I had downloaded and installed this version of the driver after putting the old card.

It is always a good idea to have a working driver saved somewhere on the disk, so that we can fail-safe in these situations. Before updating the system kernel, download this driver from Nvidia website. Search for the card, on Linux x64, and download the file

NVIDIA-Linux-x86_64-340.108.run

to somewhere on the disk. Note the location, say

$HOME/Downloads

Of course, for other cards, download the version that is latest for it. If you have a card released in the last 5 years, latest drivers should do.

Now, install the OS updates normally. After reboot, if UI does not come up (i.e login screen), fall back to a terminal using Ctrl+Shift+F3. Login using user and password, and navigate to the location where the driver was downloaded.

Make the file executable

$ cd $HOME/Downloads
$ chmod 755 NVIDIA-Linux-x86_64-340.108.run

Run as root

$ sudo ./NVIDIA-Linux-x86_64-340.108.run

Follow the prompts. Most options can just be accepted, however, depending on the driver version’s compatibility with the kernel, DKMS modules may give errors (like in my case). In that case we need to reinstall without selecting that option in the prompts.

Also install 32-bit drivers when prompted, as Steam etc. needs them. Once done,

$ reboot

, and we should now reboot on to the login screen.

Ray Tracing

With launch of RTX cards by Nvidia, ray-tracing is suddenly the hottest technology in the market today. A technology which is not new, but efficient usage of the concept was science fiction even some years back.

Ray tracing is rendering a scene real time using the ambient sources of light. If you are playing FHD @ 60 Hz, then it means rendering 60 scenes of resolution 1920×1080 pixels, per second. Most graphics are rendered today using rasterization, i.e. pre-rendered bitmaps (matrices having RGBA values at each coordinate/ pixel), but ray-tracing is about rendering these values on the go depending on what the sources of light are at a given scene.

If we try to compare a scene ray-traced with the new RTX graphics cards and one rendered using some of the latest game engines (without ray tracing), it will be hard to distinguish, because the contemporary game engines have become that good. In terms of visual fidelity, there is not much difference, so we see most ray-tracing demos showing off over-the-top reflections that look plastic in an attempt to sell the new ‘shiny’ (pun intended) thing. Fortunately, shiny reflections and ultra shadows are not what ray-tracing is actually about.

The real deal of ray-tracing would be to have exactly one high res texture (map) for each object in the game, then render that same object differently with varied lighting. This means if a game engine supports ray tracing, disk size of the game will be much less. Instead of having multiple maps corresponding to each lighting that the game engine needs to support, it will have only one map (at the highest res as otherwise upscaling would be required which does not preserve the fidelity). The variations will now be rendered real time. So ray tracing effectively moves the requirement from storage to processing.

The 1199 dollar question (pun intended) is, though, whether this is going to happen now for ray-tracing supported games which will make these cards worth. A simple comparison of the disk space requirements between Shadow of the Tomb Raider (40 GB, ray-tracing support) and Rise of the Tomb Raider (25 GB, raster graphics) shows that this is not the case. One justification for the higher disk space may be that the new game is also developed for raster-based rendering, so they had to carry excess baggage. Then the thing to look for is that if they have a different, lighter build (game distribution) exclusively for ray-tracing, and as of now, they dont.

This means nothing more than the established cycle of technology – ray tracing (i.e. efficient rendering using the concept) is a new technology, and game engines (i.e. games) are not there yet to fully utilize its potential. In a couple of years we will see new game engines that do real time ray-traced rendering, and only then would ray-tracing cards be useful. And just like any other new technology, prices which are sky-high now will drop to sane levels. Till then, it is wiser to hold on to those TFLOPS kings, or maybe even invest in one as their prices would fall now.

OS problems due to Intel EIST turned off

In non-Intel motherboards that support overclocking (OC), some features for the CPU may be disabled at default settings of the BIOS. This is not the case with Intel motherboards which normally do not support overclocking, hence their BIOS are factory-defaulted to ideal values. What follows in the OC motherboards is that Windows may not install/ start properly, drivers may not be identified, etc. leading to BSODs with random STOP errors.

One cause of such STOP errors is the EIST settings in BIOS. The Enhanced Intel SpeedStep Technology (EIST) is a feature with Intel’s latest processors. This technology allows to scale up/ modify the CPU clock frequency, power consumption as per the OS’s requirements. This feature if turned off in BIOS, might lead to BSODs. A BSOD is usually associated with a STOP error code. Microsoft’s tech support site has information on the error code; but one may get random error codes for this problem, which can lead to confusion. Say, the memory is not being accessed at the instant this problem occurred. Windows will give a STOP error code corresponding to RAM failure. But in the next attempt it may be the optical drive (during identification of its drivers) or even the mouse! So it can’t be deduced rightly that the problem is actually that corresponding to the error code. In Linux, this may not cause an issue, but one may still get Kernel Panic while booting the OS.

The main culprit can be the EIST setting being disabled by default. So in case one has a non-Intel motherboard and is struggling to get past the STOP errors during Windows installation startup, better check in the BIOS settings whether EIST is turned off:

1. Power on the CPU and enter BIOS by pressing F2 or Del (depending on the BIOS manufacturer).
2. In the BIOS menu, look for Advanced CPU settings/ Cell Menu/ Overclocking settings, and enter the same.
3. Look for Intel EIST. Enable the setting.
4. Save BIOS configuration (usually F10).

At this point the system will restart and hopefully the problem will be resolved. Note that this may have to be done again if the BIOS is reset to default or the CMOS battery has expired (in which case one gets CMOS checksum error on booting because the battery is not able to provide power to BIOS when mains is switched off, hence the values are lost).