Review and testing of NVIDIA GeForce GTX TITAN X: massacre of babies. Overview of the NVIDIA TITAN X video adapter: large Pascal Appearance and dimensions

Nvidia Geforce GTX Titan X

The most powerful single-processor accelerator

  • Part 2 - Practical acquaintance

Due to the late receipt of a test sample of the new accelerator (and software for it), as well as due to the participation of our author Alexei Berillo in the work of GTC, parts of this review devoted to the architecture of the new Nvidia product and the analysis of synthetic tests will be released later (in about a week ). And now we are presenting a material that acquaints readers with the features of the video card, as well as with the results of gaming tests.

Device(s)



Nvidia Geforce GTX Titan X 12288MB 384-bit GDDR5 PCI-E
ParameterMeaningNominal value (reference)
GPUGeforce GTX Titan X (GM200)
InterfacePCI Express x16
GPU operating frequency (ROPs), MHz1000—1075 1000—1075
Memory frequency (physical (effective)), MHz1750 (7000) 1750 (7000)
Memory exchange bus width, bit384
The number of computing units in the GPU / the frequency of the blocks, MHz24/1000—1075 24/1000—1075
Number of operations (ALU) per block128
Total Operations (ALU)3072
Number of texturing units (BLF/TLF/ANIS)192
Number of rasterization blocks (ROP)96
Dimensions, mm270×100×35270×100×35
The number of slots in the system unit occupied by the video card2 2
Textolite colorblackblack
Power consumption (peak in 3D/in 2D mode/in "sleep" mode), W257/98/14 257/98/14
Noise level (in 2D mode / in 2D mode (video playback) / in maximum 3D mode), dBA20/21/29,5
Output jacks1×DVI (Dual-Link/HDMI), 1×HDMI 2.0, 3×DisplayPort 1.2
Support for multiprocessingSLI
Maximum number of receivers/monitors for simultaneous image output4 4
Auxiliary power: number of 8-pin connectors1 1
Auxiliary power: number of 6-pin connectors1 1
Maximum 2D Resolution: DP/HDMI/Dual-Link DVI/Single-Link DVI
Maximum 3D Resolution: DP/HDMI/Dual-Link DVI/Single-Link DVI3840×2400/3840×2400/2560×1600/1920×1200

Bundled with local memory

The card has 12288 MB of GDDR5 SDRAM placed in 24 4 Gb chips (12 on each side of the PCB).

As synthetic tests for DirectX 11, we used examples from the Microsoft and AMD SDKs, as well as the Nvidia demo program. The first is HDRToneMappingCS11.exe and NBodyGravityCS11.exe from the DirectX SDK (February 2010) . We also took applications from both video chip manufacturers: Nvidia and AMD. DetailTessellation11 and PNTriangles11 were taken from the ATI Radeon SDK (they are also in the DirectX SDK). Additionally, Nvidia's Realistic Water Terrain demo program, also known as Island11, was used.

Synthetic tests were carried out on the following video cards:

  • Geforce GTX Titan X GTX Titan X)
  • Geforce GTX Titan Z with standard parameters (abbreviated GTX Titan Z)
  • Geforce GTX 980 with standard parameters (abbreviated GTX 980)
  • Radeon R9 295X2 with standard parameters (abbreviated R9 295X2)
  • Radeon R9 290X with standard parameters (abbreviated R9 290X)

To analyze the performance of the new model of the Geforce GTX Titan X video card, these solutions were chosen for the following reasons. The Geforce GTX 980 is based on a graphics processor of the same Maxwell architecture, but of a lower level - GM204, and it will be very interesting for us to evaluate what the complication of the chip to GM200 gave. Well, the Geforce GTX Titan Z dual-chip video card was taken just for reference - as the most productive Nvidia video card based on a pair of GK110 chips of the previous Kepler architecture.

From rival company AMD, we also chose two graphics cards for our comparison. They are very different in principle, although they are based on the same Hawaii GPUs - they just have a different number of GPUs on the cards and they differ in positioning and price. Geforce GTX Titan X has no price competitors, so we took the most powerful dual-chip video card Radeon R9 295X2, although such a comparison would not be very interesting technically. For the latter, the fastest single-chip video card of the competitor, the Radeon R9 290X, was taken, although it was released too long ago and is based on a GPU of clearly less complexity. But there is simply no other choice from AMD solutions.

Direct3D 10: PS 4.0 pixel shader tests (texturing, looping)

We abandoned the outdated DirectX 9 benchmarks, as super-powerful solutions like the Geforce GTX Titan X don't show very good results in them, being always limited by memory bandwidth, fillrate or texturing. Not to mention the fact that dual-chip video cards do not always work correctly in such applications, and we have two of them.

The second version of RightMark3D includes two already familiar PS 3.0 tests under Direct3D 9, which were rewritten for DirectX 10, as well as two more new tests. The first pair added the ability to enable self-shadowing and shader supersampling, which additionally increases the load on video chips.

These tests measure the performance of looping pixel shaders with a large number of texture samples (up to several hundred samples per pixel in the heaviest mode) and a relatively small ALU load. In other words, they measure the speed of texture fetches and the efficiency of branching in the pixel shader.

The first pixel shader test will be Fur. At the lowest settings, it uses 15 to 30 texture samples from the heightmap and two samples from the main texture. The Effect detail - "High" mode increases the number of samples to 40-80, the inclusion of "shader" supersampling - up to 60-120 samples, and the "High" mode together with SSAA is characterized by the maximum "severity" - from 160 to 320 samples from the height map.

Let's first check the modes without supersampling enabled, they are relatively simple, and the ratio of results in the "Low" and "High" modes should be approximately the same.

The performance in this test depends on the number and efficiency of TMUs, and the efficiency of executing complex programs also affects. And in the version without supersampling, the effective fillrate and memory bandwidth also have an additional impact on performance. The results when detailing the "High" level are up to one and a half times lower than when "Low".

In the tasks of procedural fur rendering with a large number of texture selections, with the release of video chips based on the GCN architecture, AMD has long since seized the lead. It is Radeon boards that are still the best in these comparisons to this day, which indicates that they are more efficient in carrying out these programs. This conclusion is confirmed by today's comparison - the Nvidia video card we are considering lost even to the outdated single-chip Radeon R9 290X, not to mention the closest price competitor from AMD.

In the first Direct3D 10 test, the new video card of the Geforce GTX Titan X model turned out to be slightly faster than its younger sister based on a chip of the same architecture in the form of the GTX 980, but the latter is not far behind - 9-12%. This result can be explained by the noticeably lower texturing speed of the GTX 980, and it lags behind in other parameters, although the point is clearly not in the performance of ALU units. The dual-chip Titan Z is faster, but not as fast as the Radeon R9 295X2.

Let's look at the result of the same test, but with "shader" supersampling turned on, which quadruples the work: in such a situation, something should change, and memory bandwidth with fillrate will have less effect:

In difficult conditions, the new video card of the Geforce GTX Titan X model is already more noticeably ahead of the younger model from the same generation - GTX 980, being faster by a decent 33-39%, which is much closer to the theoretical difference between them. And the backlog from competitors in the form of the Radeon R9 295X2 and R9 290X has decreased - the new product from Nvidia has almost caught up with the single-chip Radeon. However, the two-chip one is far ahead, because AMD chips prefer pixel-by-pixel calculations and are very strong in such calculations.

The next DX10 test measures the performance of executing complex looping pixel shaders with a large number of texture fetches and is called Steep Parallax Mapping. At low settings, it uses 10 to 50 texture samples from the heightmap and three samples from the main textures. When you turn on heavy mode with self-shadowing, the number of samples is doubled, and supersampling quadruples this number. The most complex test mode with supersampling and self-shadowing selects from 80 to 400 texture values, that is, eight times more than the simple mode. We first check simple options without supersampling:

The second Direct3D 10 pixel shader test is more interesting from a practical point of view, since parallax mapping varieties are widely used in games, and heavy variants, like steep parallax mapping, have long been used in many projects, for example, in Crysis, Lost Planet and many other games. In addition, in our test, in addition to supersampling, you can turn on self-shadowing, which increases the load on the video chip by about two times - this mode is called "High".

The diagram is generally similar to the previous one, also without the inclusion of supersampling, and this time the new Geforce GTX Titan X turned out to be a little closer to the GTX Titan Z, losing not so much to a two-chip board based on a pair of Kepler family GPUs. Under different conditions, the new product is 14-19% ahead of the previous top model of the current generation from Nvidia, and even if we take a comparison with AMD video cards, something has changed here - in this case, the new GTX Titan X is slightly inferior to the Radeon R9 290X quite a bit. The dual-chip R9 295X2, however, is far ahead of everyone. Let's see what will change the inclusion of supersampling:

When supersampling and self-shadowing are enabled, the task becomes more difficult, the combined inclusion of two options at once increases the load on the cards by almost eight times, causing a serious drop in performance. The difference between the speed indicators of the tested video cards has changed slightly, although the inclusion of supersampling has less effect than in the previous case.

AMD Radeon graphics solutions perform more efficiently in this D3D10 pixel shader test than competing Geforce boards, but the new GM200 chip changes the situation for the better - the Geforce GTX Titan X board based on the Maxwell architecture chip is already ahead of the Radeon R9 290X in all conditions ( however, based on a noticeably less complex GPU). The dual-chip solution based on the Hawaii pair remains the leader, but compared to other Nvidia solutions, the new product is not bad. It showed speed almost at the level of dual-chip Geforce GTX Titan Z, and outperformed Geforce GTX 980 by 28-33%.

Direct3D 10: PS 4.0 Pixel Shader Benchmarks (Computing)

The next couple of pixel shader tests contain the minimum number of texture fetches to reduce the impact of TMU performance. They use a large number of arithmetic operations, and they measure precisely the mathematical performance of video chips, the speed of execution of arithmetic instructions in the pixel shader.

The first math test is Mineral. This is a complex procedural texturing test that uses only two texture data samples and 65 sin and cos instructions.

The results of extreme mathematical tests most often correspond to the difference in frequencies and the number of computing units, but only approximately, since the results are affected by the different efficiency of their use in specific tasks, and driver optimization, and the latest frequency and power management systems, and even an emphasis on memory bandwidth . In the case of the Mineral test, the new Geforce GTX Titan X model is only 10% faster than the GTX 980 board based on the GM204 chip from the same generation, and the dual-chip GTX Titan Z was not so fast in this test - something is clearly preventing Nvidia boards from opening up .

Comparing the Geforce GTX Titan X to competing motherboards from AMD wouldn't be so sad if the GPUs in the R9 290X and Titan X were close in complexity. But the GM200 is much bigger than the Hawaii, and its small win is only natural. Nvidia's architecture upgrade from Kepler to Maxwell has brought the new chips closer to competing AMD solutions in such tests. But even the cheaper dual-chip solution Radeon R9 295X2 is noticeably faster.

Let's consider the second test of shader calculations, which is called Fire. It is heavier for ALU, and there is only one texture fetch in it, and the number of sin and cos instructions has been doubled, up to 130. Let's see what has changed with increasing load:

In the second mathematical test from RigthMark, we see already different results for video cards relative to each other. So, the new Geforce GTX Titan X is already stronger (by 20%) ahead of the GTX 980 on a chip of the same graphics architecture, and the dual-chip Geforce is very close to the new product - Maxwell copes with computational tasks much better than Kepler.

The Radeon R9 290X is left behind, but as we already wrote, the Hawaii GPU is noticeably simpler than the GM200, and this difference is logical. But although the dual-chip Radeon R9 295X2 continues to be the leader in math tests, in general, the new Nvidia video chip performed well in such tasks, although it did not reach the theoretical difference with the GM204.

Direct3D 10: Geometry Shader Tests

There are two geometry shader speed tests in RightMark3D 2.0, the first option is called "Galaxy", the technique is similar to "point sprites" from previous versions of Direct3D. It animates a particle system on the GPU, a geometry shader from each point creates four vertices that form a particle. Similar algorithms should be widely used in future DirectX 10 games.

Changing the balance in the geometry shader tests does not affect the final rendering result, the final image is always exactly the same, only the scene processing methods change. The "GS load" parameter determines in which shader the calculations are performed - in vertex or geometry. The number of calculations is always the same.

Let's consider the first version of the "Galaxy" test, with calculations in the vertex shader, for three levels of geometric complexity:

The ratio of speeds with different geometric complexity of the scenes is approximately the same for all solutions, the performance corresponds to the number of points, with each step the FPS drop is close to twofold. This task is very simple for powerful modern video cards, and performance in it is limited by the speed of geometry processing, and sometimes by memory bandwidth and/or fillrate.

The difference between the results of video cards based on Nvidia and AMD chips is usually in favor of the solutions of the Californian company, and it is due to differences in the geometric pipelines of the chips of these companies. In this case too, the top Nvidia video chips have many geometry processing units, so the gain is obvious. In geometry tests, Geforce boards are always more competitive than Radeon.

The new Geforce GTX Titan X model slightly lags behind the dual-chip GTX Titan Z board on previous generation GPUs, but it outperforms the GTX 980 by 12-25%. Radeon graphics cards show markedly different results, as the R9 295X2 is based on a pair of GPUs, and only it can compete with the novelty in this test, and the Radeon R9 290X was an outsider. Let's see how the situation changes when transferring part of the calculations to the geometry shader:

When the load changed in this test, the numbers changed slightly, for AMD boards and for Nvidia solutions. And it doesn't really change anything. Video cards in this test of geometry shaders react poorly to changes in the GS load parameter, which is responsible for transferring part of the calculations to the geometry shader, so the conclusions remain the same.

Unfortunately, "Hyperlight" is the second test of geometry shaders, which demonstrates the use of several techniques at once: instancing, stream output, buffer load, which uses dynamic geometry creation by drawing to two buffers, as well as a new Direct3D 10 feature - stream output, on All modern AMD graphics cards just don't work. At some point, another Catalyst driver update caused this test to stop running on Catalyst boards, and this has not been fixed for several years now.

Direct3D 10: texture fetch rate from vertex shaders

The "Vertex Texture Fetch" tests measure the speed of a large number of texture fetches from a vertex shader. The tests are similar in essence, so the ratio between the results of the cards in the "Earth" and "Waves" tests should be approximately the same. Both tests use displacement mapping based on texture sampling data, the only significant difference is that the "Waves" test uses conditional jumps, while the "Earth" test does not.

Consider the first test "Earth", first in "Effect detail Low" mode:

Our previous studies have shown that both fillrate and memory bandwidth can affect the results of this test, which is clearly visible on the results of Nvidia boards, especially in simple modes. The new video card from Nvidia in this test shows a speed that is clearly lower than it should be - all Geforce boards turned out to be approximately on the same level, which clearly does not correspond to theory. In all modes, they clearly run into something like a memory bandwidth. However, the Radeon R9 295X2 is also nowhere near twice as fast as the R9 290X.

By the way, this time AMD's single-chip board turned out to be stronger than all Nvidia's boards in light mode and approximately at their level in hard mode. Well, the dual-chip Radeon R9 295X2 again became the leader of our comparison. Let's look at the performance in the same test with an increased number of texture fetches:

The situation on the diagram has slightly changed, AMD's single-chip solution in heavy modes has lost significantly more Geforce boards. The new model Geforce GTX Titan X showed speeds up to 14% faster than the Geforce GTX 980, and outperformed the single-chip Radeon in all modes except the lightest - because of the same focus on something. If we compare the new product with AMD's dual-chip solution, then Titan X was able to fight in heavy mode, showing close performance, but lagging behind in light modes.

Let's consider the results of the second test of texture fetches from vertex shaders. The Waves test has fewer samples, but it uses conditional jumps. The number of bilinear texture samples in this case is up to 14 ("Effect detail Low") or up to 24 ("Effect detail High") per vertex. The complexity of the geometry changes similarly to the previous test.

The results in the second "Waves" vertex texturing test are nothing like what we saw in the previous diagrams. The speed performance of all Geforces in this test has seriously deteriorated, and the new Nvidia Geforce GTX Titan X model shows speed only slightly faster than the GTX 980, lagging behind the dual-chip Titan Z. Compared to competitors, both Radeon boards were able to show the best performance in this test during all modes. Consider the second version of the same problem:

With the complexity of the task in the second texture sampling test, the speed of all solutions became lower, but Nvidia video cards suffered more, including the model under consideration. Almost nothing changes in the conclusions, the new Geforce GTX Titan X model is up to 10-30% faster than the GTX 980, lagging behind both the dual-chip Titan Z and both Radeon boards. The Radeon R9 295X2 was far ahead in these tests, and from the point of view of theory, this is simply inexplicable except for insufficient optimization from Nvidia.

3DMark Vantage: Feature tests

Synthetic tests from the 3DMark Vantage package will show us what we previously missed. Feature tests from this test package have DirectX 10 support, are still relevant and interesting because they differ from ours. When analyzing the results of the latest video card Geforce GTX Titan X in this package, we will draw some new and useful conclusions that have eluded us in tests from RightMark family packages.

Feature Test 1: Texture Fill

The first test measures the performance of texture fetch units. Used to fill a rectangle with values ​​read from a small texture using multiple texture coordinates that change every frame.

The efficiency of AMD and Nvidia video cards in Futuremark's texture test is quite high, and the final figures of different models are close to the corresponding theoretical parameters. So, the difference in speed between the GTX Titan X and the GTX 980 turned out to be 38% in favor of a solution based on the GM200, which is close to theory, because the new product has one and a half times more TMU units, but they operate at a lower frequency. Naturally, the lag behind the dual-GTX Titan Z remains, as the two GPUs have faster texturing speeds.

As for the comparison of the texturing speed of the new top Nvidia video card with similarly priced solutions of the competitor, here the novelty is inferior to the two-chip rival, which is a relative neighbor in the price niche, but it is ahead of the Radeon R9 290X, although not too much. Still, AMD graphics cards are still doing a little better with texturing.

Feature Test 2: Color Fill

The second task is the fill rate test. It uses a very simple pixel shader that does not limit performance. The interpolated color value is written to an offscreen buffer (render target) using alpha blending. It uses a 16-bit FP16 off-screen buffer, the most commonly used in games that use HDR rendering, so this test is quite timely.

The numbers of the second 3DMark Vantage subtest show the performance of ROP units, without taking into account the amount of video memory bandwidth (the so-called "effective fill rate"), and the test measures exactly the performance of ROP. The Geforce GTX Titan X board we are reviewing today is noticeably ahead of both Nvidia boards, the GTX 980 and even the GTX Titan Z, outperforming the single-chip board based on GM204 by as much as 45% - the number of ROPs and their efficiency in the top GPU of the Maxwell architecture is excellent!

And if we compare the scene filling speed of the new Geforce GTX Titan X video card with AMD video cards, then the Nvidia board we are considering in this test shows the best scene filling speed even in comparison with the most powerful dual-chip Radeon R9 295X2, not to mention the considerably lagging behind Radeon R9 290X. A large number of ROP blocks and optimizations for the efficiency of framebuffer data compression did their job.

Feature Test 3: Parallax Occlusion Mapping

One of the most interesting feature tests, since this technique is already used in games. It draws one quadrilateral (more precisely, two triangles) using the special Parallax Occlusion Mapping technique, which imitates complex geometry. Rather resource-intensive ray tracing operations and a high-resolution depth map are used. This surface is also shaded using the heavy Strauss algorithm. This is a test of a very complex and heavy pixel shader for a video chip, which contains numerous texture fetches during ray tracing, dynamic branching, and complex Strauss lighting calculations.

This test from the 3DMark Vantage package differs from the previous ones in that the results in it depend not only on the speed of mathematical calculations, the efficiency of branch execution, or the speed of texture fetches, but on several parameters simultaneously. To achieve high speed in this task, the correct balance of the GPU is important, as well as the efficiency of executing complex shaders.

In this case, both mathematical and texture performance are important, and in this "synthetics" from 3DMark Vantage, the new Geforce GTX Titan X board turned out to be more than a third faster than the model based on the GPU of the same Maxwell architecture. And even the dual-chip Kepler in the form of the GTX Titan Z outperformed the novelty by less than 10%.

Nvidia's single-chip top-end board clearly outperformed the single-chip Radeon R9 290X in this test, but both are seriously outperformed by the dual-chip Radeon R9 295X2. GPUs from AMD are somewhat more efficient than Nvidia chips in this task, and the R9 295X2 has two of them.

Feature Test 4: GPU Cloth

The fourth test is interesting because it calculates physical interactions (cloth imitation) using a video chip. Vertex simulation is used, using the combined operation of the vertex and geometry shaders, with several passes. Use stream out to transfer vertices from one simulation pass to another. Thus, the performance of the execution of vertex and geometry shaders and the stream out speed are tested.

The rendering speed in this test also depends on several parameters at once, and the main factors of influence should be the performance of geometry processing and the efficiency of geometry shaders. That is, the strengths of Nvidia chips should show up, but alas - we saw a very strange result (rechecked), the new Nvidia video card showed not too high speed, to put it mildly. Geforce GTX Titan X in this subtest showed the worst result of all solutions, lagging behind even the GTX 980 by almost 20%!

Well, the comparison with Radeon boards in this test is just as unsightly for a new product. Despite the theoretically smaller number of geometric execution units and the geometric performance lag of AMD chips compared to competing solutions, both Radeon boards work very efficiently in this test and outperform all three Geforce boards presented in comparison. Again, it looks like a lack of optimization in Nvidia drivers for a specific task.

Feature Test 5: GPU Particles

A test for physical simulation of effects based on particle systems calculated using a video chip. Vertex simulation is also used, each vertex represents a single particle. Stream out is used for the same purpose as in the previous test. Several hundred thousand particles are calculated, all are animated separately, their collisions with the height map are also calculated.

Similar to one of our RightMark3D 2.0 tests, the particles are drawn using a geometry shader that creates four vertices from each point to form the particle. But the test loads shader blocks with vertex calculations most of all, stream out is also tested.

In the second "geometric" test from 3DMark Vantage, the situation has seriously changed, this time all Geforces already show a more or less normal result, although the dual-chip Radeon still remains in the lead. The new GTX Titan X model is 24% faster than its sister GTX 980 and about the same time behind the dual-GPU Titan Z on the previous generation GPU.

The comparison of Nvidia's novelty with competing video cards from AMD this time is more positive - it showed the result between two boards from the rival company, and turned out to be closer to the Radeon R9 295X2, which has two GPUs. The novelty is far ahead of the Radeon R9 290X and this clearly shows us how different two seemingly similar tests can be: cloth simulation and particle system simulation.

Feature Test 6: Perlin Noise

The last feature test of the Vantage package is a mathematically intensive test of the video chip, it calculates several octaves of the Perlin noise algorithm in the pixel shader. Each color channel uses its own noise function to increase the load on the video chip. Perlin noise is a standard algorithm often used in procedural texturing, it uses a lot of mathematical calculations.

In this case, the performance of solutions does not quite match the theory, although it is close to what we saw in similar tests. In the mathematical test from the Futuremark package, which shows the peak performance of video chips in limit tasks, we see a different distribution of results compared to similar tests from our test package.

We have known for a long time that AMD video chips with GCN architecture still cope with such tasks better than competitor solutions, especially in cases where intensive "mathematics" is performed. But Nvidia's new top model is based on the large GM200 chip, and so the Geforce GTX Titan X performed noticeably better than the Radeon R9 290X in this test.

If we compare the new product with the best model of the Geforce GTX 900 family, then in this test the difference between them was almost 40% - in favor of the video card we are considering today, of course. This is also close to the theoretical difference. Not a bad result for the Titan X, only the dual-chip Radeon R9 295X2 was ahead, and far ahead.

Direct3D 11: Compute Shaders

To test Nvidia's recently released top-of-the-line solution for tasks that use DirectX 11 features such as tessellation and compute shaders, we used SDK examples and demos from Microsoft, Nvidia, and AMD.

First, we'll look at benchmarks that use Compute shaders. Their appearance is one of the most important innovations in the latest versions of the DX API, they are already used in modern games to perform various tasks: post-processing, simulations, etc. The first test shows an example of HDR rendering with tone mapping from the DirectX SDK, with post-processing , which uses pixel and compute shaders.

The calculation speed in the compute and pixel shaders for all AMD and Nvidia boards is approximately the same, differences were observed only for video cards based on GPUs of previous architectures. Judging by our previous tests, the results in a problem often depend not so much on mathematical power and computational efficiency, but on other factors, such as memory bandwidth.

In this case, the new top-end graphics card is faster than the single-chip versions Geforce GTX 980 and Radeon R9 290X, but behind the dual-chip R9 295X2, which is understandable, because it has the power of a pair of R9 290X. If we compare the new product with the Geforce GTX 980, then the motherboard of the Californian company considered today is 34-36% faster - exactly according to the theory.

The second compute shader test is also taken from the Microsoft DirectX SDK and shows an N-body (N-body) gravity computational problem, a simulation of a dynamic particle system that is subject to physical forces such as gravity.

In this test, most often there is an emphasis on the speed of execution of complex mathematical calculations, geometry processing and the efficiency of code execution with branching. And in this DX11 test, the alignment of forces between the solutions of two different companies turned out to be completely different - clearly in favor of Geforce video cards.

However, the results of a pair of Nvidia solutions based on different chips are also strange - Geforce GTX Titan X and GTX 980 are almost equal, they are separated by only 5% difference in performance. Dual-chip rendering does not work in this task, so the rivals (single-chip and dual-chip Radeon models) are about equal in speed. Well, the GTX Titan X is three times ahead of them. It seems that this task is calculated much more efficiently on GPUs of the Maxwell architecture, which we noted earlier.

Direct3D 11: Tessellation performance

Compute shaders are very important, but another major new feature in Direct3D 11 is hardware tessellation. We considered it in great detail in our theoretical article about Nvidia GF100. Tessellation has been used in DX11 games for a long time, such as STALKER: Call of Pripyat, DiRT 2, Aliens vs Predator, Metro Last Light, Civilization V, Crysis 3, Battlefield 3 and others. Some of them use tessellation for character models, others to simulate a realistic water surface or landscape.

There are several different schemes for partitioning graphic primitives (tessellation). For example, phong tessellation, PN triangles, Catmull-Clark subdivision. So, the PN Triangles tiling scheme is used in STALKER: Call of Pripyat, and in Metro 2033 - Phong tessellation. These methods are relatively quick and easy to implement into the game development process and existing engines, which is why they have become popular.

The first tessellation test will be the Detail Tessellation example from the ATI Radeon SDK. It implements not only tessellation, but also two different pixel-by-pixel processing techniques: a simple overlay of normal maps and parallax occlusion mapping. Well, let's compare DX11 solutions from AMD and Nvidia in different conditions:

In the simple bumpmapping test, the speed of the boards is not very important, since this task has become too easy for a long time, and the performance in it depends on the memory bandwidth or fillrate. Today's hero of the review is 23% ahead of the previous top model Geforce GTX 980 based on the GM204 chip and slightly inferior to its competitor in the form of the Radeon R9 290X. The dual-chip version is even a little faster.

In the second subtest with more complex pixel-by-pixel calculations, the new product is already 34% faster than the Geforce GTX 980, which is closer to the theoretical difference between them. But Titan X this time is already a little faster than a single-chip conditional competitor based on a single Hawaii. Since the two chips in the Radeon R9 295X2 work perfectly, this task is completed even faster on it. Although the performance of mathematical calculations in pixel shaders is higher for GCN architecture chips, the release of Maxwell architecture solutions improved the positions of Nvidia solutions.

In the light tessellation subtest, the recently announced Nvidia board is again only a quarter faster than the Geforce GTX 980 - perhaps the speed is limited by memory bandwidth, since texturing in this test has almost no effect. If we compare the new product with AMD boards in this subtest, then the Nvidia board is again inferior to both Radeons, since in this tessellation test the triangle splitting is very moderate and the geometric performance does not limit the overall rendering speed.

The second tessellation performance test will be another example for 3D developers from the ATI Radeon SDK - PN Triangles. In fact, both examples are also included in the DX SDK, so we are sure that game developers create their own code based on them. We tested this example with a different tessellation factor to see how much it affects the overall performance.

In this test, more complex geometry is used, therefore, a comparison of the geometric power of various solutions brings different conclusions. The modern solutions presented in the material cope quite well with light and medium geometric loads, showing high speed. But while Hawaii's one and two GPUs in the Radeon R9 290X and R9 295X2 perform well in light conditions, Nvidia's boards come out on top in heavy duty. So, in the most difficult modes, the Geforce GTX Titan X presented today shows the speed already noticeably better than the dual-chip Radeon.

As for the comparison of Nvidia boards based on GM200 and GM204 chips, the Geforce GTX Titan X model under consideration today increases its advantage with an increase in the geometric load, since in the light mode everything depends on the memory bandwidth. As a result, the new product is ahead of the Geforce GTX 980 board, depending on the complexity of the mode, by up to 31%.

Let's take a look at the results of another test, the Nvidia Realistic Water Terrain demo program, also known as Island. This demo uses tessellation and displacement mapping to render a realistic looking ocean surface and terrain.

The Island test is not a purely synthetic test for measuring purely geometric GPU performance, as it contains both complex pixel and compute shaders, and such a load is closer to real games that use all GPU units, and not just geometric ones, as in previous geometry tests. Although the load on the geometry processing units still remains the main one, the same memory bandwidth, for example, can also affect.

We test all video cards at four different tessellation factors - in this case, the setting is called Dynamic Tessellation LOD. With the first triangle splitting factor, the speed is not limited by the performance of geometric blocks, and Radeon video cards show a rather high result, especially the two-chip R9 295X2, which even surpasses the result of the announced Geforce GTX Titan X board, but already at the next geometric load levels, the performance of Radeon cards decreases, and solutions Nvidia are taking the lead.

The advantage of the new Nvidia board based on the GM200 video chip over its rivals in such tests is already quite decent, and even multiple. If we compare Geforce GTX Titan X with GTX 980, then the difference between their performance reaches 37-42%, which is perfectly explained by theory and exactly corresponds to it. Maxwell GPUs are noticeably more efficient in mixed workloads, switching quickly from graphics to computing and back again, and the Titan X is much faster than even the dual-chip Radeon R9 295X2 in this test.

After analyzing the results of synthetic tests of the new Nvidia Geforce GTX Titan X video card based on the new top-end GM200 GPU, as well as considering the results of other video card models from both manufacturers of discrete video chips, we can conclude that the video card we are considering today should be the fastest on the market, competing with the strongest dual-chip graphics card from AMD. In general, this is a good follower of the Geforce GTX Titan Black - a powerful single-chip.

The new graphics card from Nvidia shows pretty strong results in synthetics - in many tests, though not in all. Radeon and Geforce traditionally have different strengths. In a large number of tests, the two GPUs in the Radeon R9 295X2 model were faster, including due to the higher overall memory bandwidth and texturing speed with very efficient execution of computational tasks. But in other cases, the top graphics processor of the Maxwell architecture wins back, especially in geometric tests and tessellation examples.

However, in real gaming applications, everything will be somewhat different, compared to "synthetics" and the Geforce GTX Titan X should show a speed significantly higher than the level of single-chip Geforce GTX 980, and even more so the Radeon R9 290X. And it is difficult to compare the novelty with the dual-chip Radeon R9 295X2 - systems based on two or more GPUs have their own unpleasant features, although they provide an increase in the average frame rate with proper optimization.

But the architectural features and functionality are clearly in favor of Nvidia's premium solution. Geforce GTX Titan X consumes much less energy than the same Radeon R9 295X2, and in terms of energy efficiency, the new Nvidia model is very strong - this is a distinctive feature of the Maxwell architecture. We should not forget about the greater functionality of Nvidia's new product: there is support for Feature Level 12.1 in DirectX 12, VXGI hardware acceleration, a new MFAA anti-aliasing method, and other technologies. We already spoke about the market point of view in the first part - in the elite segment, not so much depends on the price. The main thing is that the solution should be as functional and productive as possible in gaming applications. Simply put, it was the best in everything.

Just in order to evaluate the speed of the novelty in games, in the next part of our material we will determine the performance of the Geforce GTX Titan X in our set of gaming projects and compare it with the performance of competitors, including evaluating the justification of the retail price of the novelty from the point of view of enthusiasts, and also find out how much faster it is Geforce GTX 980 already in games.

Asus ProArt PA249Q monitor for work computer provided by the company Asustek Cougar 700K keyboard for work computer provided by the company Cougar

The appearance of a large GPU based on the Maxwell architecture was inevitable, the only question is when and in what form. As a result, the assumption was justified that the GM200 will repeat the path of its counterpart from the Kepler family, the GK110, making its debut as part of an accelerator under the TITAN brand.

NVIDIA GeForce GTX TITAN X

There was very little time to test the new video card this time, so the review will be compressed. Discarding unnecessary arguments, let's get straight to the point. The Maxwell architecture, in comparison with Kepler, is characterized by a simplified and optimized structure of streaming multiprocessors (SMM), which made it possible to radically reduce the SMM area, while maintaining 90% of the previous performance. In addition, GM200 belongs to the second iteration of the Maxwell architecture, like the previously released GM204 (GeForce GTX 970/980) and GM206 (GeForce GTX 960) chips. As a result, it has a more productive geometry engine PolyMorph Engine version 3.0 and supports some computing functions at the hardware level, which are likely to be included in the new Direct3D 12 feature level, and are also necessary for hardware acceleration of NVIDIA's VXGI global illumination technology. For a more detailed description of the first and second generation Maxwell architecture, we refer readers to the GeForce GTX 750 Ti and GeForce GTX 980 reviews.

NVIDIA GM200 GPU Block Diagram

Qualitatively, the GM200 GPU and lower-end GPUs in the line are no different, except that only the GM206 has a dedicated H.265 (HEVC) video decoder. The differences are purely quantitative. GM200 includes an unprecedented number of transistors - 8 billion, so there are one and a half to two times more computing units in it than in GM204 (depending on which ones to count). In addition, the 384-bit memory bus returned to service. Compared to the GK110 chip, the new flagship GPU is not as intimidatingly powerful, but, for example, the number of ROPs is double here, which makes the GM200 well prepared for 4K resolution.

In terms of support for double precision calculations, the GM200 is no different from the GM204. Each SMX contains only four FP64 compatible CUDA cores, so the combined performance under this load is 1/32 of FP32.

⇡ Specifications, price

TITAN X uses the most powerful version of the GM200 core with a full set of active computing units. The base frequency of the GPU is 1000 MHz, Boost Clock is 1076 MHz. The memory operates at the standard frequency for Maxwell-based products of 7012 MHz. But the volume is unprecedented for gaming video cards - 12 GB (and TITAN X is primarily a gaming video card, at least until the GM200 appeared in the main, "numbered" GeForce line).

Suggested retail prices for TITAN X were announced in the last hours before the review was published. For the US market, the price is set at $ 999 - the same as the first TITAN based on the GK110.

Note: the prices in the table for the GeForce GTX 780 Ti and TITAN Black are at the time the latter were discontinued.

Model

GPU

video memory

TDP, W

RRP* for the US market (excluding taxes), $

code name

Number of transistors, million

Clock frequency, MHz: Base Clock / Boost Clock

Number of CUDA Cores

Number of texture units

Bus width, bit

Chip type

Clock frequency: real (effective), MHz

Volume, MB

GeForce GTX 780 Ti

GeForce GTX TITAN Black

GeForce GTX 980

GeForce GTX TITAN X

⇡ Construction

Since the very first "Titan", NVIDIA has been using the same cooling system in top-end video cards, with some variations. TITAN X stands out among its predecessors only with an absolutely black body (only two inserts on the sides remained unpainted).

NVIDIA GeForce GTX TITAN X

The back plate, which was experimentally equipped with the GeForce GTX 980, is again absent in TITAN X, despite the fact that part of the memory chips is soldered on the back of the board. Although GDDR5 chips do not require additional cooling, in general.

NVIDIA GeForce GTX TITAN X rear view

But the heatsink with the vapor chamber returned, which in the GTX 980 was replaced by a simpler option.

NVIDIA GeForce GTX TITAN X, cooling system

NVIDIA GeForce GTX TITAN X, cooling system

NVIDIA GeForce GTX TITAN X, cooling system

The video card has three DisplayPort connectors and one each - HDMI and Dual-Link DVI-I.

⇡ Fee

The design of the printed circuit board, which is not surprising, evokes associations with a series of video adapters based on the GK110 chip. The voltage converter is built according to the 6 + 2 scheme (the number of phases for powering the GPU and memory chips, respectively). Power is supplied through one 8-pin and one 6-pin connector. But we see the ON Semiconductor NCP81174 GPU power controller here for the first time.

24 memory chips SK hynix H5GQ4H24MFR-R2C with a nominal frequency of 7 GHz are located on both sides of the board.

NVIDIA GeForce GTX TITAN X, printed circuit board, front side

NVIDIA GeForce GTX TITAN X, printed circuit board, rear side

Test stand, testing methodology

Power-saving CPU technologies are disabled in all tests. In the NVIDIA driver settings, the CPU is selected as the processor for PhysX calculation. In AMD drivers, the Tesselation setting is moved from AMD Optimized to Use application settings.

Benchmarks: synthetic
Program Settings Permission
3D Mark 2011 Extreme Test - -
3DMark Fire Strike test (not Extreme) - -
Unigine Heaven 4 DirectX 11 max. quality, tessellation in Extreme mode AF 16x, MSAA 4x 1920×1080 / 2560×1440
Benchmarks: games
Program Settings Anisotropic filtering, full-screen anti-aliasing Permission
Far Cry 3 + FRAPS DirectX 11 max. quality, HDAO. Beginning of the Secure the Outpost mission AF, MSAA 4x 2560×1440/3840×2160
Tomb Raider. Built-in benchmark Max. quality AF 16x, SSAA 4x 2560×1440/3840×2160
Bioshock Infinite. Built-in benchmark Max. quality. Postprocessing: Normal AF 16x, FXAA 2560×1440/3840×2160
Crysis 3 + FRAPS Max. quality. Beginning of the Post Human mission AF 16x, MSAA 4x 2560×1440/3840×2160
Metro: Last Light. Built-in benchmark Max. quality AF 16x, SSAA 4x 2560×1440/3840×2160
Company of Heroes 2. Built-in benchmark Max. quality AF, SSAA 4x 2560×1440/3840×2160
Battlefield 4 + FRAPS Max. quality. The start of the Tashgar mission AF 16x, MSAA 4x + FXAA 2560×1440/3840×2160
Thief. Built-in benchmark Max. quality AF 16x, SSAA 4x + FXAA 2560×1440/3840×2160
Alien: Isolation Max. quality AF 16x, SMAA T2X 2560×1440/3840×2160

Test participants

The following video cards took part in performance testing:

  • NVIDIA GeForce GTX TITAN X (1000/7012 MHz, 12 GB);

⇡ Clock speeds, power consumption, temperature, overclocking

The GM110 runs at a base frequency that the GK110 has never reached in its reference specs. In addition, GPU Boost acts very aggressively, raising the frequency up to 1177 MHz. At the same time, the processor is content with a voltage of 1.174 V, which is lower relative to top-end products based on the GK110.

BIOS settings allow you to increase the power limit to 110% and add 83 mV to the maximum voltage on the GPU. In fact, the voltage rises only to 1.23 V, but at the same time several additional frequency steps / VID are opened: the difference between the base frequency and the maximum frequency recorded in the speaker increases to 203 MHz.

Overclocking the video card made it possible to reach the base frequency of 1252 MHz, and frequencies up to 1455 MHz were observed in dynamics. The video memory was able to add 1.2 GHz, successfully operating at an effective frequency of 8212 MHz.

Base clock, MHz Max. Boost Clock, MHz Base Clock, MHz (overclocking) Max. registered Boost Clock, MHz (overclocking)
GeForce GTX TITAN X 1000 1177 (+177) 1252 1455 (+203)
GeForce GTX 980 1127 1253 (+126) 1387 1526 (+139)
GeForce GTX TITAN Black 889 1032 (+143) 1100 1262 (+162)
GeForce GTX TITAN 836 1006 (+145) 966 1150 (+184)
GeForce GTX 780 Ti 876 1020 (+144) 986 1130 (+144)
GeForce GTX 780 863 1006 (+143) 1053 1215 (+162)
GeForce GTX 770 1046 1176 (+130) 1190 1333 (+143)

In terms of power consumption, TITAN X is close to GTX 780 Ti and far exceeds GTX 980. Contrary to expectations, in Crysis 3 there is no significant difference between TITAN X and Radeon R9 290X, but in FurMark R9 290X (like R9 280X) warms up more and noticeably outperforms TITAN x.

Overclocking TITAN X increases power by 5-25 watts depending on whether you rely on the results of which test - FurMark or Crysis 3.

The maximum temperature that is allowed for the GPU is determined by the BIOS settings, so TITAN X does not go beyond the set 83 ° C. At the same time, the cooling system turbine spins up at 49% of the maximum speed - up to 2339 rpm. At first glance, this is quite a lot, but in fact, the noise from the cooler is quite acceptable.

⇡ Performance: synthetic benchmarks

  • TITAN X impresses from the very first test. Compared to the GTX 780 Ti and Radeon R9 290X, the graphics card is one and a half times faster.
  • With the Radeon R9 280X and GeForce GTX 770 - adapters based on the once top GPUs - the difference is more than twofold.

  • All of the above is also true for 3DMark 2013.

Unigine Heaven 4

  • TITAN X maintains a lead of about 50% over the GTX 780 Ti and Radeon R9 290X at WQHD resolution. By the way, unlike 3DMark, the GTX 980 is no better than the GTX 780 Ti in this test.
  • At Ultra HD resolution, previous graphics adapters have closed the gap, and yet TITAN X is head and shoulders above all rivals.

⇡ Performance: games

This time we will deviate from the standard form of describing game tests. It is completely pointless to paint for each game which video card is faster, in the case of TITAN X. In all games, the new "Titan" is ahead of its rivals by a colossal margin. Quantitative indicators tend to the formula: TITAN X is 30-50% faster than the GeForce GTX 780 Ti and Radeon R9 290X, and often twice as fast compared to the Radeon R9 280X and GeForce GTX 770. The only intrigue is to look for fluctuations within this corridor in that or the other side. In addition, there is a unique case: TITAN X enjoys a frame rate of 24 FPS in Far Cry 4 at Ultra HD resolution and MSAA 4x, while rivals can not get out of the 5-7 FPS hole (and the GeForce GTX 770 and even less). Here, apparently, the Titan needed 12 GB of memory, and even 4 GB, which the Radeon R9 290X is equipped with, is not enough for such settings in FC4.

tomb raider

Bioshock Infinite

Crysis 3

⇡ Performance: Computing

Video decoding (DXVA Checker, Decode Benchmark)

  • The dedicated H.264 decoder in the GM200 is the same as in other chips in the Maxwell family. Its performance is more than enough to play videos with resolutions up to Ultra HD and frame rates of 60 Hz and higher.
  • Among discrete AMD video adapters, only the Radeon R9 285 can boast of this. The GeForce GTX 780 Ti is capable of delivering up to 35 FPS at a resolution of 3840 × 2160.
  • CPUs with 6-8 x86 cores are better suited for fast decoding for video conversion, but the fixed functionality block does this work with less power consumption, and, finally, it is simply given to the powerful GPU.

  • The only GPU with full hardware H.265 decoding is the GM206 in the GeForce GTX 960. Other representatives of the Maxwell architecture, as well as Kepler, perform some of the operations on the H.264 decoder pipeline. The rest falls on the central processor.
  • The performance of all these adapters with a good CPU is enough to play video at any reasonable resolution and frame rate. For speed work, a GTX 960 or a powerful CPU is better suited.

Luxmark: Room (Complex Benchmark)

  • The Maxwell architecture shows a surprising performance boost over Kepler in this task, making the TITAN X double the modest result of the GeForce GTX 780 Ti and outperform the Radeon R9 290X. However, this does not mean that LuxMark results are representative of any ray tracing task.
  • The difference between TITAN X and GeForce GTX 980 is not as huge as in gaming tests.

Sony Vegas Pro 13

  • AMD video adapters continue to lead the way in video rendering. And TITAN X does not stand out in the group of the most productive NVIDIA devices.

CompuBench CL: Ocean Surface Simulation

  • TITAN X takes the palm away from the Radeon R9 290X and compensates for the failure of the GeForce GTX 980, which is surprisingly difficult in this test.

CompuBench CL: Particle Simulation

  • Here, in contrast, the GTX 980 took a big step forward from the GTX 780 Ti, and the TITAN X built on the success. The Radeon R9 290X is no match for NVIDIA's flagship.

SiSoftware Sandra 2015: Scientific Analysis

  • In terms of double precision (FP64), AMD accelerators are still unmatched, and even the Radeon R9 280X based on a far from new GPU can outperform TITAN X.
  • Among the "green" TITAN X predictably leads in performance in FP64, especially compared to the frankly weak GTX 980.
  • In FP32 computing, TITAN X stands out sharply from all NVIDIA graphics cards. Only it provides a level of performance comparable to that of the Radeon R9 290X.

⇡ Conclusions

Considering that discrete GPU production is still within the 28nm process, the GeForce GTX TITAN X results look fantastic. At the same TDP as the GK110-based graphics adapters, TITAN X achieves 130-150% of the performance of accelerators such as the GTX 780 Ti and Radeon R9 290X. If we take the first 28nm GPUs - GK104 (GTX 680, GTX 770) and Radeon R9 280X, then TITAN X often surpasses them twice.

TITAN X, like its predecessors in this position, is extremely expensive for a single GPU graphics card. The positioning has not changed from previous Titans. Firstly, this is an alternative to SLI configurations of two discrete GeForce GTX 980s: although the potential performance of the tandem is higher, a single GPU has more predictable performance. Secondly, compact PCs in which there is no room for two video cards. And finally, non-graphical computing (GP-GPU). Although the performance of FP64 in GM200 is limited to 1/32 of that of FP32, TITAN X partly compensates for this limitation with GPU brute force. In addition, FP32 computing dominates in "prosumer" load (the same Ray Tracing, accelerated video rendering), and in this discipline, the GM200 is at least as good as the best AMD products, and often outperforms the same as in gaming tests.

Introducing the basic detailed material with the Nvidia Geforce GTX Titan X study.

Object of study: 3D graphics accelerator (video card) Nvidia Geforce GTX Titan X 12288 MB 384-bit GDDR5 PCI-E

Developer Details: Nvidia Corporation (Nvidia trademark) was founded in 1993 in the United States. Headquarters in Santa Clara (California). Develops graphic processors, technologies. Until 1999, the main brand was Riva (Riva 128/TNT/TNT2), from 1999 to the present - Geforce. In 2000, the assets of 3dfx Interactive were acquired, after which the 3dfx / Voodoo trademarks were transferred to Nvidia. There is no production. The total number of employees (including regional offices) is about 5,000 people.

Part 1: Theory and architecture

As you already know, back in the middle of last month, Nvidia released a new top-end video card called Geforce GTX Titan X, which has become the most powerful on the market. We immediately released a detailed review of this new product, but it contained only practical studies, without a theoretical part and synthetic tests. This happened due to various circumstances, including those beyond our control. But today we are correcting this defect and will take a closer look at the March novelty - nothing has happened in a month to make it lose its relevance.

Back in 2013, Nvidia released the first solution of a new brand of video cards Geforce GTX Titan, named after a supercomputer at the Oak Ridge National Laboratory. The first model of the new line-up set new records for both performance and price, with an MSRP of $999 for the US market. It was the first high-end Titan series graphics card, which then continued with the not-so-popular dual-chip Titan Z and the accelerated Titan Black, which received a fully unlocked GK110 revision B GPU.

And now, in the spring of 2015, it's time for another Nvidia novelty from the "titanium" premium series. The GTX Titan X was first revealed by company president Jensen Huang at the GDC 2015 gaming developer conference at the Epic Unreal Engine event. In fact, this video card invisibly participated in the show anyway, being installed in many demo stands, but Jensen presented it officially.

Before the release of the Geforce GTX Titan X, the fastest single-chip video card was the Geforce GTX 980, based on the GM204 chip of the same Maxwell graphics architecture, introduced last September. This model is very energy efficient, delivering decent processing power while consuming only 165W of power - that is, it is twice as energy efficient as the previous generation Geforce.

At the same time, Maxwell GPUs support the upcoming DirectX 12 (including Feature Level 12.1) and other latest graphics technologies from the company: Nvidia Voxel Global Illumination (VXGI, we wrote about it in the GTX 980 article), a new Multi-Frame sampled anti-aliasing method AA (MFAA), Dynamic Super Resolution (DSR) and more. The combination of performance, energy efficiency and features made the GM204 chip the best advanced graphics processor at the time of its release.

But everything changes sometime, and the GPU with 2048 cores and 128 texture units was replaced by a new GPU based on the same second-generation Maxwell architecture (we remember the first one from the GM107 chip on which the Geforce GTX 750 Ti video card is based) and those the same capabilities, but with 3072 CUDA cores and 192 texture units - all this has already been packed into 8 billion transistors. Of course, Geforce GTX Titan X immediately became the most powerful solution.

In fact, the top-of-the-line second generation Maxwell chip, which we now know by the code name GM200, was ready at Nvidia for some time before its announcement. It just didn't make much sense to release another top-end graphics card when even the Geforce GTX 980 based on the GM204 did a great job of being the world's fastest single-chip graphics card. Nvidia waited for some time for a more powerful solution from AMD based on the GPU, produced on the same 28 nm process technology, but did not wait.

It is likely that the product would not turn sour at all in the absence of real competition, they nevertheless decided to release it, securing the title of the company that produces the most powerful GPUs. And indeed, there was no point in waiting for the opponent's decision, because it was postponed at least until June - it is simply unprofitable to wait so long. Well, in which case, you can always release an even more powerful video card based on the same GPU, but operating at a higher frequency.

But why do we need such powerful solutions in the era of multi-platform games with rather average GPU power requirements? Firstly, the first gaming applications using the capabilities of DirectX 12, even if they are multi-platform, should appear very soon - after all, the PC versions of such applications almost always offer better graphics, additional effects and higher resolution textures. Secondly, DirectX 11 games have already been released that can use all the capabilities of the most powerful GPUs - like Grand Theft Auto V, which we will discuss in more detail below.

It is important that Nvidia's Maxwell graphics solutions fully support the so-called Feature Level 12.1 feature level from DirectX 12 - the highest known at the moment. Nvidia has been providing game developers with drivers for the upcoming version of DirectX for a long time, and now they are available to users who have installed the Microsoft Windows 10 Technical Preview. Not surprisingly, it was the Geforce GTX Titan X video cards that were used to demonstrate the capabilities of DirectX 12 at the Game Developers Conference, where the model was first shown.

Since the Nvidia video card model under consideration is based on the second-generation top-end GPU of the Maxwell architecture, which we have already reviewed and which is similar in detail to the previous Kepler architecture, it is useful to familiarize yourself with earlier articles about the company's video cards before reading this material. Nvidia:

  • Nvidia Geforce GTX 970 - A good replacement for the GTX 770
  • Nvidia Geforce GTX 980 - Follower of Geforce GTX 680, outperforming even GTX 780 Ti
  • Nvidia Geforce GTX 750 Ti - Maxwell starts small... despite Maxwell
  • Nvidia Geforce GTX 680 is the new single-socket leader in 3D graphics

So, let's take a look at the detailed specifications of the Geforce GTX Titan X video card based on the GM200 GPU.

Graphic accelerator Geforce GTX Titan X
ParameterMeaning
Chip code nameGM200
Production technology28 nm
Number of transistorsabout 8 billion
Core areaapprox. 600 mm 2
ArchitectureUnified, with an array of common processors for stream processing of numerous types of data: vertices, pixels, etc.
DirectX hardware supportDirectX 12, with support for Feature Level 12.1
Memory bus384-bit: Six independent 64-bit memory controllers with GDDR5 memory support
GPU frequency1000 (1075) MHz
Computing blocks24 streaming multiprocessors including 3072 single and double precision floating point scalar ALUs (1/32 rate of FP32) in accordance with the IEEE 754-2008 standard;
Texturing blocks192 texture addressing and filtering units with support for FP16 and FP32 components in textures and support for trilinear and anisotropic filtering for all texture formats
Rasterization Units (ROPs)6 wide ROPs (96 pixels) with support for various anti-aliasing modes, including programmable and with FP16 or FP32 frame buffer format. Blocks consist of an array of configurable ALUs and are responsible for depth generation and comparison, multisampling and blending
Monitor supportIntegrated support for up to four monitors connected via Dual Link DVI, HDMI 2.0 and DisplayPort 1.2
Geforce GTX Titan X reference graphics card specifications
ParameterMeaning
Core frequency1000 (1075) MHz
Number of universal processors3072
Number of texture blocks192
Number of blending blocks96
Effective memory frequency7000 (4×1750) MHz
Memory typeGDDR5
Memory bus384-bit
Memory12 GB
Memory Bandwidth336.5 GB/s
Computing performance (FP32)up to 7 teraflops
Theoretical maximum fill rate96 gigapixels/s
Theoretical texture sampling rate192 gigatexels/s
TirePCI Express 3.0
ConnectorsOne Dual Link DVI, one HDMI 2.0, and three DisplayPort 1.2
Energy consumptionup to 250 W
Extra foodOne 8-pin and one 6-pin connectors
Number of slots occupied in the system chassis2
Recommended price$999 (US), 74990 RUB (Russia)

The new Geforce GTX Titan X model received a name that continues Nvidia's line of premium solutions for specific positioning - they simply added the letter X to it. The new model replaced the Geforce GTX Titan Black model, and is at the very top in the company's current product line. Above it, only the two-chip Geforce GTX Titan Z remains (although it can no longer be mentioned), and below it are the single-chip GTX 980 and GTX 970 models. is the best performance solution on the market for single-chip video cards.

The Nvidia model in question is based on the GM200 chip, which has a 384-bit memory bus, and the memory runs at 7 GHz, which gives a peak bandwidth of 336.5 GB / s - one and a half times more than in the GTX 980. This is quite an impressive value, especially if we recall the new methods of on-chip information compression used in the second generation Maxwell, which help to use the available memory bandwidth much more efficiently than the competitor's GPU.

With such a memory bus, the amount of video memory installed on the video card could be 6 or 12 GB, but in the case of the elite model, the decision was made to install 12 GB to continue the trend set by the first GTX Titan models. This is more than enough to run any 3D applications without regard to quality parameters - this amount of video memory is enough for absolutely any existing game at any screen resolution and quality settings, which makes the Geforce GTX Titan X video card especially tempting with a perspective view - its the owner will never run out of video memory.

The official power consumption figure for the Geforce GTX Titan X is 250 W - the same as other single-chip solutions in the elite Titan series. Interestingly, 250 W is about 50% more than the GTX 980, and the number of main functional blocks has also increased by the same amount. A rather high consumption does not bring any problems, the reference cooler does an excellent job of dissipating such an amount of heat, and enthusiast systems after the GTX Titan and GTX 780 Ti have long been ready for such a level of power consumption.

Architecture

The model of the Geforce GTX Titan X video card is based on the new GM200 graphics processor, which includes all the architectural features of the GM204 chip, so everything said in the article on the GTX 980 fully applies to the premium new product - we advise you to first read the material in which the precisely the architectural features of Maxwell.

The GM200 GPU can be called an extreme version of the GM204, possible within the 28nm process. The new chip is larger, much faster and more demanding on power. According to Nvidia, the "big Maxwell" includes 8 billion transistors, which cover an area of ​​​​about 600 mm 2 - that is, it is the company's largest graphics processor. The "Big Maxwell" has 50% more stream processors, 50% more ROPs, and 50% more memory bandwidth, which is why it has almost one and a half times the area.

Architecturally, the GM200 video chip is fully consistent with the younger model GM204, it also consists of GPC clusters, which contain several SM multiprocessors. The top graphics processor contains six GPC clusters, consisting of 24 multiprocessors, in total it has 3072 CUDA cores, and texture operations (sampling and filtering) are performed using 192 texture units. And with a base frequency of 1 GHz, the performance of texture modules is 192 gigatexels / sec, which is more than a third higher than the similar characteristic of the previous most powerful video card from the company - Geforce GTX 980.

The second-generation Maxwell multiprocessor is divided into four blocks of 32 CUDA cores (total 128 cores per SMM), each of which has its own resources for command distribution, processing scheduling and instruction stream buffering. Due to the fact that each compute unit has its own dispatcher units, CUDA compute cores are used more efficiently than in Kepler, which also reduces GPU power consumption. The multiprocessor itself has not changed compared to the GM204:

To improve the efficiency of using caches in the GPU, numerous changes have been made to the memory subsystem. Each of the multiprocessors in the GM200 has a dedicated 96 KB of shared memory, and the first level and texture caches are combined into 24 KB blocks - two blocks per multiprocessor (48 KB in total per SMM). Previous generation Kepler GPUs had only 64 KB of shared memory, which also acted as a L1 cache. As a result of all the changes, the efficiency of Maxwell CUDA cores is about 1.4 times higher than in a similar Kepler chip, and the energy efficiency of the new chips is about twice as high.

In general, everything is arranged in the GM200 graphics processor in exactly the same way as in the GM204 chip we reviewed in 2014. They did not even touch the computing cores that can perform double-precision floating-point operations at a rate of only 1/32 of the speed of single-precision calculations - just like the Geforce GTX 980. It seems that Nvidia recognized that the release of specialized solutions for the professional market (GK210) and for gaming (GM200) is quite justified.

The memory subsystem of the GM200 is strengthened compared to the GM204 - it is based on six 64-bit memory controllers, which in total makes up a 384-bit bus. The memory chips operate at an effective frequency of 7 GHz, which gives a peak bandwidth of 336.5 GB / s, which is one and a half times higher than that of the Geforce GTX 980. Do not forget about the new data compression methods from Nvidia, which allow you to achieve greater effective memory bandwidth, compared to previous products - on the same 384-bit bus. In our review of the Geforce GTX 980, we carefully considered this innovation of the second generation of Maxwell chips, which provides them with a quarter more efficient use of video memory compared to Kepler.

Like all recent Geforce graphics cards, the GTX Titan X model has a base frequency - the minimum for GPU operation in 3D mode, as well as a Boost Clock turbo frequency. The base frequency for the novelty is 1000 MHz, and the Boost Clock frequency is 1075 MHz. As before, turbo frequency means only the average frequency of the GPU for a certain set of gaming applications and other 3D tasks used by Nvidia, and the actual frequency of operation can be higher - it depends on the 3D load and conditions (temperature, power consumption etc.)

It turns out that the GPU frequency of the new product is about 10% higher than that of the GTX Titan Black, but lower than that of the GTX 980, since large GPUs always have to be clocked at a lower frequency (and the GM200 is noticeably larger in area than the GM204) . Therefore, the overall 3D performance of the novelty will be about 33% higher than that of the GTX 980, especially when comparing Turbo Boost frequencies.

In all other respects, the GM200 chip is exactly the same as the GM204 - the solutions are identical in their capabilities and supported technologies. Even the modules for working with displays and video data were left exactly the same as those of the GM204, on which the Geforce GTX 980 model is based. Accordingly, everything that we wrote about the GTX 980 and GTX 970 fully applies to the Titan X.

Therefore, for all other questions of the functional subtleties of the novelty, you can refer to the Geforce GTX 980 and GTX 750 Ti reviews, in which we wrote in detail about the Maxwell architecture, the device of streaming multiprocessors (Streaming Multiprocessor - SMM), the organization of the memory subsystem and some other architectural differences. You can also check out features like hardware support for accelerated VXGI global illumination calculation, new full-screen anti-aliasing methods, and improved DirectX 12 graphics API capabilities.

Solving problems with the development of new technical processes

We can confidently say that everyone on the video card market is tired of the 28 nm process technology for a long time - we have been observing it for the fourth year already, and at first TSMC couldn’t make a step forward at all, and then it seemed like it was possible to start 20 nm production, but it’s no use it was not available for large GPUs - the yield of suitable ones is rather low, and no advantages were found compared to the spent 28 nm. Therefore, Nvidia and AMD had to squeeze as much out of the existing possibilities as possible, and in the case of Maxwell chips, Nvidia clearly succeeded in this. In terms of power and energy efficiency, the GPUs of this architecture have become a clear step forward, to which AMD simply did not respond - at least not yet.

So, from GM204, Nvidia engineers were able to squeeze much more performance compared to GK104, with the same level of power consumption, although the chip increased by a third, and the higher density of transistors made it possible to increase their number even more - from 3.5 billion to 5.2 billion. It is clear that in such conditions the GM204 included much more execution units, which resulted in greater 3D performance.

But in the case of the largest chip of the Maxwell architecture, Nvidia designers could not increase the size of the chip too much, compared to the GK110, it already has an area of ​​​​about 550 mm 2, and it was not possible to increase its area by a third or at least a quarter - such a GPU would become too complex and expensive to manufacture. I had to sacrifice something (compared to the older Kepler), and this something became the performance of double-precision calculations - its pace in the GM200 is exactly the same as in other Maxwell solutions, although the older Kepler was more versatile, suitable for graphic and for any non-graphic calculations.

Such a decision was not easy for Kepler - too much of the area of ​​this chip was occupied by CUDA FP64 cores and other specialized computing units. In the case of the large Maxwell, it was decided to get by with graphics tasks and it was simply made as an enlarged version of the GM204. The new GM200 chip has become purely graphic, it does not have special blocks for FP64 calculations, and their rate remains the same - only 1/32 of FP32. But most of the area of ​​the GK110, occupied by FP64 ALUs, was freed up and more graphics-important FP32 ALUs were placed in their place.

Such a move made it possible to significantly increase the graphics (and computing, if we take FP32 calculations) performance compared to the GK110 without increasing power consumption and with a slight increase in the crystal area - less than 10%. Interestingly, Nvidia deliberately went for the separation of graphics and computing chips this time around. Although the GM200 remains very productive in FP32 calculations, and Tesla's specialized solutions for single-precision calculations are quite possible, sufficient for many scientific tasks, the Tesla K40 remains the most productive for FP64 calculations.

This is the difference from the original Titan, by the way - the first solution of the line could also be used for professional purposes for double precision calculations, since it also has a rate of 1/3 for FP64 calculations. And many researchers have used the GTX Titan as a starter card for their CUDA applications and tasks, transitioning to Tesla solutions with success. For this, the GTX Titan X is no longer suitable, you will have to wait for the next generation GPUs. If they are not divided into graphics and computing chips initially, of course.

Expansion cards already have such a division - the Tesla K80 model contains a pair of GK210 chips, which are not used in video cards and differ from the GK110 in a doubled register file and shared memory for greater performance of computing tasks. It turns out that the GK210 can be considered an exclusively "computing" processor, and the GM200 - a purely "graphics" one (with a certain degree of conventionality, because both GPUs have the same capabilities, just different specializations).

Let's see what happens in the next generations of Nvidia's graphics architectures, which are already produced on a "thinner" technical process - perhaps such a separation will not be needed in them, at least at first. Or vice versa, we will immediately see a strict division into GPU models with different specializations (computational models will have more computing capabilities, and graphics models - TMU and ROP blocks, for example), although the architecture will remain the same.

Features of the design of the video card

But back to Geforce GTX Titan X. This is a powerful video card designed for PC gaming enthusiasts, so it should also have an appropriate appearance - an original and solid design of the board and cooler. Like the previous solutions of the Titan line, the Geforce GTX Titan X model is covered with an aluminum case, which gives the very premium look to the video card - it really looks solid.

The cooling system is also very impressive - the Titan X cooler design uses a copper alloy vapor chamber - it cools the GM200 GPU. The evaporation chamber is connected to a large two-slot aluminum alloy heatsink that dissipates the heat transferred from the video chip. The fan removes heated air outside the PC case, which has a positive effect on the overall temperature regime in the system. The fan is very quiet even when overclocked and under load for a long time, and as a result, the 250W GTX Titan X is one of the quietest graphics cards in its class.

Unlike the reference board Geforce GTX 980, the new product does not contain a special removable plate that covers the rear surface of the board - this is done to ensure maximum air flow to the PCB for cooling it. The board is powered by a set of one 8-pin and one 6-pin PCI Express auxiliary power connectors.

Since the Geforce GTX Titan X is designed for enthusiasts who prefer solutions with maximum performance, all the components of the new video card were selected with this in mind and even with some reserve in terms of features and characteristics.

For example, to provide the graphics processor in the Geforce GTX Titan X with energy, a 6-phase power supply system with the possibility of additional amplification is used. To ensure the operation of GDDR5 memory, another two-phase power system is additionally used. The 6 + 2-phase power system of the video card provides the model in question with more than enough power, even with overclocking. Thus, the Titan X reference board is capable of supplying up to 275W of power to the GPU, provided that the maximum value of the target power (power target) is set to 110%.

Also, to further improve the overclocking potential, the cooling of all new components was improved, compared to the original Geforce GTX Titan video card - the redesigned board and cooler led to improved overclocking capabilities. As a result, almost all Titan X samples are capable of operating at frequencies up to 1.4 GHz or more - with the same reference air cooler.

The length of the Geforce GTX Titan X reference board is 267 mm, it has the following image output connectors: one Dual-Link DVI, one HDMI 2.0 and three DisplayPort. Geforce GTX Titan X supports display output up to 5K resolution, and is another HDMI 2.0-enabled graphics card that the competitor still lacks - this allows you to connect the new product to 4K TVs, providing maximum picture quality at a high refresh rate of 60 Hz.

Game developer support

Nvidia has always been a company that works very closely with software makers, and especially game developers. Take a look at PhysX - the most popular game physics engine, which has been used for more than 10 years in more than 500 games. The widespread use of PhysX is due, among other things, to the fact that it is integrated into one of the most popular game engines: Unreal Engine 3 and Unreal Engine 4. So, at the Game Developers Conference 2015 game developers conference, Nvidia announced free access to CPU source codes - focused part of PhysX 3.3.3 for C++ developers in versions for Windows, Linux, OS X and Android.

Developers will now be able to modify the engine's PhysX code however they wish, and the modifications can even then be incorporated into the core Nvidia PhysX code. By opening the source of PhysX to the public, Nvidia has given access to its physics engine to an even wider range of game application developers who can use this advanced physics engine in their games.

Nvidia continues to promote another technology of its own, the rather new VXGI dynamic global illumination simulation algorithm, which includes support for special hardware acceleration on video cards with second-generation Maxwell GPUs, such as the Geforce GTX Titan X.

The introduction of VXGI into the game will allow developers to provide a very high-quality calculation of dynamic global illumination in real time, using all the capabilities of modern GPUs and providing the highest performance. To understand the importance of calculating global illumination (rendering taking into account not only direct illumination from light sources, but also its reflection from all objects in the scene), just look at a couple of pictures - with and without GI enabled:

It is clear that this example is artificial, and in reality game designers use special methods to simulate global shading, placing additional lights or using pre-calculated lighting - but before the advent of VXGI they were either not fully dynamic (pre-calculated for static geometry) or did not have sufficient realism and/or performance. In future games, it is quite possible to use VXGI, and not only on top GPUs.

The VXGI technique has been very popular with game developers. At least many of them have tried the method in test scenes, are very excited about the results and are considering incorporating it into their games. And here is another scene with a high-quality calculation of global illumination - it also shows how important it is to take into account the rays of light reflected from all surfaces of the scene:

While developers have not implemented VXGI into their own engines, you can use the special version of the Unreal Engine 4 VXGI GitHub engine, which is provided to all interested developers - this makes it possible to quickly integrate VXGI into their game (and not only!) projects using this popular game engine - however , this will require some modifications, VXGI cannot simply be "enabled".

Let's consider another Nvidia technology - full-screen anti-aliasing using the MFAA method, which provides excellent performance and, at the same time, acceptable anti-aliasing quality. We have already written about this method and only briefly repeat the essence and perspectives. MFAA support is one of the key features of Maxwell GPUs compared to previous generation GPUs. Using the ability to program positions for anti-aliasing samples in the MSAA method, these samples change every frame in such a way that MFAA is almost full-fledged MSAA, but with less load on the GPU.

As a result, the picture with MFAA enabled looks almost like with MSAA, but the performance loss is much lower. For example, MFAA 4x provides speeds on par with MSAA 2x, and anti-aliasing quality is close to MSAA 4x. Therefore, in those games where the performance is not enough to achieve a high frame rate, the use of MFAA will be fully justified and can improve the quality. Here is an example of the resulting performance with MSAA and MFAA on a Titan X graphics card compared to a regular Titan (in 4K resolution):

The MFAA anti-aliasing method is compatible with all DirectX 10 and DirectX 11 games with MSAA support (with the exception of rare projects like Dead Rising 3, Dragon Age 2 and Max Payne 3). MFAA can be manually enabled in the Nvidia Control Panel. Also, MFAA is integrated into Geforce Experience, and this method will be automatically enabled for different games in case of optimization using Geforce Experience. The only problem is that at the moment MFAA is still not compatible with Nvidia SLI technology, which they promise to fix in future versions of video drivers.

Modern games on Geforce GTX Titan X

With all its power and capabilities, Geforce GTX Titan X is able to cope not only with current games, but also future projects with support for the upcoming DirectX 12 version. quality, with full-screen anti-aliasing enabled and high resolution rendering - like 4K.

At high resolutions and enabled anti-aliasing, a powerful memory subsystem becomes especially important, and the Geforce GTX Titan X has everything in order with it - a 384-bit memory interface and chips operating at an effective frequency of 7 GHz provide a bandwidth of 336.5 GB / s - although this is not a record, it is pretty decent.

And it is also very important that all data fit into the video memory, since when MSAA is enabled at 4K resolution in many games, the amount of video memory is simply not enough - more than 4 GB of memory is needed. And Titan X has not just 6 GB, but as much as 12 GB of video memory, because this line is created for those enthusiasts who do not tolerate compromises. It is clear that with such an amount of on-board memory, the player does not need to think about whether the performance of the game in high resolution will decrease when multisampling is enabled - in all games at any settings, 12 GB will be more than enough.

At the moment, in absolutely any game, you can set any settings and choose any resolution - Titan X will provide sufficient frame rates under (almost) any conditions. Here are the games Nvidia chose to demonstrate the performance of their solution:

As you can see, a frame rate of 40 FPS or more is provided in most of the "heaviest" modern games with full-screen anti-aliasing enabled, including projects such as Far Cry 4 - in this game, with Ultra settings and anti-aliasing in 4K resolution, achieve acceptable rendering speed is possible only on Titan X or multi-chip configurations.

And with the release of games of the future that will support DirectX 12, we can expect an even greater increase in the requirements for GPU and video memory performance - improving the quality of rendering is not given "for free". By the way, at that time, Nvidia had not yet tested its Titan X graphics card in the latest game, which was released quite recently - the PC version of Grand Theft Auto V. This series of games is the most popular among modern projects, in which you act as various criminal elements in the scenery the city of Los Santos, suspiciously similar to the real Los Angeles. The PC version of GTAV was highly anticipated and finally released in mid-April - a month after Titan X.

Even the console versions (we are talking about the current generation consoles, of course) of the Grand Theft Auto V game were quite good in terms of picture quality, and the PC version of the game offers several more opportunities to improve it: a significantly increased draw distance (objects, effects, shadows), the ability to play at 60 FPS or more, including resolutions up to 4K. In addition, they promise rich and dense traffic, a lot of dynamic objects in the scene, improved weather effects, shadows, lighting, etc.

The use of a couple of Nvidia GameWorks technologies has further improved the quality of the picture in GTAV. Recall that GameWorks is a special platform for game and graphic developers, providing them with 3D technologies and utilities designed for Nvidia video cards. Adding GameWorks technologies to games makes it relatively easy to achieve high-quality imitation of realistic smoke, wool and hair, waves, as well as global illumination and other effects. GameWorks makes it a lot easier for developers by providing examples, libraries, and SDKs ready to be used in game code.

Grand Theft Auto V uses a couple of these technologies from Nvidia: ShadowWorks Percentage-Closer Soft Shadows (PCSS) and Temporal Anti-Aliasing (TXAA), which improve the already good graphics in the game. PCSS is a special shadow rendering technique that has better quality than typical soft shadow methods. PCSS has three advantages: the softness of shadow edges depends on the distance between the object that casts the shadow and the surface on which it is drawn, it also provides better filtering that reduces the number of artifacts in the form of jagged edges of shadows, and the use of a shadow buffer allows you to correctly handle shadow intersections from different objects and prevent the appearance of "double" shadows.

As a result, when PCSS is enabled, the game provides soft, realistic, dynamic shadows that are far better than what we've seen on game consoles. And for a game like Grand Theft Auto V with a bright sun constantly moving across the horizon, the quality of the shadows is very important, they are always in sight. From the following screenshots, you can see the difference between the two highest quality methods used in the game (AMD algorithm versus Nvidia method):

It is clearly seen that the PCSS method allows you to get soft edges of the shadows, which are progressively blurred the more, the farther the distance between the object from which the shadow is, and the surface that "receives" the shadow. At the same time, the inclusion of PCSS has almost no effect on the final performance in the game. While this method provides better shadow quality and realism, turning this option on is virtually "free" for performance.

Another important addition to the PC version of GTAV is the Nvidia TXAA anti-aliasing method. Temporal Anti-Aliasing is a new anti-aliasing algorithm designed specifically to address the problems of conventional anti-aliasing methods seen in motion - where individual pixels flicker. To filter pixels on the screen using this method, samples are used not only inside the pixel, but also outside it, also together with samples from previous frames, which allows you to get a "cine" filtering quality.

The advantage of the method over MSAA is especially noticeable on such objects with translucent surfaces as grass, tree leaves and fence nets. TXAA also helps smooth out pixel-by-pixel effects. In general, the method is very high quality and approaches the quality of professional methods used in 3D graphics, but the result after TXAA is slightly more blurry compared to MSAA, which is not to the liking of all users.

The performance hit from enabling TXAA depends on the game and conditions, and correlates mainly with the speed of MSAA, which is also used in this method. But compared to pure post-processing anti-aliasing methods like FXAA, which provide maximum speed at lower quality, TXAA aims to maximize quality at some additional performance penalty. But with such richness and detail in the world, as we see in Grand Theft Auto V, the inclusion of high-quality anti-aliasing will be quite useful.

The PC version of the game has rich graphics settings that allow you to get the required picture quality with the required performance. So, GTAV on PC provides acceptable rendering speed and quality on all Nvidia solutions, starting from about Geforce GTX 660. Well, to get full enjoyment of all the graphic effects of the game, it is recommended to use something like Geforce GTX 970/980 or even Titan x.

To check the settings, a performance test is built into the game - this benchmark contains five scenes close to real gameplay, which will allow you to evaluate the rendering speed in the game on a PC with different hardware configurations. But owners of Nvidia graphics cards can do it easier by optimizing the game for their own PC using Geforce Experience. This software will select and adjust the optimal settings while maintaining a playable rendering speed - and all this is done with the click of a button. Geforce Experience will find the best combination of features for both the Geforce GTX 660 with a FullHD monitor and the Titan X with a 4K TV, providing the best settings for a particular system.

Full support for the GTAV game appeared in the new build of Geforce drivers version 350.12 WHQL, which has a special optimized profile for this application. This driver version will provide optimal in-game performance, including other Nvidia technologies: 3D Vision, 4K Surround, Dynamic Super Resolution (DSR), GameStream, G-SYNC (Surround), Multi Frame Sampled Anti-Aliasing (MFAA) , Percentage Closer Soft Shadows (PCSS), SLI and more.

Also, the 350.12 WHQL special driver contains updated SLI profiles for several games, including a new profile for Grand Theft Auto V. In addition to SLI profiles, the driver updates and adds profiles for both 3D Vision technology, and the profile for GTAV has been rated "Excellent", which means excellent stereo image quality in this game - owners of the appropriate glasses and monitors should try it!

Support for virtual reality technologies

The topic of virtual reality (Virtual Reality - VR) is now one of the loudest in the gaming industry. In many ways, the Oculus company, which was then acquired by Facebook, is “to blame” for the revival of interest in VR. So far, they have only shown prototypes or SDKs, but they have plans to release a commercial version of the Oculus Rift helmet later this year. Other companies are also not left out. For example, the well-known company Valve has announced plans to partner with HTC to release its own virtual reality helmet also by the end of 2015.

Naturally, GPU manufacturers also see a future in VR, and Nvidia is working closely with suppliers of software and hardware solutions for virtual reality in order to ensure that they work as comfortably as possible with Geforce video cards (or even Tegra, who knows?). And these are not just marketing slogans, because in order for the use of VR to be comfortable, several problems need to be solved, including reducing the delay between the player's action (head movement) and the resulting display of this movement on the display - too much lag does not just spoil the experience of virtual reality , but can cause the so-called motion sickness (sickness, motion sickness).

To reduce this latency, Nvidia's VR Direct software supports a feature called asynchronous time warp. With the use of asynchronous time warping, a scene rendered some time ago can move based on the player's head movements later captured by the helmet's sensors. This reduces the latency between the action and the rendering of the image, as the GPU does not have to recalculate the entire frame before shifting. Nvidia already provides driver support for VR application developers, and they can apply asynchronous time distortion in their software.

In addition to the output delay, it is very important to achieve comfortable gameplay in a virtual reality helmet is not just to provide a high frame rate, but to display frames for each eye with the smoothest possible change. Accordingly, after the popularization of future-generation VR helmets, many of the players will want to try them out in modern games that are very demanding on GPU power. And in some cases, you will have to create a two-chip SLI configuration from a pair of powerful video cards like the Geforce GTX Titan X.

To ensure maximum comfort in such cases, Nvidia offers VR SLI technology, which allows game developers to assign a specific GPU from a pair to each eye in order to reduce latency and improve performance. In this case, the picture for the left eye will be rendered by one GPU, and for the right eye - by the second GPU. This obvious solution reduces latency and is ideal for VR applications.

So far, VR SLI and asynchronous time warp are not available in Nvidia's public drivers, but this is not particularly necessary, since their use requires changes to the game's executable code. And pre-release Geforce video drivers with support for VR SLI and Asynchronous Time Warp are available to select Nvidia partners such as Epic, Crytek, Valve, and Oculus. Well, the public driver will be released closer to the release of the final VR products on sale.

In addition, such a powerful graphics card as the Geforce GTX Titan X was used in many virtual reality demonstrations at this year's Game Developers Conference 2015. Here are just a few examples: "Thief in the Shadows" - a joint development of Nvidia, Epic, Oculus and WETA Digital, the visual effects studio behind The Hobbit Movie Trilogy, Back to Dinosaur Island is a reboot of Crytek's famous 14-year-old X-Isle: Dinosaur Island demo and Valve's Portal ”, “Job Simulator”, “TheBluVR” and “Gallery”. In general, it's up to the release of VR helmets on sale, and Nvidia will be ready for this.

Conclusions on the theoretical part

From an architectural point of view, the new top-end GPU of the second generation of the Maxwell architecture turned out to be very interesting. Like its siblings, the GM200 takes the best of the company's past architectures, with added functionality and all the improvements of Maxwell's second generation. Therefore, functionally, the novelty looks just fine, corresponding to the models of the Geforce GTX 900 line. With the help of a serious upgrade of the execution units, Nvidia engineers achieved a doubling of the performance-to-power consumption ratio in Maxwell, while adding functionality - we recall hardware support for VXGI global illumination acceleration and graphics API DirectX 12.

The top-of-the-line Geforce GTX Titan X graphics card is designed for ultra-enthusiast gamers who want the ultimate in quality and performance from the latest PC games, running at the highest resolutions, highest quality settings, full-screen anti-aliasing, and all at an acceptable frame rate. On the one hand, few games require such a powerful GPU, and you can install a couple of less expensive video cards. On the other hand, due to the problems of multi-chip solutions with increased latency and uneven frame rates, many players will prefer one powerful GPU to a pair of less powerful ones. Not to mention that a single-chip card will also provide lower power consumption and noise from the cooling system.

Naturally, in such conditions, the main issue for Geforce GTX Titan X is the price of the solution. But the fact is that it is sold in a niche where the concepts of price justification and value for money are simply not needed - solutions with maximum performance always cost significantly more than those close to them, but still not as productive. And the Titan X is an extremely powerful and expensive graphics card for those who are willing to pay for maximum speed in 3D applications.

The Geforce GTX Titan X is positioned as a premium (luxury, elite - whatever you want to call it) video card, and there should be no complaints about the recommended price - especially since the previous solutions in the line (GTX Titan and GTX Titan Black) initially cost exactly the same - $999 . This is the solution for those who need the fastest GPU in existence, despite its price. Moreover, for the richest enthusiasts and record holders in 3D benchmarks, systems with three and even four Titan X video cards are available - these are simply the fastest video systems in the world.

These are the requests Titan X fully justifies and provides - the top-end novelty, even alone, shows the highest frame rate in all gaming applications and in almost all conditions (resolution and settings), and the amount of fast GDDR5 video memory of 12 GB allows you not to think about the lack of local memory for several years ahead - even games of future generations, with support for DirectX 12, etc., simply will not be able to clog this memory so much that it will not be enough.

As with the first GTX Titan in 2013, the GTX Titan X sets a new bar for performance and functionality in the premium graphics segment. At one time, the GTX Titan became a fairly successful product for Nvidia, and there is no doubt that the GTX Titan X will repeat the success of its predecessor. Moreover, the model based on the largest video chip of the Maxwell architecture has become the most productive on the market without any reservations. Since video cards like the GTX Titan X are manufactured by Nvidia itself and sell reference samples to their partners, there have been no problems with availability in stores since the very moment of its announcement.

The GTX Titan X lives up to its highest level in every way: the most powerful GPU from the Maxwell family, the excellent design of graphics cards in the style of previous Titan models, as well as the excellent cooling system - efficient and quiet. In terms of 3D rendering speed, this is the best video card of our time, offering more than a third more performance compared to the best models that came out before Titan X - like the Geforce GTX 980. And if you do not consider dual-chip video systems (like a couple of the same GTX 980 or one Radeon R9 295X2 from a competitor that has problems inherent in multi-chip configurations), then Titan X can be called the best solution for non-poor enthusiasts.

In the next part of our material, we will examine the rendering speed of the new Nvidia Geforce GTX Titan X video card in practice, comparing its speed with the performance of the most powerful video systems from AMD and with the performance of Nvidia's predecessors, first in our usual set of synthetic tests, and then in games.

In March 2015, the public was presented with a new flagship graphics card from NVIDIA. The Nvidia Titan X gaming video card is a single-chip, and its architecture is based on the Pascal algorithm (for the GP102 GPU), patented by the manufacturer. At the time of the presentation of the Geforce GTX Titan X, it was rightfully considered the most powerful gaming video adapter.

GPU. The GPU has 3584 CUDA cores with a base frequency of 1417 MHz. In this case, the clock frequency with acceleration will be at the level of 1531 MHz.

Memory. The flagship was presented with a capacity of 12 Gb, but later a version with a reduced volume of 2 times was released. Memory speed reaches 10 Gb / s. The bandwidth in the memory bus is 384-bit, which makes it possible to have a memory bandwidth of 480 Gb / s. GDDR5X memory chips are used, so even with a 6 Gb configuration, performance will be high.

Other characteristics of the Titan X. The number of ALUs is 3584, the ROP is 96, and the number of overlaid texture units is 192. The card also supports resolutions up to 7680×4320, a set of connectors of new standards DP 1.4, HDMI 2.0b, DL-DVI, and HDCP version 2.2.

The video card works with a slot (bus) PCIe 3.0. To provide full power, you must have additional 8-pin and 6-pin connectors on the power supply. The card will take up two slots on the motherboard (SLI is possible for 2, 3 and 4 cards).

The height of the graphics card is 4.376″ and the length is 10.5″. It is recommended to use power supplies with a power of 600 W or more.

Video card overview

The main emphasis of manufacturers was placed on improving graphics for VR, as well as full support for DirectX 12. The performance of the video card in games can be slightly increased by overclocking the performance of the GTX Titan X 12 Gb card.


Pascal technology is aimed at VR games. Using ultra-high-speed FinFET technology, maximum smoothing is achieved when using a helmet. The Geforce Titan X Pascal model is fully compatible with VRWorks, which gives the effect of complete immersion with the ability to experience the physics and tactile sensations of the game.

Instead of copper heat pipes, an evaporation chamber is used here. The maximum temperature is 94 degrees (from the manufacturer's website), however, in tests, the average temperature is 83-85 degrees. When rising to this temperature, the cooler turbine speeds up. If the acceleration is not enough, then the clock frequency of the graphics chip is reduced. The noise from the turbine is quite distinguishable, so if this is a significant indicator for the user, then it is better to use water cooling. Solutions for this model already exist.

Mining Performance Improvement

The company has focused on gaming performance. In comparison with the video card, Geforce GTX Titan X 12 Gb does not improve mining, but the consumption is higher. All Titan series graphics cards stand out for their FP32 and INT8 double precision performance. This allows us to consider a series of cards as professional class accelerators. However, the model with the GM200 chip is not, as many tests show performance degradation in hash calculations and other operations. Cryptocurrency mining performance is only 37.45 Mhash/s.

We do not recommend using the X model for mining cryptographic currencies. Even tweaking the Nvidia Titan X for performance won't deliver the same results as the Radeon Vega (if taken in the same price bracket), let alone the Tesla.

A new card from the same manufacturer gives 2.5 times more performance. In the overclocked state, the Titan V gave a figure of 82.07 Mhash / s.

Test results in games

If we compare the Titan X Pascal video card with others, then it is 20-25% better than the video card of the same manufacturer, and also almost twice outperforms the competitor Radeon R9 FuryX, which is also single-chip.

In all games in 4K and UltraHD we see a smooth picture. We also achieved good results in tests using the SLI mode.

Comparison of video cards from different manufacturers

The price of a Titan X 12 Gb video card starts at $1200 and depends on the manufacturer and the amount of memory.

We offer you to get acquainted with the comparative characteristics of goods from different manufacturers (* - similar):

ProductPalit GeForce GTX TITAN XMSI GeForce GTX TITAN XASUS GeForce GTX TITAN X
Primary Feature List
Video card typegame *
Name of the GPUNVIDIA GeForce GTX TITAN X *
Manufacturer codeNE5XTIX015KB-PG600F *
GPU codenameGM 200 *
Technical process28nm *
Supported monitorsfour *
Resolution GM200 (maximum)5120 to 3200 *
List of specifications
GPU frequency1000Mhz *
Memory12288Mb *
Memory typeGDDR5 *
Memory frequency7000 Mhz7010MHz7010MHz
Memory bus width384bit *
RAMDAC frequency400 Mhz *
Support for CrossFire /SLI modepossible *
Quad SLI supportpossible* *
List of specifications by connection
Connectorssupport for HDC, HDMI, DisplayPort x3 *
HDMI Version2.0 *
Math block
Number of universal processors3072 *
Shader version5.0 *
Number of texture blocks192 *
Number of rasterization blocks96 *
Additional Features
Dimensions267×112 mm280×111 mm267×111 mm
Number of occupied slots2 *
Price74300 r.75000 r.75400 r.

Based on the comparison table, it can be seen that various manufacturers comply with standardization. The difference in characteristics is insignificant: different frequency of video memory and sizes of adapters.

Now on sale there is no this model from any manufacturer. In January 2018, the world was introduced, which outperforms its counterparts several times in performance in games and in cryptocurrency mining.