NVIDIA updated his offer of top-end-range graphics chip with the arrival of GeForce 8800. from the start to video card go in parallel with Windows Vista since that this GeForce 8800 is the first graphics chip compatible with DirectX 10 and such support is native, which falls rather well since Windows Vista will be the only operating system to benefit from DirectX 10. GeForce 8800 represents a true image from NVIDIA where last major architecture 3D goes back in GeForce 6800 i.e. semi-2004. Whereas GeForce 7 essentially took again the architecture of the NV40 with several improvements, GeForce 8 New Year\'s gift for its part a very new architecture.

As result of almost four years of development, G80, alias GeForce 8800, will have cost some 400 million dollars in search and development . As much to say that NVIDIA plays large role on this new chip whose architecture is contrary to what could have been indicated in the past, completely unified. Among the objectives by NVIDIA when designing its G80, there are of course the assumption of support for DirectX 10, the new 3D API from Microsoft, but also the performances in DirectX 9.0 and finally the image quality , this last point being likely in the side of ati those last time with its last Radeon X1900. Before presenting the architecture GeForce 8800 to you, we will reconsider the new functions of DirectX 10 then we will discover together the card and finally we will get the benchmarks.

Attention, the pages which follow are relatively technical and although the presentation of DirectX 10 is essential for including/understanding well the operation and the architecture G80, just like the presentation of the aforesaid architecture, we invite our readers to pass directly to b.

The new major version of the operating system general public from Microsoft, named Windows Vista, integrates a very new interface of programming 3D. This one, which almost bore all the names of creation (Windows Graphic Fundation, DirectX Next, etc) names finally DirectX 10 and is exclusively reserved for Windows Vista, Microsoft is not intending to add support under Windows XP. Although DirectX 10 succeeds DirectX 9.0, all the part concerning made graphic was completely re-examined and reconsidered. DirectX 10 indeed gives up all the fixed functions inherited preceding grindings the API one and is thus compatible only with the graphics chips DirectX 10. It is precisely for this reason that Windows Vista also integrates an adapted version of DirectX 9.0 named Direct 3D 9Ex to allow the correct operation of the current game on this new system.

While setting out again on new bases, Microsoft aims as before to making its API lighter so that this one spends less time in the processor. the support of the fixed functions is not the only reason explaining the reduction in processor consumption of DirectX 10.For example, when DirectX 9.0 do a systematic validation of resources to each use, DirectX 10 do only once creation, reducing the processor cycles . It is of course not the only change since DirectX 10 proposes a very new pipeline 3D as well as a very new model of programming. The engineers of Microsoft indeed maked the unification of the instructions that it is for the programming of pixels shaders or vertexes shaders always with an aim of facilitating the life of programmers. Remain that this does not mean that graphics chip DirectX 10 must have a unified architecture. Who says new version of DirectX, known as generally new version of model Shaders and with DirectX 10 Microsoft added support for Shader Model 4.0.

Geometry with DirectX 10!

Among the update of Shader Model 4.0, we of course notes the appearance of Geometry Shaders .It is a question here of making possible to the GPU to generate geometry and either only to handle it (itis for what are used the Vertexes Shaders ). Geometry Shaders are used for of the primitives (points, lines or triangles) to generate in fine creation of whole forms, a task formerly reserved for the CPU. If Geometry Shader cannot generate types of primitives different from those on which it works, it can on the other hand reach information of the adjacent forms, which will be useful for for example calculating the edges. Being based on the data provided by the Vertexes Shaders, moreover located immediately in front the new pipeline 3D of DirectX, the uses of Geometry Shaders will increase the geometrical details of the 3D objects for more realism. It is a question of adding polygons with an existing model, thing which one cannot make with the Vertexes Shaders . Despite everything, Geometry Shaders should allow other uses as the system generation of particles or of effects of type fur . NVIDIA also evokes effects of hair simulation. Remain that the power required for the processing of Geometry Shaders is such as the first generations of GPU DirectX 10 should not be able to make an intensive use of it.

The passage to Shader Model 4.0 is accompanied by many other update as Stream Output which makes possible to write in memory the data resulting from a treatment of the level Vertexes Shaders or Geometry Shaders and this before processing the Pixels Shaders . This should be particularly useful if a calculation requires several master keys , the GPU couldd write in memory only pixels. In addition to the unification of the instructions on the hardware level , the limitations was imposed on the software level with the programming of the vertexes or pixels shaders are now unified. As with each new version of DirectX, the maximum number of instructions increases by 512 to 65536 whereas the number of instructions maked reached the infinite one and the number of temporary registers was also re-examined with the rise, since it passes from 32 to 4096. Of course, the manufacturers of graphics processors will not have to integrate such number of registers in their chips, but on the other hand their driver must manage them.

Other Update of DirectX 10:

Always with an aim of reducing the processor load generated by its API, Microsoft implements with DirectX 10 a new management of the states and generalizes the use of the constants buffers . These two functions should make possible to return on one of the gaps of DirectX 9.0, impossibility to do repetitive operations by batch.To this limitation, the developers were forced to send a succession of commands to done such task . With DirectX 10, the programmers thus have access to five new states of objects which capture the main properties part of the graphic pipeline. The constants which are preset values used as parameters by the programs , benefit from a memory buffer which can accommodate a total of 4096 constants. DirectX10 more precisely comprises 16 memory constants buffer each one able to accommodate 4096 constants. All the constants contained in the buffer can in addition be updated via a simple command, t.

NVIDIA GeForce 8800: Constant DirectX 10 Buffers

The reduced processor load of DirectX 10 makes possible to return a greater number of objects

Update on the level of textures management are also to announce ,since Microsoft makesmove the number of managed textures and their access mode. we passes thus from 16 to 128 textures and this also applies to the Vertexes Shaders which were satisfied before with a maximum of 4 textures like in new Geometry Shaders. The resolution of textures was also improved and DirectX 10 manages textures with resolution of 8192x8192 pixels. This figure is to be put in connection with the resolution of maximum texture dealt with the graphics chips DirectX 9.0: 4096x4096 pixels. Always in the chapter of textures, Microsoft introduces the tables of textures to simplify their management. Bound to Shader Model 4.0, this function makes possible to use several textures shader and either a small handle as it in case with DirectX 9.0. Under DirectX 9.0 the developers had to create f textures with an only aim of circumventing the important latency times induced by the change of textures.

NVIDIA GeForce 8800: DirectX 10 Textures Array

Illustration of the tables use for texture increased details

Microsoft also benefits from DirectX 10 to just like return the assumption of responsibility of obligatory filtering FP16 the access and the filtering of the shadow map. And as happiness generally does not arrive , the developers of Microsoft propose a new type of access to textures. this new instruction allows the recovery of a quite precise pixel by using standardized co-ordinates. That should make possible besides to use textures to store another thing that data while making possible to the GPU to reach the data with the same manner as a CPU. Function MRT from DirectX 9.0 or Multiple Render Target, is improved with DirectX 10 since the number of targets passes from four to eight. Let us recall on this subject that function MRT makes possible on the only one way to generate several different one .

The other notable evolution brought by the API of Microsoft relates to the calculation precision . Although this one remains fixed at format FP32, Microsoft imposes new constraints to the manufacturers of GPU by preventing that those do not manage the special numbers or the precision of the round-offs by their own way as it is the case with the generations GeForce 6/7 and Radeon X1000. Microsoft thus tries to bring the GPU closer to the standard IEEE 754 which is common for the CPU, even if currently DirectX 10 is not completely in conformity with IEEE 754. To this rather significant change, the total management of the entireties is added 32 bits, in addition to the assumption of responsibility of floating the 32 bits which one already finds in DirectX 9.0c. Side HDR (High Dynamic Arranges), DirectX 10 introduced two new formats HDR close to the FP16 but requiring less memory. The first format R11G11B10 is optimized for the texture memory in floating point with as its name suggests it 11 bits for the components red and green and 10 bits for blue. The second format of HDR is conceived to be used with Render Target and as each color is stored on a number even more restricted of bit, it also takes part to reduce the costs of usually high band-width generated by the HDR.

NVIDIA GeForce 8800: DirectX 10 New formats HDR

Two new formats HDR with DirectX 10

With regard to Instancing,, i.e. the possibility of drawing several part of the same object on the only one way (concretely the GPU calculates teh texture of a soldier and post about fifteen in the screen), DirectX 10 makes possible to raise certain limitations. For example, it is not more necessary that the objects in question use same texture thanks to the table of textures which we evoke before. the objects benefitting from Instancing can use different shaders. In practice Instancing version 2.0, as certain developers enjoy to describe it, should make possible to avoid waiting for a clones multiplication in the screen to benefit from each part

Example of game using Instancing to multiply the number of character

Let us finish by evoking the antialiasing in DirectX 10 and the fact that the multisampling type is always optional. While waiting for DirectX 10.1, which should reconsider this point, Microsoft introduces a new method Alpha to Coverage aiming at solving the problems of crenellation on polygons containing the transparent portions: this kind of problem particularly affects scenes of outside . In practice, Alpha Coverage excute the returned polygons containing of the fairly transparent values with an antialiasing of the multisample type. It is here about a functionality rather close to Transparency Antialiasing which ATI and NVIDIA with Radeon X1000 and other GeForce 7 propose.

NVIDIA GeForce 8800: DirectX 10 Coverage Alpha

DirectX 10 and Coverage Alpha

NVIDIA G80: unified architecture!

As much to warn you immediately, it is better to forget all that you know of GeForce 6 and GeForce 7 to approach G80. On the architectural level it is the great revolution, NVIDIA adopting, with the general surprise, a unified architecture. It should be said that DirectX 10 did not require unified architecture for the graphics processor, which everyone had interpreted as a sign which G80 would in the final analysis be a kind of hybrid chip .It is not the case. At this stage we should recall that ATI was the first manufacturer of graphics chips to propose an architecture completely unified with the chip XBox 360.

But before continuing, it is necessary for us to clarify exactly the mean of unified architecture. Until now the operation of the great majority of the graphics processors depends on pipeline. The data arrive of the main frame and are examined stage by stage progressively with their progression in various levels of the pipeline. Since DirectX 8, the first stage consists in treating the vertex Shaders , before formatting the primitives like lines, points, or triangles,Then our primitives, these outlines of geometrical figures, pass by the pixel pipeline to undergo the operations of shading before arriving in the ROP (Raster Operations). At this stage, our embryos of pixels must still undergo a whole heap of operations like test Z or antialiasing before being regarded as true pixels once they will be stored in the video memory.

NVIDIA GeForce 8800: the traditional pipeline

Classical architecture of GPU

With unified architecture, it is not any more question of partitioning the graphics chip with specific calculating units which would be dedicated to such or such task and would generate a linear advance of pixels. Not, all the calculating units of GeForce 8800 are polyvalent and thus capable to treat any type of data (vertices, pixels, geometry, etc). The advantage number one of this architecture being of allowing the dynamic distribution of tasks between the various calculating units, and this according to the needs for the application. Whereas the classical architectures have a fixed number of units dedicated to the processing of vertexes, just like besides the number of stages of pipeline to processe the pixels, the architecture G80 is programmable and can always according to the application see its rather affected units with the processing of the pixels shaders or rather with that the vertexes shaders or to both at the same time. Currently the majority 3d programs require a more important computing power on the pixels shaders than on the vertexes shaders, which explains the passing to a greater number of units pixels shaders than vertex units in the traditional graphics chips. NVIDIA estimates that the applications generally have different needs and that certain scenes of the same game can be limited by the computing power in vertex (for example). Also the unification of architecture makes possible to allocate the maximum of resources necessary to the operation in calculation on a given moment during the execution of any application 3D.

Why unify architecture… according to NVIDIA

NVIDIA : the pipeline Exit, welcome to the Streaming Processor

In its version GTX GeForce 8800 comprises 128 calculating units to do everything Streaming Processors. Very different from the current units, these Streaming Processors are of type scalars and nonvectorial. With the image instructions SE from INTEL the concept of the vector calculus is to be able to work into simultaneous on a table of values. Typically for a GPU it is about the same component treatment various (coordinated X, Y, Z vertex or values of colors RGBA in pixel). The concept of a scalar unit is to work only on one of its data. Indeed, the vectorial operations are overall majority, however 3D programs also have recourse to scalar calculations and in this precise case, the vectorial units are unsuited. According to analyzes done by NVIDIA, which says to have considered several hundreds of shading programs , those are increasingly numerous to scalar calculations resort in particular when the shaders implemented are long and complex, which would justify the passage to a scalar architecture.

Sight of the architecture G80 GeForce 8800

In architecture GeForce 8, NVIDIA group Streaming Processor per pair of 8 and one arithmetic unit includes a total of 16 of these data processors . Each processor supports the standard IEEE 754 for the precision of floating point and can simultaneously excute two instructions (dual-exit) scalar of type MAD (to multiply and add) and MUL.

If we want to make a parallel with existing architectures, we would be tempted to say that GeForce 8800 is equivalent to a graphics chip having 32 pipelines. It would not be completely exact at the time of the vectorial operations GeForce 8800 is like a chip with 32 pipelines, it does not go from there in the same way when the scalar operations are considered. While summarizing, G80 thus brings much greater flexibility in the execution of shaders. It is all the more true as the frequency of operation for the Streaming Processor is different from that of the chip: when the whole chip work at 575 MHz, Streaming Processor operate at 1,35 GHz (at least on the model 8800 GTX).

Improvements and Early-Z

With GeForce 8, NVIDIA returns also on one of the major problems of its GeForce 7: power as regards branches. Since with the Shader Model 3.0 programmers can add in their shaders conditions (what is called branches) in order to jump a whole part of instructions. All the GPU break up the scene into several packages of pixels and each unit excute the shaders on the pixels in question. The problem of the branches comes to the fact that if all the pixels do not answer at the same conditions it is necessary to excute the program, or all at least part of the program, on several occasions. Blow, the size of the packages of pixels (also called granularity) is critical to obtain good performances. With GeForce 7 the pipelines worked on blocks of 1024 pixels making null and void the concept of branch. In order to allow a real support of Shader Model 3.0, ATI had introduced a granularity of 16 pixels on its X1800 (before totaling 48 pixels on X1900). GeForce 8800 fact a step in the good direction by proposing to him also a lower granularity ranging between 16 (4x4) and 32 (8x8) pixels according to cases\'.

Always with an aim of improving the effectiveness of its chip, NVIDIA introduces Early-Z. It is a question of eliminating as soon as possible the pixels which we knows that will not be visible in the screen. Without being completely new, the approach of NVIDIA with Early-Z consists in excuting the useless pixels upstream in the pipeline (what is called Z-Bottom), history to avoid an useless wasting of resources. Each GPU has supposed techniques on the matter to solve the problem such as for example Hyper-Z at ATI.

Diagram of operation of Early-Z

NVIDIA G80: Beyond Streaming Processors…

Of course the architecture of G80 is not summarized in only Streaming Processors, each block gathering “Streaming Processor” also comprising four units of texture. Completely alone from the data processors , and functioning at 575 MHz on GeForce 8800 GTX, the units of bilinear texture are with the total number of 64 and are actually grouped two by two. This number of impressing texture units makes possible to NVIDIA to assert the possibility of treating 64 pixels per cycle of clock for the filtering of textures or 32 pixels per cycle of clock for the addressing of textures. These figures are to be put in connection with the capacity of GeForce 7900 to treat 24 pixels by clock for simple filtering. We mention it before, the units of textures are alone from the data processors This detail is important since it makes possible to back up GPU cycles at the time of the operations addressing of textures or at the time of filtering operations , for example. With a traditional graphics chip like GeForce 7, a calculation to address texture was intercalated in general with the operations of arithmetic processing of the shaders thus preventing the use of shader processor before textures are not repatriated. Thanks to the operation alone from the units of texture and “Streaming Processsor”, GeForce 8 can mask in a rather effective way the latency times induced by the access to textures.

Let us continue our architecture G80 review by evoking the ROP. Separated from the arithmetic units comprising the famous data processors , partitions ROP (Raster Operation) are six and their role always consists in writing the pixels in memory, in other words the last necessary operation to generate an image. Side characteristics, NVIDIA indicates that six units ROP of G80 can treat 4 pixels each one for a total of 24 pixels per cycle of clock during the processing of complete pixels (colors and Z). If only processing Z is applied, the ROP of GeForce 8800 can reflect 192 pixels per cycle of clock with a sample pixel. Let us recall that vs the 24 ROP of GeForce 8, GeForce 7 had 16 units.

In addition to Early-Z previously evoked, NVIDIA also made improvements to traditional Z-Culling which remains of topicality, since none of the two methods can determine all the situations on the level of an individual pixel. Nvidia asserts the speed of processing four times higher vs GeForce 7900 GTX to test the visibility of the pixels. The goal being always to check if a pixel is not hidden by another in order to avoiding wasting unnecessarily resources of GPU for the display of a pixel which will be finally invisible for our eyes. Subsystem ROP of G80 naturally manages the functions of antialiasing in multi-sampling (MSAA) or super-sampling (SSAA) without forgetting the antialiasing transparency. For the HDR, the mixtures of the render targets in FP16 and FP32 are logically taken charges some by the ROP and as the specifications DirectX 10 impose it, 8 targets of returned (MRT or Multiple Render Targets) can be used.

Lastly, the concept of controller report also moves with GeForce 8, since the chip, in its version GTX, contain six partitions memory in correlation with six partitions ROP. Each partition lays out of an interface 64 bits making GeForce 8800 GTX a chip whose controller report is finally interfaced on 384 bits. , that makes possible to nvidia to equip the top-end-range card with 768 Mb of memory GDDR3. NVIDIA did not jump step towards with the GDDR4 .

The Lumenex engine: anisotropic filtering, antialiasing and HDR with the menu!

Behind name Lumenex marketing, name which succeeds for the small history CineFX introduced with disastrous GeForce FX, NVIDIA proposes significant improvements as regards of the image quality . One of the most outstanding update relates to the quality of textures filtering . We evoked above the power of the texture units in G80 vs its predecessors, it makes possible to avoid the recourse to optimizations of filtering often too aggressive. we remembers that architecture GeForce 7 was regularly pointed finger for the poor quality of its filtering. at least with GeForce 8 since NVIDIA proposes finally an anisotropic filtering worthy of this name as we can see it on the screen shot below! Note that by default, the driver now use the mode of high quality; !

NVIDIA Geforce 8800: D 3D AFTester - G71

Anisotropic quality filtering 8x, from left to right: GeForce 7900, Radeon X1950 then GeForce 8800

But this is not the only outstanding update of Lumenex that NVIDIA presents besides the new standard as regards to the quality of image (...) the firm proposes improvements on the level of antialiasing. Thus technology CSAA or Coverage Sampling Anti-Aliasing makes its appearance. The objectives is always the same one to reduce to the maximum the effects of staircase which appear on the objects composing a scene 3D while trying not to hurt the performances too much,! Traditionally, antialiasing is needed a particularly heavy technique which weighs severely on the performances of the graphics chip. Using multiple samples under pixels to calculate the color of our objects, antialiasing requires always more samples to provide an optimal quality. But with more samples to be stored and calculated in the memory, an increase proportional in the resources is necessary to maintain a decent performance level .

Also NVIDIA introduces it the CSAA which is based on a new algorithm in an intelligent way, information of cover in order to avoid the clogging of the bus report by the multiplication of samples. mesuring the details on the exact operation of its technology CSAA, NVIDIA indicates that it makes possible to offer the quality of an antialiasing 16x with performances which would not be cut down more than when traditional antialiasing 4x is activated. Concretely, the CSAA acts as an antialiasing type multi-sampling (MSAA) whose mask of cover is programmable: in light, only certain zones of the image benefit from the processing AA. In the long term with DirectX the 10.1 programmer will be able to define the zones of the image requiring more antialiasing. Today and fault of being able to let the developers doing it, it is the driver who determines these zones, which largely reduces the interest of this function. In practice, the contribution of the CSAA does not seem obvious in the graphic quality if we believes thsoe screen shot under Half-Life 2: Lost Coast in 1600x1200:

NVIDIA GeForce 8800: Test CSAA - Antialiasing (G80)

Tests of antialiasing with GeForce 8800, from left to right: no antialiasing, antialiasing standard 4x, antialiasing CSAA 16x

Difficult to propose a real quality difference between the made traditional in type antialiasing 4x and the CSAA 16x on the same game (Half-Life 2 Lost Coast). Side control, the implementation of the CSAA requires a light parameter . It is enough to select the option “ improve the parameter of application” or further going and choosing one of the four new antialiasing available: 8x, 8xQ, 16x and 16xQ. The good news being that according to NVIDIA technology is compatible with all the game t. We express however serious doubts with our made screen shot :-).

Activation of the CSAA on the level of the pilots of GeForce 8800

But Lumenex does not stop there, since NVIDIA proposes for the first time the use of antialiasing with the effects HDR! Let us remember that ATI proposed an exclusiveness with its Radeon X1000 the capability to use borth HDR and antialiasing. NVIDIA thus goes back on level on this precise point what should satisfy most users. And since we speak about HDR it should be known that NVIDIA introduced with the Lumenex engine an assumption of responsibility of one HDR improved. Let us recall that the HDR, or High Dynamic Range, makes possible to display scenes with a very strong variation of clearness to reproduce with more close the natural effects the back-light or reflection of the light. According to the recommendations of Microsoft for DirectX 10, NVIDIA offers one HDR in conformity with the OpenEXR specifications with a precision in floating point 32 bits per component one HDR of type 128 bits. Compared so that with GeForce 7, NVIDIA offers here twice more precision, the graphics chips of the G7x generation being limited to a precision in floating point of 16 bits per component.

Let us finish finally by the display engine which as that of Radeon X1000 is of type 10 bits. Whereas traditionally a graphics chip integrates an engine of display 8 bits for a total number of displayable colors reaching 16,7 million, GeForce 8 benefits from an engine 10 bits which can display nearly a billion different colors. NVIDIA runs up however against the same problem as ATI with knowing the lack of contents 10 bits and the scarcity of the screens 10 bits: the majority of the flat-faced screens are indeed of type 8 bits.

NVIDIA cuda– Unified Device Structures

With its GeForce 8800, NVIDIA takes a step in direction with developers wishing to use the graphics processor like a kind of coprocessor which can relieve the main frame or CPU of certain calculations.The goal is to transform the graphics chip into a general calculating unit and the architecture massively parallel of G80 make benefit from it. here is needed to only replace the work of G80, named by pixels to threads, in other words processes. But before going further, let us recall that ATI was the first to take a step in this direction with its support for project GPGPU (General Purpose Graphics Processor Unit) during launching of its Radeon X1800 then with the arrival of version adapted to its chips Radeon X1900 with Folding@Home . At ATI it is called Stream Computing, but at NVIDIA we talk about Thread Computing.

Why Thread Computing? Quite simply bus contrary to Stream Computing of ATI, the solution of NVIDIA would allow various processes to be treated and communicate between them. Other favors proposed by CUDA: the use of programming language cannot be standard any more. Consequently, it is possible to program G80 as one program like any processor x86. . Of course, CUDA is based on an additional instruction set, but NVIDIA shown particularly discrete about it. it does not wish to reveal in public the instruction set CUDA, reserving it for some tests.

NVIDIA announces that its technology CUDA can benefit from the passage in SLI whereas it is possible simultaneously to use the same graphics chip for a processing CUDA and one for 3D. For the moment CUDA remains altogether rather theoretical , no concrete application still benfit from it, we will return more in the details later.

NVIDIA PureVideo

Introduced with GeForce 6200 several months ago, PureVideo technology had made smile to its beginnings with many application . However, it should be well be recognized that with the new driver and generations of graphics chips, this one was largely improved. Today, the number one of use of PureVideo is its capacity to relieve the main frame at the time when decoding video file to the H.264 formats, VC-1, WMV, WMV HD and MPEG 2 HD.

With this intention, GeForce 8800 integrates a block of transistors exclusively dedicated to the processing of the video. However, the amateurs will be probably disappointed to learn that NVIDIA did not bring any change to PureVideo logic. The same capacities are thus found as with GeForce 7 with precedes them two new functions suggested on the driver level : the PureVideo filters with reduction of noise and improvement area are now able to function on high definition video. The possibility to use the power of chip to contribute with decoding .

NVIDIA G80: A standard ....chip!

Technically, GeForce 8800 can be a standard graphic chips, but its the most impressive ever that i have seen . Its size is simply enormous, since it is nearly 681 million transistors which animate it! It is more than 2 times the number of transistors in GeForce 7900 GTX (278 million) and in Core 2 Duo (291 million). Probably because of the particularly high number of transistors, NVIDIA did not want to take the risk of launching the production of this big chip with a new 0.08m process. G80 thus remains in 90 Nm by TSMC just like the GeForce 7 serie. The packaging of the chip was updated since the die is covered with a very imposing metal from the same quality as that which equips the Athlon64 processors.

NVIDIA GeForce 8800: the chip here in version GTS

GeForce 8800 presente for the first time what NVIDIA calls the NV I/O 1. It is about an additional chip on which NVIDIA gives few details and which integrates a good part of the logic display 2D into knowing the RAMDAC, the TMDS and other transcoders in load of the final display. It will be noted that the Dual-Link DVI is always the part like doubles it RAMDAC 400 MHz. To note that HDCP are stored via external CryptoROM and not directly in the NV I/O.

Chip NVIO accompanying GeForce 8800

Electric consumption in question, and in spite of fears which one could have, NVIDIA finally remained almost measured. Thus, according to our recordings power consumption , GeForce 8800 GTS consumes slightly less than one GeForce 7950 GX2 whereas model GTX is far from . NVIDIA recommends the power of 450 Watts with a line 12 volts of 30A for GeForce 8800 GTX. More modest, GeForce 8800 GTS will be satisfied with a power of 400 Watts with a line 12 volts and 26 amps. Side thermal pwoer,nvidia indicates a maximum TDP of 147 Watts for GeForce 8800 GTS against 177 Watts for GeForce 8800 GTX. Here on the other hand, it is almost a doubling of the TDP, which is never was a good sign, since GeForce 7900 GTX displayed a TDP of 90 Watts…

	Electric consumption total of the system
GeForce 7900 GTX	256 Watts
GeForce 7950 GX2	277 Watts
GeForce 8800 GTS	273 Watts
GeForce 8800 GTX	308 Watts
Radeon X1950 XTX	280 Watts

GeForce 8800 GTS and GTX: specifications

The arrival of G80 is the occasion for NVIDIA to introduce the new GTS version …And contrary to the generation of preceding chips where GeForce 7900 gt shared with model GTX same architecture, there are serious differences between the two alternatives of our new model GeForce 8800. Thus GeForce 8800 GTS counts 96 Streaming Processor, 20 units ROP and lays out in an interface memory 320 bits. Its frequencies are 500 MHz for the chip, 1,2 GHzfor the shaders processors and 800 MHz for the memory . On its side, GeForce 8800 GTX comprises 128 Streaming Processor, 24 units ROP and benefits from an interface memory 384 bits. Its frequencies are 575 MHz for the chip, 1,35 GHz for Streaming Processor and 900 MHz for the memory. When GeForce 8800 GTS contain 640 Mb of memory, model GTX has 768 Mb. Lastly, side band-width theoretical memory, GeForce 8800 GTS asserts 64 Gig/S against 86,4 Gig/S for his/her big brother model GTX.

With such specification you will have to understood that the GeForce 8800 GTX has an installed memory 33% higher than the model GTS. It will be necessary to see in the pages which follow how that is translated in practice.

	GeForce 8800 GTS	GeForce 8800 GTX
Interface	NCV-express train 16x	NCV-express train 16x
Engraving	0,09µ	0,09µ
Transistors	681 Million	681 Million
RAMDAC	2 X.400 MHz	2 X.400 MHz
T&L	DirectX 10	DirectX 10
Stream Processors	96	128
Units ROP	24	20
Embarked memory	640 Mo	768 Mo
Interface memory	320 Bits	384 Bits
Band-width	64 Go/S	86 Go/S
Frequency GPU	500 MHz	575 MHz
Frequency Stream Processors	1200 MHz	1350 MHz
Frequency memory	800 MHz	900 MHz

GeForce cards 8800 GTS MSI, GeForce 8800 GTX Sparkle

MSI and Sparkle are among the first manufacturers to have been able to forward to us the cards based on GeForce 8800. It is thus quite naturally that we will use their cards for this test. To note that like very often with rising generation top-end-range, all manufacturers are placed with the same sign since Asus has the exclusiveness in the manufacture of cards containing G80. ! Also note that the PCB employed is different between GeForce 8800 GTS and GeForce 8800 GTX. Proposed exclusively with the format PCI-Express 16x, the cards GeForce 8 will not be born with format AGP, the chip being incompatible with bridge HSI developped by NVIDIA a few years ago.

MSI signs with its GeForce 8800 GTS a card not very common with a black pcb! Measuring in the 23 centimetres, the card adopts a new kind of cooling system which proves to be rather monstrous. The cooler is composed of thick copper base which rests on the graphics chip and then connect a large metal radiator via a heat-pipe. The whole is covred by a plastic at the end. It is about the model CoolerMaster TM63. Heat is thus ejected outside the card via the openings place on the higher part of the video card. with such cooling we could legitimately fear that the card will be particularly noisy, which is fortunately not the case, the noise of the cooler being very often less than that of GeForce 7950 GX2.

With only one power connector pci-Express , the card has a connector SLI and uses the chip NV I/O for the management of video out. On this subject, we finds of course two Dual-Link ports DVI as well as a video out . Provided with 640 Mb of video memory, the card uses ten Samsung chips in certified GDDR3 with 1,2 NS and its frequencies respect those recommended by NVIDIA, 500 MHz for the chip and 800 MHz for the memory.

Side of sparkle, GeForce 8800 GTX proves to be much larger than his/her small sister GeForce 8800 GTS. If the PCB remains black, it is 27 centimetres long! As comparison, GeForce 7950 GX2 measure 23 centimetres. Result, the card will have to be placed in the big cases . Another notable difference with GeForce 8800 GTS, model GTX contain two power connectors PCI-Express ! . they is two connectors SLI which are present on the card. The cooling system is identical in its design to that of GeForce 8800 GTS although slightly larger. The fan is always signed CoolerMaster a model TM61.

so our 8800gtx card come with 768 Mb of video memory, the card uses 12 chips memories GDDR3 Samsung and certified to 1,2 NS. The chip NV I/O is of course a part, and the frequencies of the card are in conformity with those recomended by NVIDIA the 575 MHz for the graphics chip and 900 MHz for the memory. GeForce 8800 GTX of Sparkle offers two Dual-Link connectors DVI . Sparkle having forwarded to us a commercial version of its card we are able to quickly evoke the contents of the box, which is rather summary.we finds a short handbook, Driver CD , only one adapter DVI/VGA, a cable S-Video, a cable with out component, and two adapateurs electric Molex towards PCI-Express . The manufacturer does not forget the software aspect and delivers the game Call of Duty 2 in complete version like PowerDVD 6.

To evaluate the performances of the new GeForce 8800 GTS and GeForce 8800 GTX we had recourse to the following configuration:

INTEL Core 2 Extreme X6800 2.93 GHz,
Motherboard eVGA nForce 680i SLI,
2x1 Gig Corsair Twin2X 6400C3,
Hard disk Western DIGITAL Raptor 150 - Serial-ATA 150

As you will be able to note that our motherboard used the new chipset NVIDIA nForce 680i SLI on which we will have the write our next review . All test where under Windows XP Professionnel Service Pack 2, our system used the last BIOS and driver available to the date of the test. It should be noted that the driver of GeForce 8 do not support officially operation SLI .

We will compare here GeForce 8800 GTS MSI and GeForce 8800 GTX from Sparkle with GeForce 7900 GTX, GeForce 7950 GX2 but also with SLIGeForce 7900 GTX. For the graphics card containing GeForce 7 we used the driver ForceWare 96.94 whereas we had recourse to ForceWare 96.97 for the cards GeForce 8. We will of course not fail to oppose GeForce 7/8 to their competitors ATI and for this reason we used the following configuration:

INTEL Core 2 Extreme X6800 2.93 GHz,
MotherboardAsus P5W-DH Deluxe (BIOS 1601),
2x1 GIG Corsair Twin2X 6400C3,
Hard disk Western DIGITAL Raptor 150 - Serial-ATA 150

At ATI we tested Radeon X1950 XTX but also CrossFire Radeon X1950 XTX and this with driver CATALYST 6.10.

nvidia 8800 gtx - 8800 gts 3DMark 06

NVIDIA GeForce 8800 - G80 - 3DMark 06

we starts like before with 3DMark 06 and from the start, GeForce 8800 GTX show pretty strong result similar to our CrossFire Radeon X1950 XTX in 2560x1600! Vs SLI GeForce 7900 GTX, the last-born from NVIDIA are faster 31%! And if we compares the score of GeForce 8800 GTX with that of GeForce 7900 GTX alone, the variation reaches 137% in 2560x1600! As much to say that Radeon X1950 XTX of ATI is unable to face, since even GeForce 8800 GTS is shown faster. The card of MSI is here 31% superior than the Radeon X1950 XTX in 1920x1200.

nvidia 8800 gtx - 8800 gts Half-Life 2 - Lost Coast - HDR

NVIDIA GeForce 8800 - G80 - Half Life 2 Lost Coast

GeForce 8800 GTX of Sparkle is found invariably at the head, and this same in 2560x1600 where it does better than the systems SLI and CrossFire. However the advantage of GeForce 8800 GTX on the configurations multi-GPU is much more restricted here. Vs GeForce 7950 GX2, the GeForce 8800 GTS is equal, model GTX is 35% faster in 2560x1600. And compared with GeForce 7900 GTX and Radeon X1950 XTX, these two cards being with equality in 2560x1600, GeForce 8800 GTX displays higher performances with about 93%.

nvidia 8800 gtx - 8800 gts Splinter Concealment Theory Chaos - 1.05

NVIDIA GeForce 8800 - G80 - Splinter Concealment

For the most known agent of the NSA, our friend Sam Fisher, GeForce 8800 is a blessing. At least in 1600x1200 where the card from Sparkle is shown to be incredibly fast. However we excuted this test several time and the result is the same in this resolution , result pretty strange thus. In higher resolutions, the card has much lower performance and its even give place to the SLI of GeForce 7900 GTX which lead then. In 1920x1200, GeForce 8800 GTS is as fast as Radeon X1950 XTX whereas it precedes GeForce 7900 GTX by 15%. But nothing vs GeForce 8800 GTX which shows 62% faster than the old top-end-range GeForce 7, GeForce 7900 GTX. Here the variation of performances between GeForce 8800 GTS and GeForce 8800 GTX reaches 42%.

nvidia 8800 gtx - 8800 gts Quake 4 - v1.2

One connects with Quake 4 which remains based on the graphic engine of Doom 3. Here GeForce 8800 GTX is at the head and exceeds our CrossFire configuration slightly containing Radeon X1950 XTX as well as the SLI GeForce 7900 GTX. Compared with only one Radeon X1950 XTX, GeForce 8800 GTX are faster by 15% in 1920x1200 and this variation grows to reach 31% in 2560x1600. GeForce 7900 GTX takes some here for its grade, since it is largely outdistanced: GeForce 8800 GTS from MSI is already 35% faster whereas GeForce 8800 GTX from show 70% higher framerate , always in 2560x1600!

nvidia 8800 gtx - 8800 gts Call Of Duty 2 - v1.3

NVIDIA GeForce 8800 - G80 - Cal Off Duty 2

Whereas the version 1.3 of Call Of Duty gives lead to the G80 a. GeForce 8800 GTX from Sparkle dominates the test whereas model GTS from MSI is very slightly classified third in withdrawal with the SLI of GeForce 7900 GTX. In 1600x1200 the difference between the two GeForce 8 is 24% and climbs to 34% in 2560x1600. In 1920x1200, resolution of the 24 inches wide screens , GeForce 8800 GTX is 57% more powerful than GeForce 7900 GTX.

nvidia 8800 gtx - 8800 gts F.E.A.R. Extraction Not

Based on the graphic engine of F.E.A.R., this last game is particularly favorable for GeForce 8800 GTX , this last makes equal performance with the SLI GeForce 7900 GTX in 1600x1200 and 1920x1200, takes again the advantage in 2560x1600. Vs the CrossFire of Radeon X1950 XTX, the small last of NVIDIA is faster 46%! But let us specify that we have to re-elect executable F.E.A.R. in Extraction the CrossFire accelerations are not operational. By default driver CATALYST 6.10 do not propose any acceleration for this extension. More modest GeForce 8800 GTS remains 60% faster than GeForce 7900 GTX and thus that Radeon X1950 XTX which makes here equal performance with the last one representing of GeForce 7.

nvidia 8800 gtx - 8800 gts Company Of Heroes - v1.2

Very recent, Company Of Heroes is a game of type RTS in other words a real strategy game time .Result, GeForce 8800 GTX get some honors being nearly 138% more faster than Radeon X1950 XTX in 1920x1200! But this enormous figure is also valid vs GeForce 7900 GTX the two cards being with equality. GeForce 8800 GTS from MSI displays good performances, and the card is classified third behind the SLIGeForce 7900 GTX. This test also clarifies the uselessness of CrossFire and we will underline the very unpleasant bug which affects GeForce 7950 GX2 with an absolutely unbearable flickering of texture (NVIDIA indicates that this bug will be corrected by the patch 1.3 of the game).

nvidia 8800 gtx - 8800 gts Tomb Raider Legends - v1.2

Lara Croft does not contradict our preceding tests and place our two GeForce 8 at the head . In 1600x1200, GeForce 8800 GTX displays higher performances 37%. Compared with GeForce 7900 GTX, the card from Sparkle is practically three times faster in 2560x1600. And vs Radeon X1950 XTX from ATI the variation reaches the 130% in 2560x1600! Enormous, quite simply.

nvidia 8800 gtx - 8800 gts Far Cry v1.4 - HDR 7

Not frankly new, Far Cry followed enough close the graphic evolutions by supporting in turn the geometrical instantiation, Shader Model 3.0 or HDR.More probleme here you will note the absence of result CrossFire Radeon X1950 XTX in 2560x1600: alas the game did not start out in this resolution. Far Cry is very favorable to GeForce 8800 GTX whichh remains fastest in all the resolutions even if the SLI GeForce 7900 GTX is not very far. In 1600x1200, GeForce 8800 GTX is 84% faster than Radeon X1950 XTX. This variation is higher vs only one GeForce 7900 GTX whereas GeForce 8800 GTX asserts performances 35% higher than model GTS. This last is slightly better than GeForce 7950 GX2 in all resolutions.

nvidia 8800 gtx - 8800 gts Need For Speed Most Wanted - v1.3

The balance always also clearly get in favor of GeForce 8800 GTX but the surprise comes from CrossFire Radeon X1950 XTX which is shown here a quite powerful. vs a single Radeon X1950 XTX, the GeForce 8800 GTX is 86% faster! GeForce 8800 GTS does a little better than the model 7950 GX2 with better performances 20% in 2560x1600.

nvidia 8800 gtx - 8800 gts F.E.A.R. Extraction Not - FSAA 4x - Anisotropic Filtering 16x

By activating the functions of filtering under F.E.A.R., CrossFire takes again his rights and precedes GeForce 8800 GTX. This last remains however the fastest chip with performances 36% higher than Radeon X1950 XTX in 1920x1200 but this advantage falls down to 14% in 2560x1600. we will note that the weak performances of GeForce 8800 GTS since here model GTX is shown all the same up to 94% faster in 2560x1600! Curious, can be a diver problem ?

nvidia 8800 gtx - 8800 gts Quake 4 - v1.2 - FSAA 4x

Return under Quake 4 by activating this time antialiasing 4x. At the head: GeForce 8800 followed GTX, in 2560x1600, Radeon X1950 XTX. GeForce 8800 GTX from Sparkle shows 15% faster result than the top-end-range from ATI, and vs model GTS from MSI the advantage climbs to 55% in 2560x1600. By comparing GeForce 8800 GTX with GeForce 7900 GTX we note that the small new one is 54% faster in 1920x1200, whereas in 2560x1600 the advantage for GeForce 8 reached 110%!

nvidia 8800 gtx - 8800 gts Call Of Duty 2 - v1.3 - FSAA 4x - Filtering Anisotropic

always in call of dutty , but this time by activating the functions of filtering directly in the game. Without much surprise, GeForce 8800 GTX remains largely at the head, in front of the SLI GeForce 7900 GTX. And the difference between the two solutions is far from being large! In 1920x1200, GeForce 8800 GTX is 26% faster and this figure climbs to 35% in 2560x1600. GeForce 8800 GTS is shown here particularly convincing while being as fast as the SLI GeForce 7900 GTX. Vs Radeon X1950 XTX, GeForce 8800 GTS from MSI is 13% faster in 1600x1200 whereas the advantage of the last chip from NVIDIA does not cease growing more the resolution increases. Thus in 2560x1600, GeForce 8800 GTS is 28% faster than the last ATI chip.

nvidia 8800 gtx - 8800 gts Half-Life 2 - Lost Coast - FSAA 4x - AF 8x

With Half-Life 2, the activation of the functions of filtering always makes the balance in favor of GeForce 8800 GTX. This one dominates its candidates in 2560x1600 where it makes equal performance with CrossFireRadeon X1950 XTX. In 1600x1200, GeForce 8800 GTX is 33% more powerful than GeForce 7950 GX2 and 77% faster than GeForce 7900 GTX. Radeon X1950 XTX being with equality with GeForce 7900 GTX, the variation with GeForce 8800 GTX is thus identical.

nvidia 8800 gtx - 8800 gts Test CSAA - Half-Life 2 Lost Coast - GeForce 8800 GS

We finish our series of tests by an outline of the performances raised with GeForce 8800 GS by activating function CSAA under Half-Life 2 Lost Coast. We compare the performances raised with those obtained in antialiasing 4x and without antialiasing. Note that in all the cases anisotropic filtering 8x is active. As u can see the activation of the CSAA 16x is not completely without effect on the performances since those are approximately 30% lower than those obtained with the antialiasing 4x activated in the game.

Conclusion

Arrived at the end of this article, it is necessary for us to draw up a first assessment of G80, the new top-end-range graphics chip of NVIDIA. Surprising, here is the first word which comes to mind by considering GeForce 8800. Force is indeed to note that the chip is very different from what we expected, here being confined NVIDIA of a very new unified architecture of the type where each calculating unit is capable to treat standard data. success since the performances are present! with the applications DirectX 9.0c the performances are sometimes multiplied by two vs GeForce 7900 GTX or with the current offer from ATI! On the architectural level, the chip thus seems succeeded overall even if some will regret the number finally relatively low of units ROP.

As happiness never arrives , NVIDIA corrects the quality problems of which overpowered GeForce 7. With GeForce 8, the users benefit finally from a quality of anisotropic filtering worthy of this name . Better, it is even possible to benefit from antialiasing with effects of HDR! Two changes which are more than welcome and which are the fact of the principal defects of GeForce 7. NVIDIA benefits from it more to introduce a new type of antialiasing, the CSAA, but we here we was not able to see any real visual difference.

GeForce 8800 is displayed more like the first graphics chip DirectX 10. Available only under Windows Vista, DirectX 10 should be the base of a number of new game in the months which come. Crysis, the successor of Far Cry, should thus benefit from the additional functionalities offered by DirectX 10. However, difficult to come to a conclusion today about the behavior of GeForce 8800 with the DirectX 10 game. It is the great unknown factor because the effectiveness of the architecture of G80 will depend mainly on the choices which the developers will make.

With 681 million transistors the chip requires a system a rather massive cooling and a power supply which holds strong load. If for a top-end-range card these two parameters are less important than for a solution with a medium range, the more so as cooling remains fortunately quiet, the size of the card is simply impressive! Thus, GeForce 8800 GTX is 27 centimetres long and it is clear that such card will enter in all cases… And even if it enters, it will block all ports Serial-ATA or the hard disk. Side price this oen also hurt, because NVIDIA announces an advised public price of 649 euros including all taxes! And whereas the availability of GeForce 8800 GTX was announced already particularly tended, NVIDIA must face a problem of last minute with a defective resistance on the first samples of GeForce 8800 GTX having involved a systematic recall of the defective units. ! More reasonable e plan oand price come with version GTS which is less powerful, up to 30% of performances vs version GTX.