Tuesday, March 27, 2007

Quadtet: Nvidia GeForce 7900 Quad SLI Performance Review.

When announcing the quad SLI technology at CES 2006, Nvidia Corp. stated that there are some gamers who desire to have extreme performance, quality and exclusivity, but did not begin shipping such products immediately. At CeBIT 2006 the company stated that the systems featuring four GPUs were shipping, even though not all system makers can supply such computers even now. Two announcements and several systems shipped to end-users later, We're publishing the performance preview of the technology that made such a loud hype to reveal all the truth about it. Check out how a GeForce 7900 quad SLI performs against rivals in over 10 games.

The market of high-end graphics sub-systems is changing: those, who were video games enthusiasts a decade ago have grown up, but retained the passion for gaming. But now they do not want to just play, they want to enjoy the best quality and very rapid frame-rates. Today they have much higher budgets on a PC than just about 8-10 years ago and not only want, but can afford to be on the bleeding edge of technology.

Just about 3 years ago technology enthusiasts would be happy with only one Radeon 9800 Pro, but a couple of years back Nvidia introduced its technology called scalable link interface (SLI) that could allow two graphics cards to work in tandem to deliver high frame-rates and some improvements in quality.

The high-end of enthusiast community adopted the multi-GPU technology rather quickly, but the total available market for those extreme graphics sub-systems did not increase much: the majority of computer gamers do not buy graphics cards more expensive than $399. Then Nvidia introduced its GeForce 7800 GTX at extreme price of $599 and those, who could afford, bought the boards once again despite of the price. It looked like the number of customers who buy the cream of the creams does not grow, they simply buy more cards in their desire for the best.

If hardcore technology and video games enthusiasts, or, at least, some of them, would afford virtually everything, why not offer them something, which promises them the absolutely highest performance and amazing quality? Why not sell them four graphics cards at once?

Photo Sharing and Video Hosting at Photobucket

This is the idea behind the quad SLI: to offer extreme gamers a truly extreme solution. But is the first implementation of the quad SLI really brings extreme speeds and quality? Let’s find out together!

Graphics Not for Everyone

Quad SLI implies that users get four premium graphics processors that should allow them to attain very high speeds. It means that users have to buy four graphics chips with memory that could sell for $500 per single graphics card. Obviously, quad SLI solution will be extremely expensive – about $2000 for a graphics sub-system.

But quad SLI is not meant to be inexpensive. It is meant to offer something that nothing else can and make owners of such technologies feel like the owners of Ferrari or Lamborghini, luxurious sports cars. Very few people can afford those automobiles and considerably more can acquire less expensive mass-produced sport cars and tune them to performance that is not significantly below the cost of a new Ferrari. Still, being an owner of black Lamborghini should bring a totally different feeling than being an owner of a blue Nissan. The same applies to quad SLI: perhaps, some graphics cards may perform truly well, but the quad should perform faster.

Historically, there were premium-class graphics sub-systems: two Voodoo2 in scan-line interleaving (SLI) mode, GeForce 2 Ultra, Radeon X850 XT Platinum Edition, GeForce 7800 GTX 512 SLI and so on. However, they did not cost $2000 and the price-point is the main feature that distinguishes Nvidia’s quad SLI from all the other solutions.

However, such a high price-point needs several justifications and the main one is the best possible experience of a premium-class product:

* It is outperforming all the rest graphics sub-systems on the market.
* It offers something which the rest simply cannot.
* It does not have any issues with the drivers or the software.
* It is quiet.
* Owners of it must not experience any kind of issues – hardware-related, software-related or anything-else-related.

At the end, quad SLI must be the best of the best.

Nvidia is trying very hard to offer quad SLI in the best possible way: through system integrators, even though it continuously emphasizes that someday quad SLI-featuring computer will be assembled at home. Selling quad SLI now through system makers is conditioned by several factors:

* Quad SLI graphics sub-system should be power hungry and system integrators should definitely ensure that it is supplied with enough of power.
* Quad SLI has four fans and requires proper air flows inside computer cases: system integrators should make sure that the system has appropriate cooling, yet, is not too noisy.
* Finally, tech support of system integrators can address individual problems of their customers, making the overall quad SLI experience more pleasant.

But what system integrators cannot improve is performance in games, the lack of any problems with the drivers, the lack of any issues with the games and so on, something, which has to be addressed by Nvidia.

At the moment Nvidia does allow to sell GeForce 7900 quad SLI systems to end-users, however, they say, it does not allow system integrators to let the press to benchmark the systems for some reason. Nevertheless, it was reported in the media that some members of the press are still permitted to review the quad SLI systems as a ready-to-go personal computers, which means that the articles about quad SLI should contain a lot of information about the systems and some benchmarks of the GeForce 7900 quad SLI, which once again emphasizes that the potential customers will get reviews of “quad SLI experience”, not just pure numbers.

Being a technology-related web-site, X-bit labs decided to skip the system part of the GeForce 7900 quad SLI preview and do a classic technology research (even though, a bit shortened for several reasons), something which FHM or Playboy magazines do not typically do.


Quad SLI Explained: PCI Express x48 Bridge Improves the Bandwidth

Certainly, to develop a system that uses four graphics processors to render a frame is not easy and involves both hardware and software technological expertise. One of the problems that developers of multi-GPU technologies usually face is bandwidth between graphics chips. The processors should communicate between each other in various cases, especially in “scissor” mode also known as “split frame rendering”, when the card that is going to display a frame should get the rest of it from another board as rapidly as possible.

In order to ensure that there is plenty of bandwidth between the chips and, perhaps, provide its GPUs some new capabilities, Nvidia has invented a special PCI Express x48 bridge, which, basically speaking, allows two GPUs to communicate with each other using full-speed PCI Express x16 bus and also to have a “shared” PCI Express x16 lane to the rest of the system.

Photo Sharing and Video Hosting at Photobucket

Such an approach allows the company not only to ensure relatively speedy chip-to-chip interconnect, but also to allow such dual-chip graphics cards to work on mainboards that do not officially support the SLI technology, for instance, on Intel’s platforms. In the situation of its 4-way SLI Nvidia ensures that the bandwidth provided to each chip is be optimal.

Photo Sharing and Video Hosting at Photobucket

It is not clear whether such method with additional chip is efficient in terms of performance compared to methods with no bridges, but it may have a tangible advantage if the chip (or driver) can determine priority for each data stream, something which may reduce data latencies. Moreover, Nvidia has been using its PCI Express <=> AGP bridge for several years now without a problem.

All 4 One: Quad SLI Rendering Modes Explained

Currently Nvidia offers four SLI rendering modes that are different from each other:

In Alternate Frame Rendering , the driver divides workload by alternating GPUs every frame. For example, on a system with two SLI-enabled GPUs, frame 0 would be rendered by GPU 0, frame 1 would be rendered by GPU1, frame 2 would be rendered by GPU0, and so on. Nvidia says that this is typically the preferred SLI rendering mode as it divides workload evenly between GPUs and requires little inter-GPU communication, allowing for up to a 1.9x performance increase in case of 2 GPUs.

Photo Sharing and Video Hosting at Photobucket

We have not seen any statements from Nvidia that describe benefits of AFR in 4-GPU environment, even though the company states that the basic AFR principles work here as well: the GPU0 renders frame 0, the GPU1 renders frame 1, the GPU2 renders frame 2, the GPU3 renders frame 3.

The AFR works pretty fine in 2 GPU environments, even in cases when we need to use render to texture operations (necessary for environment mapping, shadow mapping, and many more) which require graphics chips to in SLI mode to broadcast any changes in render targets to each other. Nvidia admits that this typically results in a large data copy, resulting in bus traffic and synchronization overhead and to avoid this applications developers have to clear color for texture RTs each frame by calling a special command (“Clear()”) on the surface that corresponds to the texture. The problem that occurs is that there are some special cases that need the results of rendering of the previous frame to render the next frame, for example, in the cases of previous frame’s rendering is used to approximate scene luminance for tone mapping, which means, clearing the RT should not be performed. Game developers have to allocate separate RT for each GPU (if an application uses one texture RT and happens to be running on an SLI system with two GPUs, Nvidia advices to allocate two RTs instead: on even frames, perform all RTT operations on renderTarget0, and on odd frames perform all RTT operations on renderTarget1), however, it is unclear whether all the programmers have already adopted the technique for two GPUs, not talking about four GPUs.

Photo Sharing and Video Hosting at Photobucket

In Split Frame Rendering , the driver will clip the scene into multiple regions and designate rendering workload for these regions to different GPUs. For example, on a system with two SLI-enabled GPUs, the screen may be divided vertically, with GPU0 rendering the top region and GPU1 rendering the bottom region. Rendering is also dynamically load balanced, so the scene division will change whenever the driver determines that one GPU is working more than another. According to Nvidia, this SLI rendering mode is typically not as desirable as AFR mode, since some rendering work is duplicated and communications overhead is higher. We suspect that SFR mode for 4 GPUs has even higher overheads, hence, should deliver a bit lower efficiency compared to the 2 GPU SFR.'

Photo Sharing and Video Hosting at Photobucket

Specifically for quad SLI, Nvidia introduced the so-called AFR of SFR : the frame0 is rendered in scissor mode by GPU0 and GPU1, whereas the frame1 is rendered by GPU2 and GPU3.

In compatibility mode , only one GPU is active and all other GPUs are idle. This offers no performance benefit, but ensures compatibility.

Finally, there is a mode called SLI antialiasing (SLI AA) that can blend the results of antialiased rendering of several GPUs for the final frame. In addition to the 8x and 16x mode, the 4-way SLI adds 32xs mode (which means that every GPU renders its frame using 8xs algorithm with certain offset and then the main GPU combines the rendering into one frame) that promises ultimate quality.

Photo Sharing and Video Hosting at Photobucket

We should not that since there are 4 GPUs now, the SLI AA patterns that were effective for the 2-way solution no longer work here and probably Nvidia employed some new methods.

GeForce 7900 GX2: PCB Design

It’s not the first time an attempt to make a graphics card with two high-performance graphics processors on board is made. We could recall ASUS EN6800GT Dual, EN7800GT Dual or Gigabyte GV-3D1-68GT as examples.

And each time the developer was facing a lot of problems because it’s not just about putting two graphics chips on the PCB. Placing the memory chips, wiring the power circuit, and ensuring that the result can work at high frequencies are just some of the problems to be solved. It’s a terribly difficult job to design a PCB that would meet all the requirements, yet would have a normal size (installing an ASUS EN6800GT Dual into a standard system case was real troublesome).

All dual-chip graphics cards have been “single-storied” until the GeForce 7900 GX2. The developers have been sticking to traditional PCB designs although the concept of a two-storied graphics card isn’t new – it was first implemented back in 1998 in the Quantum 3D Obsidian2 X-24. That card was an ordinary Voodoo2 SLI made as a two-storied card, each storey accommodating one set of Voodoo2 chips.

Nvidia’s engineers took this path as they were developing their GeForce 7900 GX2. This graphics card is in fact two cards placed one above another.

Photo Sharing and Video Hosting at Photobucket

The GeForce 7900 GX2 is also the longest graphics card for today. Its height is not bigger than that of existing graphics cards, so there should be no problems if you try to install it into narrow system cases, but its length means that not all PC cases can be used to build a Quad SLI system. In some system cases the cage for hard drives may get in the card’s way. On the other hand, the GeForce 7900 GX2 is going to ship in ready-made computers as yet, and you won’t have to solve that problem on your own.

Photo Sharing and Video Hosting at Photobucket

Note that the left part of the top PCB is almost empty, but that’s not a waste of space, as you will see shortly. Seats for two more DVI-I connectors can be seen nearby. They are installed on the professional Quadro FX 4500 X2 graphics card that uses the same PCB design and can connect to four monitors simultaneously.

Despite the low power consumption of the G71 chip and the relatively low graphics memory frequency, the power circuit on the GeForce 7900 GX2 is rather complex and each PCB of this card is equipped with an independent voltage regulator. The MOSFETs in the bottom right corner of each PCB are cooled with a small aluminum heatsink. There’s a risk of overheat since the bottom heatsink is fully covered by the top PCB. That’s why the system case of a Quad SLI computer must be well ventilated. The PCI Express power connectors are rotated by 90 degrees relative to their usual position, probably to avoid making the card even longer. Each GeForce 7900 GX2 has two such connectors, so four external power connectors are attached to a whole Quad SLI subsystem that consists of two such cards.

The PCBs that make up a GeForce 7900 GX2 are connected to each other by means of metal hexagon poles and screws. We took the screws out to access the bottom PCB and we also removed both the coolers. Although the cooler of the bottom card is completely cut off by the top card, overheat is not likely: there are special holes in the appropriate part of the PCB. So, the GeForce 7900 GT2 takes its air for cooling from both sides at once.

Photo Sharing and Video Hosting at PhotobucketPhoto Sharing and Video Hosting at Photobucket

Photo Sharing and Video Hosting at Photobucket

The empty space you could see on the top PCB is occupied by PCI Express x48 switch chip and by two diagonally placed connectors for transferring data. Analogous connectors can be seen on the reverse side of the top PCB, and the PCBs communicate via two textolite contact panels. The reason for this connector position is very simple. Since PCI Express lanes should be of the same length for each point-to-point connection, in case of vertical connector placement they would need either to turn some chips by 45 degrees or to make the cards layout much more complex. So, the diagonal connector placement is a pretty logical move here.

Photo Sharing and Video Hosting at Photobucket

It’s from this connector that the top graphics processors receives data from the switch chip which is directly attached to the PCI Express bus via a standard x16 connector on the bottom PCB. The second output of the x48 chip is directly linked to the bottom G71 processor.

Photo Sharing and Video Hosting at Photobucket

The use of a switch chip allows the GeForce 7900 GX2 to work normally on any mainboard that is based on the nForce4 SLI X16 or any other SLI-compatible chipset from Nvidia. Moreover, as we see from the situation in the professional graphics cards market and in particular solutions like Nvidia Quadro FX 4500 X2, a switch chip like that may allow using the card on platforms that officially do not support SLI technology.

Photo Sharing and Video Hosting at Photobucket

As required by the design of the quad SLI platform, the GeForce 7900 GX2 is equipped with two MIO connectors shifted relative to one another in order not to get in the way of the connecting bridges.

We removed the coolers to have a better view of the design of the PCBs near the graphics processor and memory chips. Each PCB of the GeForce 7900 GX2 carries a G71 D-N chip and 8 chips of GDDR3 SDRAM.

Photo Sharing and Video Hosting at Photobucket

The graphics chips are marked as G71 D-N but they do not differ externally from those installed on GeForce 7900 GTX and 7900 GT, have the same revision A2 and lack a protective frame on the die package. It’s important to note that the graphics processors are clocked at a lower frequency on the GeForce 7900 GS2 than on the GeForce 7900 GTX: 500MHz against 650MHz.

Nvidia just couldn’t help reducing the GPU frequency. Although the G71 chip features low heat dissipation, the cramped two-storied design of the GeForce 7900 GX2 leaves no room for an efficient cooling solution. The overall power consumption of the card would be over 160W whereas reducing the GPU and memory frequencies helped keep within 145W. And after all it would be just much more difficult to ensure stable operation of the complex two-storied solution at high frequencies.

The Samsung K4J52324QC-BC14 chips installed on the GeForce 7900 GX2 have 512Mbit capacity, work at 1.8V voltage and can be clocked at 700 (1400) MHz frequency. At this frequency the power consumption of each chip is about 1.7W, but since they are clocked at 600 (1200) on the GeForce 7900 GX2, their power consumption is lower – 1.55W per each chip. The 16 chips consume a total of about 25W. The chips are placed in the same manner as on other today’s top-end graphics cards; the total amount of graphics memory on the GeForce 7900 GX2 is 1 gigabyte. Two such cards will have a total of 2GB, but you should be aware that the memory amount does not add up in multi-GPU systems, so only 512 megabytes of memory will be available to applications, i.e. as much as each graphics processor from a quad SLI system has access to.

We guess the use of slow memory may harm the GeForce 7900 GX2 in particular and the quad SLI concept at large because it is the memory subsystem performance that’s important in high resolutions and extreme full-screen antialiasing modes. Since the memory of the GeForce 7900 GX2 card works at 600 (1200) MHz, its speed is definitely not high enough to satisfy the growing demand for high memory bandwidth of a system like that. In other words, the performance boost from the four graphics processors may be negated by the slow of each individual graphics processor memory. We’re going to check this supposition in our practical tests soon.


GeForce 7900 GX2: Cooling System

The design of the cooling system of the GeForce 7900 GX2 is necessitated by the design of the graphics card itself. The card consists of two PCBs placed one above another, so Nvidia had to use thin, single-slot coolers. Theoretically, the developer might have put a larger cooler at least on the top PCB, but this would make it impossible to install two GeForce 7900 GX2 into one system with closely placed PCI Express x16 slots.

The coolers are a variation of the reference cooler of the GeForce 7800 GTX which we described in our review.

Photo Sharing and Video Hosting at Photobucket

There’s a U-shaped heat pipe on the cooler’s sole that directly contacts the GPU die surface. The two heatsink sections are made of thin aluminum plates. The whole arrangement is covered with a plastic cap that doesn’t let the air stream leave the heatsink.

Airflow is created by a small blower that differs from the one used on the GeForce 7800 GTX only in the color of the blades. Here, it is made of a translucent acid-green UV-reactive plastic. The fans use a 4-wire connection which implies an intelligent fan speed control system. It doesn’t guarantee quiet operation, though.

As we noted above, the cooler on the bottom PCB of the GeForce 7900 GX2 takes its air not from the thin slit between the fan’s blades and the top PCB, but through the special holes in the PCB. This doesn’t help much in a system with two GeForce 7900 GX2: only the bottom card’s top cooler and the top card’s bottom cooler are going to get enough of air for cooling. So, a Quad SLI system needs a well-ventilated system case or the graphics cards may just overheat. This is a problem for computer integrators, though, because GeForce 7900 GX2 aren’t yet selling individually.

The slits in the graphics card’s mounting bracket seems to imply that the hot air is exhausted outside of the system case, but this is not implemented despite the correct orientation of the airflow from the top cooler. The reason is that there’s no air-directing casing on the way between the heatsink and the bracket. We guess they just installed a standard bracket from the GeForce 7900 GTX here.

By the way, it is possible the diagonal placement of the connector that links the top and bottom PCBs is due to cooling efficiency considerations. Oriented like that, the connecting card turns the airflow from the bottom cooler by 90 degrees and drives it away from the graphics card.

Photo Sharing and Video Hosting at Photobucket

There are special projections at the external side of the cooler’s aluminum casing with which it contacts the memory chips, via heat-conductive pads. It also cools the X48 switch chip. The traditional dark-gray and very thick thermal paste is used as thermal interface here.

Overall, the cooling system of the GeForce 7900 GX2 is a compromise between cooling efficiency and compact size. The developers had to work within the limits imposed by the design features of this graphics card. The cooler should do its job well considering the low heat dissipation of the G71 chip, but you are not guaranteed a comfortable level of noise. The fans will probably have to work at increased speeds to cool the graphics processor and the memory chips in such cramped conditions.


Meet the Extreme High Definition: 2560*1600 in Action

Virtually all computer games today are running pretty fine in 1600x1200 resolution with anisotropic filtering and all the quality settings set to the maximum on a Radeon X1900 XT graphics card, however, there are quite a number of advanced users who want to enjoy their video games on 24” or 30” displays with 1920x1200 or 2560x1600 resolution. Nvidia believes that for this kind of customers even two graphics cards are not enough and four will be exactly what will satisfy their demanding needs.

Nvidia calls 2560x1600 resolution as XHD, or extreme high definition, and especially targets its quad SLI offering at those, who happen to own a large screen 30” display and want to play the games in native wide-screen resolution.

Photo Sharing and Video Hosting at Photobucket

In addition to truly impressive image quality that the 2560x1600 resolution provides, wide-screen displays with 16:9 or 16:10 ratio give games ability to have wider field of view, which makes the overall gaming experience even more immersive, while professional gamers may see more opponents and thus, become more successful in online gaming. In fact, even adorers of strategy games will take advantage of a wide-screen display.

While the advantages of 30” displays are indisputable, there are several points that should be considered:

* 30” displays from Apple or Dell cost starting from $2500
* Matrixes used on large displays are not truly fast for rapid gaming
* Very few games today support resolutions higher than 1600x1200, but those, who actually do, support 1920x1200 and 2560x1600 nearly for sure.

Meet the Sleeping Beauty: 32x Antialiasing Checked Out

Keeping in mind the following points and also considering another target group for quad SLI – those with smaller monitors who still demand extreme quality of 3D on their screens – Nvidia introduced a new method of antialiasing called SLI AA 32xs. There are quite some 20”+ displays on the market that only support 1600x1200 resolution, which is good by itself, but may not be enough enthusiasts, who constantly want more. Therefore, bringing a more advanced antialiasing pattern was certainly a quite logical option for Nvidia Corp.

As explained above, the SLI AA 32xs is a “mix” of four frames with 8xs (4x multi-sampling + 2x supersampling) FSAA algorithms. Nvidia’s 8xs is known for being very performance hungry, additionally, in case of 4 GPUs there is additional performance hit caused by necessity to blend the frames into one.

In fact, 32xs antialiasing along with a high resolution provides virtually the most astonishing image quality that we have ever seen. For instance, in Half-Life 2 the scenes truly look like photographs, whereas the woods of Elder of Scrolls: Oblivion impress with their detail and accuracy. Unfortunately, you have really be careful with enabling the SLI AA 32xs on the GeForce 7900 quad SLI right now, as in quite a lot of titles it cannot guarantee smooth framerate, thus, becoming a sleeping beauty, which will be awaken by the next-generations of GeForce graphics chips.


Meet the Testbeds and Methods

Unfortunately, we did not have enough time with the hardware at hands to test all the settings and all the games that we typically use. Therefore, we would consider this article as a preview of the technology with full review coming out at a later date.

Photo Sharing and Video Hosting at Photobucket

In order to test the GeForce 7900 quad SLI configuration along with its competitors – GeForce 7900 GTX SLI and Radeon X1900 XT CrossFire – we took two systems from an authorized computer builder based on the Radeon X1900 XT CrossFire and the GeForce 7900 quad SLI and configured just like our typical testbeds. The systems were built using high-quality CoolerMaster CMStacker cases which are big enough to install the long boards and which also provide very advanced cooling: up to four 12cm fans may be installed into the side window and one more sits in the backside.

Photo Sharing and Video Hosting at Photobucket

The GeForce 7900 quad SLI-based system was built by a system integrator, just like Nvidia recommends, and used the drivers that the actual consumers get when they acquire a machine featuring four GeForce 7900-based graphics cards.

Photo Sharing and Video Hosting at Photobucket

We used our hard drives for the appropriate platforms with all the drives and games pre-installed in order to save time on installation procedures, but we had to use a new Nvidia ForceWare driver that is shipping with quad SLI systems.

At the end, we’ve got the following testing platform:

* AMD Athlon 64 FX-60 CPU (2.60GHz, 2x1MB L2 cache)
* ASUS A8N32-SLI Deluxe mainboard (Nvidia nForce4 SLI X16 chipset) for Nvidia GeForce cards
* ASUS A8R32-MVP Deluxe mainboard (ATI CrossFire Xpress 3200 chipset) for ATI Radeon cards
* OCZ PC3200 Platinum EL DDR SDRAM (2 x 1GB, CL2-3-2-5)
* Maxtor MaXLine III 7B250S0 hard disk drive (Serial ATA-150, 16MB buffer)
* Creative SoundBlaster Audigy 2 sound card
* Enermax 660W power supply unit
* Apple Cinema HD 30” display (30”, 2560x1600@75Hz max display mode)
* Microsoft Windows XP Pro SP2 with DirectX 9.0c
* ATI Catalyst 6.3/6.4
* Nvidia ForceWare 87.24

We set up the ATI and Nvidia drivers in the same way as always:

ATI Catalyst:

* Catalyst A.I.: Standard
* Mipmap Detail Level: Quality
* Wait for vertical refresh: Always off
* Adaptive antialiasing: Off
* Temporal antialiasing: Off
* Quality AF: Off
* Other settings: default

Nvidia ForceWare:

* Image Settings: Quality
* Vertical sync: Off
* Trilinear optimization: On
* Anisotropic mip filter optimization: Off
* Anisotropic sample optimization: On
* Gamma correct antialiasing: On
* Transparency antialiasing: Off
* Other settings: default

We select the highest graphics quality settings in each game, identical for graphics cards from ATI and Nvidia. We did not edit the configuration files of the games, but sometimes we used console commands to get higher resolutions than the game offers. To measure the performance we either used the integrated tools of the games we tested in, or if there were none available, resorted to Fraps utility. If it was possible, we measured minimal performance as well.

To load the video subsystem to the full extent and to minimize the influence of the CPU speed on the performance results we didn’t test the systems in the “pure speed” mode. We only ran the tests in “eye candy” mode with full-screen anti-aliasing and anisotropic filtering as well as in “extreme eye candy mode” with SLI AA or Super AA.

We turned on full-screen antialiasing and anisotropic filtering from the game’s own menu if possible. Otherwise we forced the necessary mode from the ATI Catalyst and Nvidia ForceWare graphics card driver. We didn’t test anything in overclocked mode, because of the lack of time. We didn’t test certain settings due to time constraints as well.

Bang Bang: Here Come Problems

Before we proceed with the benchmark scores, we would like to stress that Nvidia GeForce 7900 quad SLI technology does not seem to be mature enough so far. We have experienced a lot of significant and insignificant issues with nearly all the games we have tested with it, including 3DMark benchmarks, except just a few.

The Chronicles of Riddick performed well and even did not crash, however, SLI AA worked really incorrectly: images doubled, tripled, quadrupled and so on.

Photo Sharing and Video Hosting at Photobucket

A really unpleasant experience, which makes SLI AA useless in this game for now. It is impossible to catch such issues in a screenshot (as screenshot basically copies the frame-buffer of the card that outputs the image), but we can show you a photograph.

Photo Sharing and Video Hosting at Photobucket

In the Doom III , Quake 4 and Serious Sam 2 we noticed exactly the same effects as in case of the Chronicles of Riddick: once SLI AA and relatively high resolutions are enabled, we see the aforementioned artifacts with AA, which were probably caused by the fact the frame compositing mechanism did not work as it is supposed to.

In the Half-Life 2 we noticed some artifacts on certain reflective surfaces after changing resolution during the game. The issue is definitely not significant, but is not a pleasant one either. They say that this may be a peculiarity of Nvidia GeForce-based graphics cards, which means, that it happens even on single GPU systems.

Photo Sharing and Video Hosting at Photobucket

The most notable rendering related problem could be observed in Pacific Fighter game, the one which has been renowned for vertex texturing and some other Nvidia-only techniques. Apparently, neither water, nor earth, nor skies could be displayed.

Far Cry crashed pretty often in both SLI AA as well as in high-resolution modes, but we have to admit that the game itself does not support ultra high resolutions out of the box and we also could not benchmark the Radeon X1900 XT CrossFire in 2560x1600.

The Call of Duty 2 game usually froze after changing resolution, which made it nearly impossible to benchmark. At the same time, Battlefield 2 game did not start at all.

Serious Sam 2 could not be benchmarked with 32xs AA in 2560x1600. While such setting is useless, as it is too slow, this issue still does not improve overall impression about the quad SLI platform.

3DMark05 crashed with SLI AA enabled in higher resolutions, while The Elder Scroll’s: Oblivion game froze when we attempted to take screenshots in 2560x1600 resolution with 32xs SLI AA enabled.

During the testing we also experienced numerous crashes and freezes in different situation, primarily in high resolutions or with high levels of SLI AA enabled, which is why we could not compete quite a number of test runs. After numerous crashes we decided to switch to a 1000W power supply unit bought in Northern America, as we believed that the 660W power supply shipped with the system (as there are no higher-wattage power supplies in Europe) could not supply enough power for the machine. Even after the 1kW PSU was installed, the crashes and freezes remained.

The continuous crashes, freezes and problems of the currently shipping Nvidia GeForce quad SLI systems may be associated with early revisions of the hardware, but as the hardware is shipping to customers, any issues with stability are totally unacceptable.

Performance in First-Person 3D Shooters
- Call of Duty -

The first game in our test session – Battlefield 2 – refused to launch on the new platform for some reason, hence, we loaded the Call of Duty 2 only to find out that the game freezes after changing resolutions in menus. Unfortunately, we could only test the GeForce 7900 quad SLI platform in 1600x1200 and 1920x1200 resolutions. In the former, the four-GPU complex from Nvidia suffers a defeat, but in the latter it is much faster than the Radeon X1900 XT CrossFire.

Photo Sharing and Video Hosting at Photobucket

What we see is that in 1280x1024, where we did not manage to test the GeForce 7900 quad SLI, the Radeon X1900 XT CrossFire leaves behind the dual GeForce 7900 GTX configuration. The tandem of the Radeon X1900 XT CrossFire is considerably faster compared to the GeForce 7900 quad SLI in 1600x1200, nevertheless, its lead over the GeForce 7900 GTX SLI is not really solid. In 1920x1200 quad SLI manages to declare victory, however, its performance is not truly firm.


- Chronicles of Riddick -

Photo Sharing and Video Hosting at Photobucket

In 1600x1200 Quad SLI system outperforms both: GeForce 7900 GTX SLI and Radeon X1900 XT CrossFire, but nevertheless, it doesn’t demonstrate any breakthrough performance: all multi-GPU platform guarantee over 100fps performance in this test, so any system can ensure comfortable gaming experience with enabled FSAA 4x and AF 16x.

The same is true for 1920x1200. And in 2560x1600, despite the OpenGL and stencil shadows, Quad SLI and Radeon X1900 XT CrossFire performed almost equally fast. Nvidia platform is less than 10% faster, however none of them can actually provide performance level high enough to ensure comfortable gameplay. The reasons for the low performance of the Quad SLI system are evident: slow memory.

Photo Sharing and Video Hosting at Photobucket

The benefits of the four graphics processors combined into a single configuration are more evident in FSAA 8x mode: Quad SLI appears almost 1.5 times faster than GeForce 7900 GTX SLI in 1600x1200 already and provides quite acceptable performance at 1920x1200. However the next resolution, 2560x1600, remains still unattainable, even though the performance drops less than in the previous case when we had FSAA 4x.

Photo Sharing and Video Hosting at Photobucket

The next FSAA mode – FSAA 16x – is too heavy for Quad SLI, at least if we try to get it to work in 1600x1200. Slow memory prevents the new platform from showing its potential and causes its complete failure in 1920x1200 and 2560x1600. Radeon X1900 XT CrossFire outperforms Quad SLI system by 19% and 28% correspondingly.

Photo Sharing and Video Hosting at Photobucket

As for the unique FSAA 32x mode that is supported only by the GeForce 7900 quad SLI platform, we didn’t see this system work any wonders here. Even in 1280x1024 the average performance reached only 38fps, and in higher resolution it dropped below 25fps. So, SLI AA 32x is of pure theoretical interest in the Chronicles of Riddick – you will hardly spend all that money on the GeForce 7900 quad SLI platform to play in 1280x1024 resolution and with artifacts…


- Doom III -

Photo Sharing and Video Hosting at Photobucket

GeForce 7900 quad SLI manages to reach the performance level of GeForce 7900 GTX SLI in 1600x1200. In all previous resolutions the quad-processor Nvidia platform was lagging behind the dual-processor one.

Photo Sharing and Video Hosting at Photobucket

Quad SLI starts showing its advantages in FSAA 8x + AF 16x mode. Here two GeForce 7900 GX2 working together outpace a pair of GeForce 7900 GTX also working in SLI mode. In 1600x1200 the performance gain reaches about 15%.

Photo Sharing and Video Hosting at Photobucket

This performance advantage grows up to 70% when we use more advanced FSAA mode, however, the overall performance level drop below the acceptable gaming performance for first-person 3D shooters. Radeon X1900 XT CrossFire falls only 15% behind the GeForce 7900 quad SLI in 1600x1200.

Photo Sharing and Video Hosting at Photobucket

Unfortunately, we couldn’t obtain any acceptable results in SLI AA 32x mode, just like in Chronicles of Riddick game we have already discussed. Even the fact that Doom III engine favors Nvidia architecture didn’t help here. The average performance of our today’s hero made only 37fps in 1280x1024.

- Far Cry -

Photo Sharing and Video Hosting at Photobucket

Photo Sharing and Video Hosting at Photobucket

Testing GeForce 7900 quad SLI platform in Far Cry was akin walking around a mine field. The platform did not like our demo recorded on the Pier level, it produced artifacts or crashed without HDR enabled. Surprisingly, it worked fine with HDR activated in 1280x1024 and 1920x1200 resolutions (but could not work in 1600x1200), which one again proves that the platform is pretty capricious at the moment.

The GeForce 7900 quad SLI manages to leave the Radeon X1900 XT CrossFire in 1920x1200 because of the raw computing power, however, such win comes amid great amount of instabilities.

- F.E.A.R -

Photo Sharing and Video Hosting at Photobucket

We see pretty good results in F.E.A.R.: GeForce 7900 quad SLI manages to outperform GeForce 7900 GTX SLI in FSAA 4x mode already. It gets about 25% faster Radeon X1900 XT CrossFire yields to quad SLI system, but also demonstrates pretty decent results. In some complex scenes, the performance still drops below 25fps though.

Photo Sharing and Video Hosting at Photobucket

When we switch from FSAA 4x to FSAA 8x, GeForce 7900 quad SLI remains the only graphics system that can provide acceptable performance in 1600x1200. We cannot say the same about Radeon X1900 XT CrossFire, although it gets noticeably ahead of GeForce 7900 GTX SLI in this mode.

Photo Sharing and Video Hosting at Photobucket

GeForce 7900 Quad SLI cannot outperform Radeon X1900 XT CrossFire despite the four graphics processors working altogether with FSAA 16x activated.

Photo Sharing and Video Hosting at Photobucket

Quad SLI demonstrated 40fps in 1280x1024, which is a pretty good result for SLI AA 32x mode. However in 1600x1200 the performance of the quad-GPU system is only half as fast. In other words, we do not se any real advantages of the top anti-aliasing mode once again.


-Half Life 2 -

Photo Sharing and Video Hosting at Photobucket

Quad SLI system cannot really boast much here: it yields to GeForce 7900 GTX SLI as well as to Radeon X1900 XT CrossFire.

Photo Sharing and Video Hosting at Photobucket

In FSAA 8x mode it finally manages to outperform its dual-GPU counterpart – GeForce 7900 GTX SLI. However Radeon X1900 XT CrossFire remains undefeated. The leader is not too far ahead, the gap is only 6%-8% in 1600x1200, but the fact is undeniable: in equal testing conditions GeForce 7900 quad SLI platform is slower than the dual-GPU CrossFire.

Photo Sharing and Video Hosting at Photobucket

Radeon X1900 XT CrossFire supports Super AA 14x and hence works in slightly lighter conditions, however it has tremendous performance advantage over the GeForce 7900 quad SLI. At the same time, GeForce 7900 quad SLI gets quite far ahead of the GeForce 7900 GTX SLI: in 1600x1200 the advantage is close to 70%.

Photo Sharing and Video Hosting at Photobucket

Since half-Life 2 is not one of the resource demanding games, we managed to get our Quad SLI system perform at an acceptable level in SLI AA 32x, although only in 1280x1024. So far this is the only game where this mode can actually be playable.


- Quake 4 -

Photo Sharing and Video Hosting at Photobucket

In 1280x1024 the CPU limits the system performance. However in 1600x1200 GeForce 7900 quad SLI doesn’t show any remarkable results yielding quite significantly to GeForce 7900 GTX SLI and a little less to Radeon X1900 XT CrossFire, which both sport only two graphics processors! However, all systems here perform at over 110fps, which would be more than enough for any first-person 3D shooters fan.

Photo Sharing and Video Hosting at Photobucket

The new Nvidia solution manages to finally leave behind Radeon X1900 XT CrossFire and catch up with GeForce 7900 GTX SLI in FSAA 8x + AF 16x mode. In this case all multi-GPU systems again cope successfully with the workload and reach 100+ fps in 1600x1200.

Photo Sharing and Video Hosting at Photobucket

GeForce 7900 quad SLI manages to perform better that dual-GPU SLI system in SLI AA 16x in 1600x1200. It is about 50% ahead and 89fps is a very good result keeping in mind how many resources the SLI AA 16x mode eats up. However, we clearly see than in the Quake 4, which engine is very favorable to the GeForce 6/7 hardware, Radeon X1900 XT CrossFire system is 10% faster compared to the GeForce 7900 quad SLI!

Photo Sharing and Video Hosting at Photobucket

Although Quake 4 is a relatively contemporary game, it doesn’t demand much from the graphics subsystem. As a result, GeForce 7900 Quad SLI can show acceptable results even in the extreme anti-aliasing mode – SLI AA 32x, but only in 1280x1024 resolution. In 1600x1200 the average performance of the Quad SLI system drops below 50fps. It is definitely not enough for comfortable gameplay, because in scenes with complex graphics and plenty of enemies the performance can drop dramatically. Also, we should remember that there are slight SLI AA-related artifacts in the Quake 4.


- Serious Sam 2 -

Photo Sharing and Video Hosting at Photobucket

Serious Sam 2 is a pretty demanding game, however despite this fact, all multi-GPU systems perform quite fast in standard resolutions in FSAA 4x mode. In 1920x1200 only GeForce 7900 quad SLI shows acceptable results. But in 2560x1600 the obtained results can only be interesting from the theoretical prospective. In particular, these results show that Quad SLI system suffers from insufficient memory bandwidth, because it is just a tiny bit ahead of Radeon X1900 XT CrossFire – no more than 10%-12%. The latter has to work in not very favorable conditions, because the game is rich in pixel shaders sending a lot of texturing requests. The average number of shader samples is 4, but sometimes it may reach 7-8.

Photo Sharing and Video Hosting at Photobucket

In 1600x1200 with FSAA 8x enabled only GeForce 7900 quad SLI manages to retain nearly comfortable performance level, while the dual-GP solutions fall about 36% behind. In higher resolutions even the quad SLI cannot provide acceptable performance results of about 55-6-fps. So, if your monitor supports 1920x1200, you will still have to play in lower resolution or switch to FSAA 4x.

Photo Sharing and Video Hosting at Photobucket

This test mode is a stumbling stone for all multi-GPU systems, however GeForce 7900 quad SLI and Radeon X1900 XT CrossFire perform almost equally fast in 1600x1200 and 1920x1200. It proves once again that quad SLI system will suffer from insufficient memory bandwidth.

Photo Sharing and Video Hosting at Photobucket

Of course, you shouldn’t even dream of playing in SLI AA 32x mode: the average performance will reach 30fps at best. Besides, Quad SLI platform simply refused to work in any resolution higher than 1600x1200: the system would inevitably crash.


- The Elder Scrolls IV: Oblivion -

We didn’t test this game with enabled FSAA, because HDR support would be disabled in this case.

Photo Sharing and Video Hosting at Photobucket

In the tunnels of Imperial City GeForce 7900 quad SLI outperforms Radeon X1900 XT CrossFire in 1280x1024 already, and as the resolution grows the gap increases reaching its maximum – 50% - in the maximum supported resolution of 2560x1600. In this case you can still enjoy the gameplay, as GeForce 7900 quad SLI fps rate doesn’t ever drop below 30fps.

Photo Sharing and Video Hosting at Photobucket

The open spaces of the Cyrodiil push the graphics subsystem to the limits, however GeForce 7900 quad SLI manages to provide quite acceptable level of performance. Check out the dynamics of the performance change with the growth of the screen resolution: the gaming performance of the quad SLI system decreases slower than that of the Radeon X1900 XT CrossFire system. As we move from 1280x1024 to 2560x1600, GeForce 7900 quad SLI performance gets only 20%-22% lower, while the performance of a Radeon X1900 XT CrossFire based platform drops by the good 60%.


- Project Snowblind -

For some reason, Project Snowblind offered only 1600x1200 resolution in addition to 1024x768, 800x600 and so on.

Photo Sharing and Video Hosting at Photobucket

The Project Snowblind cannot boast with impressive effects and is also known for poor scale of multi-GPU technologies, which is why it does not make much sense to comment about the numbers. However, it worth to point out that even in this game the SLI AA 32xs cannot offer enough speed.

Performance in Third-Person 3D Shooters

Splinter Cell: Chaos Theory

The performance in Splinter Cell: Chaos Theory depends a lot on the pixel shader performance. These shaders contain complex mathematical calculations and create various visual effects in the game.

Photo Sharing and Video Hosting at Photobucket

GeForce 7900 quad SLI is not very effective here to say at lease. This platform loses even to GeForce 7900 GTX SLI, not to mention Radeon X1900 XT CrossFire, although they both have the same total amount of pixel processors.

In 2560x1600 ATI Technologies wins by almost twofold advantage: its guarantees excellent performance, which never drops below 45fps. GeForce 7900 quad SLI can achieve something like that only in a lower 16:10 resolution of 1920x1200.


- Performance in Simulators -
X3: Reunion

Both: GeForce 7900 quad SLI and Radeon X1900 XT CrossFire support all resolutions including 2560x1600. The ATI platform, however, is substantially faster in all resolutions despite the two GPUs fighting against Nvidia’s four.

Photo Sharing and Video Hosting at Photobucket

Just like the previous game, X3: Reunion simulator is rich in shader effects, and just like in the previous case the ATI Technologies’ solution takes the lead here. We believe it could be an indication that Nvidia’s platform cannot distribute the workload between the four GPUs efficiently enough, while for the dual-GPU ATI platform this is not a problem at all.

Photo Sharing and Video Hosting at Photobucket

This situation is similar to what we have just seen in FSAA 4x + AF 16x mode: the Radeon X1900 XT CrossFire manages to leave behind the quad-GPU rival by pretty large margin, even though it uses 14x FSAA, not 16x FSAA.

Photo Sharing and Video Hosting at Photobucket

If you enable SLI AA 32x, you will not be able to play this game even in 1280x1024, not to mention any higher resolutions. Once again the support of this ultra-resource-hungry anti-aliasing mode doesn’t do us any real good.


Synthetic Benchmarks
Futuremark 3DMark05 build 120

Photo Sharing and Video Hosting at Photobucket

You can’t see any advantages of a GeForce 7900 quad SLI system if you test it in 3DMark05 at the default settings because the benchmark defaults to 1024x768 resolution without full-screen antialiasing. This explains the defeat of the 4-processor complex from Nvidia.

-Game 1-

Photo Sharing and Video Hosting at Photobucket

Although we turned on full-screen antialiasing for separate tests, we can’t see that the GeForce 7900 quad SLI has any advantage over the ordinary dual-GPU systems at 4x FSAA. On the contrary, the new solution from Nvidia is slower!

-Game 2-

Photo Sharing and Video Hosting at Photobucket

A considerable performance gain can only be observed when we use the next level of FSAA: the GeForce 7900 quad SLI is about 38-40% ahead of the Radeon X1900 XT CrossFire and GeForce 7900 GTX SLI here.

Photo Sharing and Video Hosting at Photobucket

The quad SLI platform is far behind its dual-chip opponents in the second test, probably due to the specifics of the scene which doesn’t require a high fill rate. Instead, such parameters as pixel shader performance and efficient rendering of shadows and lighting are important here. Considering that the graphics processors of the quad SLI system work at only 500MHz, its defeat here seems logical.

Photo Sharing and Video Hosting at Photobucket

It’s quite different in the 8x FSAA mode: the GeForce 7900 Quad SLI is again much faster than its opponents at this level of antialiasing.

-Game 3-

Photo Sharing and Video Hosting at Photobucket

The third test requires a high texturing speed as well as fast processing of complex shaders. But even with four graphics processors working all together, the GeForce 7900 quad SLI is again slower than the dual-chip GeForce 7900 GTX SLI and the Radeon X1900 XT CrossFire.

Photo Sharing and Video Hosting at Photobucket

Just like in the two previous tests, the GeForce 7900 quad SLI is ahead of the GeForce 7900 GTX SLI and of the Radeon X1900 XT CrossFire when we switch from 4x FSAA to 8x FSAA (8x SLI AA). The advantage is over 60% and the low memory bandwidth (at a frequency of only 600 (1200) MHz) doesn’t ruin the performance of Nvidia’s four-GPU subsystem in this mode.


Futuremark 3DMark06 build 120

Photo Sharing and Video Hosting at Photobucket

The same as in 3DMark05, the GeForce 7900 quad SLI has a considerably lower overall score than its opponents.

Photo Sharing and Video Hosting at Photobucket

The GeForce 7900 quad SLI is about as fast as the Radeon X1900 XT CrossFire in the SM2.0 tests, but these tests aren’t well suited for the latter: they require a high fill rate and do not use complex pixel shaders. The failure of the quad SLI platform here means that Nvidia’s got a lot of work to do yet: its four-GPU system is slower than the architecturally similar dual-GPU system in tests that are favorable for this architecture.

Photo Sharing and Video Hosting at Photobucket

Doing somewhat better in the SM3.0/HDR tests, the GeForce 7900 quad SLI is at least no worse than the Radeon X1900 XT CrossFire. That’s a good result considering that these graphics subsystems have the same number of pixel processors but Nvidia’s new solution has a lower GPU clock rate.

Shader Model 2.0 Game 1

Photo Sharing and Video Hosting at Photobucket

The first SM2.0 graphics test is sensitive to the texturing speed, but the GeForce 7900 quad SLI isn’t faster than the Radeon X1900 XT CrossFire although the latter has fewer TMUs (32 against 96). The overall efficiency of the four-GPU systems seems to be low.

Photo Sharing and Video Hosting at Photobucket

It’s almost the same in the second SM2.0 test: the GeForce 7900 quad SLI is slower than the Radeon X1900 XT CrossFire in all the resolutions, including 2560x1600. The speed of this test depends but little on the fill rate parameter, but this doesn’t matter much when it comes to the standings of the tested graphics subsystems.


Shader Model 3.0 Game 1

Photo Sharing and Video Hosting at Photobucket

Both a high scene fill rate and a high speed of execution of numerous pixel shaders is important for the first SM3.0/HDR test, and it’s here that the advantage of the Radeon X1900 XT CrossFire is well outlined. The gap diminishes in higher resolutions, from about 30% to 20%. So, the four chips of the GeForce 7900 quad SLI platform obviously have a lower efficiency than the two chips of the Radeon X1900 XT CrossFire in this test.

Photo Sharing and Video Hosting at Photobucket

The second SM3.0/HDR test isn’t as difficult as the first one, focusing on realistic dynamic shadows and HDR lighting. Anyway, the Radeon X1900 XT CrossFire outperforms the GeForce 7900 GTX in all the resolutions here.



Conclusion

After looking at the benchmark numbers, the experience we had with the GeForce 7900 quad SLI, let is try to summarize everything and draw a conclusion about the technology.

Nvidia managed to declare its lead in consumer-oriented multi-GPU technologies by introducing its quad SLI – a technique that allows four graphics processors to work together. While formally the company claim the lead in “constructors’ championship”, the current implementation of the quad SLI leaves much to be desired.

The quad SLI is here, the quad SLI is now shipping from a few companies. The reason why several big names, such as Alienware, are not yet shipping the quad SLI systems commercially amid formal launch is because this technology does not seem to be ready for commercial systems: users, who will utilize graphics cards to play games at extreme quality, will almost surely run into significant troubles with freezes, crashes, quality issues and so on. It transpires that Nvidia’s current quad SLI is not a product for buyers of luxury, as they desire stability and performance, not compatibility issues.

The quad SLI technology indisputably has potential: already now it demonstrates the highest scores in such games as F.E.AR., Far Cry with HDR enabled, scores best high-resolution numbers in Elder Scroll’s Oblivion, it wins Chronicles of Riddick tests, it produces amazing quality with 32xs SLI AA enabled, it does a plethora of great things. At the same time, it crashes in 3DMark05, Far Cry and numerous other games, produces artifacts when SLI AA is activated in Chronicles of Riddick and Serious Sam 2, which all degrades the value of this technology for the user right here and right now.

Perhaps, Lamborghini cars are known for spending quite some time in services, but they are faster than Porsches, meanwhile Nvidia GeForce 7900 quad SLI is not faster than ATI Radeon X1900 XT CrossFire (which is known for high performance in high resolutions and with FSAA) across the board and may even lose to dual GeForce 7900 GTX setup. Furthermore, at present quad SLI has many issues with stability and compatibility.

Basically, either the lack of proper driver tweaking, games optimizations or low clock-speeds of core and memory do not allow to show the quad SLI its real force – extreme resolutions and extreme antialiasing: even if the performance numbers it shows in 2560x1600 are higher compared to the rivals, they are not sufficient for game playing in many cases, which makes such a victory useless. At the same time, not a lot of games actually support resolutions higher than 1600x1200, which means that even though the boards are capable of handling the resolutions, they will not have such an opportunity in many cases. 32xs antialiasing is certainly an interesting option, but in the absolute majority of cases enabling it with high resolution means that the quartet of GeForce 7900 processors will not deliver sufficient frame-rate.

When ATI CrossFire technology was commercially released in September 2005, we criticized it for its peculiarities (maximum resolution of 1600x1200@60Hz), low scaling and some other issues. At the end, the CrossFire technology has matured significantly, lost its disadvantages and came to stars through the thorns. Can Nvidia’s quad SLI do the same?


Credits to X-bit labs.

0 comments: