New Kepler architecture makes GeForce GTX 680 fastest graphic card with maximum performance and low power consumption

INTRODUCTION

Every two years NVIDIA engineers set out to design a new GPU architecture. The architecture defines a GPU's building blocks, how these blocks are connected and how they work. The architecture blueprint is a basis for a family of chips that serves a whole spectrum of systems from consumer computers or PC's to workstations and supercomputers.

Two years ago NVIDIA lunched Fermi which is named after the Italian nuclear physicist Enrico Fermi. This new architecture brought two new key advancements. Firstly, it brought full geometry processing to the GPU, enabling a key DirectX 11 technique called tessellation with displacement mapping. This technique was used in Battlefield 3 and Crysis 2 and improved the geometric realism of water, terrain and game characteristics.

Last month NVIDIA lunched Kepler, the much anticipated successor to the Fermi architecture. Currently the Kepler is the fastest GPU on the market and also power efficient. Feature wise, they have added new technologies that fundamentally improve the smoothness of each frame and the richness of the overall experience.

When NVIDIA lunched Fermi with the GeForce GTX 480 the problem was noise and power consumption. Gamers got top performance at that time, but the problem was noise from the fans and power consumption. With Kepler NVIDIA team was focused on building graphic card with silent fans and low power consumption. For these two key elements the NVIDIA team needed to redesigned the streaming multiprocessor which is the most important building block of the GPUs. With this redesigned streaming multiprocessor Kepler got optimal performance per watt. Second feature was adding a feature called GPU Boost the performance of the graphic card becomes dynamical. This means that when you play games or working with pictures and videos the GPU Boost automatically increases the clock speed to improve performance within the card's power budgets.

Kepler has new SM, called SMX is radical departure from past designs. SMX eliminates the Fermi 2x processor clock and uses the same base clock across the GPU.To balance out this change; SMX uses an ultra-wide design with 192 CUDA cores.

With this new SMX Kepler can do twice as much as standard SM per watt. Put another way, given a watt of power, Kepler's SMX can do twice the amount of work as Fermi's SM. The benefit of this power improvement is most obvious when you plug in a GeForce GTX 680 into your system. If you've owned a high end graphics card before, you'll know that it requires an 8-pin and 6-pin PCI-E power connector. With the GTX 680, you need just two 6-pin connectors. This is because the card draws at most 195 watts of power, compared to 244 watts on the GeForce GTX 580. If there ever was a middle weight boxer that fought like a heavy weight that would be the GeForce GTX 680.

GPU Boost

SMX doubled the performance per watt, but what if the GPU wasn't actually using its full power capacity? Going back to the light bulb analogy, what if a 100 watt light bulb was sometimes running at 90 watts, or even 80 watts? As it turns out, that's exactly how GPUs behave today.

The reason for this is actually pretty simple. Like light bulbs, GPUs are designed to operate under certain wattage. This number is called the thermal design point, or TDP. For a high-end GPU, the TDP has typically been about 250 watts. You can interpret this number as saying: this GPU's cooler can remove 250 watts of heat away from the GPU. If it goes over this limit for an extended period of time, the GPU will be forced to throttle down its clock speed to prevent overheating. What this also means is, to get the maximum performance; the GPU should operate close to its TDP, without ever exceeding it. In reality, GPUs rarely reach their TDP when playing even the most intensive 3D games. This is because different games consume different amounts of power and the GPU's TDP is measured using the worst case. Popular games like Battlefield 3 or Crysis 2 consume far less power than a GPU's TDP rating. Only a few synthetic benchmarks can push GPUs to their TDP limit. For example, say your GPU has a TDP of 200 watts. What this means is that in the worst case, your GPU will consume 200 watts of power. If you happen to be playing Battlefield 3, it may draw as little as 150 watts. In theory, your GPU could safely operate at a higher clock speed to tap into this available headroom. But since it doesn't know the power requirements of the application ahead of time, it sticks to the most conservative clock speed. Only when you quit the game does it reduce to a lower clock speed for the desktop environment.

GPU Boost changes all this. Instead of running the GPU at a clock speed that is based on the most power hungry app, GPU Boost automatically adjusts the clock speed based on the power consumed by the currently running app. To take our Battlefield 3 example, instead of running at 150 watts and leaving performance on the table, GPU Boost will dynamically increase the clock speed to take advantage of the extra power headroom. Different games use different amounts of power. GPU Boost monitors power consumption in real-time and increases the clock speed when there's available headroom.

How the GPU Boost Works?

The most important thing to understand about GPU Boost is that it works through real time hardware monitoring as opposed to application based profiles. As an algorithm, it attempts to find what is the appropriate GPU frequency and voltage for a given moment in time. It does this by reading a huge swathe of data such as GPU temperature, hardware utilization, and power consumption. Depending on these conditions, it will raise the clock and voltage accordingly to extract maximum performance within the available power envelop. Because all this is done via real-time hardware monitoring, GPU Boost requires no application profiles. As new games are released, even if you don't update your drivers, GPU Boost "just works."

The GPU Boost algorithm takes in a variety of operating parameters and outputs the optimal GPU clock and voltage. Currently it does not alter memory frequency or voltage though it has the option to do so.

How Much Boost?

Because GPU Boost happens in real-time and the boost factor varies depending on exactly what's being rendered, it's hard to pin the performance gain down to a single number. To help clarify the typical performance gain, all Kepler GPUs with GPU Boost will list two clock speeds on its specification sheet: the base clock and the boost clock. The base clock equates to the current graphics clock on all NVIDIA GPUs. For Kepler, that's also the minimum clock speed that the GPU cores will run at in an intensive 3D application. The boost clock is the typical clock speed that the GPU will run at in a 3D application. For example, the GeForce GTX 680 has a base clock of 1006 MHz and a boost clock of 1058 MHz . What this means is that in intensive 3D games, the lowest the GPU will run at is 1006 MHz, but most of the time, it'll likely run at around 1058 MHz. It won't run exactly at this speed--based on real-time monitoring and feedback, it may go higher or lower, but in most cases it will run close to this speed.GPU Boost doesn't take away from overclocking. In fact, with GPU Boost, you now have more than one way to overclock your GPU. You can still increase the base clock just like before and the boost clock will increase correspondingly. Alternatively, you can increase the power target. This is most useful for games that are consuming near 100% of this power target.

Genuinely Smoother Gameplay

Despite the gorgeous graphics seen in many of today's games, there are still some highly distracting artifacts that appear in gameplay despite our best efforts to suppress them. The most jarring of these is screen tearing. Tearing is easily observed when the mouse is panned from side to side. The result is that the screen appears to be torn between multiple frames with an intense flickering effect. Tearing tends to be aggravated when the framerate is high since a large number of frames are in flight at a given time, causing multiple bands of tearing.

FXAA: Anti-Aliasing at Warp Speed

Nothing ruins a beautiful looking game like jaggies. They makes straight lines look crooked and generate distracting crawling patterns when the camera is in motion. The fix for jaggies is anti-aliasing, but today's method of doing anti-aliasing is very costly to frame rates. To make matters worse, their effectiveness at removing jaggies has diminished in modern game engines.Almost all games make use of a form of antialiasing called multi-sample anti-aliasing (MSAA). MSAA renders the screen at an extra high resolution then down samples the image to reduce the appearance of aliasing. The main problem with this is technique is that it requires a tremendous amount of video memory. For example, 4x MSAA requires four times the video memory of standard rendering. In practice, a lot of gamers are forced to disable MSAA in order to maintain reasonable performance. FXAA is a new way of performing antialiasing that's fast, effective, and optimized for modern game engines. Instead of rendering everything at four times the resolution, FXAA picks out the edges in a frame based on contrast detection. It then smoothes out the aliased edges based on their gradient. All this is a done as a lightweight, post processing shader. FXAA is not only faster than 4xMSAA, but produces higher quality results in game engines that make use of extensive post processing. Click here for an interactive comparison.

Compared to 4xMSAA, FXAA produces comparable if not smoother edges. But unlike 4xMSAA, it consumes no additional memory and runs almost as fast as no antialiasing. FXAA has the added benefit that it works on transparent geometry such as foliage and helps reduce shader based aliasing that often appears on shiny materials.Comparison of MSAA (Antiliasing Deferred) vs. FXAA (Antialiasing Post) performance in Battlefield 3. Click to enlarge.While FXAA is available in a handful of games today, with the R300 series driver, we've integrated it into the control panel. This means you'll be able to enable it in hundreds of games, even legacy titles that don't support anti-aliasing.

TXAA: Even Higher Quality Than FXAA

Computer generated effects in films spend a huge amount of computing resources on anti-aliasing. For games to truly approach film quality, developers need new antialiasing techniques that bringing even higher quality without compromising performance.With Kepler, NVIDIA has invented an even higher quality AA mode called TXAA that is designed for direct integration into game engines. TXAA combines the raw power of MSAA with sophisticated resolve filters similar to those employed in CG films. In addition, TXAA can also jitter sample locations between frames for even higher quality.TXAA is available with two modes: TXAA 1, and TXAA 2. TXAA 1 offers visual quality on part with 8xMSAA with the performance similar to 2xMSAA, while TXAA 2 offers image quality that’s superior to 8xMSAA, with performance comparable to 4xMSAA.

TXAA 2 performs similarly to 4xMSAA, but produces higher quality results than 8xMSAA.

Like our FXAA technology, TXAA will be first integrated directly into the game engine. The following games, engines, and developers have committed to offering TXAA support: MechWarrior Online, Secret World, Eve Online, Borderlands 2, Unreal Engine 4, BitSquid, Slant Six Games, and Crytek.

Conclusion

The new architecture Kepler with GPU Boost makes the GTX 680 the fastest graphic card with lower power consumption and noise. The NVIDIA team wanted to make gaming faster and smoother. The FRXAA and TXAA make the gaming both. Games get super smooth edges without tanking the performance. Adaptive V-Sync improves upon the feature that many gamers swear-by. With new Kepler architecture you can play with V-sync enabled and not worry about sudden dips in frame rate. GPU now powers a full NVIDIA Surround setup plus an accessory display. There is simply no better way to play a racing game or flight simulator. Add in NVIDIA PhysX technology, and the GTX 680 delivers an amazingly rich gaming experience. Kepler is the outcome of over four years of research and development by some of the best and brightest engineers at NVIDIA.

Nanotechnology

Post of the month

Popular Posts

New Posts

New Kepler architecture makes GeForce GTX 680 fastest graphic card with maximum performance and low power consumption

0 comments:

Post a Comment

Popular Post

Facebook page

Popular Posts