本帖最后由 ~DeatHMooN~ 于 28-9-2009 09:18 编辑
DirectX 11Aah finally, a new DirectX.It's funny how most game developers skipped DX10 really. Face it, ifthere are not enough changes over DX9 then why should software houses [color=#3169b5 [color=#3169b5 !]invest in a new code path and thus spend [color=#3169b5 ]extra [color=#3169b5 ]moneyon development? This literally was a problem with DX generation 10.Next to that add the stupendous limitation from Microsoft to limit DX10to Windows Vista only. Probably the most horrendous call Microsoft evermade for an operating system.
Good news though, DirectX 11 isan extensive step upwards for both developers and gamers. Developerscan speed up their games and improve it with more complex shaders and afew new tricks like tesselation. Gamers on their side can have fasterrunning games with some really cool new eye candy. This is the newshader palette for developers to use: Vertex, Hull, Domain, Geometry,Pixel and Compute Shaders. With the compute shader comes DirectComputeas well, allowing Windows Vista or Windows 7 to utilize the GPUdirectly from within Windows. It's a first step but quite a number ofapplications that would benefit from GPU computing now can make use ofit. This really is a revolutionary step in development, as parallelprocessing can be really helpful in specific situations and thissoftware.
Here are the most prominent newfeatures of DirectX 11 (and I'm keeping this as simple as possible)that will effect you directly:
- Shader model 5.0
- Multi-threading
- DirectCompute 11 - Physics and AI
- Hardware Tessellation
- Better shadows
- HDR Texture compression
Let's highlight and discuss the five more important features that will effect you the most.
Multi-threaded rendering
Muchlike modern day applications and processors, it is now possible to fireoff code and datasets directly towards the GPU multi-threaded, we callthis multi-threaded rendering. Your gain here is efficiency. If aninstruction or shader has to be queued up (single threaded) thatcreates latency, a delay. The GPU as such can now handle all the datacompletely threaded. And that means better overall performance.
Think of a hundred cars that have to move over a single lane road from point A to B.
Now imagine a hundred lane road where all hundred cars have a lane available.
Which approach do you thinkwould get all cars to point B the quickest? Exactly. I'll probablyreceive a few emails from programmers and developers for thisoversimplified explanation though.
Fact is, and this you need toremember with multi-threaded rendering, DirectX will take betteradvantage of all the available processing cores.
DirectCompute 11
Anothernew feature in DirectX 11 I find very exciting is DirectCompute. Itallows Windows 7 and Vista developers to make use of the parallelprocessing power of modern video cards; software developers will haveaccess to the GPU and can use it to help out the system processor withtasks that involve say, high-quality video playback or high performancetranscoding.
In its most simple explanatoryform DirectCompute allows access to the GPU for Stream Computing(acceleration, post processing, whatever). As such DirectCompute allowsfor more easy access to the GPU’s many cores for parallel processing,thus utilizing the GPU for stuff other than gaming. Examples here areStream Computing, transcoding videos over the GPU (which is somethingwe'll be testing today as well later on in this review).
What about games you ask? Well,you could implement and use DirectCompute 11 for image processing andfiltering (post processing), order independent transparency (reallycool feature where you could see through an object like it was made outof glass), shadows rendering, physics, Artificial Intelligence andsure... Ray Tracing as well (though very limited).
I just touched order independent transparency (OID) and quickly wanted to show you that feature through a little video.
Now ATI will very likelyrelease this footage at high quality somewhere this week, but I made arecording of an OTD technology demo. The quality is poor as it isrecorded HD camera footage from a regular monitor. But you'll get theidea, in this demo we'll use a "Mech" and apply the OID technology(Proper rendering of sorted transparent geometry). Look closely at howyou can see through objects like that Mech as it where a 3D X-ray. Youcan actually use this feature in smoke, fire, hair, foliage, fences,grates and so on. In this particular demo DirectCompute is utilized toenable single pass transparent pixel sorting
Some stats: the environments is build out of of 343 thousand triangles, the Mech is built out of 262 thousand triangles.
Where will DirectX ComputeShaders be used first? Well, it seems like the optimization andenhancement of post processing routines may well be an area that seesan immediate benefit. Compute Shaders is another area of functionalitywhere DirectX 10 and DirectX 10.1 graphics processors will gainbenefits under the DirectX 11 runtime.
The DirectX 11 API doesn't justhave specifications for Compute Shaders 5.0, but also 4.1 and 4.0 andas such ATI Radeon HD 4000 Series graphics cards will actually fallinto the Compute Shader 4.1 profile, bringing more functionality todevelopers, these are DirectX 10.1 class products.
NVIDIA will benefit from DirectCompute model 4.0 only as their GPUs are DX class 10.0.
| | GPU | | DirectX 11 Feature | DX10 | DX 10.1 | DX 11 | Tesselation | No | No | Yes | Shader Model 5.0 | No | No | Yes | DirectCompute 11 | No | No | Yes | DirectCompute 10.1 | No | Yes | Yes | DirectCompute 10 | Yes | Yes | Yes | Multi-threading | Yes | Yes | Yes | HDR Texture compression | No | No | Yes |
Shader model 5.0
DirectX11 also introduces Shader Model 5 for High Level Shader Language(HLSL), providing a better way for graphics programmers to implementshader programs. It adds double-precision support, which allowsprogrammers to tackle shader specialization with polymorphism, objects,and interfaces.
We could go horribly in-depthon Shader Model 5.0 but it would be too far-fetched. Better, more andlonger shaders is what you need to remember.
Hardware Tessellation One feature that I am reallyexcited about personally is that we'll finally have a hardwaretessellation unit inside the GPU that DirectX can utilize. But what ishardware tessellation you might ask? We are going to spend an entirepage on this new feature that both DX11 class graphics cards fromNVIDIA and ATI will have embedded.
Well... allow me, did you grab a cup of coffee already?
What is tessellation? Simplyput it's adding more detail to 3D objects, real-time. And with thearrival of DX11 class graphics cards ATI and NVIDIA now include ahardware tessellation unit inside the GPU, a programmable tessellationunit.
Tessellation simply means increasing your polygon count to get more detail. Look at the image below.
Tessellation is theprocess of subdividing a surface into smaller shapes. To describeobject surface patterns, tessellation breaks down the surface of anobject into manageable polygons. Triangles or quadrilaterals are twocommonly used polygons in drawing graphical objects because computerhardware can easy manipulate and calculate these two simple polygons.An object divided into quads and subdivided into triangles forconvenient calculation.
Now at the firstframe you can see a face. There's a small number of polygons in there.It's anno 2009, and we demand more deatailed objects in our 3D scene.So by recursively applying a subdivision rule we can increase thenumber of polygons. Now look at the second and third faces. There's somuch more detail. This process can now be done 100% at GPU level inhardware without a significant impact on performance.
For DirectX 11 thetessellation portion of the pipeline has been wrapped with two newshader types that can be used, the Hull Shader and the Domain Shader.
Now some of you mighthave noticed it already from previous reviews. Tessellation isn't new,ATI already had a hardware tessellation unit in their GPUs for years.But the older units could not be addressed whatsoever in DirectX. Thetessellation units featured in the ATI Radeon HD 2000, HD 3000 and HD4000 series are all very much based on the same functionality found inthe XBOX 360 'Xenos' graphics chip.
Some more examples --Another good example for the usage of tessellation would be terrainbuilding. This technique is especially useful for creatingcomplex-looking terrain using a combination of very simple basegeometry with a height map, and a texture map. And perhaps moreinteresting is that this generated terrain can be deformed dynamicallyby manipulating the height map.
A scene could havemuch polygonal complexity closer to the viewer or camera, and fewerpolygons as distance from the camera increases.
Anyway, thoughtechnical and somewhat difficult to explain, try and rememberthis... tessellation will allow much higher quality rendering andanimation at very low GPU compute cost. The generic rule here is themore tessellation, the slower the GPU gets, yet since there's nowdedicated core logic for it on the GPU, it's fast and can boost yourdetail massively, thus giving an impression of sharpness and much finerquality.
As stated, the new DX11 tessellation unit is programmable though two new shaders, the Domainand Hull shader. And remember, the higher the level of tessellation,the closer to realism the sharpness of the surface approaches.
DX11 - HDR texture compression
Wedoubt you care much about this info, but this is something developerslike and requested. With DX11 also comes new texture compressionmethods BC6 and BC7. Microsoft boasts that these two compressionformats are the best they can offer for the ratio of high-quality overperformance.
Block compression 6 (BC6)compresses high dynamic range (HDR) data at a ratio of 6:1, givenhardware support for decompression. BC7 offers 3:1 compression ratiosfor 8-bit low dynamic range (LDR) data.
Anyway I'd like to end this little chapter on DX11 now. Some soon to be released games that are DX11 compatible will be:
- Aliens vs. Predator (February 2010)
- BattleForge (DX11 patch expected in October)
- DiRT 2 (December 2009)
- S.T.A.L.K.E.R: Call of Pripyat (October 2009)
- Lord of the Rings Online (Q1 2010)
And sure, this is just ahandful, but in the upcoming year expect a lot of titles as DX11 is theway to go for developers. Okay, enough about DirectX 11. One lastthing, DirectX will become available on both Windows Vista and Windows7.
ATI Eyefinity Okay, the next new hot featurefor ATI Radeon graphics cards was already announced, ATI's Eyefinity.ATI introduces Eyefinity technology on their Radeon HD 5000 seriesgraphics cards. This literally boils down to multi-monitor desktop andgaming nirvana! You will have no problem connecting say, three 30"monitors at 2560x1600. The graphics card can take that resolution andin fact combine the screen resolution and play in it.
We can explain this really simply though; you guys remember our Matrox Triplehead2Go reviewsright? Well, ATI's Series 5000 graphics cards will be able to drive oneto six monitors per graphics card. We've seen and tested this live inaction, and it works really nicely. You can combine monitors and getyour groove on up-to 7680x3200 pixels separated over several monitors-- multiple monitors to be used as a single display. I think the limitis even 8000x8000 pixels, but don't hold me to that.
So some examples of what you can do here:
- Single monitor setup at 2560x1600
- Dual monitor setup at 2560x1600 per monitor
- Three monitors setup at 2560x1600 per monitor
- Six monitors setup at 1920x1080 per monitor
Eyefinity is looking reallynice, and sure we also understand that 99% of you guys will never usemore than two monitors. That other 1% definitely matches the Guru3daudience. Personally I like to game on three screens. It's reallyimmersive.
Mind you that for six monitor support a special edition (Eyefinity6)card will be launched with six display ports. Your average Radeon HD5870 will have three or four monitor outputs. In fact the reference5870 has two DVI, one HDMI and one display port connector all on onecard. If you are bold enough to go for a multi-monitor setup, it reallyis ideal to get three screens for flight sims, racing games, rolepaying games, real-time strategy, first-person shooters and sure, evenmultimedia apps.
At ATI's press events theyhooked up the Radeon HD 5870 to half a dozen DisplayPort outputs thatwere running at their full resolution, merging all six into a solitaryimage to hit a phenomenal live display. Eyefinity is modular and thusallows users to rearrange the number of discrete images created inaddition to their shape according to your liking. Guru3D users andgamers will no doubt find this setup to their liking. It will beinteresting to learn just what kind of living room you have if you wereto employ such a configuration. Please post your setups in our forums.
Also a note -- we'll bepublishing a dedicated article on Eyefinity in the future, but weexpect this to be a great feature for all kinds of simulations, theflight-sim community must be going wild for sure allright !
Power Consumption One of the biggestaccomplishments of the series 5000 graphics cards is the enhancement inthe power design, the implementation of voltage and clock regulation iseven more dynamic -- power management at a new level.
So we'll look purely at theRadeon HD 5870 now, in IDLE the GPU will clock down and lower itsvoltages on both GPU and memory. Have a look:
GPU | Radeon HD 4870 | Radeon HD 5850 | Radeon HD 5870 | Max. Board Power (TDP) | 160W | 170W | 188W | Idle Board Power | 90W | 27W | 27W | The card obviously achieves alow 27W IDLE power consumption by clocking down with several powerstates. Thus a low engine (core) clock frequency with lowered voltagesand lower GDDR5 memory power. It's amazing though as your generichigh-end graphics card would normally consume 50~60 Watts when it idlesin Windows.
Things get even better though,the performance of the graphics card opposed to the last generationproducts has nearly doubled up in performance and design, yet the 5870has a TDP (peak wattage) of only 188 Watts. We think that is justawesome.
Though we haven't tested ityet, ATI also incorporated a new technology feature called ULPS --Ultra High Power State for multi-GPU configurations. We need to lookinto this, but typically with multiple GPUs installed you'd have a highIDLE power consumption, this seems to have been improved. More on thatin another article though.
We will test power consumption later on in this article.
Universal Video Decoder 2.0Always worth a mention is UVD,short for Universal Video Decoder. With proper 3rd party software likeWinDVD or PowerDVD you can enable support for UVD 2.0 which provideshardware acceleration of H.264 and VC-1 high definition video formatsused by Blu-ray and HD DVD. The video processor allows the GPU to applyhardware acceleration and video processing functions while keepingpower consumption & CPU utilization low.
You will have sheer decodingprecision on the Radeon 5000 series. Low CPU utilization whilst scoringmaximum image quality. One improvement has been made as well; you cannow for example upscale your 1920x1080 streams fine to a 2560x1600sized monitor (no more black borders).
New in the GPU architecture ofthe series 5000 is an updated video engine. It's really not massivelydifferent opposed to the old UVD engine, yet has two new additions forpost-processing, decoding and enhancing video streams. Dual streamdecoding is one of the new features. For example, if you playback aBlu-ray movie and simultaneously want to see a director's commentary(guided by video) you can now look at both the movie and in a smallerscreen see the additional content (like picture-in-picture). Obviouslythis is Blu-ray 2.0 compatibility here, and the additional content isan actual feature of the movie. But definitely fun to see.
New in Enhanced UVD 2.0
- Hardware acceleration decode of two 1080P HD streams
- Compatible with Windows Aero mode - playback of HD videos while Aero remains enabled
- Video gamma - independent gamma control from Windows desktop.
- Brighter whites - Blue Stretch processing increases the blue value of white colors for bright videos
- Dynamic Video Range - Controls levels of black and white during playback
A recently added feature alsois Dynamic Contrast Enhancement. It does pretty much what the namesays; Dynamic Contrast Enhancement technology will improve the contrastratios in videos in real-time on the fly. It's a bit of a trivial thingto do, as there are certain situations where you do not want yourcontrast increased.
Another feature is DynamicColor Enhancement. It's pretty much a color tone enhancement featureand will slightly enforce a color correction where it's needed. We'llshow you that in a bit as I quite like this feature; it makes certainaspects of a movie a little more vivid.
Directly tied to the UVD engineis obviously also sound. AMD's Radeon series 3000, 4000 and 5000 cardscan pass lossless sound directly through the HDMI connector. This hasbeen upgraded as it's now possible to have 7.1 channel lossless sound192kHz / 24-bit. The HDMI audio output follows HDMI standard 1.3a andnow also supports Dolby True HD and DTS-HD audio. Obviously there isalso support for standard PCM, AC-3 and DTS.
To beable to playback high-def content you'll still need software likeWinDVD or PowerDVD, a HD source (Blu-ray player) and a HDCP monitor ortelevision. For those interested in MKV / x.264 GPU based content acceleration, playback and image quality enhancements, please read this guide we have written. We spotted this lovely little free application to manage this.
ATI StreamIn the current day and agethere is more to graphics cards than just playing games. More and morenon-gaming related features can and are being offloaded to the GPU.Roughly a year ago ATI introduced ATI Stream. This is a software layerthat allows software developers to 'speak' with the GPU and have itprocess data using your graphics card. This really is the most simple& basic description I can give it. I have no idea where ATI Streamwill be heading now that DirectCompute is available.
In this article we'll show you a test where we utilize ATI Stream and NVIDIA CUDA to transcode videos over the GPU.
Now I'd like to point you towards one function you should all do with your GPU when it's doing nothing.
Folding@home using the ATI Radeon series 5000 GPU
Foldingat home is a project where you can have your GPU or CPU (when the PC isnot used) help out solving diseases, folding proteins. Over the past 12months a lot of progress has been made between the two partiesinvolved. And right now there is a GPU folding client available thatworks with Radeon 5000 series graphics processors. It is ATI Streambased, meaning that all Stream ready GPUs can start folding.
Guru3D team is ranking in theFolding@Home top 90, yes... I'm very proud of our guys crunching thesenumbers, especially since there are tens of thousands of other teams.The client is out, if possible please join team Guru3D and let's foldaway some nasty stuff. The good thing is, you won't even notice thatit's running.
Our Folding@home info can be found here:
- Team Guru3D Homepage
- Team Guru3D support forums
Our Guru3D team number is 69411and if you decide to purchase a 4000 series product, guys, promise meyou'll use it to fold for us. By making this move my dear friends,there are now 70 million GPUs available to compute the biggestmysteries in diseases and illnesses. Again, let's make Team Guru3D thebiggest one available guys, join our team.
Radeon HD 5870 GPU wafer
|