AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |
Back to Blog
Time spy benchmark records9/19/2023 ![]() Each per quad list is read for four pixels in the resolve pass. When resolving the A-buffer fragments for each pixel, both per pixel list and per quad list are read and blended in the correct order. When rendering to per quad lists, a half resolution viewport and depth texture is used to ignore fragments behind opaque surfaces. This saves memory when per pixel information is not required for a visually satisfying result. The per-quad lists can be used for selected renderables instead of the per pixel lists. In addition to the per-pixel lists of fragments, per 2x2 quad lists of fragments are created. The A-buffer is drawn after the G-buffer to fully take advantage of early depth tests. The fragments are blended according to the visibility function and illuminated in the lighting pass to allow them to be rendered in any order. Simply put, a per-pixel list of fragments is created for which a visibility function (accumulated transparency) is approximated. Transparent objectsįor rendering transparent geometries, the engine uses a variant of an order-independent transparency technique called Adaptive Transparency (Salvi et al. For example, a luminance texture is only written into when drawing geometries with luminous materials. A material might not use all target textures. The G-buffer is composed of textures such as depth, normal, albedo, material attributes, and luminance. Opaque objects are rendered directly to the G-buffer. The material system uses physically-based materials. This also results in a bigger on-screen triangle size. Geometry rendering uses a LOD system to reduce the number of vertices and triangles for objects that are far away. In the second step, transparent objects are rendered to an A-buffer, which is then resolved on top of surface illumination later on. ![]() First, all opaque objects are drawn into the G-buffer. If an object has several geometry LODs, tessellation is used on the most detailed LOD. Tessellation is turned entirely off by disabling hull and domain shaders when the size of an object’s bounding box on the render target drops below a given threshold. Additionally, patches that are back-facing and patches that are outside of the view frustum are culled by setting the tessellation factor to zero. Tessellation factors are adjusted to achieve the desired edge length for the output geometry on the render target (G-buffer, shadow map or other). The engine supports Phong tessellation and displacement-map-based detail tessellation. Async compute workload per frame varies between 10-20%. Asynchronous computeĪsynchronous compute is utilized heavily to overlap multiple rendering passes for maximum utilization of the GPU. Explicitly created heaps are used for some target resources to reduce memory consumption by placing resources that not needed at the same time on top of each other. Implicit resource heaps created by ID3D12Device::CreateCommittedResource() are used for most resources. ![]() Root signature constants and descriptors are used when suitable. Hardware Tier 1 is sufficient for containing all the required descriptors in the heaps. One descriptor heap is created for each descriptor type when the scene is loaded. The culling runs on the CPU and does not consume GPU resources. The Umbra occlusion library (version 3.3.17 or newer) is used to accelerate and optimize object visibility evaluation for all cameras, including the main camera and light views used for shadow map rendering. Heterogeneous adapters are not supported. two identical GPU adapters in Crossfire/SLI, by using explicit multi-adapter with a linked-node configuration to implement explicit alternate frame rendering. The engine supports the most common type of multi-GPU configuration, i.e. This reduces CPU load by utilizing multiple cores. The rendering, including scene update, visibility evaluation, and command list building, is done with multiple CPU threads using one thread per available logical CPU core. The engine was created with the input and expertise of AMD, Intel, Microsoft, NVIDIA, and the other members of the UL Benchmark Development Program. To fully take advantage of the performance improvements that DirectX 12 offers, Time Spy uses a custom game engine developed in-house from the ground up. Solution home 3DMark Time Spy Time Spy engine
0 Comments
Read More
Leave a Reply. |