Saturday, 28 December 2013

Updating Vertex Buffers in DX11: StreamOut vs Compute

Benchmarking an update of 125,000 vertices (xyz float) using double-buffered StreamOut compared to compute with a single buffer.

The compute C++ code is:
 dx11::setUnorderedAccessView(effect,"targetVertexBuffer",vertexBuffer.unorderedAccessView);
 ApplyPass(pContext,effect->GetTechniqueByName("move_particles_compute")->GetPassByIndex(0));
 pContext->Dispatch(5,5,5);

The StreamOut C++ code is:

 pContext->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_POINTLIST );
 pContext->IASetInputLayout( m_pVtxDecl );
    
 ID3D11Buffer *pBuffer;
 pBuffer    =vertexBuffer.vertexBuffer;
 UINT stride   = sizeof(vec3);
 UINT offset[1]   = { 0 };
 pContext->IASetVertexBuffers( 0, 1, &pBuffer,&stride, offset );

 // Point to the correct output buffer
 pContext->SOSetTargets( 1, &m_pVertexBufferSwap, offset );

 // draw
 D3DX11_TECHNIQUE_DESC techDesc;
 techniqueMoveParticles->GetDesc( &techDesc );
 techniqueMoveParticles->GetPassByIndex(0)->Apply(0,pContext);

 pContext->Draw(125000 , 0 );

 // Get back to normal
 pBuffer    = NULL;
 pContext->SOSetTargets( 1, &pBuffer, offset );

 // Swap buffers
 ID3D11Buffer* pTemp  = m_pVertexBufferSwap;
 m_pVertexBufferSwap  = vertexBuffer.vertexBuffer;
 vertexBuffer.vertexBuffer = pTemp;

And the results: the compute operation takes 0.014ms, the geometry shader one takes 0.16ms. But: it turns out that only StreamOut is actually updating all of the vertices. Compute updates about 8000, then silently gives up.

Turning on Debug using DirectX Control Panel, we learn that passing a vertex buffer's UAV to a shader as a structured buffer is NOT supported. So StreamOut is actually the only option here.