Roderick's Debug Diary: October 2014

Tuesday 21 October 2014

Custom Build Tools in Visual Studio 2012, 2013 - "Pay no attention to the API behind that curtain!"

So I want to add a custom file type - a new kind of shader I'm going to call a psfx file. I have an exe I've created to build it into binaries. All I need now is to make Visual Studio recognize psfx files, and call "Psfx.exe" to build them into .psfxo files. Simple eh? I used to do this in VS2008:

Now let's try it in VS2012. Let's see what MS has to say:

In earlier releases, a rule file is an XML-based file that has a .rules file name extension. A rule file lets you define custom build rules and incorporate them into the build process of a Visual C++ project. A custom build rule, which can be associated with one or more file name extensions, lets you pass input files to a tool that creates one or more output files.

In this release, custom build rules are represented by three file types, .xml, .props, and .targets, instead of a .rules file. When a .rules file that was created by using an earlier release of Visual C++ is migrated to the current release, equivalent .xml, .props, and .targets files are created and stored in your project together with the original .rules file.

Important
In the current release, the IDE does not support the creation of new rules. For that reason, the easiest way to use a rule file from a project that was created by using an earlier release of Visual C++ is to migrate the project to the current release.

Urk.

OK. Here's what I did next. I happened to still have VS 2008 on my machine - I've been meaning to get rid of it for a while now. I made a blank project, and created a new rule - "The Floop Rule". For files with the .floop extension. VS built this beautiful simple, sensible file, FloopFileName.rules:

<visualstudiotoolfile name="Floop Rule File Display Name" version="8.00">
    <rules>
        <custombuildrule commandline="Floop.exe [inputs]" displayname="Floop Rule Display Name" executiondescription="Executing Floop Rule" fileextensions="*.floop" name="Floop Rule" outputs="$(InputName).floopout">
     <properties>
     </properties>
 </custombuildrule>
    </rules>
</visualstudiotoolfile>

468 bytes of perfection. Then I opened the .vcproj file in Visual Studio 2012, and converted it to a .vcxproj. Now my nice simple .rules file has been converted into FloopRuleFileName.props (879 bytes), FloopRuleFileName.targets (3.27k) and FloopRuleFileName.xml (4.37k). You can tell that this is improved technology, because of the huge file sizes.

Anyway, when I want to create a new rule in future, I can just copy these three files, replacing the following elements:

"Floop Rule File Display Name"

"Floop.exe"

"*.floop"

"Floop_Rule"

"floopout"

And so can you. To add the rule to a project, you'll need:

<ImportGroup Label="ExtensionSettings">
    <Import Project="FloopRuleFileName.props" />
  </ImportGroup>

and

<importgroup label="ExtensionTargets">
    <import project="FloopRuleFileName.targets">
  </import></importgroup>

- insert them as text in the appropriate places - as seen in Temp.vcxproj

Tuesday 14 October 2014

Cross-API Rendering: An Aside

Just as an aside, an insight into why I need to wrapper-up some perfectly serviceable rendering API's. Suppose I want to create a structured buffer to be used in a compute shader, for both read-only and read-write usage.
Here's how I want to declare it:

simul::crossplatform::StructuredBuffer<vec2> tempBuffer;

Here's how I'd like to initialize it:

tempBuffer.RestoreDeviceObjects(renderPlatform,num_elements,true);

- where passing "true" in the third element means "make it writeable by compute shaders".

Now in DirectX 11, I'd declare it like this:

ID3D11Buffer *pBuffer_Tmp;
ID3D11UnorderedAccessView *pUAV_Tmp;
ID3D11ShaderResourceView *pSRV_Tmp;

And initialize it like so (get ready):


  D3D11_BUFFER_DESC buf_desc;
  buf_desc.ByteWidth = sizeof(float) * 2 * num_elements;
  buf_desc.Usage = D3D11_USAGE_DEFAULT;
  buf_desc.BindFlags = D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE;
  buf_desc.CPUAccessFlags = 0;
  buf_desc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED;
  buf_desc.StructureByteStride = sizeof(float) * 2;

  renderPlatform->AsD3D11Device()->CreateBuffer(&buf_desc, NULL, &pBuffer_Tmp);
  assert(pBuffer_Tmp);

  // Temp undordered access view
  D3D11_UNORDERED_ACCESS_VIEW_DESC uav_desc;
  uav_desc.Format = DXGI_FORMAT_UNKNOWN;
  uav_desc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER;
  uav_desc.Buffer.FirstElement = 0;
  uav_desc.Buffer.NumElements =num_elements;
  uav_desc.Buffer.Flags = 0;

  renderPlatform->AsD3D11Device()->CreateUnorderedAccessView(pBuffer_Tmp, &uav_desc, &pUAV_Tmp);

  // Temp shader resource view
  D3D11_SHADER_RESOURCE_VIEW_DESC srv_desc;
  srv_desc.Format = DXGI_FORMAT_UNKNOWN;
  srv_desc.ViewDimension = D3D11_SRV_DIMENSION_BUFFER;
  srv_desc.Buffer.FirstElement = 0;
  srv_desc.Buffer.NumElements =num_elements;

  renderPlatform->AsD3D11Device()->CreateShaderResourceView(pBuffer_Tmp, &srv_desc, &pSRV_Tmp);

So, that was a bit longer; with a lot of redundant "match-up" values - you can't create an unordered access view of pBuffer_Tmp unless it was created with BindFlags containing D3D11_BIND_UNORDERED_ACCESS, for example. Now I know you can do clever things like create a view into a specific part of a buffer, and so on, but let's at least make a neat interface for the 99% case.

Monday 13 October 2014

Cross-API Rendering 2: RenderPlatform - the main interface

This is the definition for RenderPlatform, the core of the cross-API rendering interface.

virtual ID3D11Device *AsD3D11Device()=0;

We've forward-declared ID3D11Device as a struct. Non D3D11 implementations return NULL. This is a convenience function for the D3D11 implementation.

virtual void RestoreDeviceObjects(void*);

We call this once, when the 3D graphics device has been initialized, and pass the API-specific device pointer/identifier.

virtual void InvalidateDeviceObjects();

We call this once, when the 3d graphics device object is being shut down.

virtual void RecompileShaders();

This is optional - call this to recompile the standard shaders. RenderPlatform implementations will contain some standard shaders for things like drawing text and putting textured quads to the screen. It's handy to be able to force a recompile when the shader source is modified.

DeviceContext &GetImmediateContext();

This returns an object containing immediate-context API-specific values. Context is a funny thing: some API's have it explicitly, some don't. Commands executed with the immediate context are sent straight away to the GPU.

virtual void PushTexturePath (const char *pathUtf8)=0;
virtual void PopTexturePath ()=0;

For resources, like textures, a stack is a great way of telling the renderer where to find them, because it handles multiple locations, and prioritizes them based on what was added last.

virtual void DispatchCompute (DeviceContext &deviceContext,int w,int l,int d)=0;

Most modern API's support compute shaders, and they pretty much all work like this - you apply the shader, then call "Dispatch" with three integers, representing the width, length, and depth of the compute block.

virtual void Draw  (DeviceContext &deviceContext,int num_verts,int start_vert)=0;
virtual void DrawIndexed (DeviceContext &deviceContext,int num_indices,int start_index=0,int base_vertex=0)=0;

And with the demise of OpenGL's immediate mode, most API's do their drawing this way: either you'll do a draw call with a specified number of vertices from the current buffer, or you'll do an indexed draw call, with a given set of indices.

virtual void DrawTexture (DeviceContext &deviceContext,int x,int y,int dx,int dy,crossplatform::Texture *tex,float mult=1.f,bool blend=false)=0;
virtual void DrawDepth  (DeviceContext &deviceContext,int x,int y,int dx,int dy,crossplatform::Texture *tex,const crossplatform::Viewport *v=NULL)=0;

These are two functions for drawing an onscreen quad. DrawTexture just puts the specified texture to the screen at the specified coordinates. DrawDepth draws a depth buffer: we needed a special function for this because depth values are often stored in a highly non-linear way, so that it's hard to see the detail in them - values are either very close to 1, or very close to zero. Implementations of DrawDepth get around this by using the projection matrix (part of the DeviceContext).

virtual void DrawQuad(DeviceContext &deviceContext)=0;

Here we just issue a draw call with four vertices - and usually, no vertex buffer. Modern shader languages can look at the vertex index instead of needing any actual vertex data. So the vertex shader can infer vertex position from the index - e.g. for fullscreen quads.

virtual void Print  (DeviceContext &deviceContext,int x,int y,const char *text,const float* colr=NULL,const float* bkg=NULL);

It is tremendously useful to be able to put text to the screen. This function doesn't need to be efficient: you probably won't use it for nice-looking in-game text, it's for debugging. We pass four floats (including alpha opacity) for the colour (white if NULL), and four for the background - if NULL, we don't draw the background.

virtual void PrintAt3dPos  (DeviceContext &deviceContext,const float *pos3,const char *text,const float* colr,int offsetx=0,int offsety=0,bool centred=false)  =0;

Another super-useful debugging function - draw the given text at the specified 3D position (but keep the text itself the same pixel size so it's readable).

virtual void Draw2dLines (DeviceContext &deviceContext,Vertext *lines,int vertex_count,bool strip)  =0;
virtual void DrawLines  (DeviceContext &deviceContext,Vertext *lines,int count,bool strip=false, bool test_depth=false,bool view_centred=false)  =0;
virtual void DrawCircle   (DeviceContext &deviceContext,const float *dir,float rads,const float *colr,bool fill=false)  =0;

Because each API has platform-specific stuff it needs to do for different kinds of rendering object, we define crossplatform base classes for them, and derived platform-specific classes.

virtual Texture *CreateTexture(const char *lFileNameUtf8=NULL) =0;
virtual BaseFramebuffer *CreateFramebuffer()=0;
virtual SamplerState *CreateSamplerState(SamplerStateDesc *) =0;
Effect *CreateEffect(const char *filename_utf8);
virtual Effect *CreateEffect  (const char *filename_utf8,const std::map<std::string,std::string> &defines)=0;
virtual Buffer *CreateBuffer()=0;
virtual Layout *CreateLayout(int num_elements,const LayoutDesc *layoutDesc) =0;
virtual RenderState *CreateRenderState(const RenderStateDesc &desc)=0;
virtual Query *CreateQuery(QueryType q)=0;

Each of these functions creates an instance of the platform-specific derived class, and returns a pointer to it. When you're done with the pointer, you just delete it. Later on we'll need to think about memory allocation - if the RenderPlatform is creating these objects, and we're deleting them in some other class, we need to provide some kind of memory allocator interface. But for now, we'll go with new/delete.

virtual Mesh *CreateMesh() =0;
virtual Light *CreateLight() =0;
virtual Material *CreateMaterial() =0;

Here's where we get a bit high-level. Neither OpenGL, nor DirectX defines things like lights or materials - those are more like game engine objects. But I think it's worthwhile to add them to the interface.

virtual void SetVertexBuffers(DeviceContext &deviceContext,int slot,int num_buffers,Buffer **buffers,const crossplatform::Layout *layout)=0;

Activate the specifided vertex buffers in preparation for rendering.

virtual void SetStreamOutTarget    (DeviceContext &deviceContext,Buffer *buffer)=0;

Graphics hardware can write to vertex buffers using vertex and geometry shaders; we use this function to set the target buffer.

virtual void     ActivateRenderTargets(DeviceContext &deviceContext,int num,Texture **targs,Texture *depth)=0;

Make the specified rendertargets and optional depth target active.

virtual void SetViewports(DeviceContext &deviceContext,int num,Viewport *vps)=0;
virtual Viewport GetViewport(DeviceContext &deviceContext,int index)=0;

Get the viewport at the given index.

   virtual void     SetIndexBuffer     (DeviceContext &deviceContext,Buffer *buffer)=0;

Activate the specified index buffer in preparation for rendering. //! Set the topology for following draw calls, e.g. TRIANGLELIST etc.

 virtual void     SetTopology      (DeviceContext &deviceContext,Topology t)=0;

This function is called to ensure that the named shader is compiled with all the possible combinations of \#define's given in \em options.

 virtual void     EnsureEffectIsBuilt    (const char *filename_utf8,const std::vector<effectdefineoptions> &options);

Called to store the render state - blending, depth check, etc. - for later retrieval with RestoreRenderState. Some platforms may not support this.

virtual void     StoreRenderState    (DeviceContext &deviceContext)=0;

virtual void     RestoreRenderState    (DeviceContext &deviceContext)=0;

Called to restore the render state previously stored with StoreRenderState. There must be exactly one call of RestoreRenderState for each StoreRenderState call, and they can be nested.

virtual void     SetRenderState     (DeviceContext &deviceContext,const RenderState *s)=0;

Apply the RenderState to the device context - e.g. blend state, depth masking etc.

virtual void     SetStandardRenderState   (DeviceContext &deviceContext,StandardRenderState s);

Apply a standard renderstate - e.g. opaque blending. This is a shortcut for the 95% cases that we'll be using a lot.

virtual void     PushRenderTargets(DeviceContext &deviceContext)=0;

Store the current rendertargets and viewports at the top of a stack, must always be followed by:

virtual void     PopRenderTargets(DeviceContext &deviceContext)=0;

This will restore rendertargets and viewports from the top of the stack.

Thursday 9 October 2014

Cross-API Rendering 1: Creating a Graphics API Abstract Layer for multiple API's

Writing rendering code for games today often involves supporting multiple platforms - PC of course, but also consoles like PS4 and Xbox One. Mac and Linux support are also becoming desirable - take a look at Steam on Linux for example.

So it's a bit awkward when writing software to go across these platforms when you have to use different graphics API's. For Windows, it's DirectX 11 - soon to be DX12. Some Windows users - particularly simulation people -

So we need a layer of commands between generic C++ code and a graphics API that could be DirectX 11, OpenGL 3+, or a console API, depending on the circumstances.

First question: why have a new API? Why not just choose a standard one, say OpenGL 3.0, then write adaptors for all the others so that we can write our rendercode IN GL3.0 and run it on any target? Firstly, I prefer a higher-level interface than GL, or DirectX or any other API offers. I want the minimum number of commands to achieve any task, so none of the existing API's fit the bill. Secondly, the way the abstracted API works is by sending ALL the information that we could possibly need. That is, we're sending MORE information that DX11 needs, more than GL3.0 needs. We're sending enough information that any KNOWN API will be able to interpret it.

This means that the API I'm creating will necessarily expand, slightly, as I add support for each new platform. And there will be redundancy - for example if one API specifies sampling in the shader, and one does it in C++, we must specify both, and they should match.

What a graphics API needs to be able to do

Create and initialize device objects

Textures
Shaders
Buffers
State objects
Queries

Perform GPU rendering or calculation commands using device objects

State changes
Draw calls
Compute dispatches

Free the device objects when no longer needed

The list above covers pretty much all the low-level constructs that we need to create. You can subdivide them - textures can be 3D, 2D or 1D. They can be arrays of 2D textures. They can be rendertargets, they can be compute targets. Buffers can be vertex buffers, index buffers, constant buffers (for shaders), or various other kinds.

Not all API's use state objects, e.g. Blend States. In OpenGL you mostly just set the individual parts of the state with single commands.

Draw calls and compute dispatches are really the same thing with different outputs - in both cases, you're going to run a shader with textures and buffers as inputs. But while a draw call puts to a render target, compute dispatches put random access output to textures or buffers.

What a high-level graphics API should do

Because we don't want to reinvent the wheel, it would be nice if our API could have some higher-level constructs.

Create and initialize

Materials
Meshes
Lights
Framebuffers/Rendertargets
Effects - grouped shaders

Perform higher-level draw functions

Print text
Draw full-screen quads
Draw screen-space quads - great for debugging
Draw geometric shapes - lines, circles and so on
Push and pop state - great to make sure render code doesn't leave a mess

And where possible, we want this API to be light and clean - to use as few commands as possible to perform common tasks.

Next, I'll describe the RenderPlatform class, the core of the new API.

@err,hr

https://twitter.com/grahamsellers/status/460840588456128512

If you're debugging Win32 code, stick this in your watch window:

"@err,hr"

Tuesday 7 October 2014

How to use "Move" on the Windows Menu

http://www.howtogeek.com/howto/windows/bring-misplaced-off-screen-windows-back-to-your-desktop-keyboard-trick/