Advanced HLSL II: Shader compound parameters

This blog post was originally posted on my previous blog code4k
A very short post in the sequel of my previous post "Advanced HLSL using closures and function pointers", there is again a little neat trick by using the "class" keyword in HLSL: It is possible to use a class to regroup a set of parameters (shader resources as well as constant buffers) and their associate methods, into what is called a compound parameter. This feature of the language is absolutely not documented, I discovered the name "compound parameter" while trying to hack this technique, as the HLSL compiler was complaining about a restriction about this "compound parameter". So at least, It seems to be implemented up to the point that it is quite usable. Let's see how we can use this...

Group of input parameters in shaders, the usual way

Suppose the following code (not really useful):

// Shader Resources
SamplerState PointClamp;

// First set of parameters
// -----------------------
Texture2D<float> DepthBuffer;
float2 TexelSize;

// Associated methods with these parameters
float SampleDepthBuffer(float2 texCoord, int2 offsets = 0)
{
  return DepthBuffer.SampleLevel(PointClamp, texCoord + offsets * TexelSize, 0.0);
}

// Second set of parameters
// ------------------------
Texture2D<float> DepthBuffer1;
float2 TexelSize1;

// Associated methods with these parameters
float SampleDepthBuffer1(float2 texCoord, int2 offsets = 0)
{
  return DepthBuffer1.SampleLevel(PointClamp, texCoord + offsets * TexelSize1, 0.0);
}

float4 PSMain(float2 texCoord: TEXCOORD) : SV_TARGET
{
   return float4(SampleDepthBuffer(texCoord, int2(1, 0)), SampleDepthBuffer1(texCoord, int2(1, 0)), 0, 1);
}

What we have is some parameters that are grouped, for example
  • A resource DepthBuffer
  • A TexelSize that gives the size of a texel in uv coordinates for the previous textures (float2(1/width, 1/height))
  • A method "SampleDepthBuffer" that will sample the depth buffer.
And this set of parameters is duplicate with another set with just the postfix number "1". We need to duplicate the code here. Though of course, as usual there are some workaround
  • Either by using the preprocessor and token pasting: this approach is often used, but It means that you have a code that is sometimes less readable, especially if you have to embed a function in a #define.
  • For the methods SampleDepthBuffer, It could be possible to rewrite the signature to accept a Texture2D as well as a TexelSize as a parameter. Of course, if this function was using more textures, more parameters, we would have to pass them all by parameters...
The generated code produced by fxc.exe HLSL compiler is like this:
//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
//
//   fxc /Tps_5_0 /EPSMain test.fx
//
//
// Buffer Definitions:
//
// cbuffer $Globals
// {
//
//   float2 TexelSize;                  // Offset:    0 Size:     8
//   float2 TexelSize1;                 // Offset:    8 Size:     8
//
// }
//
//
// Resource Bindings:
//
// Name                                 Type  Format         Dim Slot Elements
// ------------------------------ ---------- ------- ----------- ---- --------
// PointClamp                        sampler      NA          NA    0        1
// DepthBuffer                       texture   float          2d    0        1
// DepthBuffer1                      texture   float          2d    1        1
// $Globals                          cbuffer      NA          NA    0        1
//
//
//
// Input signature:
//
// Name                 Index   Mask Register SysValue Format   Used
// -------------------- ----- ------ -------- -------- ------ ------
// TEXCOORD                 0   xy          0     NONE  float   xy
//
//
// Output signature:
//
// Name                 Index   Mask Register SysValue Format   Used
// -------------------- ----- ------ -------- -------- ------ ------
// SV_TARGET                0   xyzw        0   TARGET  float   xyzw
//
ps_5_0
dcl_globalFlags refactoringAllowed
dcl_constantbuffer cb0[1], immediateIndexed
dcl_sampler s0, mode_default
dcl_resource_texture2d (float,float,float,float) t0
dcl_resource_texture2d (float,float,float,float) t1
dcl_input_ps linear v0.xy
dcl_output o0.xyzw
dcl_temps 1
mad r0.xyzw, cb0[0].xyzw, l(1.000000, 0.000000, 1.000000, 0.000000), v0.xyxy
sample_l_indexable(texture2d)(float,float,float,float) r0.x, r0.xyxx, t0.xyzw, s0, l(0.000000)
sample_l_indexable(texture2d)(float,float,float,float) r0.y, r0.zwzz, t1.yxzw, s0, l(0.000000)
mov o0.xy, r0.xyxx
mov o0.zw, l(0,0,0,1.000000)
ret
// Approximately 6 instruction slots used

When we have to deal with lots of parameters that are grouped, and these groups need to be duplicated with their associated methods, It becomes almost impossible to maintain a clean and reusable HLSL code. Fortunately, the "class" keyword is here to the rescue!

Shader input compound parameters container, the neat way

Let's rewrite the previous code using the keyword "class":

SamplerState PointClamp;

// Declare a container for our set of parameters
class TextureSet
{
    Texture2D<float> DepthBuffer;
    float2 TexelSize;

    float SampleDepthBuffer(float2 texCoord, int2 offsets = 0)
    {
        return DepthBuffer.SampleLevel(PointClamp, texCoord + offsets * TexelSize, 0.0);
    }
};

// Define two instance of compound parameters
TextureSet Texture1;
TextureSet Texture2;

float4 PSMain2(float2 texCoord: TEXCOORD) : SV_TARGET
{
    return float4(Texture1.SampleDepthBuffer(texCoord, int2(1, 0)), Texture2.SampleDepthBuffer(texCoord, int2(1, 0)), 0, 1);
}

And the resulting compiled HLSL is slightly equivalent:

//
// Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
//
//
//   fxc /Tps_5_0 /EPSMain2 test.fx
//
//
// Buffer Definitions:
//
// cbuffer $Globals
// {
//
//   struct TextureSet
//   {
//
//       float2 TexelSize;              // Offset:    0
//
//   } Texture1;                        // Offset:    0 Size:     8
                                        // Texture:   t0
//
//   struct TextureSet
//   {
//
//       float2 TexelSize;              // Offset:   16
//
//   } Texture2;                        // Offset:   16 Size:     8
                                        // Texture:   t1
//
// }
//
//
// Resource Bindings:
//
// Name                                 Type  Format         Dim Slot Elements
// ------------------------------ ---------- ------- ----------- ---- --------
// PointClamp                        sampler      NA          NA    0        1
// Texture1.DepthBuffer              texture   float          2d    0        1
// Texture2.DepthBuffer              texture   float          2d    1        1
// $Globals                          cbuffer      NA          NA    0        1
//
//
//
// Input signature:
//
// Name                 Index   Mask Register SysValue Format   Used
// -------------------- ----- ------ -------- -------- ------ ------
// TEXCOORD                 0   xy          0     NONE  float   xy
//
//
// Output signature:
//
// Name                 Index   Mask Register SysValue Format   Used
// -------------------- ----- ------ -------- -------- ------ ------
// SV_TARGET                0   xyzw        0   TARGET  float   xyzw
//
ps_5_0
dcl_globalFlags refactoringAllowed
dcl_constantbuffer cb0[2], immediateIndexed
dcl_sampler s0, mode_default
dcl_resource_texture2d (float,float,float,float) t0
dcl_resource_texture2d (float,float,float,float) t1
dcl_input_ps linear v0.xy
dcl_output o0.xyzw
dcl_temps 1
mad r0.xy, cb0[0].xyxx, l(1.000000, 0.000000, 0.000000, 0.000000), v0.xyxx
sample_l_indexable(texture2d)(float,float,float,float) r0.x, r0.xyxx, t0.xyzw, s0, l(0.000000)
mov o0.x, r0.x
mad r0.xy, cb0[1].xyxx, l(1.000000, 0.000000, 0.000000, 0.000000), v0.xyxx
sample_l_indexable(texture2d)(float,float,float,float) r0.x, r0.xyxx, t1.xyzw, s0, l(0.000000)
mov o0.y, r0.x
mov o0.zw, l(0,0,0,1.000000)
ret
// Approximately 8 instruction slots used

There are a couple of things to highlight:
  • The main difference is the packing of constant buffer variable is done separately as they will be packed together - as a struct and aligned on a float4 boundary. So in this specific case, the two floats TexelSize cannot be swizzled/merged (if they were float4, the code would be strictly equivalent). So we need to be aware and careful about this behavior.
  • Input resources are nicely prefixed by their compound parameter name, like "Texture1.DepthBuffer" or "Texture2.DepthBuffer", so it is also really easy to access them when using named resource bindings in an effect. Note that a resource declared but unused inside a compound parameter will occupy a slot register without using it (This is not a big deal, as there is almost the same kind of behavior when using array of resources)
  • We can still enclose "TextureSet Texture1" into a constant buffer declaration, the variable defined inside TextureSet for the Texture1 instance will correctly end-up in the corresponding constant buffer.
  • Global variable are accessible from methods defined in a compound parameter (for example PointClamp SamplerState used by the SampleDepthBuffer method)
  • Compound parameters can only be compiled using SM5.0 (unlike the previous post about the closures).
This is really a handy feature that could help to better organize some of our shaders. It's always surprising to still discover this kind of syntax constructions accessible from the current HLSL compiler. Let me know if you find any issues using this trick!

Comments