SSAO-屏幕空间的环境光遮蔽实现

代码地址：

https://github.com/Li-Kira/CodeLib/tree/main/Unity/URP/PostProcess/SSAO

概述

在开始编写代码之前，我们需要了解屏幕空间的环境遮蔽的基本原理，下面是LearnOpenGL关于SSAO原理的一则解释：

对于屏幕上的每个片元，我们根据该片元周围的深度计算一个遮蔽因子，然后遮蔽因子用于减少片元的环境光照分量（环境光照分量分为环境光、漫反射光以及镜面反射光）。遮蔽因子是在片元的位置以球形进行采样，并将每个样本与当前片元的深度值进行比较得到的。采样到的样本越多，片元最终接受到的环境光越少。

环境光的效果和采样的样本数量有关，如果样本较少，那么会出现带状的伪影，如果样本较多，会影响性能。我们可以通过随机地旋转采样核来得到较好的结果。但是随机性会带来一些噪声，我们需要通过模糊操作来降低噪声的影响。下图展示了较低采样率下使用随机旋转采样核以及引入模糊操作带来的结果。

总之，为了实现高性能的SSAO效果，我们需要使用随机旋转采样核，为了降低噪声带来的伪影，我们可以对结果进行模糊。

技术细节

SSAO 是一种屏幕空间技术，其中遮挡是根据视图空间计算的，因此，由几何阶段的顶点着色器提供的position和normal被转换到视图空间（需要乘以视图矩阵）

我们需要生成许多沿表面法线方向的半球形采样核，但是为每个法线方向生成采样核实现困难，因此我们将在切线空间中生成采样核，法线向量指向z方向。

使用半球形采样核可以避免得到像孤岛危机那样的灰蒙蒙的效果。

更多采样核相关的资料可以查看这里：

http://frederikaalund.com/a-comparative-study-of-screen-space-ambient-occlusion-methods/

本文的实现基于URP Renderer Feature，通过屏幕后处理将环境光遮蔽应用到渲染的图像中，关于URP Renderer Feature相关问题可以查看我的上一篇关于Custom URP Renderer Feature的文章。

在片元着色器中，SSAO需要以下几种数据

根据深度缓冲、法线缓冲获取世界空间下的法线向量和位置信息
根据深度缓冲，获取当前屏幕空间的深度值
用于旋转采样核的随机旋转向量

下图展示了SSAO的实现流程，首先通过深度缓冲、法线缓冲获取物体的位置信息，然后沿着法线半球对周围的随机点进行采样，如果采样的点深度大于原来的深度就舍弃，如果小于则添加到AO的环境光分量中，最后将得到的AO进行模糊然后与原来的画面进行混合。

获取深度缓存、法线缓存

在Unity URP中获取深度缓存比较简单，需要下面的纹理：

1 2	TEXTURE2D(_CameraDepthTexture); SAMPLER(sampler_CameraDepthTexture);

通过输入深度坐标，将深度坐标转换成齐次剪裁空间坐标(ndc)，然后再使用矩阵转换成世界空间可以得到世界空间下的顶点数据。

float4 GetWorldPos(float2 uv)
{
    float rawDepth = SAMPLE_TEXTURE2D_X(_CameraDepthTexture, sampler_CameraDepthTexture, UnityStereoTransformScreenSpaceTex(uv)).r;
#if defined(UNITY_REVERSED_Z)
    rawDepth = 1 - rawDepth;
#endif
    float4 ndc = float4(uv.xy * 2 - 1, rawDepth * 2 - 1, 1);
    float4 wPos = mul(_VPMatrix_invers, ndc);
    wPos /= wPos.w;
    return wPos;
}

其中，转换世界空间的矩阵是从Renderer Feature获取的。

1 2	Matrix4x4 vp_Matrix = renderingData.cameraData.camera.projectionMatrix * renderingData.cameraData.camera.worldToCameraMatrix; m_Material.SetMatrix("_VPMatrix_invers", vp_Matrix.inverse);

此外，屏幕空间的深度信息可以使用以下函数来得到：

float GetEyeDepth(float2 uv)
{
    float rawDepth = SAMPLE_TEXTURE2D_X(_CameraDepthTexture, sampler_CameraDepthTexture, UnityStereoTransformScreenSpaceTex(uv)).r;
    return LinearEyeDepth(rawDepth, _ZBufferParams);
}

half4 SSAO_Frag(Varyings i) : SV_Target
{
    ...
    float depth = GetEyeDepth(i.texcoord);
    ...
 }

获取法线缓存需要添加一个Renderer Feature：

[DisallowMultipleRendererFeature]
[Tooltip("The Scene Normals pass enables rendering to the CameraNormalsTexture if no other pass does it already.")]
internal class DepthNormals : ScriptableRendererFeature
{
    private SceneNormalsPass m_SceneNormalsPass = null;

    public override void Create()
    {
        if (m_SceneNormalsPass == null)
        {
            m_SceneNormalsPass = new SceneNormalsPass();
        }
    }

    public override void AddRenderPasses(ScriptableRenderer renderer, ref RenderingData renderingData)
    {
        m_SceneNormalsPass.Setup();
        renderer.EnqueuePass(m_SceneNormalsPass);
    }


    // The Scene Normals Pass
    private class SceneNormalsPass : ScriptableRenderPass
    {
        public void Setup()
        {
            ConfigureInput(ScriptableRenderPassInput.Normal); // all of this to just call this one line
            return;
        }

        public override void Execute(ScriptableRenderContext context, ref RenderingData renderingData) { }
    }
}

然后可以在Shader中使用下面宏来得到法线缓存。

1 2	TEXTURE2D(_CameraNormalsTexture); SAMPLER(sampler_CameraNormalsTexture);

接着可以使用下面函数将片元的uv转换成法线向量。

float3 GetWorldNormal(float2 uv)
{
    float3 wNor = SAMPLE_TEXTURE2D(_CameraNormalsTexture, sampler_CameraNormalsTexture, uv).xyz; //world normal
    return wNor;
}

随机旋转内核

随机噪声可以使用下面的函数来生成

float Hash(float2 p)
{
    return frac(sin(dot(p, float2(12.9898, 78.233))) * 43758.5453);
}

float3 GetRandomVec(float2 p)
{
    float3 vec = float3(0, 0, 0);
    vec.x = Hash(p) * 2 - 1;
    vec.y = Hash(p * p) * 2 - 1;
    vec.z = Hash(p * p * p) * 2 - 1;
    return normalize(vec);
}

float3 GetRandomVecHalf(float2 p)
{
    float3 vec = float3(0, 0, 0);
    vec.x = Hash(p) * 2 - 1;
    vec.y = Hash(p * p) * 2 - 1;
    vec.z = saturate(Hash(p * p * p) + 0.2);
    return normalize(vec);
}

然后我们可以得到一个矩阵，用于旋转内核

float3 randomVec  = GetRandomVec(i.texcoord);

float3 tangent = normalize(randomVec  - worldNormal * dot(randomVec , worldNormal));
float3 bitangent = cross(worldNormal, randomVec);
float3x3 TBN = float3x3(tangent, bitangent, worldNormal);

可视化

到此为止，我们就得到所有需要的数据了，在开始采样之前，我们可以对数据进行一些可视化，看看是否有错误。

float3 worldPos = GetWorldPos(i.texcoord);
float3 worldNormal = GetWorldNormal(i.texcoord);
float3 randomVec  = GetRandomVec(i.texcoord);
float depth = GetEyeDepth(i.texcoord);

float3 tangent = normalize(randomVec  - worldNormal * dot(randomVec , worldNormal));
float3 bitangent = cross(worldNormal, randomVec);
float3x3 TBN = float3x3(tangent, bitangent, worldNormal);

可以使用以下语句进行检测：

1
2
3

return float4(worldPos, 1.0);
return float4(worldNormal, 1.0);
return float4(randomVec, 1.0);

如果数据正确，那么出来的结果会是这样的

深度缓存

法线缓冲

随机变量

其中，随机变量的结果就是一堆噪声。如果在采样ao之后发现一片空白，需要检查是否正确地获取了法线缓存，需要添加并开启法线缓冲的Render feature，否则无法得到上图2的结果。

采样

得到我们需要的数据之后，我们就可以开始采样了，首先我们定义采样率，这里使用**_SampleCount**来控制采样的次数。

使用[unroll(x)] 可以指定循环的最大次数

首先我们通过GetRandomVecHalf生成一个半球采样核
然后我们使用TBN矩阵来随机旋转它
然后我们使用两个矩阵视角矩阵和投影矩阵：VMatrix和PMatrix将采样核从世界空间转化到剪裁空间，最后通过映射得到屏幕空间的坐标
使用sampleDepth来存储当前屏幕空间下的采样核的深度值，用在之后的判断
使用rangeCheck和selfCheck来判断当前的ao是否被遮挡，如果被遮挡则需要剔除

具体代码如下：

float ao = 0;
int sampleCount = (int)_SampleCount;

//采样核
[unroll(128)]
for (int s = 0; s < sampleCount; s++)
{
    float3 sample = GetRandomVecHalf(s * i.texcoord);
    float scale = s / _SampleCount;
    scale = lerp(0.01f, 1.0f, scale * scale);

    sample *= scale * _Radius;
    float weight = smoothstep(0, 0.2, length(sample));
    sample = mul(sample, TBN);
    sample += worldPos;

    float4 offset = float4(sample, 1.0);
    offset = mul(_VMatrix, offset);
    offset = mul(_PMatrix, offset);
    offset.xy /= offset.w; 
    offset.xy = offset.xy * 0.5 + 0.5;

    float sampleDepth = SampleSceneDepth(offset.xy);
    sampleDepth = LinearEyeDepth(sampleDepth, _ZBufferParams);

    float sampleZ = offset.w;
    float rangeCheck = smoothstep(0, 1.0, _Radius / abs(sampleZ - sampleDepth) * _RangeCheck * 0.1);
    float selfCheck = (sampleDepth < depth - 0.08) ?  1 : 0;
    ao += (sampleDepth < sampleZ) ?  1 * rangeCheck * selfCheck * _AOInt * weight : 0;

}

ao = 1 - saturate((ao / sampleCount));
return ao;

到这里我们就能得到我们的第一个AO Pass了，之后还有用来模糊的Pass以及用来混合颜色的Pass，目前为止的效果如下：

多Pass处理

Shader

因为之后我们还要实现Blur Pass，在此之前，我们要将Shader代码整理成下面这样的结构，方便我们管理多个Pass的情况。

Shader "Hidden/SSAO"
{
    Properties
    {
        _MainTex ("Main Texture", 2D) = "white" {}
        _aoColor("aoColor", Color) = (1,1,1,1)
    }
    HLSLINCLUDE
    #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/Core.hlsl"
    #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/DeclareDepthTexture.hlsl"
    ENDHLSL
    
    SubShader
    {
        Tags { "RenderType"="Opaque" "RenderPipeline" = "UniversalPipeline" }
        Cull Off ZWrite Off ZTest Always
        
        Pass
        {
            Name "SSAO"
            HLSLPROGRAM
            #pragma vertex SSAO_Vert
            #pragma fragment SSAO_Frag
            #include "SSAOPass.hlsl"
            ENDHLSL
        }

        Pass
        {
            Name "Vertical Blur"
            HLSLPROGRAM
            #pragma vertex vertBlurVertical
            #pragma fragment fragBlur
            #include "BlurPass.hlsl"
            ENDHLSL
        }
        
        Pass
        {
            Name "Horizental Blur"
            HLSLPROGRAM
            #pragma vertex vertBlurHorizontal
            #pragma fragment fragBlur
            #include "BlurPass.hlsl"
            ENDHLSL
        }
        
        Pass
        {
            Name "Final SSAO"
            HLSLPROGRAM
            #pragma vertex SSAO_Vert
            #pragma fragment Final_Frag
            #include "SSAOPass.hlsl"
            ENDHLSL
        }
        
    }
}

Render Feature

关于URP Renderer Feature相关问题可以查看我的上一篇关于Custom URP Renderer Feature的文章。

为了能够成功实现屏幕后处理效果，我们还需要进一步地修改Render Feature。

首先我们要修改EffectComponent，添加我们需要控制的参数。

[Serializable]
[VolumeComponentMenuForRenderPipeline("Custom/SSAO", typeof(UniversalRenderPipeline))]
public class SSAOEffectComponent : VolumeComponent, IPostProcessComponent
{
    public ClampedFloatParameter radius = new ClampedFloatParameter(0.5f, 0f, 0.8f,  true);
    public NoInterpColorParameter color = new NoInterpColorParameter(Color.black);
    public ClampedIntParameter sampleCount = new ClampedIntParameter(22, 1, 128);
    public ClampedFloatParameter rangeCheck = new ClampedFloatParameter(0f, 0f, 10f);
    public ClampedFloatParameter aoInt  = new ClampedFloatParameter(1f, 0f, 3f);
    
    public ClampedFloatParameter blurSize = new ClampedFloatParameter(1f, 0f, 10f);
    
    public bool IsActive() => true;
    public bool IsTileCompatible() => false;
}

然后修改Pass，增加两张渲染纹理，并为他们获取内存。

private readonly int temporaryRTId_1 = Shader.PropertyToID("_BlurRT1");
private readonly int temporaryRTId_2 = Shader.PropertyToID("_BlurRT2");


public override void OnCameraSetup(CommandBuffer cmd, ref RenderingData renderingData)
{
    RenderTextureDescriptor descriptor = renderingData.cameraData.cameraTargetDescriptor;
    descriptor.depthBufferBits = 0;

    m_Source = renderingData.cameraData.renderer.cameraColorTarget;

    cmd.GetTemporaryRT(temporaryRTId_0, descriptor, FilterMode.Bilinear);
    cmd.GetTemporaryRT(temporaryRTId_1, descriptor, FilterMode.Bilinear);
    cmd.GetTemporaryRT(temporaryRTId_2, descriptor, FilterMode.Bilinear);

    m_Destination = new RenderTargetIdentifier(temporaryRTId_0);
    m_BlurBuffer1 = new RenderTargetIdentifier(temporaryRTId_1);
    m_BlurBuffer2 = new RenderTargetIdentifier(temporaryRTId_2);
}

接着需要修改Pass中的Render函数，

public void Render(CommandBuffer cmd, ref RenderingData renderingData)
{
    m_Material.SetColor("_aoColor", m_Effect.color.value);

    Matrix4x4 vp_Matrix = renderingData.cameraData.camera.projectionMatrix * renderingData.cameraData.camera.worldToCameraMatrix;
    m_Material.SetMatrix("_VPMatrix_invers", vp_Matrix.inverse);
    Matrix4x4 v_Matrix = renderingData.cameraData.camera.worldToCameraMatrix;
    m_Material.SetMatrix("_VMatrix", v_Matrix);
    Matrix4x4 p_Matrix = renderingData.cameraData.camera.projectionMatrix;
    m_Material.SetMatrix("_PMatrix", p_Matrix);

    m_Material.SetFloat("_SampleCount", m_Effect.sampleCount.value);
    m_Material.SetFloat("_Radius", m_Effect.radius.value);
    m_Material.SetFloat("_RangeCheck", m_Effect.rangeCheck.value);
    m_Material.SetFloat("_AOInt", m_Effect.aoInt.value);

    m_Material.SetFloat("_BlurSize", m_Effect.blurSize.value);


    Blit(cmd, m_Source, m_Destination);

    // Test SSAO
    //Blit(cmd, m_Destination, m_Source, m_Material, 0);

    // Test Blur
    // Blit(cmd, m_Destination, m_BlurBuffer1, m_Material, 0);
    // cmd.SetGlobalTexture("_AOTex", m_BlurBuffer1);  
    // Blit(cmd, m_BlurBuffer1, m_BlurBuffer2, m_Material, 1);
    // cmd.SetGlobalTexture("_AOTex", m_BlurBuffer2);  
    // Blit(cmd, m_Destination, m_Source, m_Material, 2);

    // Final
    Blit(cmd, m_Destination, m_BlurBuffer1, m_Material, 0);
    cmd.SetGlobalTexture("_AOTex", m_BlurBuffer1);  
    Blit(cmd, m_BlurBuffer1, m_BlurBuffer2, m_Material, 1);
    cmd.SetGlobalTexture("_AOTex", m_BlurBuffer2);  
    Blit(cmd, m_BlurBuffer2, m_BlurBuffer1, m_Material, 2);
    cmd.SetGlobalTexture("_AOTex", m_BlurBuffer1);  
    Blit(cmd, m_Destination, m_Source, m_Material, 3);

}

最后我们要在OnCameraCleanup中释放我们申请的临时纹理。

public override void OnCameraCleanup(CommandBuffer cmd)
{
    cmd.ReleaseTemporaryRT(temporaryRTId_0);
    cmd.ReleaseTemporaryRT(temporaryRTId_1);
    cmd.ReleaseTemporaryRT(temporaryRTId_2);
}

模糊

这里的模糊使用的是高斯模糊，他需要进行水平和垂直两次模糊，因此需要两个Pass来进行，代码如下：

#ifndef BLUR_INCLUDED
#define BLUR_INCLUDED

    #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/Core.hlsl"

    CBUFFER_START(UnityPerMaterial)
    float4 _MainTex_ST;
    float4  _AOTex_TexelSize;
    float _BlurSize;
    CBUFFER_END

    TEXTURE2D(_MainTex);
    SAMPLER(sampler_MainTex);

    TEXTURE2D(_AOTex);
    SAMPLER(sampler_AOTex);

    struct Attribute
    {
        float4 position : POSITION;
        float2 uv : TEXCOORD0;
    };

    struct Varyings
    {
        float4 position : SV_POSITION;
        float2 uv[5]  : TEXCOORD0;
    };

    Varyings vertBlurHorizontal(Attribute v)
    {
        Varyings o;
        VertexPositionInputs  PositionInputs = GetVertexPositionInputs(v.position.xyz);
        o.position = PositionInputs.positionCS;
        
        half2 uv = v.uv;
        
        o.uv[0] = uv;
        o.uv[1] = uv + float2(_AOTex_TexelSize.x * 1.0, 0.0) * _BlurSize;
        o.uv[2] = uv - float2(_AOTex_TexelSize.x * 1.0, 0.0) * _BlurSize;
        o.uv[3] = uv + float2(_AOTex_TexelSize.x * 2.0, 0.0) * _BlurSize;
        o.uv[4] = uv - float2(_AOTex_TexelSize.x * 2.0, 0.0) * _BlurSize;
					 
        return o;     
    }

    Varyings vertBlurVertical(Attribute v)
    {
        Varyings o;
        VertexPositionInputs  PositionInputs = GetVertexPositionInputs(v.position.xyz);
        o.position = PositionInputs.positionCS;

        half2 uv = v.uv;
			
        o.uv[0] = uv;
        o.uv[1] = uv + float2(0.0, _AOTex_TexelSize.y * 1.0) * _BlurSize;
        o.uv[2] = uv - float2(0.0, _AOTex_TexelSize.y * 1.0) * _BlurSize;
        o.uv[3] = uv + float2(0.0, _AOTex_TexelSize.y * 2.0) * _BlurSize;
        o.uv[4] = uv - float2(0.0, _AOTex_TexelSize.y * 2.0) * _BlurSize;
					 
        return o;
    }


    half4 fragBlur(Varyings i) : SV_Target {
        float weight[3] = {0.4026, 0.2442, 0.0545};
        

        float3 sum = SAMPLE_TEXTURE2D(_AOTex, sampler_AOTex, i.uv[0]) * weight[0];
        
        for (int it = 1; it < 3; it++) {
            sum += SAMPLE_TEXTURE2D(_AOTex, sampler_AOTex, i.uv[it*2-1]) * weight[it];
            sum += SAMPLE_TEXTURE2D(_AOTex, sampler_AOTex, i.uv[it*2]) * weight[it];
        }
			    
        return float4(sum, 1.0);
    }


    

#endif //BLUR_INCLUDED

此外，我们还需要在Render Feature中改变上面的参数

1	m_Material.SetFloat("_BlurSize", m_Effect.blurSize.value);

最终结果如图：

混合

最后写一个片元着色器，并添加一个将原来的场景和AO进行颜色的混合，代码如下：

half4 Final_Frag(Varyings i) : SV_Target
{
    half4 scrTex = SAMPLE_TEXTURE2D(_MainTex, sampler_MainTex, i.texcoord);
    half4 aoTex = SAMPLE_TEXTURE2D(_AOTex, sampler_AOTex, i.texcoord);

    half4 finalCol = lerp(scrTex * _Color, scrTex, aoTex.r);
    return finalCol;
}

以下两张图展示了没有开启SSAO和开启了SSAO的画面，开启了SSAO的画面可以在物体闭塞的区域看到阴影，添加了SSAO后的画面看起来比没有SSAO的画面更有层次感了。

拓展阅读

SSAO Tutorial