If you're in my situation... with a overall good graphics card but just lacking floating-point target support (what makes HDR), and you want to both achieve and be able to see HDR-ish effects, then you must get creative.
First step into faking HDR, is understanding how HDR works.
Table of contents
- True HDR
- Faking it
- Example implementation
- Shader code dump
- Fragment program for pre-bloom processing
- Fragment program for final composition stage
- Vertex/Fragment program for rendered objects
- Fragment programs for first separable convolution stage
- Convolution matrix for second separable convolution stage
- Convolution matrix for third separable convolution stage
- Convolution matrix for fourth separable convolution stage
- Shader code dump
- Screenshots
True HDR
Quickly, HDR works by rendering to an offscreen buffer in higher precision, higher range pixel formats - like floating point formats. The objective is to not require saturation of supra-white pixels. Rather, precise lighting information is retained for post-processing.
Thus, objects can be as dark or as bright as they need to, without having to "look all black" or "look all white". Why don't they? Because after the whole frame is rendered, a post-processing phase will compute an appropriate exposure level, and adjust lighting that way.
After exposure control, though, things may look all black, or all white. With high exposure levels, this excessive contrast may look unnatural. How does one fix that problem? Well...
Emulation of lens aberrations
When taking a picture, lens aberrations produce a very subtle but present bloom effect on the pictures taken. If you apply such effect to a saturated picture, you get the classic oniric look - dreamish. Now... get inside things, and apply the bloom effect before saturating the image, when there are whiter than white pixels. What do you get? A realistic "glow". This, is HDR, when used with taste.
So... the process
Simple.
- Render to a floating point framebuffer, with a mixture of ultra-bright and dark colors, as realistic as possible.
- Select an appropriate exposure level, and adjust the framebuffer accordingly (multiply by an appropriate scalar).
- Apply bloom.
- Transfer the floating point framebuffer to the main framebuffer (the backbuffer), converting formats and saturating on the way (usually done automatically by the hardware).
You're done.
Faking it
Let's se...
- Render to a floating point framebuffer, with a mixture of ultra-bright and dark colors, as realistic as possible.
Oops... I can't use floating point frambuffers.
So... No HDR? Wrong!
The previous line should be read, actually, as:
- Render to a temporary framebuffer, with extended range, and with a mixture of ultra-bright and dark colors, as realistic as possible.
There are ways to accomplish that without floating point framebuffers. The way I decided to use is a mockup of the sRGB standard, tweaked for my own requirements.
Basically, instead of storing alpha in the alpha channel, alpha will now store a scaling factor. So, finally, the tuple r,g,b,s translates to r,g,b*(1+4*s^2) - this gives us the ability to represent color components in the range 0,5, with a maximum contrast ratio of 1280:1 - not that bad. Of course... lots of precision issues arise in reality, so those 1280:1 are of academic interest only. But still, in practice, the extra range does make a difference.
All shaders in materials used to render to such a buffer will have to be modified, so that:
- They do not saturate the specular component - the one most likely to end up in the extended range
- They perform a final translation pass to the output prior to any kind of saturation, that encodes the r,g,b,s tuple. Ideally, they would select the lowest s component that prevents saturation of the rgb part... but since that is difficult on pixel shaders (without table lookups - which are possible, have fun, this is left as an excercise to the reader), approximate solutions will have to suffice.
- They take equally encoded textures and colors (both diffuse and specular) - this is crucial to be able to model any kind of visually appealing scene.
After this, the rest of the HDR process goes on as usual, with some interesting implementations though on the bloom section (which must process r,g,b,s-encoded data).
Avoiding excessive bloom
However, due to the reduced range of the data, it is very useful to do some preprocessing of the picture before applying the bloom. Thus, the bloom is performed in a strange way... in an additive way - instead of blooming the picture and using the bloomed picture as replacement, the picture gets its low-intensity data removed (to avoid excessive blooming), bloom-filtered, and then added to the original picture. This effectively avoids excessive blooming and produces utterly good-looking results ;-)
See the example implementation for more details on this part.
Back to floating-point?
Now... keep all things equal, and go back to using floating-point textures (if you can) - removing, of course, all things related to sRGB-ish conversion. You'll see remarkable improvement over standard HDR. Why? Basically, because you retained the measures taken to avoid excessive bloom - and are indeed, avoiding it. Although it is possible with floating point textures to use standard HDR techniques and achieve a similar, well-balanced effect, it is much harder. By applying the transfer procedure mentioned earlied that removes low-intensity content, any image content within a normal level of exposure will look sharp and clean.
Without floating-point, though, I'm having lots of precision issues
Inevitable... but you may want to try using fixed-point formats like A2R10G10B10 in temporary textures used for the bloom filter stages, though, to avoid those.
Example implementation
Here's a RenderMonkey project implementing a very dumbed-down version of the approach outlined above. Even so it is simple, it still achieves a good, realistic look. Simplifications include a very simple colorspace translation phase in the pixel shader of rendered objects (fixed s component that allows only a mild range extension, and only for the specular highlight), sRGB-ish decoding prior to bloom effects (so that the bloom filter can be applied as normal, instead of having to work natively with sRGB-ish data - this introduces some precision issues, but nothing too worrysome), and who knows what else (I sure don't ;-) ).
NOTE: It uses standard resources that come with RenderMonkey itself.
Shader code dump
Notes:
- If you don't understand how those are used, see the RenderMonkey project for enlightment.
- Don't forget to pay close attention to minification/magnification settings - they do matter a lot.
- Also pay attention to intermediate texture resolution - you can play with it, but not much.
- Only the first convolution stage will be fully dumped... the rest involve only replacing the convolution matrix, and updating the texture dimension constant.
- Just in case you don't notice, they're all GLSL shaders.
Fragment program for pre-bloom processing
uniform float Exposure; uniform sampler2D SrcColor; uniform sampler2D SrcHDR; varying vec2 texCoord; const vec4 gloomStart = vec4(0.95,0.95,0.95,0.95); float sqr(float x) { return x*x; } vec4 sqr(vec4 x) { return x*x; } vec4 expand_Hdr(vec4 color) { return color*(sqr(color.a*2.0)+1.0); } void main(void) { vec4 color = texture2D(SrcColor,texCoord); gl_FragColor = expand_Hdr(color*Exposure)-gloomStart; }
Fragment program for final composition stage
uniform float Exposure; uniform sampler2D SrcColor; uniform sampler2D SrcHDR1; uniform sampler2D SrcHDR2; uniform sampler2D SrcHDR3; uniform sampler2D SrcHDR4; uniform sampler2D Measure; uniform vec4 MipMix; float gloomIntensity=1.0; varying vec2 texCoord; float sqr(float x) { return x*x; } vec4 sqr(vec4 x) { return x*x; } vec4 expand_Hdr(vec4 color) { return color*(sqr(color.a*2.0)+1.0); } void main(void) { vec4 color = texture2D(SrcColor,texCoord); vec4 gloom = mat4( texture2D(SrcHDR1,texCoord), texture2D(SrcHDR2,texCoord), texture2D(SrcHDR3,texCoord), texture2D(SrcHDR4,texCoord) ) * MipMix; gl_FragColor = (expand_Hdr(color*Exposure)+gloom*16.0*gloomIntensity)*Exposure; }
Vertex/Fragment program for rendered objects
Vertex program
uniform vec3 LightDir; uniform vec4 vViewPosition; uniform mat4 matViewProjection; varying vec2 texCoord; varying vec3 normal; varying vec3 lightDirInTangent; varying vec3 viewDirInTangent; attribute vec3 rm_Tangent; attribute vec3 rm_Binormal; void main(void) { texCoord = gl_MultiTexCoord0.xy; mat3 tangentMat = mat3(rm_Tangent, rm_Binormal, gl_Normal); lightDirInTangent = normalize(LightDir) * tangentMat; viewDirInTangent = normalize(vViewPosition-gl_Position).xyz * tangentMat; gl_Position = ftransform(); }
Fragment program
uniform sampler2D BumpMap; uniform sampler2D ObjectMap; uniform sampler2D SpecMap; uniform float Shininess; uniform float SpecularIntensity; varying vec2 texCoord; varying vec3 lightDirInTangent; varying vec3 viewDirInTangent; void main(void) { vec3 n_lightDirInTangent = -normalize(lightDirInTangent); vec3 n_viewDirInTangent = normalize(viewDirInTangent); vec3 bump = normalize(texture2D(BumpMap,texCoord).xyz * 2.0 - 1.0); float lighting = dot(bump,n_lightDirInTangent); float blighting= n_lightDirInTangent.z; float specular = dot(-reflect(n_lightDirInTangent,bump),n_viewDirInTangent); vec4 texColor = texture2D(ObjectMap,texCoord); vec4 specColor = texture2D(SpecMap, texCoord); vec4 color = texColor * lighting + float(lighting>0.0)*float(blighting>0.0) * SpecularIntensity*pow(specular,Shininess)*specColor; gl_FragColor = vec4(color.xyz*0.5,0.414); }
Fragment programs for first separable convolution stage
Horizontal
uniform sampler2D Src; varying vec2 texCoord; const float texDimension = 512.0; const float texScaler = 1.0/texDimension; const float texOffset = -0.5/texDimension; void main(void) { vec4 color = vec4(0.0,0.0,0.0,0.0); const float gauss0 = 1.0/32.0; const float gauss1 = 5.0/32.0; const float gauss2 =15.0/32.0; const float gauss3 =22.0/32.0; const float gauss4 =15.0/32.0; const float gauss5 = 5.0/32.0; const float gauss6 = 1.0/32.0; vec4 gaussFilter[7]; gaussFilter[0] = vec4( -3.0*texScaler , 0.0, 0.0, gauss0); gaussFilter[1] = vec4( -2.0*texScaler , 0.0, 0.0, gauss1); gaussFilter[2] = vec4( -1.0*texScaler , 0.0, 0.0, gauss2); gaussFilter[3] = vec4( 0.0*texScaler , 0.0, 0.0, gauss3); gaussFilter[4] = vec4( +1.0*texScaler , 0.0, 0.0, gauss4); gaussFilter[5] = vec4( +2.0*texScaler , 0.0, 0.0, gauss5); gaussFilter[6] = vec4( +3.0*texScaler , 0.0, 0.0, gauss6); int i; for (i=0;i<7;i++) color += texture2D(Src, texCoord + gaussFilter[i].xy) * gaussFilter[i].w; gl_FragColor = color*0.5; }
Vertical
uniform sampler2D Src; varying vec2 texCoord; const float texDimension = 512.0; const float texScaler = 1.0/texDimension; const float texOffset = -0.5/texDimension; void main(void) { vec4 color = vec4(0.0,0.0,0.0,0.0); const float gauss0 = 1.0/32.0; const float gauss1 = 5.0/32.0; const float gauss2 =15.0/32.0; const float gauss3 =22.0/32.0; const float gauss4 =15.0/32.0; const float gauss5 = 5.0/32.0; const float gauss6 = 1.0/32.0; vec4 gaussFilter[7]; gaussFilter[0] = vec4( -3.0*texScaler , 0.0, 0.0, gauss0).yxzw; gaussFilter[1] = vec4( -2.0*texScaler , 0.0, 0.0, gauss1).yxzw; gaussFilter[2] = vec4( -1.0*texScaler , 0.0, 0.0, gauss2).yxzw; gaussFilter[3] = vec4( 0.0*texScaler , 0.0, 0.0, gauss3).yxzw; gaussFilter[4] = vec4( +1.0*texScaler , 0.0, 0.0, gauss4).yxzw; gaussFilter[5] = vec4( +2.0*texScaler , 0.0, 0.0, gauss5).yxzw; gaussFilter[6] = vec4( +3.0*texScaler , 0.0, 0.0, gauss6).yxzw; for (int i=0;i<7;i++) color += texture2D(Src, texCoord + gaussFilter[i].xy) * gaussFilter[i].w; gl_FragColor = color*0.5; }
Convolution matrix for second separable convolution stage
const float gauss0 = 1.0/32.0; const float gauss1 = 5.0/32.0; const float gauss2 =15.0/32.0; const float gauss3 =22.0/32.0; const float gauss4 =15.0/32.0; const float gauss5 = 5.0/32.0; const float gauss6 = 1.0/32.0;
Convolution matrix for third separable convolution stage
const float gauss0 = 1.0/32.0; const float gauss1 = 5.0/32.0; const float gauss2 =15.0/32.0; const float gauss3 =22.0/32.0; const float gauss4 =15.0/32.0; const float gauss5 = 5.0/32.0; const float gauss6 = 1.0/32.0;
Convolution matrix for fourth separable convolution stage
const float gauss0 = 1.0/18.0; const float gauss1 = 2.0/18.0; const float gauss2 = 4.0/18.0; const float gauss3 = 4.0/18.0; const float gauss4 = 4.0/18.0; const float gauss5 = 2.0/18.0; const float gauss6 = 1.0/18.0;
Screenshots
These were taken with an exposure level of 1. See how the sun, part of the cubic texture actually, creates an intense glow much like a lens flare. Also see how the specular highlights glow slightly, in a very pleasant way.
These were taken with an increased exposure. See how the specular highlights now glow in the final picture, due to being much more intense than what meets the eye.
Alias: Faking_HDR