This is the start of the Non-Photorealistic Rendering series of articles.
The other articles are:
Table of contents
Non-photorealistic (NPR) techniques, as presented here, is about creating/rendering content without photo-realism as the goal. It's about controlling your style and creating content with a specific look in mind. The techniques and approaches listed here are so far focusing on reproducing artistic styles, a common goal of NPR artists.
There is some disagreement over what to call NPR. Suggestions I like are "expressive graphics" or "artistic rendering," but as I first encountered this topic under the title of NPR, I've decided to keep that phrase for now.
When you look at a model and think of representing it without trying to mimic the real appearance, usually the task falls into one of two broad categories of techniques:
- Edge Differentiation
You want the edges of the model to be distinct so that objects on screen are visually separate entities. Edges can be used in any number of ways to add flavor to the representation, but usually the goal is to tell the viewer's eyes how to tell objects apart.
- Internal Shading
If edges are the outlines, internal shading is using crayons to color in between the lines. Fills don't have to work around edges, but usually need to avoid the edges since the design of internal techniques isn't usually suited for drawing things near the edge of a surface, especially true of fill techniques that aren't aware if they are near edges.
These two processes can be aware of each other or interact on many levels and don't necessarily have to be separate. However, because the challenges are very distinct, they are usually tackled with different approaches that by design become separate and distinct.
There are two broad categories of approaches that entail completely different ways of using the information from the 3D scene. Some prefer to distinguish the two as 2D or 3D techniques, but as I will propose later in this article, it's possible to use data with 3D significance in a 2D process.
- Post-Processing will typically use a series of images created from rendering passes and then execute algorithms that use these values as a basis to generate the final rendered image. Post-processing can be used as a total solution or as a supplement. It's usually more straight-forward since you're working with data that exists in the form of a 2D image, but not always as capable of acting in a manner that is aware of how the data exists in the 3D scene.
- Direct techniques mean you're using shaders that are written to produce output for the final image. Although the hardware is producing the output during the process of rendering the 3D scene, the result will usually look as though it was done with awareness of how the shapes would exist in 2D. Direct processes have the advantage of control. You can assign a different technique/material to every object and thus gain different behavior where desired. In addition, because the direct techniques will be working directly with 3D data, it's more straight-forward to use 3D relationships in producing the final rendered image.
Which is more powerful? Notice that both approaches strive to operate in a way that will produce an image reflecting the information that would be readily available to the other. Direct processes don't always know what their neighbor pixels are doing. Post-processing can no longer work with the raw 3D data. The most powerful technique will use them both to do what the other does poorly and to provide information that can be readily obtained in the 3D stage in a well-packaged format to the 2D stage. Using render layers in Blender 3D's compositor are the best example I can provide.
Humans have been striving to portray things since time immemorial. The arts have had innumerable schools in so many cultures even since recorded history began. When looking for a good technique to go for, if you create one that can reproduce a traditional artistic style, you can create a render that doesn't just get recognized by our eyes but speaks to our hands. Not only can we see what it is, but we can feel how it's drawn. For a program to have that kind of impact is amazing. That is what I consider the most powerful aspect of expressive rendering. In context of this discussion, I have to use the phrase expressive rendering, because that's what it's really about.
So you have a model. Now, a model describes many valuable things that a human knows through spatial intelligence. To mimic a technique for transforming that into a 2D image, you have to follow yourself step by step through the process of how you would do it with your hands. What information are you looking at? What part of the information is describing the thing you want your algorithm to know? How can you either extract this data from the 3D data or provide it separately? What aspect of the object would your hand be trying to recreate? How can you describe this relationship mathematically or how can you measure this relationship on the model? These are the questions that will take you from knowing what your hand knows to being able to tell a computer how to mimic it.
Here are some tricks that I have successfully implemented in Ogre and how they work:
This is a very basic technique that is very helpful to start building more complex techniques. The idea is to use your knowledge of where a pixel is on screen to do a texture fetch from a simple 2D texture that stores how to draw "strokes." The texture is sized/shaped to match the screen so that the point on the screen corresponds to the correct texel. I call these textures "stroke textures," and they are incredibly valuable in that you can have one stroke texture and use it to create a very good shade that doesn't require the shader code to be aware of what's going on in neighboring pixels. The relationship to neighboring pixels is implicitly correct in that the pixels are fetching from the same texture, which already contains this information of how to look on screen.
In order to either draw the edges or just have a readily available surface for creating edges, the most common technique is having several copies of the mesh displaced slightly but with reversed normals. The displaced mesh will protrude on one side of the regular mesh. Because the faces are back-facing, most of them will be culled from the render, but the ones that protrude and are not occluded by the front of the regular mesh will be rendered as non-lit poloygons. You can apply any material you want to them, but for black lines, leaving them unlit is the most common solution.
A much, much streamlined version of this technique (and I'm fairly certain this is what's behind the beautiful sumi-e inspired edges in Okami) is to scale the original mesh along its normals (inflating it slightly) and then reversing the normals. This requires only one set of extra vertices to achieve roughly the same geometric description as the previous technique.
Because the inflated mesh has back-facing normals, only the half of the mesh where you are looking into its interior will be rendered. Because the original mesh will render it's front half, it will occlude the normal-flipped mesh everywhere except on the edges or where the normal of the mesh crosses from positive to negative (a mathematical description of the phenomenon). If you render just the inflated, normal-flipped mesh, you will be looking into the concave interior of it, and this is why I call it "edge envelopes." You render an envelope that sticks out at the edges of the front-facing mesh, creating very convenient surfaces to draw edges on.
The beauty of these two techniques is that they don't actually perform edge detection. The edge display is implicit in culling and z-buffering, so there is no shader required. This is how PS2 hardware, with no shaders, was able to draw the very beautiful edges in Okami.
If you are sketching something by hand start shading the object very diligently, notice how you attempt to follow the surface contours so that the anisotropy (directional quality of the lines you use to shade) will describe the direction of the surface as much as the shading amount describes the lighting on the surface.
Ever looked at a contour map that uses tangent lines to express topology? If you could drop the z-component of the surface normal (or not, I've made techniques that work either way) you would be left with a vector that points away from the surface in 2D. If you shade perpendicular to that vector, you will create lines that flow along the topology in the same step as doing lighting, making a very distinct 3D appearance.
The Ogre head at the top of the page is done with an algorithm that gets the 2D normal and uses that to drive the application of a stroke texture. The correct anisotropy is implicit in how the texture is fetched. Thus the shader, even with no knowledge of what adjacent pixels are doing, is able to apply the right amount of shade in the correct direction (ironic that the shader has no knowledge of direction since it's only shading a single pixel) and produce a very nice 3D shade.