Dev Diary — Skeletal Animation and GPU Skinning
frustrating start, orgasmic result
I got bored of working on my toy renderer. So I decided to switch it up a bit and work on some other typical systems of a game engine. And then out of nowhere, skeletal animation hit my mind. “yeah… why not?! How hard can it be? You got a mesh and some bones. You accumulate bone transforms. You tick and interpolate the animation keyframes…” Before I knew it, I was 30 hours deep in the rabbit hole, about to pull my hair out staring at a deformed humanoid mesh flapping its leg around all over the place.
The Skeleton
The first couple nights of this adventure, I had some random TV shows playing in the background. Everything was still chill. Then I realize after days I still couldn’t even wrap my head around getting all the data I needed to set up the bone structure correctly. To be fair, the node graph and bone data relationship in Assimp is confusing as fuck. Anyways, here are the things we need:
- Transform and Inverse Bind Pose(OffsetMatrix) of each bone (passed as uniform)
- Bound bones (IDs/indices and weights) of each vertex (stored as vertex attribute)
The Math
The goal of skeletal animation system, is to animate/move the “skin” (mesh made of vertices) “along” with the bound bones and deform accordingly. What’s stored in the animation key frames, are the local transformation of the specific bone in model space. So in order get a model transform of a child bone, we need to chain/recursively multiply all the local transformation of its parent bones. So now we can get the model space transformation of any bone.
But the bones by default are all positioned in the default bind pose of the skeleton (usually a T-Pose). How do we calculate the transformation needed to transform a bone from default bind pose to animated pose?
(I remember I was asked this exact question during a interview for frostbite)
Imagine two nodes represent an arm. We want to move it from Pose 1 to Pose 2. If these were vectors, the delta would be (p2-p1). But we’re messing with matrices here. So it’s a little different. The correct way to calculate the delta is: Pose2 Transform * Inverse of Pose1 Transform
We can get the Pose2 transform from the animation frames. Inverse of Pose1, or the Inverse Bindpose, is actually conveniently provided by Assimp. It’s called OffsetMatrix.
// the model transform is calculated each animation frame
// the offsetmatrix or the inverse bind pose, can be cached at initialization
// the final transform of a bone:boneTransforms[boneID] = bone.modelTransform * bone.offsetMatrix;
The Skinning
// vertex shaderin ivec4 boneIDs;
in vec4 boneWeights;const int MAX_BONES = 100;
uniform mat4 gBones[MAX_BONES];
uniform mat4 u_mvp;void main() {
vec4 localPos = vec4(0.0);
// each vertex can be influenced by 4 bones max
for(int i = 0; i < 4; i++) {
mat4 boneTransform = gBones[boneIDs[i]];
vec4 posePosition = boneTransform * vec4(pos, 1.0);
localPos += posePosition * boneWeights[i];
// todo: need to apply the bone transformation
// to normal as well
}
gl_Position = u_mvp * localPos;
}
and voila
yo when I saw this little shit walking smoothly, after a week of struggling. It felt so fucking good. And I guess this is what we live for.
All the little Gotchas:
- all the matrices in assimp, are row major. So if you’re using opengl or glm, either transpose everything on the go to column major, or transpose it at the end when setting the uniform
- The bone ID/index vertex attributes are obviously integers since these are indices to the bone transformation array. As a result
glVertexAttribPointer
can’t be used, since it will convert everything to floats. So either useglVertexAttribIPointer
or make sure to cast to int when indexing into the bone tranformation array in vertex shader.
Thoughts for future:
- Optimize bone transforms uniform data flow. don’t need full 4x4 matrix for example
- Experiment doing the skinning on CPU. How does that perform in comparison, specially if we can offload to worker threads.
- Experiment with instancing
- Experiment with different interpolation methods. And wtf are dual quaternions the pros keep talking about