tardy renderer minor 1
Rasterization is one of the typical techniques for rendering 3D models. By following the tinyrenderer course, this project aims to explore and understand the rasterization pipeline in low-level 3D graphics libraries. Let’s start with a first basic implementation.
Important
All models (Wavefront .obj file format) are from the tinyrenderer course. See it here.
Topic of Lessons
Bresenham’s line drawing




Note
All measurements in Release mode with optimizations enabled (averaged over 5 runs)
Here are the performance measurements on Windows.
cd ./x64/Release
Measure-Command { .\tardy_renderer_minor_1.exe }



Note
In the past, floating-point operations were significantly more expensive than integer ones (or even entirely inaccessible). This is why Jack Elton Bresenham developed his all-integer rasterization algorithm in the 1960s.

Thank you, Professor Bresenham. I see you.
Triangle rasterization

Note
Since the three vertices need to be sorted, I adopted the insertion sort from C++ standard library’s introsort.




After sorting, the edges are colored as follows:
- edge between the 1st and 2nd vertices: blue
- edge between the 2nd and 3rd vertices: green
- edge between the 3rd and 1st vertices: red

Warning
While the top-left rule is standard practice for rasterization, it’s unnecessary here as the three triangles are non-overlapping.






AABB means Axis-Aligned Bounding Box, a type of bounding volume.

I think it is more efficient to use float instead of double.
Because it provides sufficient precision for optimization with no visually perceivable loss in quality.


The output of scanline rendering and modern rasterization are NOT same. This discrepancy is driven by three technical factors:
- quantization errors from integer arithmetic,
- tie-breaking rules (like the Top-Left rule) for shared edges,
- and the shift from incremental to pixel-independent interpolation.
Ultimately, the modern rasterization approach provides superior precision and architectural consistency.
Barycentric coordinates
The second attempt code is a variation of the digital differential analyzer (DDA) algorithm.
represents the sampling position. It is not the target being interpolated, but the input that drives interpolation. indicates how far along it is from the start point () to the end point (). If is at , then . If is at , then .
Based on the equation of a line, estimate where should be for the value of . To do this, apply the ratio computed earlier. This is called linear interpolation, also known as lerp.

The detailed discussion of the barycentric coordinate system will be covered separately.

Hidden faces removal
Each {Model} - add a simple trick corresponds to “Triangle rasterization with a simple trick” homework.
The difference is in color:
- solid (random) color := modern rasterization approach + back-face culling
- gradient color := full-colored triangles by barycentric coordinates
At a glance, it is difficult to distinguish the first image (add a simple trick ver.) from the second one (painter's algorithm ver.).
That’s why I visualized the comparison of two images.
The (C, M, Y) colors indicate region where pixels do not exactly match, by computing pixel-wise difference between two images.
Note
The image comparison was implemented using OpenCV in a separate project, and I plan to extend it further.
I just wanted to paint all triangles in the rear-to-front order, as presented in the lecture. The volume rendering-like effect was made by reconstructing 96 frames into 4 seconds GIF with FFmpeg. In other words, it was encoded at 24 FPS to produce cinematic feeling.












The initial, yet incomplete depth buffer appears. The depth buffer colors represent:
- black: depth value close to (far)
- white: depth value close to or (near)
This is because the depth buffer is initialized to 0, which corresponds to black.






Once updated to the nearest depth value per pixel, the result (Z-buffer) is finally complete. Compared to the previous incomplete depth buffer, grayscale gradation becomes even smoother. The volume rendering-like effect is made in a similar manner.
Warning
In Painter’s algorithm, the effect appears at the triangle level, whereas here it appears at the pixel level. This still differs from actual volume rendering, as all but a single depth value are discarded by the Z-buffer.





















Naive camera handling
The rotation matrix that rotates vectors by an angle (= ) about the -axis in three dimensions, following the right-hand rule, is given by:
There are two ways to apply this rotation.
- Keep the camera fixed and rotate the scene (including all models). ✅
- Keep the scene fixed and rotate the camera.
In this chapter, the first approach is used. These two methods are equivalent and differ only by an inverse transformation. So the result is indistinguishable to the viewer—much like in Einstein’s elevator experiment.






To express perspective—one of the key elements of visual realism—a simplified form of perspective projection is used. (Ref. 3D projection) This makes the fundamental geometry easier to understand. The camera is placed on the -axis at (0, 0, ), and the projection plane is the -plane at . This setup differs from the standard rendering pipeline used in OpenGL, where the camera is fixed at the origin and the projection is performed onto a near plane in front of it.
Important
In the standard rendering pipeline, projection is tightly coupled with clipping. In particular, it determines the last visible point along the viewing frustum by projecting from an out-of-view (invisible) point after all necessary transformations have been applied. As a result, the projection matrix must map the entire frustum into a canonical view volume, which leads to a significantly more complex matrix. (Ref. The Perspective Projection Matrix)






Better camera
Shading
More data!
Tangent space
Shadow mapping
Ambient occlusion
Bonus: toon shading
Homework
Bresenham’s line drawing (wireframe rendering)

You can see the actual appearance in this video. (Ref. Diablo Ⅲ Cinematics)



The resolution in the Visual Studio Model editor is 500 x 500, while the others (my renderer) are 1000 x 1000.











Triangle rasterization with a simple trick
Note
All renders using random colors are shown from a fixed front view.





A simple trick includes the following:
- backface culling,
- avoiding division by zero,
- discarding triangles (< a pixel).





Barycentric coordinates (full-colored vs wireframe effect)
After sorting, the vertices are colored as follows:
- the top vertex: blue
- the middle vertex: green
- the bottom vertex: red
Let’s lerp!

Then, let’s find the magic number. Here’s the method I found after a few attempts.




It looks like I can subtract the white triangle area from the whole triangle. Professor, would be the magic number?

Note
All renders using specific colors are shown from a fixed front view.
For we are bound by the (R, G, B) Color. The sacred union of our every thought and emotion. :)
“For we are bound by the Khala…the sacred union of our every thought and emotion."
— Hierarch Artanis (StarCraft Wiki)
In full-colored models, I can observe consistent patterns. In contrast, in wireframe models, the thickness of wireframe effect is proportional to the size of each triangle, so it is NOT uniform.






Graphics mathematics
Graphics mathematics templates are based on three design choices:
- Adopt a column-major layout to align with the convention used by GLM, OpenGL, and Vulkan.
- Apply modern C++ features such as
constexprand[[nodiscard]]to make the code more robust and expressive. - Take care to document each function with mathematical details so that its intent is clear.
vector
template<int N> struct vec
{
double data[N] = {};
// i-th element
constexpr double& operator[](int i) {
assert(i >= 0 && i < N);
return data[i];
}
constexpr double operator[](int i) const {
assert(i >= 0 && i < N);
return data[i];
}
};
// addition (component-wise)
template<int N> constexpr vec<N> operator+(vec<N> a, const vec<N>& b) {
for (int i = N; i--; a[i] += b[i])
;
return a;
}
// subtraction (component-wise)
template<int N> constexpr vec<N> operator-(vec<N> a, const vec<N>& b) {
for (int i = N; i--; a[i] -= b[i])
;
return a;
}
// multiplication: vector x scalar (component-wise)
template<int N> constexpr vec<N> operator*(vec<N> v, double s) {
for (int i = N; i--; v[i] *= s)
;
return v;
}
// multiplication: scalar x vector (component-wise)
template<int N> constexpr vec<N> operator*(double s, vec<N> v) { return v * s; }
// division: vector / scalar (component-wise)
template<int N> constexpr vec<N> operator/(vec<N> v, double s) {
for (int i = N; i--; v[i] /= s)
;
return v;
}
// dot product (sum of component-wise multiplication)
template<int N> constexpr double operator*(const vec<N>& a, const vec<N>& b) {
double sum = 0;
for (int i = N; i--; sum += a[i] * b[i])
;
return sum;
}
// stream output
template<int N> std::ostream& operator<<(std::ostream& out, const vec<N>& v) {
for (int i = 0; i < N; ++i)
out << v[i] << " ";
return out;
}
// L2 norm: ||v||
template<int N> double norm(const vec<N>& v) { return std::sqrt(v * v); }
// normalization: v / ||v||
template<int N> vec<N> normalized(const vec<N>& v) { return v / norm(v); }
template<> struct vec<2>
{
double x = 0, y = 0;
constexpr double& operator[](int i) {
assert(i >= 0 && i < 2);
return i ? y : x;
}
constexpr double operator[](int i) const {
assert(i >= 0 && i < 2);
return i ? y : x;
}
};
template<> struct vec<3>
{
double x = 0, y = 0, z = 0;
constexpr double& operator[](int i) {
assert(i >= 0 && i < 3);
return i ? (1 == i ? y : z) : x;
}
constexpr double operator[](int i) const {
assert(i >= 0 && i < 3);
return i ? (1 == i ? y : z) : x;
}
};
// cross product: a × b (for vec3)
// perpendicular to both a and b
// magnitude: |a||b|sin(theta)
inline vec<3> cross(const vec<3>& a, const vec<3>& b) {
return {a.y * b.z - a.z * b.y, a.z * b.x - a.x * b.z, a.x * b.y - a.y * b.x};
}
template<> struct vec<4>
{
double x = 0, y = 0, z = 0, w = 0;
constexpr double& operator[](int i) {
assert(i >= 0 && i < 4);
return i < 2 ? (i ? y : x) : (2 == i ? z : w);
}
constexpr double operator[](int i) const {
assert(i >= 0 && i < 4);
return i < 2 ? (i ? y : x) : (2 == i ? z : w);
}
constexpr vec<2> xy() const { return {x, y}; }
constexpr vec<3> xyz() const { return {x, y, z}; }
};
// type alias
using vec2 = vec<2>;
using vec3 = vec<3>;
using vec4 = vec<4>;matrix (column-major)
// forward declaration
template<int N> struct recursive_det;
template<int R, int C> struct mat
{
vec<R> col[C] = {};
// j-th column
constexpr vec<R>& operator[](int j) {
assert(j >= 0 && j < C);
return col[j];
}
constexpr const vec<R>& operator[](int j) const {
assert(j >= 0 && j < C);
return col[j];
}
// i-th row -> vec<C>
[[nodiscard]] constexpr vec<C> row(int i) const {
vec<C> row_vec;
for (int j = C; j--; row_vec[j] = col[j][i])
;
return row_vec;
}
// determinant (only square matrix)
// Ref. recursive_det<C>::expand
[[nodiscard]] double det() const {
static_assert(R == C);
return recursive_det<C>::expand(*this);
}
// cofactor matrix (only square matrix)
// (1) minor (T: 1 / F: 0)
// (2) det(minor) * (-1)^{r + c}
[[nodiscard]] double cofactor(int r, int c) const {
assert(r >= 0 && r < R && c >= 0 && c < C);
mat<R - 1, C - 1> minor;
for (int i = R - 1; i--;)
for (int j = C - 1; j--; minor[j][i] = col[j + (j >= c)][i + (i >= r)])
;
return minor.det() * ((r + c) % 2 ? -1 : 1);
}
// transpose of the inverse matrix
// (1) inverse matrix := 1 / det(M) * C^T (C: cofactor matrix)
// (2) 1 / det(M) * C
// Note. Laplace expansion
// det(M) := sum of products of the entries in that row and their respective cofactors
[[nodiscard]] mat inverse_transpose() const {
mat cof_mat;
for (int i = R; i--;)
for (int j = C; j--; cof_mat[j][i] = cofactor(i, j))
;
return cof_mat / (col[0] * cof_mat[0]);
}
// inverse matrix
[[nodiscard]] mat inverse() const { return inverse_transpose().transpose(); }
// transpose matrix
[[nodiscard]] mat<C, R> transpose() const {
mat<C, R> tp_mat;
for (int i = R; i--;)
for (int j = C; j--; tp_mat[i][j] = col[j][i])
;
return tp_mat;
}
};
// addition
template<int R, int C> mat<R, C> operator+(mat<R, C> a, const mat<R, C>& b) {
for (int j = C; j--; a[j] = a[j] + b[j])
;
return a;
}
// subtraction
template<int R, int C> mat<R, C> operator-(mat<R, C> a, const mat<R, C>& b) {
for (int j = C; j--; a[j] = a[j] - b[j])
;
return a;
}
// multiplication: matrix x scalar
template<int R, int C> mat<R, C> operator*(mat<R, C> m, double s) {
for (int j = C; j--; m[j] = m[j] * s)
;
return m;
}
// multiplication: matrix x vector
template<int R, int C> vec<R> operator*(const mat<R, C>& m, const vec<C>& v) {
vec<R> col_vec;
for (int i = R; i--; col_vec[i] = m.row(i) * v)
;
return col_vec;
}
// multiplication: vector x matrix
template<int R, int C> vec<C> operator*(const vec<R>& v, const mat<R, C>& m) {
vec<C> row_vec;
for (int j = C; j--; row_vec[j] = v * m[j])
;
return row_vec;
}
// multiplication: matrix x matrix
template<int R, int K, int C> mat<R, C> operator*(const mat<R, K>& a, const mat<K, C>& b) {
mat<R, C> m;
for (int j = C; j--; m[j] = a * b[j])
;
return m;
}
// division: matrix / scalar
template<int R, int C> mat<R, C> operator/(mat<R, C> m, double s) {
for (int j = C; j--; m[j] = m[j] / s)
;
return m;
}
// stream output (row-major)
template<int R, int C> std::ostream& operator<<(std::ostream& out, const mat<R, C>& m) {
for (int i = 0; i < R; ++i)
out << m.row(i) << "\n";
return out;
}
// expand -> cofactor -> det -> -> expand -> ...
template<int N> struct recursive_det
{
static double expand(const mat<N, N>& m) {
double sum = 0;
for (int i = N; i--; sum += m[0][i] * m.cofactor(i, 0))
;
return sum;
}
};
template<> struct recursive_det<1>
{
static double expand(const mat<1, 1>& m) { return m[0][0]; }
};Custom Features
- Add PNG image output support using
stb_image_write.h. - Add
byte_per_pixel()(accessor function) to support generic color interpolation. - Apply
clang-formatbased on Blender style. - Separate render modes (wireframe, Lerp, Z-buffer, etc.).
- Add
Meshconstructor overloading to support multi-mesh loading. - Visualize volume rendering-like effect (triangle-level / pixel-level).
Development Environment
Hardware specifications
- CPU: Intel Core i7-8750H (6 cores, 12 threads @ 2.20GHz)
- GPU: NVIDIA GeForce GTX 1060 (6GB VRAM)
- RAM: 16GB
Software requirements
- OS: Windows 11 Home (Build 26200.7462)
- IDE: Visual Studio 2026
- Compiler: MSVC (included with Visual Studio 2026)
- C++ Standard: C++20
- Build System: Visual Studio Solution (.sln)
- Graphics API: None - Pure software rasterizer (CPU-based rendering)
Dependencies
- stb_image_write.h - For PNG image export (header-only library)
- Source: stb/stb_image_write.h
- Already included in project