Skip to content

overview

Zhou edited this page Oct 24, 2018 · 6 revisions

THE OVERVIEW DIAGRAM

Puresoft3D’s system structure is very straight forward, so system diagram and class diagram do not seem to make much sense for this project. How about a pipeline diagram?

This is a very simple graphics pipeline, much simpler than those modern graphics APIs and GPUs, so I will skip those commonsense and only explain something special in this project.

INTERPOLATION PROCESSOR

The first special thing is the interpolation processor (again, ‘processor’ here means shader). OpenGL has vertex, geometry and fragment shader, but it does not have interpolation shader, because it has its own GLSL compiler and it can automatically generate interpolation codes based on the ‘out’ declarations in the vertex shader. However, because making a shader language compiler does not make sense to my project at least for now, I have to let processor developers write parts of interpolation code on their own. If you are not clear of what I mean, let me give you an example.

Suppose we have a vertex processor that takes a vertex as input and gives two output: a clip space vertex position, and a vertex colour data, usually 4 floats, right? Now, we call the ‘clip space vertex position’ ‘standard output’, because every vertex processor outputs it. However, we have to call the ‘vertex colour data’ ‘user data’, meaning that the pipeline does not know the purpose, size, and structure of the data, only the processor’s author knows it. On the other hand, the pipeline cannot assume vertex processors outputs colour data --- a vertex processor would output anything but it is impossible for the pipeline to guess. So to interpolate everything output from the vertex processor, the pipeline has to leave some work to the processor’s author. This is what the interpolation processor does. More details about how to write Puresoft3D processor programme can be found in API readme document.

THREAD MODEL

Another thing worth mention here is the threads. As illustrated in the above diagram, there are multiple fragment processing threads but only one vertex processing thread. Actually the vertex processing is in the caller’s thread. The reason for this is probably much simpler than you would guess --- my poor developing environment only has 4 CPU cores installed, one obvious choice is to give 2 threads to vertex processing and 2 for fragment. But this fair play is not as fair as it looks like, as in most cases fragment processing is much heavier than vertex. One of the reasons is that memory throughput is much lower on CPU side than GPU, which is specially designed to fit for graphics work --- intensive buffer reading and writing. But unfortunately, both texture sampling (reading) and frame buffer writing is in fragment processing stage, which makes fragment processing almost always slower than vertex. So finally my design is to give only one thread to vertex processing and except that, leave as many threads as possible for fragment processing.

RETURN TO LETS-SET-OFF
RETURN TO INDEX

Clone this wiki locally