It was apparent early on the project that the GS plugin was going to be
a big bottleneck during 3D scenes. It isn't that the GS plugin itself
does a lot of computation on the CPU, but the fact that it needs to
communicate with the graphics card means that unnecessary stalls will
occur in the graphics driver as the GPU and CPU are synchronized. During
these stalls, the CPU basically goes to lunch until the GPU is ready.
Graphics drivers and libraries are aware of this and try as little as
possible to communicate with the graphics card. They usually cache
render state changes, shader changes, and texture changes up until
actual geometry is rendered. They also take advantage of FIFOs (first in
first out buffers). The CPU just writes to the FIFO and the GPU just
reads from it, this makes all the difference in terms of keeping the GPU
active while the CPU isn't and vise versa.