Use software's BSP node traversal in OpenGL
I'm honestly surprised how this has gone unnoticed for this long. Turns out software has had an optimization in it's BSP tree traversal that avoids unnecessary recursion for quite a while, but it was never implemented in OpenGL, so I'm doing it now. I haven't done any performance tests, though, since I assume that whoever made the optimization in the software renderer already has.