- No Multi-Threading
- Do-It-Yourself (DIY) with all its disadvantages (using a multi-threaded task manager) which is what I already have
- Intel TBB
It seems pretty well suited for the task, and for now I simply parallelized the render loop, allowing each horizontal line to be processed by a different thread.
I played around tuning it to my machine and test scenes, logically, I got best results by splitting the scene in 2: I have 2 processors and the load is pretty balanced in the scenes.
This of course does not scale well, it would be nice if it was possible to make the code more scalable (number of processors) and scene nature (load balance).
Simply increasing the task granularity (per example splitting the scene into more chunks, or even pixel by pixel) comes at the cost of more scheduling, which is not negligible at all, additionally scheduling directives for OpenMP are defined at compile time, yes, 'free lunch is over'....
I thought this would be better in OpenMP than in the DIY solution, but for now I have the feeling both have to be fine tuned to a specific type of machine and processor count.
OpenMP is 'standard' and takes care of all argument marshalling headaches, it also surely beats any experimental DIY in terms of robustness and stability, so unless really necessary it is of course the better choice. 'really necessary' depends on the specific needs, OpenMP can be too simple and too limiting making it unusable for many cases, but when it fits, it fits!
That said, I will still be trying to improve the OpenMP solution a bit before coding a basic TBB version.