The term "time to first pixel" refers to the amount of time it takes RfK and Katana to traverse the scene graph and generate the RIB necessary for RenderMan to start rendering pixels. Contrary to some popular beliefs it is possible to send RIB to PRMan from Katana in parallel using RenderMan RiProcedurals. This has the potential to greatly reduce the amount of time it takes to spin up a render however some care must be taken to set up the scene in an optimal way in order to take best advantage of this feature. In a nutshell:
The multiThreaded setting is used to inform PRMan that the procedurals generated by RfK are reentrant and eligible to be placed in a new thread. With forceExpand enabled no procedurals will be generated in the first place; likewise RfK cannot generate procedurals without bounded sections of the scene.
Parallel generation of RIB can only happen through the use of RiProcedurals. (Note: we are not talking about prman rendering, we're talking about the parallel execution of Ri calls which inform the renderer what to render and how to render it). As RfK is traversing the scene graph it will generate a procedural when it hits a bounded group or instance (locations of type assembly and component are considered to be a group in this context). Specifically, a procedural will be generated and a new thread spawned only if all of the below requirements are satisfied:
If the setting forceExpand is enabled at any location then that location and all of its children will be evaluated immediately without the use of procedurals. This will effectively disable parallel RIB gen on that portion of the scene graph. If forceExpand is set true at "/root" then the entire scene graph will be traversed in a single thread.
PRMan procedurals are not reentrant by default. Each procedural generated within RfK needs to be explicitly marked as reentrant. With the global setting multiThreaded RfK will mark all procedurals reentrant or not. Currently this is not available as a per-object setting.
Below are details regarding the settings used in determining the multi-threaded RIB gen behavior.
Attribute | Default | Effect | Details |
---|---|---|---|
prmanGlobalStatements. plugin.multiThreaded | 0 | Turning this on enables multi-threaded procedurals in PRMan. | The value of this setting is directly translated to the value of RiOption "procedural" "reentrant" which is set before calling RiProcedural |
forceExpand | 0 | Turning this on disables procedurals entirely for the location and all of its children. | This setting is not exposed in PrmanGlobalStatements or PrmanObjectStatements. It must be set directly in the Node Graph. |
type | n/a | Only locations of type group, assembly, component, and instance source are eligible for processing via procedural. | |
bound | none | Locations without a valid bound attribute cannot be processed via procedural and therefor cannot be evaluated in parallel with other portions of the scene graph. | PRMan requires bounds in order to evaluate when a procedural can be "cracked". The evaluation is based on shader rays; when a ray intersects the bounds of a procedural PRMan knows that the contents must be evaluated. |
Using the scene graph from the Katana "houseScene" demo data ($KATANA_HOME/demos/houseScene_data/houseScene.xml
) we can demonstrate the interactions between the various attributes and how they affect the parallel RIB gen behavior within RfK:
If this is set to false then none of the other settings matter, RIB gen will be serial.
Some amount of experimentation and thoughtful design needs to go into the scene organization for optimal time-to-first-pixel. There is no hard and fast rule, however we do have some hints:
If forceExpand is set true at /root then RIB gen will be serial regardless of which locations have bounds. If bounds are set up throughout the scene (option 5 above) then forceExpand can be used to disable the bounds at specific groups. Once forceExpand is set at a location it will be passed down through to that location's children, effectively disabling any parallel RIB gen for that scene graph sub-hierarchy.
So the gist of the story is "yes, you can do this in parallel but it takes a bit of effort to do right". So then, how does one tune their scene? By evaluation and experimentation.
[RfKProfile]: time-to-first-pixel (approx) = 4.093869s
Now you've got a baseline. Run it again with a different number of threads. Increase or decrease to get a feel for where it starts falling over. Ideally your time continues to decrease as your threads increase but in reality you may see it get worse with more threads (evaluate if there is enough work in each procedural to warrant a thread or are the threads spending all their time spinning, waiting for more work).
Next start experimenting with placement of bounds to give the renderer more or less work depending on how you see the threads scale.
Lather. Rinse. Repeat.
Details of performance profiling are beyond our scope, here are some links for information elsewhere: