Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Carousel Image Slider
maxNumber6
dotsfalse
infinitefalse
sliderHeight400
arrowsfalse
slidesToScroll2
captionstrue

...

Checkpoints are snapshots that save the state of the image as the renderer works on it. While viewable view-able as ordinary images, these are also slightly larger than usual because they embed extra state that the renderer needs in order to recover the render. If the render is interrupted or fails for some reason, the renderer can resume the render from the last checkpoint image. If instead, the render finishes then the extra state will be removed when writing the final version of the image. There are two main ways to produce checkpoints:

...

Alternatively, you could set the interval to 300s and the renderer will update the image approximately every five minutes after it starts work on the frame. This time includes the renderer startup time, such as parsing RIB, cracking procedurals, and building ray tracing acceleration structures. As a result, there may be fewer increments before the first checkpoint than between later checkpoints. For convenience, the time-based interval can also be specified with a suffix suffixes of s, m, h, or d for seconds, minutes, hours, and days respectively and these may be combined. For example, intervals of 360s, 6m, 5m60s, and 0.1h are all equivalent. Instead of a suffix, you can also just specify a positive number and time in seconds will be assumed, while a negative number will be interpreted as the number of increments.

...

Note that checkpointing is designed for batch rendering to images on disk. Renders to a live framebuffer such as "it" are already updated on-the-fly as the render proceeds. Of the built-in display drivers, currently, only the TIFF and OpenEXR display drivers support checkpointing. You may notice that the files produced include a ~w labeled channel. This is data stored to resume a render from this file.

How to Specify

In RIB, checkpoints can be specified via an Option or by passing the optional -checkpoint argument to prman.

Code Block
prman -checkpoint t[,t]

The above command is seen as the interval given and an optional exit time (the square backets).

You can use several different types of intervals and may be combined:

  • 60s - this is 60 seconds
  • 60m - this is 60 minutes
  • 60h - this is 60 hours
  • 60d - this is 60 days
  • 60i - this is 60 increments (incremental frame rendering, this is ignored if you're rendering to tiles/buckets)

You can combine these as needed, for example: -checkpoint 1h30m would create a checkpoint every hour and 30 minutes

The options for exitat are the same and can even be provided without needed a checkpoint written, in which case you've just told the renderer you want to stop a render and write out the result at a certain time. Below we write out a final result and exit at 30 minutes

Code Block
prman -checkpoint ,30m


It is possible to maintain a checkpointed "final" image using either an .ini setting or a RIB option, with the later overriding the former. If neither is present it defaults to off. When set, this prevents removing the extra channels and the checkpoint tag when writing the final image for the render. The final image will essentially be just another checkpoint, rather than a slimmed down image. This means that once your image has reached the quality you've set and it completes, it can always be restarted by the user later:

Code Block
/prman/checkpoint/asfinal [bool]

Option "checkpoint" "int asfinal" [0|1]

Temporary Files While Writing


When writing checkpoint files, we now write them to temporary files with a .part extension added. E.g., foo.exr will be written to foo.exr.part and kittens.ddc will be written to kittens.ddc.part during the checkpointing process.

Only after all of the files to write for a checkpoint have been written will these files be renamed to strip the .part extension and overwrite the previous checkpoint. This step happens immediately before the postcheckpoint command is run, if any.

Though not strictly atomic, this renaming is done in as small of a window of time as the OS permits in order to avoid mixing new checkpoint files with old. If the renderer is killed or dies for some reason while checkpointing, there may be some .part files left over. Deleting these should safely leave the previous checkpoint intact. Note that looking for these .part files is one way of detecting whether the renderer was killed during a checkpoint. Note too, that this new behavior means that the peak disk space used by checkpointing is now doubled.

Deep Data

Warning

When using checkpointing a deep EXR while using adaptive sampling, all of the AOVs considered by the adaptive sampler (e.g., normally the beauty) must be written as well to a shallow EXR. Otherwise, the checkpoint will not be recoverable.

We support Deep Data Checkpointing (DDC files) when rendering to Deep EXR formats. When using this feature you will not only see a "shallow" EXR written to disk for each checkpoint but also the DDC, which uses lossless compression, file for recovering the deep data. Since checkpointing often leaves multiple files on disk and deep data can be expensive to store, there is an option to compress the resulting deep data.

Compression can be set via a rendermn.ini option:

Code Block
/prman/checkpoint/ddccomplvl  7

The default level is 7. The typical range is 1 to 20 where higher numbers increase the compression at the cost of speed.

Note

The default level of 7 is chosen as a balance between size and performance. Values above 7 will decrease the performance of writing the files with diminishing returns on compression as values go higher. For this reason we do not recommend going above 7 unless your storage capacity is severely limited. A value of 1 will result in large files but the best performance and is ideal when storage for DDC files is plentiful.

To reiterate, note that this is lossless compression, unlike the legacy deepshadowerror rendering option. The choice of the ddccomplvl has no effect on image quality after recovery and is purely a question of checkpoint time performance versus disk space.

Skipping Deep EXRs or DDC files


When recovering checkpoints involving deep data, any deep EXRs are ignored and only the DDC files are used. If you do not need to view the deep EXRs from a checkpoint, you can now save both disk disk space and the time spent on processing and I/O to generate these deep EXRs by disabling them. To do this, set the following in your rendermn.ini to anything “truthy” (e.g., true, yes, on, or 1):


Code Block
/prman/checkpoint/skipdeepexr true


Even with this option set, we still write the deep EXRs when the render finishes normally (i.e., either hitting max samples or stopped by the adaptive sampler) since you cannot directly view or composite with the DDC files.

Also note that this option only applies to deep EXR files. All shallow EXRs are still written out with checkpoints as normal, albeit with the usual checkpointing channels and metadata.

The counterpart to all this is that the renderer now also normally saves time by skipping the writing of DDC checkpoint files when the render finishes and it is writing the final deep EXR image. You can use the existing asfinal option (either Option ”checkpoint” ”int asfinal” in the RIB or /prman/checkpoint/asfinal  1 in the rendermn.ini) to prevent this and have it write the DDC files anyway.

Finally, note that we continue the prior behavior of not deleting the last DDC files, even when the renderer finishes normally and asfinal is not set. At that point, the DDC files correspond to the checkpoint just before finishing and become stale data. This is something we may change in the future, depending on feedback, but remains the behavior for now.


Elapsed Time


This doesn’t strictly pertain to deep checkpoints, but shallow EXR checkpoint images now contain a new EXR attribute, checkpointElapsed. This represents the elapsed time in seconds since the renderer started until checkpointing the current image begins. It does not include prior rendering time if this process was started from a recovered checkpoint.

Recovery

Recovery of an interrupted render is enabled by passing the -recover 1 option to prman when starting a render. RenderMan will then load the scene as normal but rather than start from scratch and overwrite the existing images it will examine them to determine where it was interrupted. If successfulIfsuccessful, it will continue from close the point where it left off. If instead the images were finished, missing or don't match the current scene or each other for some reason, it will silently start from scratch.

...