Tractor 2.x Features
Anchor 2.x 2.x
The great new features in the Tractor 2.x releases extend the core Tractor system established in the 1.x family of releases. There are a broad range of new additions and improvements, from productive new command line and scripting interfaces for wranglers, to simple user interface changes. Internal upgrades range from a new high-performance, high-capability job database, to new studio-wide resource sharing and allocation controls.
Here are some highlights:
Tractor Product Layout -- Single Release Directory, Single Download per platform, Bundled Subsystem Updates -- The Tractor 2.x packaging and installation layout includes a matched set of Tractor components all in one download: engine, blade, spooler, user interfaces, and scripting tools. They are all installed together in one versioned area, along with only one copy of matched shared resources including pre-built versions of several third-party subsystems.
Tractor Query Tools -- Introducing tq the tractor query command line tool and modules. Based on proven Pixar studio tools, tq is the best way to query live or historical Tractor data from your terminal shell, from your Python scripts, or from a new tab in the Dashboard.
Adaptive Farm Allocations -- A way to dynamically allocate abstract resources between people or projects using Tractor's flexible Limits system. If two films are in production, 60% of the farm can be allocated to one of them, 30% to the other, leaving the remaining 10% for other projects. If one show is idle the others can temporarily expand their shares, then shrink back to the nominal levels when all projects are active.
Dispatching Tiers -- A simple way to organize broad sets of jobs into a descending set of site-defined priority groups. The default tiers are named: admin, rush, default, batch. Create your own!
Custom Menu Actions -- Add site-defined Dashboard menu items that can invoke your own centralized scripts, parameterized by the user's current list selection.
Job Authoring API -- A new tractor.api.author module allows your Python scripts to easily create Job, Task, and Command objects linked together according to your dependency requirements. The Job.spool() method then sends the resulting job to the tractor-engine job queue.
Simple Engine Discovery -- A simple "zero-config" announcement capability for small studios allows tractor-blades and other Tractor tools to find tractor-engine on the local network without requiring manual nameserver (DNS/LDAP) configuration changes. Tractor-engine will automatically disable this SSDP-style traffic at studios where the hostname alias "tractor-engine" has already been created by an administrator in the site nameserver database.
Checkpoint-Resume Support -- Extensions to job scripting, dispatching, and the Dashboard add interesting new capabilities related to incremental computation. Tractor also supports a general "retry from last checkpoint" scheme. Both features integrate with the new RenderMan 19 RIS checkpoint and incremental rendering features.
Blade Auto-Update -- A simple tractor-blade patch management system allows administrators to "pin" the farm to a particular blade patch version, and automatically push out a new version to the entire farm. Out of date blades restart themselves using the new module version.
Pluggable Authentication Module (PAM) support -- The engine's optional new built-in PAM support delegates password validation directly to the operating system on the engine host. This alternative makes it simple to enable password support at studios where the LAN already provides adequate credential transport security.
Privilege Insulation -- The EngineOwner setting in tractor.config specifies the login identity under which tractor-engine should operate. This setting is important because it allows the engine to drop root privileges immediately after it has acquired any protected network ports that it may need. The engine's normal day-after-day operations will then occur under the restrictions associated with the specified login name.
Dynamic Service Key Advertisement -- Several blade profile "Provides" modes have been added to support some advanced service key use cases. For example, blades can dynamically advertise a different set of service key capabilities depending on which keys have already been "consumed" by previously launched commands.
Resource Usage Tracking -- The operating system rusage process metrics CPU, RSS, and VSZ are now recorded into the job database for each launched command. Currently supported on Linux and OSX tractor-blades.
Command Retry History -- A unique tracking record is now created in the job database for every command launch attempt. So the history of retries on a given task can be reviewed using the tq tool, for example.
Configuration File Loading -- A streamlined override system can help to reduce clutter and improve clarity about which files have been modified from their original "factory settings" at your studio.
Task Concurrency Throttle -- Each job can specify a "maxActive" attribute to constrain the number of concurrently running tasks from that job. This optional control over the This quick wrangling control over a job's "footprint" size on the farm can be useful when changing the full site-wide Limits settings is not appropriate.
Automatic Blade Error Throttle -- This blade profile setting will prevent blades from picking up new work if they encounter too many errors within a given time interval.
Job Spooling Improvements -- Job processing upgrades include faster processing, better error checking, and bundling of required subsystems. A parallelized job intake and database staging scheme can dramatically reduce backlogs when many jobs are spooled simultaneously, or when many "expand" tasks are running in parallel. A self-contained Tcl interpreter bundled with the spooler simplifies site install requirements and can perform client-side error checking prior to job delivery to the engine. A new JSON job spooing format is also supported (but not available prior to beta-1 pending changes).
- A fairly invasive refactoring effort, simplification, and mutex contention changes to address several bottlenecks, especially with regards to frequent password checks and identity management.
- A more robust job-global system for sorting newly ready commands produced by "expand" tasks. This addresses the "Cmd not Ready?" error problem - which was due to sorting key collisions (precision) on large recursively expanded jobs.
- Optimization of some "inner loop" limit checking operations, especially around repeated tokenization, trace message construction, and handling of limits that are currently preventing new dispatches because they are at their max value.
- Addition of a new "connecting" entry in the queue/backlog diagnostic ("status&qlen=1" query). This count just gives distinct visibility to the number inbound RESTful requests that are incomplete, still awaiting network i/o for their complete http request body contents, they are "connected but incomplete, not yet dispatched into the engine modules for processing."
Fix for a string handling bug that could sometimes result in intermittent loss of access to task log output.
- Fix for a tractor-engine crash during error message construction in response to unparseable characters in "expand task" job graph extensions.
Added some additional execution handler threads for running custom menu item backend scripts. The number of threads now scales up along with other thread pools based on configuration settings.
- Updated environment handlers and paths to accommodate batch processing of RenderMan 22 scenes from Maya and Katana.
- Addressed a Dashboard selection and copy issue related to a change of Firefox webkit css settings.
- Updated Dashboard links to the newest Tractor documentation.
- Address path separator issue on Windows in the Python Job Authoring module.
- Address a task state transition race condition in some "expand chunk" use cases.
- Fixed full job restart pruning of previously expanded tasks that were created by the "expand chunk" mechanism.
- Addressed potential security issues related to malicious interface use by on-site Tractor users.
- NOTE: Upgrading to Tractor 2.2 is "permanent" in the sense that you cannot revert to an older tractor-engine while also retaining your old jobs once the 2.2 job database upgrade has been performed. If you BACKUP your current job database before installing 2.2, then it is possible to revert to the older engine version along with jobs restored to their state at the time of the backup. Please refer to the guidelines described in Upgrading.
- Upgrading to 2.2
- Upgrading to 2.1
- Upgrading from 1.x