Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

 


Anchor
2.x
2.x
Tractor 2.x Features

The great new features in the New features in Tractor 2 .x releases extend the core Tractor system established in the 1.x family of releases. There are a broad range of new additions and improvements, from productive new command line and scripting interfaces for wranglers, to simple user interface changes. Internal upgrades range from a new high-performance, high-capability job database, to new studio-wide resource sharing and allocation controls. Please refer to the guidelines described in Upgrading.

Here are some highlights:

  • Tractor Product Layout -- Single Release Directory, Single Download per platform, Bundled Subsystem Updates -- The Tractor 2.x packaging and installation layout includes a matched set of Tractor components all in one download: engine, blade, spooler, user interfaces, and scripting tools. They are all installed together in one versioned area, along with only one copy of matched shared resources including pre-built versions of several third-party subsystems.

  • Tractor Query Tools -- Introducing tq the tractor query command line tool and modules. Based on proven Pixar studio tools, tq is the best way to query live or historical Tractor data from your terminal shell, from your Python scripts, or from a new tab in the Dashboard.

  • Adaptive Farm Allocations -- A way to dynamically allocate abstract resources between people or projects using Tractor's flexible Limits system. If two films are in production, 60% of the farm can be allocated to one of them, 30% to the other, leaving the remaining 10% for other projects. If one show is idle the others can temporarily expand their shares, then shrink back to the nominal levels when all projects are active.

  • Dispatching Tiers -- A simple way to organize broad sets of jobs into a descending set of site-defined priority groups. The default tiers are named: admin, rush, default, batch. Create your own!

  • Custom Menu Actions -- Add site-defined Dashboard menu items that can invoke your own centralized scripts, parameterized by the user's current list selection.

  • Job Authoring API -- A new tractor.api.author module allows your Python scripts to easily create Job, Task, and Command objects linked together according to your dependency requirements. The Job.spool() method then sends the resulting job to the tractor-engine job queue.

  • Simple Engine Discovery -- A simple "zero-config" announcement capability for small studios allows tractor-blades and other Tractor tools to find tractor-engine on the local network without requiring manual nameserver (DNS/LDAP) configuration changes. Tractor-engine will automatically disable this SSDP-style traffic at studios where the hostname alias "tractor-engine" has already been created by an administrator in the site nameserver database.

  • Checkpoint-Resume Support -- Extensions to job scripting, dispatching, and the Dashboard add interesting new capabilities related to incremental computation. Tractor also supports a general "retry from last checkpoint" scheme. Both features integrate with the new RenderMan 19 RIS checkpoint and incremental rendering features.

  • Blade Auto-Update -- A simple tractor-blade patch management system allows administrators to "pin" the farm to a particular blade patch version, and automatically push out a new version to the entire farm. Out of date blades restart themselves using the new module version.

  • Pluggable Authentication Module (PAM) support -- The engine's optional new built-in PAM support delegates password validation directly to the operating system on the engine host. This alternative makes it simple to enable password support at studios where the LAN already provides adequate credential transport security.

  • Privilege Insulation -- The EngineOwner setting in tractor.config specifies the login identity under which tractor-engine should operate. This setting is important because it allows the engine to drop root privileges immediately after it has acquired any protected network ports that it may need. The engine's normal day-after-day operations will then occur under the restrictions associated with the specified login name.

  • Dynamic Service Key Advertisement -- Several blade profile "Provides" modes have been added to support some advanced service key use cases. For example, blades can dynamically advertise a different set of service key capabilities depending on which keys have already been "consumed" by previously launched commands.

  • Resource Usage Tracking -- The operating system rusage process metrics CPU, RSS, and VSZ are now recorded into the job database for each launched command. Currently supported on Linux and OSX tractor-blades.

  • Command Retry History -- A unique tracking record is now created in the job database for every command launch attempt. So the history of retries on a given task can be reviewed using the tq tool, for example.

  • Configuration File Loading -- A streamlined override system can help to reduce clutter and improve clarity about which files have been modified from their original "factory settings" at your studio.

  • Task Concurrency Throttle -- Each job can specify a "maxActive" attribute to constrain the number of concurrently running tasks from that job. This optional control over the This quick wrangling control over a job's "footprint" size on the farm can be useful when changing the full site-wide Limits settings is not appropriate.

  • Automatic Blade Error Throttle -- This blade profile setting will prevent blades from picking up new work if they encounter too many errors within a given time interval.

  • Job Spooling Improvements -- Job processing upgrades include faster processing, better error checking, and bundling of required subsystems. A parallelized job intake and database staging scheme can dramatically reduce backlogs when many jobs are spooled simultaneously, or when many "expand" tasks are running in parallel. A self-contained Tcl interpreter bundled with the spooler simplifies site install requirements and can perform client-side error checking prior to job delivery to the engine. A new JSON job spooing format is also supported (but not available prior to beta-1 pending changes).

     

...

Anchor
2091325
2091325
Changes in 2.

22.2Upgrading to 2.2
  • NOTE: Upgrading to Tractor 2.2 is "permanent" in the sense that you cannot revert to an older tractor-engine while also retaining your old jobs once the 2.2 job database upgrade has been performed. If you BACKUP your current job database before installing 2.2, then it is possible to revert to the older engine version along with jobs restored to their state at the time of the backup. Please refer to the guidelines described in Upgrading.
  • Upgrading to 2.2
  • Upgrading to 2.1
  • 2091325

    Bug fixes and improvements

    • Fixed a regression in the prior 2.4 release that meant service keys might be reported incorrectly
    • The launchd plist for running the macOS services had several outdated keys and caused errors to be reported and the service not to be restarted under certain conditions
    • On Windows, AMD CPUs like the 3990x would report 64 instead of the expected 128 cores

    Anchor
    2.4
    2.4
    Changes in 2.4 2069290

    Tractor 2.4 is an update focused on overall performance, and handling large scale farm size, job size, and number of concurrently connected user sessions. Changes include:

    • A significant code refactoring effort addressed several internal thread contention bottlenecks, including those related to frequent password checks and identity management. The new logic results in improved throughput, especially on very large farms (5000+ blades) where many dashboard sessions and automated scripts (1000+) are accessing job status.

    • A more robust job-global system for sorting newly ready commands produced by "expand" tasks. This change addresses the "Cmd not Ready?" error problem - which was due to sorting key collisions (precision) on large recursively expanded jobs.
    • Optimized limit checking operations, especially around repeated tokenization, trace message construction, and efficient handling of limits at their maximum capacity that temporarily throttle new dispatching.
    • Addition of a new "connecting" entry in the queue/backlog diagnostic (the "status&qlen=1" query). This count gives distinct visibility to the number of connected but incomplete inbound requests, those still awaiting network i/o for their complete http request body contents.
    • The number handler threads for running custom menu item backend scripts now scales up along with other thread pools based on tractor.config settings.

    • Fix for a string handling issue that could result in intermittent loss of access to task log output.

    • Fix for a tractor-engine crash during error message construction in response to unparseable (control) characters in "expand task" job graph extensions.
    • Fixed Dashboard custom job filter matching on array types such as job Project names. Previously only exact matches were working for these entries, but not "starts with" or "contains" comparisons.
    • Updated command-line options for the tractor-spool job submission utility. Several new formatting options such as zero padding are now supported on "range" values used to construct new jobs directly from command line parameters (as distinct from submitting previously constructed complete job files).
    • Fixed a problem with "tractor-spool --ribs *.rib" wildcard expansion handling on Windows.
    • Fixed mismatched version strings in some of the system service start up scripts.

    Anchor
    2.3
    2.3
    Changes in 2.3 1923604

    • Updated environment handlers and paths to accommodate batch processing of RenderMan 22 scenes from Maya and Katana.
    • Addressed a Dashboard selection and copy issue related to a change of Firefox webkit css settings.
    • Updated Dashboard links to the newest Tractor documentation.
    • Address path separator issue on Windows in the Python Job Authoring module.
    • Address a task state transition race condition in some "expand chunk" use cases.
    • Fixed full job restart pruning of previously expanded tasks that were created by the "expand chunk" mechanism.
    • Addressed potential security issues related to malicious interface use by on-site Tractor users.


    ...

    Anchor
    1715407
    1715407
    Changes in 2.2 1715407

    •  RenderMan (prman) progress messages are now detected correctly by tractor-blade when Katana (renderboot) wraps them in additional logging text.
    • Jobs submitted to tractor-engine from clients using the Python job authoring API method EngineClient.spool() now correctly abide by the current tractor.config setting AllowJobOwnerOverride. Note that this fix requires clients to re-import the new tractor.api.author module.
    • Address a tractor.engine threading issue that could cause large “config” backlogs and slow dispatching in the unusual case where tq scripting clients make requests to the engine as usual, but site network routing issues prevented engine reply buffers from being delivered back to a few of those clients. Now those stalled deliveries time-out without affecting other transactions.
    • Additional internal handling of the "assigner Cmd not Ready” state inconsistency condition that can arise in some cases of simultaneous task retry and job restart.
    • Fix negative elapsed times that were sometimes displayed in the Dashboard as tasks finished, prior to a display refresh.
    Upgrading from 1.x

    Anchor
    1677499
    1677499
    Changes in 2.2 1677499

    • Custom menu items will cause a new window to be opened if there is any script output, even if the menu item is configured to normally suppress a new window. This enables problematic scripts to be more easily detected and debugged.
    • Custom menu items support "login" as a special entry in the "values" list, which will cause the Dashboard user name to be a part of the payload sent to the menu item's script.
    • Custom menu items will now observe user names and "@owner" in the crews attribute for job and forjoband task custom menu items.
    • Fixed bug in which selecting next task by state in bystatein Dashboard task list was not automatically scrolling to tasktotask.
    • Fixed the calculation of the elapsed time of a skipped task in the rollover of the Dashboard job graph.
    • Sorting of tasks in the Dashboard task list has been corrected.
    • Dashboard preview commands can now be displayed and run for archived jobs.
    • The kill operation is now supported in tq and intqand the query API. It is used to kill a running command and leaves the task in an error state.
    • tq has a new reloadconfig command tqhas a newreloadconfigcommand to enable the triggering of configuration file reloading from the command line.
    • The query API will only return non-registered blades when the archive flag is set to True.
    • tractor-dbctl--exec-sql now sqlnow emits error messages.
    • The blade caches user database entries in order to be more resilient against transient LDAP server outages.
    • The systemd configuration directory defaults to the Tractor installation config/ directory, consistent with the sysvinit setting.
    • systemd now starts tractor blade as root by default, consistent with the sysvinit setting.

    ...

    Anchor
    1625934
    1625934
    Changes in 2.2 1625934

    • Added job service key expression support for blade selection based on "total physical RAM". For example the expression

      RemoteCmd {prman my.rib} -service "PixarRender && @.totalmem > 24"
      

      selects blades that provide the "PixarRender" service (blade.config) and which have at least 24 gigabytes of RAM installed. The previously supported "@.mem" key for "available free RAM" is also still available.

    • Enabled access to BladeUse attributes in 'tq blades' queries: taskcount, slotsinuse, and owners.

    • Added --user option to the logcleaner utility script so that a different user to query jobs can be used from the process owner which performs the file removal.

    • Fixed the --add and --remove operations in the tq jattr and cattr commands for making relative changes to job and command attributes that are lists.

    • Addressed a tractor-engine socket exception handling issue on Linux for cases where a tractor-blade host (operating system) has become unresponsive, such as in cases of GPU driver or OOM issues or a kernel panic. The tractor-engine process would sometimes exhibit high cpu load in these cases, spinning in the socket handler.

    • Fixed the access-denied advisory text in JSON responses to retry, skip, and job interrupt URL requets.

    • Suggested workaround for RHEL6 PAM-related file descriptor leak:

      On Linux RHEL 6.x era releases, the pam_fprintd.so module
      contains a bug causing it to leak file descriptors on every call from
      tractor-engine.  Since PAM modules are loaded into the tractor-engine
      process, and it performs many authentications over time, the unclosed
      "pipe" descriptors will accumulate, unknown to the main tractor-engine
      code and will eventually exhaust the available file descriptor limit
      for that engine process.  While many studios do not depend on
      fingerprint validation, especially for scripted API access to a system
      service, the "fprint" module is called indirectly from many common
      RHEL6 PAM policies, including "login" and "su".  It has been removed
      from the common policies in RHEL 7 era distributions.  A workaround
      for RHEL6 is to create your own "tractor" policy that doesn't include
      system-auth, or perhaps to specify a less general policy in crews.config,
      such as password-auth.
      

    ...

    Anchor
    1614082
    1614082
    Changes in 2.2 1614082

    • Added better user attributions in log messages related to job deletes and task task actions such as skip, kill, recall, and retry.
    • Improved tractor-engine self recovery from unexpected command signature changes that could result in dispatching stalls in some jobs, and an "assigner Cmd not Ready?" warning in the logs.
    • Fixed an engine problem that would sometimes update task records too early on active tasks that were in the midst of being swept from active blades during a manual retry of a predecessor tasks. The failed update could lead to an incorrect report of the number of active slots on the given blade.
    • Fixed enforcement of job attribute edit policies such that dashboard edits cannot be applied to a job that is being moved to the archives (aka deleted).
    • Fixed a tractor-blade state checkpoint problem that could sometimes cause state reporting delays when a blade was rebooted.
    • Fixed an engine crash related to handling a deliberately malformed URL query.
    • Added additional exception handler protection in tractor-blade to guard against errors in user-provided custom TractorSiteStatusFilter extensions.
    • Fixed a problem handling very long limit tag names.
    • Improved the dashboard efficiency related to automatic refreshes of job and blade lists, reducing load on the engine and database when many users are connected.
    • Fixed an RPM specification issue that could result in RPM install error messages like "Transaction check error: file /opt from install ..."
    • Fixed tractor init.d scripts to return a non-zero status code (3) when the "status" query is used to whether the service is running or not.
    • Worked around an init.d built-in function issue that could sometimes result in a "dirname" error in blade service start up.
    • Introduced optional new systemd start up scripts for tractor-engine and tractor-blade, for use on RHEL7 systems for example. The scripts are shipped in /opt/pixar/Tractor-2.2/lib/SystemServices/systemd/. See the documentation section Linux - with systemd

    ...

    Anchor
    1593580
    1593580
    Changes in 2.2 1593580

    • Fixed an issue where a task retry after job pause could allow that task and its successor to both become runnable concurrently after the job was unpaused.
    • Fixed engine logic that could generate spurious "retry successor task loop detected" log messages in some complex tractor Instance use cases.
    • The "rfm--maya-" environment block in the new stock shared.*.envkeys configuration files now includes additions that allow xgen procedurals to load correctly when batch rendering from Tractor. Copy the "shared.*.envkeys" files to your tractor config directory, or integrate similar changes into your "rfm" handler block if you have customized files.
    • Changed the shipped example task custom menu item to avoid confusion with a similar example in documentation.

    ...

    Anchor
    1557505
    1557505
    Changes in 2.2 1557505

    Features

    • The Python job Python job authoring API now API now supports setting of job-level serialsubtasks and spoolcwd attribute.
    • The "Clear earlier blade data" operation in the Dashboard'blade list context s blade list context menu causes the selected blades to be hidden from view until they register again. This operation now also causes an internal cache to recalculate the active task count and number of slots in use for the selected blades. This is useful should sites observe invalid values in the blade list.
    • Pressing 'Z' in the dashboard causes the dashboard causes the view to scroll to the currently selected item.
    • A new command line operation, tq  tq queuestats, displays internal queue information from the engine. This is useful in debugging engine backlogs under unusual load.
    • A new command line operation, tq dbreconnect, causes the engine to reestablish its database connections. This administrative operation may be useful in a several unusual situations. For example, dbreconnect can reclaim accumulated system memory consumed by a bug in PostgreSQL when new large jobs are submitted.

    Fixes

    • Fixed bug in which Dashboard would display incorrect task counts in job list.
    • Fixed bug in which the stoptime and process metrics of a command invocation may not be updated if the engine was restarted while the command was running.
    • Fixed bug in which a command invocation's current flag was not getting cleared if its task was retried while the command was running. This addresses multiple problems reported in the Dashboard, such as multiple blades reported for a command and unnecessary vertical spaces appearing in the job graph.
    • Fixed bug in tractor-spool in which using the --engine option with the default engine, namely tractor-engine:80, was not being observed if the TRACTOR_ENGINE environment variable was set.
    • Fixed a bug causing "linked.joblist" messages to appear in the engine log.
    • Fixed blade list and blade activity views so that selecting a blade in one view will cause the selected blade to become visible in the other view.
    • Fixed item lists so that when an out-of-viewport item is selected with the up or down arrow keys, the selected item will automatically be scrolled into view.
    • Fixed broken client-side search box by adding a check for null values.

    Optimizations

    • Improved tq responsiveness through additional threads to handle query execution.
    • Optimized task skip operation, reducing database I/O and message payload to Dashboard.

    ...

    Anchor
    1496411
    1496411
    Changes in 2.1 1496411

    • Dashboard Job Notes -- A new Notes field has been added to the Dashboard job details pane, allowing text annotations to be added to any job. Notes are visible to other users, and the presence of a note is indicated with a small "chat bubble" icon in the job list. These notes can be used to describe a problem to wranglers, or to explain why a job needs, or is getting, special handling. The engine will automatically add a note to a job when an attribute is changed through some user action, such as altering priority, so the notes become a history of changes to the job.

    • Dashboard Blade Notes -- A new Notes field has been added to the Dashboard blade details pane, allowing text annotations to be attached to a blade entry. These notes can be used by system administrators to describe known issues or to discuss ongoing admin work on a machine.

    • Dashboard Job Pins -- Individual jobs in each user's job list can now be "pinned" to the top of the list, independent of the global list sorting mode. Jobs might be pinned because they are important to track or just because they represent a current "working set" of jobs. The group of pinned jobs float at the top of the list, and they are sorted according to the overall list sorting mode, within the pinned group.

    • Dashboard Job Locks -- A single user can now "lock" a job from the Dashboard. A locked job can only be modified by the user who locked it. Locks are typically only used by wranglers who are investigating a problem and who want to prevent other users from changing, restarting, or deleting a job while the investigation is proceeding. The lock owner can unlock the job when done. Permission to apply a lock is controlled by the JobEditAccessPolicies "lock" attribute in crews.config.

    • Task Logs 'L' Hotkey -- When navigating the tasks within a job, the logs for the currently selected task can be display by pressing the 'L' key. The key is a toggle, so pressing 'L' again will close the currently open log.

    • User-centric Job Shuffle - Individual users can re-order their own jobs on the queue without disrupting global priority settings. The dashboard job list option "Shuffle Job To Top" essentially exchanges the "place in line" of the selected job with a job submitted earlier from the same user, causing the selected job to run sooner than it would in the default submission order. This swap does not affect the ordering of other jobs on the queue, relative to the submission slots already held by that user. This slightly unusual feature is a simplified re-implementation of the old per-user dispatching order controls in Alfred, as requested by several customers. Permission to perform this kind of reordering is controlled by the JobEditAccessPolicies "jshuffle" attribute in crews.config.

    • The "project" affiliations for each job are now displayed in the job list view.

    • "Delete Job" action is now called "Archive Job" -- The former "Delete Job" menu item has been changed to "Archive Job" to better reflect its actual function: when the db.config setting "DBArchiving" is enabled, jobs that are removed from the active queue are transfered to an archive database where they can still be inspected and searched in tq queries. If DBArchiving is False, then "deleted" jobs are actually deleted and their database entries are removed -- in this case the dashboard menu item still says "Delete Job".

    • Archived Jobs View -- A Dashboard view of previously "deleted" (archived) jobs is now available. This view is analogous to a "trash can view" in some file browsers or e-mail clients. Jobs listed in the archive view can be browsed, and can also be restored to the main job queue where they can again be considered for dispatching. Note that jobs can sometimes contain "clean-up" commands that execute when they finish executing. These clean-ups may remove important temporary files that can make it impossible to re-execute that job.

    • Task progress bars for Nuke renders -- Tractor-blade now triggers a Dashboard progress bar update when it encounters a multi-frame progress message from Nuke, of the form "Frame 42 (7 of 9)".

    • Task Elapsed Time Bounds -- Job authors can now specify an acceptable elapsed time range for a given launched command. Commands whose elapsed time is outside the acceptable range will be marked as an error. Commands that run past the maximum time boundary will be killed. Example job script syntax:

      RemoteCmd {sleep 15} -service PixarRender -minrunsecs 5 -maxrunsecs 20
      
    • Per-Tier Scheduling -- A new extension to the DispatchTiers specification in tractor.config allows each defined tier to have its own scheduling mode. For example, the "rush" tier might be schedule in a strict FIFO order, whereas the default mode might be one of the modes that favors shared-access (like P+ATCL+RR). Tiers can be assigned the new "P+CHKPT" mode to take advantage of partial-graph looping feature in Tractor 2.0; and tiers using that mode should be placed before tiers receiving "classic" non-checkpoint jobs.

    • Site-define Task Log Filters -- A new FilterSubprocessOutputLine() method is now available as an advanced customization feature in the TractorSiteStatusFilter module. This method provides python access to every line of task output. The site-written code can perform arbitrary actions in response to task output, and built-in Tractor-specific actions are also available. These include marking the task as an error, generating percent-done progress updates, initiating a task graph "expand" action, and stripping the output line from the logs.

    • GPU Detection -- On start-up, tractor-blade now makes an attempt to enumerate any GPU devices installed on the blade host. The device model and vendor name "labels" are made available during the profile selection process so that groups of blades can be categorized by the presence or type of GPU, if desired. The "Hosts" dictionary in a blade.config profile definition defines the matching criteria for that profile. Two new optional keys are now available: the "MinNGPU" entry specifies minimum number of GPU devices required for a match; and "GPU.label" specifies a wildcard-style matching string for a particular vendor/model. This label string also now appears in the Dashboard blade list, if a GPU device is found.

    • The new tractor.config setting "CmdAutoRetryStopCodes" specifies a list of exit codes that will be considered "terminal" -- automatic retries will NOT be considered for commands that exit with these codes, unless the -retryrc list for a specific command requests it. Negative numbers represent unix signal values, and the codes 10110 and 10111 are generated when a command's elapsed time falls outside the new run-time bounds options, when given. The default setting for the no-retry stop codes are the values for SIGTERM, SIGKILL, and the two time-bounds codes:

      "CmdAutoRetryStopCodes": [-9, -15, 10110, 10111],
      
    • Engine statistics query -- A new URL request (Tractor/monitor?q=statistics) has been added to help integrate tractor-engine performance metrics with other site-wide monitoring systems. The returned JSON object contains the most recent sample of several statics that the engine collects about itself. This data might be used, for example, to populate an external site monitoring system. Some monitoring systems are able to make this URL request for data directly, while others may require a small data source script to be written that requests the JSON statistics report and then forwards each value of interest to the monitoring system separately.

    • Concurrent Expand Chunks -- This advanced expand advanced expand task variant provides one approach to avoiding serial delays in jobs containing long-running single commands that produce a sequence of results needed by other tasks in the job. This new extension enables pipeline integrators to construct jobs that launch a long running command, such as a fluid simulation, and then concurrently launch another command, such as a render, when each sequential output file is generated by the first command. Thus rendering can proceed without waiting for all of the simulation steps to complete. This particular approach is well suited to cases where the simulation app is creating output files whose filenames are not known ahead of time, and thus the subsequent render command line arguments must be generated dynamically. The simulation, or a wrapper script, detects when the next step is complete, then it writes the appropriate rendering Task description into a temporary file, and then notifies tractor-blade by emitting the new 'TR_EXPAND_CHUNK "filename"n' line on stdout. Tractor-blade will detect that directive in the application stdout stream and deliver the file contents to the engine. The new render task is inserted into the running job and can be dispatched immediately elsewhere on the farm. The blade will automatically remove the temporary file once it has been delivered.

    • TR_EXIT_STATUS auto-terminate policy change -- the default behavior for the TR_EXIT_STATUS handler has now reverted to the 1.x and earlier 2.x behavior in which the status value is simply recorded and then reported later when the command actually exits. The more recent behavior in which the blade actively kills the app upon receipt of TR_EXIT_STATUS is still available, but it must be explicitly enabled in blade.config using the profile setting:

      "TR_EXIT_STATUS_terminate": 1,
      
    • Blade record visibility flag -- The Dashboard blade list display is created from database records describing each tractor-blade instance that has connected to the engine in the past. These records are retained, even when a blade host is no longer deployed, in order to correlate previously executed commands with the machine they ran on. The dashboard blade list menu item "Clear prior blade data" no longer removes the actual database record for the given blade. Instead it simply sets a flag that hides the record from display in the dashboard. The record (and its new unique id field) are now retained for correlation with old task records. The blade data items can be completely removed manually if they are truly unneeded.

    • Cookie-based Dashboard relogin -- A new policy allows auto-relogin to new Dashboard windows based on a saved session cookie, even when site passwords are enabled. The cookie contains only a session ID that is validated by the engine, it does not contain any password data itself. The older policy that denied auto-login when passwords are required can be restored by adding a "_nocookie" modifier to the crews.config SitePasswordValidator setting.

    • Added a new tractor-dbctl --set-job-counter option that sets the initial job ID value in a new job database. Job IDs start 1 by default, so this ability to specify a different starting value can be helpful when starting from a fresh Tractor install in order to prevent overlaps between the job IDs from the new install and older jobs. Tractor upgrade installs that reuse the prior job database will continue to see job ID continuity.

    • Several internal improvements have been made to the job database upgrade proceedure. Many code-related changes in new releases can now be applied without a significant database alteration, needing only an engine restart. Changes involving new database schema definitions are now applied with a system that better handles upgrades across multiple versions.

    • Overall throughput optimizations -- Various performance improvements have been made in the this release, especially with regards to handling large numbers of simultaneous updates as many jobs complete or are deleted at the same time.

    ...

    Anchor
    1393388
    1393388
    Changes in 2.0 1393388

    • Added Dashboard task graph visualization of Instance nodes.
    • Add a "quick job syntax check" option: tractor-spool --parse-debug (job.file)
    • Supplement the tractor-blade TR_EXIT_STATUS handler such that it will now actively kill running applications that emit TR_EXIT_STATUS directives if they do not exit on their own in a timely manner. This behavior can be useful for simple wrapper scripts that cannot implement the full process-group shutdown that tractor-blade already provides. The new behavior can be disabled in blade.config by adding "TR_EXIT_STATUS_terminate": 0,
    • Fix storage of afterJids attribute edits on previously spooled jobs.
    • Fix a dispatching problem that resulted in "no dispatchable tasks" in some cases after task retries and "afterJids" delays.
    • Address unicode handling issues for non-ascii characters in RemoteCmd application parameter lists (aka comman-line argv).
    • Fixed Dashboard display of elapsed time for still-active tasks to avoid issues caused by clock differences between the engine and user hosts.
    • Fix job ready-task counts reported by the Dashboard and tq in some cases following retries.
    • Removed engine start-up usage of a platform-dependent external python module, psycopg2.
    • Updated the Tractor Query API to improve its python module conformity.
    • Changed the way that tractor-blade reloads site-defined TractorSiteStatusFilter modules on profile refresh to improve predictability.
    • Extended the engine's expand-node output handling to tolerate some older Alfred-compatible no-op constructs.
    • Allow TractorSiteStatusFilter to set an advisory status message (aka "excuse") that is visible in the Dashboard blade list, even when the site-specific callback is allowing a request for work to proceed.

    Release Notes from Prior Versions

    See the Tractor 1.x Release Notes for details about earlier releases
    • .