Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Process Launch and Tracking

 


Anchor
substitution
substitution
Remote Servers, Launch Expressions, and Runtime Substitution

The launch expression argument to the RemoteCmd operators specifies the actual executables that Tractor should launch. As in the simple example above, these can be straightforward command strings that simply specify an executable and its arguments. Many times the named executable is just a locally written shell script.

Tractor, like Alfred, also provides other key dispatching features beyond hierarchical ordering of launches, most notably locating available remote servers. Consider this command that might be typed in a terminal:

prman -Progress /net/my_ribs/a.rib

If we would like Tractor to execute that same command on an available machine in our render farm, we can create a Tractor job like this:

Job -title {a render test} -subtasks {
  Task -title {render a.rib} -cmds {
    RemoteCmd {prman -Progress /net/my_ribs/a.rib} -service {PixarRender}
  }
}

Anchor
servicekeys
servicekeys
Service Keys

The RemoteCmd option -service {PixarRender} in the example above is called a service key expression and it is used describe which type of blade is acceptable to run that command. In this example, the prman executable should be launched on blades that advertise the service key "PixarRender". Tractor simply treats these keys as "filters" when matching commands to blades. Matching ignores capitalization when comparing command keys to blade capabilities.

Having created a job script that specifies a particular service key, then you need to "attach" a matching service key to the appropriate blades on your farm, namely those on which you want commands of that type to run. For example, it might be important to select hosts on which "prman" is actually installed! For tractor, this means going to the blade.config file and adding the appropriate key to the "Provides" entries for each type of blade profile. There are defaults shared by all profiles at the top of the file.

The particular choice of keywords to use in your jobs is arbitrary, you just need to adopt conventions for job scripts and blade profiles that agree on what an advertised keyword implies in terms of a particular blade's capabilites. By convention, Pixar applications like RenderMan for Maya generate Tractor jobs that user certain service keywords, like PixarRender. These keywords are intended to express the type of sibling Pixar software required on the blades in order to run certain commands. Here is a brief list of some common service keys from RfM-generated jobs, and the requirements they are intended to express:

PixarRendercmd requires RPS (prman) to be installed on the blades
PixarNRMcmd requires RPS to be installed, and a running netrender server (tractor-blade.py is also a netrender server, so in practice this usually equivalent to PixarRender, but it helps distinguish netrender commands)
RfMRendercmd requires an appropriate intalled Maya on the blade, plus RfM or RMS must also be installed; the cmd consumes a render license
RfMRibGencmd requires Maya and RfM/RMS, no render license needed

Note: in addition to the keywords specified in blade.config, each server blade always "provides" three additional implicit keywords: the blade's host name, the blade's IP address (dotted quad), and the name of the blade.config profile that it is using. This feature is seldom used, but by specifying a profile or host name as the --service key for a RemoteCmd, scripts can restrict execution of a particular command to a specific host, or class of host matching a specific profile by name. Usually it is best to refer to the more abstract type of service keys.

There is also an "advanced" mode in tractor in which a site-defined python plug-in for the blade can dynamically change the services advertised by that blade on a request by request basis. The site plug-in might refer to an external production database, for example, when making these sorts of dynamic decisions.

Service Key Expressions

Service keywords can be combined using a simple syntax to further restrict the blades that match a given command. For example:

RemoteCmd {prman -Progress vast.rib} -service {PixarRender,BigIron}

In this example the command will only match blades that provide PixarRender AND BigIron as keywords in their profile.

Individual key names can also be preceeded with an exclamation point, like "!xyz" to indicate a negation, meaning that the given key must NOT be present in order for the blade to be acceptable. For example, to avoid blades that are providing a key named "Irix" you might have a job script containing "RemoteCmd -service {PixarRender,!Irix}". Or this same effect might be applied to every command in a job by adding that key to the top-level Job specification, or in the Dashboard job attributes editor; or it can also be given at spool time on the tractor-spool command line:

tractor-spool --service="\!Irix" myjob.alf

Service keys are usually site-defined abstract keywords such as those given in the "Provides" specification of blade profiles in blade.config. They can also be a specific hostname or the name of an entire profile. So you could avoid rendering on a host named "darth" by adding the service key "!darth" to the Job's service keys.

Service Key Expression Operators

OperatorUseExample
keynamekeynames are replaced with 1 if they match a name in the blade's "Provides" list, or 0 if they do notPixarRender
" keynamePattern "like keynames, above, with additional support for pattern matching using the wildcard characters * and ?
"rack-15?" || '192.168.0.*'
(Note the required double or single quotation marks)
&&
, (comma)
boolean AND
PixarRender && Linux
PixarRender,NorthAnnex
||boolean ORLinux ||
OSX
macOS
!boolean NOTPixarRender && !Desktops
( subexpr )parenthetical sub-expressionsPixarRender && (Linux ||
OSX
macOS)
@. blade_metricnumeric blade metric valuesee table below
+ - * /numeric add, subtract, multiply, dividefor blade metrics, see table below
< <= == != >= >numeric comparison, less, greater, equal, etcfor blade metrics, see table below

Service Key Expression Blade Metrics

MetricMeaningExample
@.diskavailable disk space, in gigabytesPixarRender && @.disk > 5
@.memavailable physical RAM, in gigabytesPixarRender && ((1024 * @.mem) > 2048)
@.nCPUsnumber of CPU cores reported by the OSWindows7_32bit && (@.nCPUs >= 4)
@.cpucurrent CPU usage, normalized by nCPUs@.cpu < .75
@.sanumber of abstract "slots" available(@.sa > 2) && (PixarRender || PixarNRM)

(Note: boolean OR, parenthetical subexpressions and blade metrics expressions were first introduced in tractor-engine 1.5)

Unusual netrender-style client / server commands

Some commands, such as netrender, are client applications that execute on the local spooling host but which also require the names of one or more remote servers to be provided as part of their command-line options. Applications such as scp or ftp are similarly clients of remote servers. For example, consider a typical invocation of netrender from a terminal:

netrender -h antigone -h percival /net/my_ribs/a.rib

In this case, the single RIB file is parceled out to two rendering servers that are assumed to be available and already be running a netrender server (i.e. each has a running tractor-blade since that has built-in netrender support). The local netrender client application contacts each of hosts listed on its command line and initiates a rendering there, using custom netrender protocol. It is often desirable to have Tractor find available netrender servers from a pool, rather than typing in specific hostnames (such as antigone and percival above). An example job script for requesting this collection of servers might be:

Cmd {netrender %H /net/my_ribs/a.rib} -service {PixarNRM} -atleast 2

Note: It is important to remember that the Cmd directive does not actually send work to remote hosts selected in this manner. It launches the local application, such as netrender or ssh, which then manage the interactions with their remote server themselves. RemoteCmd should be used to send commands directly to a remote host with no local client.
 

Anchor
subvars
subvars
Substitutions

The following symbols are expanded at launch time when they occur within launch or message expressions:

~home directory expansion, as in csh(1)
%hsubstitute the hostnames bound to the current Cmd via the dynamic -service mechanism. This is a simple blank-delimited list of hostnames (useful for rsh, etc).
%Hlike %h, but formatted as -h hostname pairs (as required by netrender).
%nconverted to the count of bound slots; an integer indicating how many slots were bound to this command.
%jexpands to the internal dispatcher Job identifier (aka "jid") for the current job. Tractor engine creates unique ids for jobs in its queue.
%texpands to the Task identifier (aka "tid") for the current task, it is unique only within each job.
%cexpands to the Command identifier (aka "cid") for the current command, it is unique only within each job.
%rthe recover mode, expands to 0 when a task is beginning a fresh start, and to an integer greater than zero when the user or system is requesting a recover from checkpoint
%Rexpands to the "loop count" for commands that are being restarted due to the -resumewhile construct
%qthe quality hint, expands to the value 1.0 when a task is being executed due to final quality runs from the subtasks it depends upon; it will be less than 1.0 if some subtask only reached a checkpoint during the current resumewhile loop
% idref (host)like %h but using the hostnames from the command whose -id value is idref.
% idref (-host)as above, formatted as -h hostname pairs.
%D (path)DirMaps, apply per-architecture remapping of paths using a site-defined mapping table.
%%a single percent-sign is substituted.
 


Job Context Environment Variables

The following environment variables are set by tractor-blade just before it launches each command. They may be useful in some cases where the application itself needs to know some Tractor-related context information. See the Custom Environment Handlers discussion for details on controlling custom environment variables globally or on a per-project or per-command basis.

TR_ENV_JIDThe job ID number for the job from which the command was dispatched.
TR_ENV_TIDThe task ID number, within the job.
TR_ENV_CIDThe command ID number, within the job.
TR_ENV_JOB_PROJECTThe project affiliation for the job. The name (or names) are specified at job submission time, usually by the job creation application.
TR_ENV_TRACTOR_ROOTThe path to the top of the Tractor install tree for the running blade.


Anchor
sharedserver

 

sharedserver
Sharing One Server Check-Out Among Several Commands

Sometimes it can be useful to acquire a remote server and run several commands in sequence on that server. This is called a shared server scenario. The mechanism for accomplishing this goal is to add a server check-out specification to the enclosing Task and then reference the task-id on each Cmd or RemoteCmd that needs the server slot.

For example:

Task {SharedServer example} -id {bob} -service {pixarRender} -cmds {
      Cmd {rsh %h mkdir /tmp/somedir} -refersto {bob}
      Cmd {rsh %h render -Progress some.rib} -refersto {bob}
} -cleanup {
      # Clean-up commands go here.
      # This example uses RemoteCmd just to illustrate its use as an
      # alternative to rsh above.  Note that there is an implicit '%h'
      # used to determine where the command should be run.
      #
      RemoteCmd {/bin/rm -rf /tmp/somedir} -refersto bob
} -subtasks {
      # the description of any nested dependent tasks go here
      [...]
}

The lifetime of the check-out is governed by two factors. The initial task-level check-out is done lazily in the sense that it only occurs when one of the commands that references it actually becomes the next command to be executed. A reference count is used to determine when it is safe to check the slot back in, it is freed when the last command to reference it has completed or errored out.

Syntax Note: the characters allowed in the idref names are letters, numbers, period (.), and underbar(_). For all of the % substitutions above, braces may be used delimit names, as in csh or Tcl; for example, if the current schedule has a service of type "renderserver" defined for the host named "percival", then a Task containing this command:

RemoteCmd {renderer -use %{h}.settings} -service {renderserver}

might cause the following command to be launched:

renderer -use percival.settings

The braces are required in this case because the simpler string "%h.settings" would cause the substitution mechanism to look for a Task or Cmd with "-id h.settings" and use its current values. 

Anchor
specialchars
specialchars
Special Characters and Escapes in Tractor Scripts

...

Given the restrictions just described above on Cmd launch expressions there are nonetheless occasions when shell constructs, such as command pipelines or run-time filename expansion, can be very useful. In these situations, a simple solution is to launch an appropriate shell as the Cmd, passing the pipeline expression to the shell via command-line arguments:

Cmd {/bin/sh -e "cd /tmp/frames; ls -1 | xargs -I+ cp + /DDR"}

Script authors should read the documentation for their shell of choice to understand the implications of various invocation options. For example, you must decide whether the user's .cshrc or .profile should be executed when the dispatcher launches a command like the above.

Another approach is to use the Cmd -msg option to pass arbitrary expressions to a launched shell. This is essentially equivalent to the above approach, but not as compact; however, it does allow for persistent reuse of the shell, if that's desired. Consider the following examples which send mail, which can sometimes be handy at the end of job. Recall that the -s option to Mail looks for the next "word" as the subject, so spaces need to be escaped (using csh(1) syntax this time):

Cmd {/bin/csh -fc "/usr/sbin/Mail -s 'job done' jean &lt; ~/errlog"}
Cmd {/bin/csh -fet} -msg {/usr/sbin/Mail -s 'job done' jean &lt; ~/errlog}

Often the simplest solution for complex expressions is to write a short shell script in your favorite language and launch the script from Tractor:

Cmd {myscript}

If a remote server is required, the run-time host selection can be passed to the script as an argument:

Cmd {myscript %h} -service {someServerType}

...


Progress or Percent-done Indication

Tractor also scans the stdout of its launched apps for strings of this form:

TR_PROGRESS nnn%
ALF_PROGRESS nnn%

The integer nnn, in the range 0-100, is sent to the UI and used to control the percent-done bars drawn on active task nodes.

...