FBSNG Resources

Introduction

FBSNG implements the concept of abstract resources. The idea behind it is that each individual node of the farm, and the farm as a whole host certain amount of resources allocated to batch processes while they are running. They are called abstract because FBSNG does not have any knowledge of what those resources are. Each resource has its name and integer capacity. Each process (or section) allocated certain amount of zero or more resources when it starts, and releases the same amount when it finishes. For example, one can introduce resources like “cpu” and “disk”, and even though they mean “CPU power utilization in percents on the node” and “amount of local disk in Gigabytes”, which are quite different from user’s point of view, FBSNG treats them in exactly the same way.

 

FBSNG resources are classified as the following:

 

This document describes the concept of FBSNG resource management. Specific details on how to do this using FBSNG User Interface are described in FBSNG User’s Guide.

Resource Classification

Global Resources

Global resource is such resource that is hosted by the farm as a whole and cannot be associated with any node. Processes share the same resource regardless of what node they are running on. Two examples of global resources are

Local Resources

Local resources are attached to individual nodes. Processes running on different farm nodes do not share or compete for the same local resources. Examples of local resources are

Attributes

Attribute is a special kind of local resource. One can think of an attribute as a local resource with unlimited capacity. Processes do not allocate any amount of such resource while they are running. They only require that the node they are running on has this attribute. Examples of attributes are:

Attributes can be used to logically partition a farm into sub-farms. One can define an attribute like “sub-farm-1” and “sub-farm-2” and split the farm into two (possibly overlapping) parts attaching one or two (or even none) of these attributes to each node of the farm.

Resource Allocation Levels

There are two levels user can allocate necessary resources for a batch job at.

Allocation on Process Level

User can request certain resources to be allocated to each individual process of the job section. In this case each process will consume the same amount of specified resources. For example, user can specify that each individual process will consume 85% (85 units in terms of abstract resources) of CPU power (of abstract resource named “cpu”) on the node where the process runs. All types of resources (local, global and attributes) can be allocated on this level. Obviously, total resource allocation for the section will be proportional to number of processes of the section.

Section Level Allocation

User can request resources to be allocated to the section as a whole. Since section itself does not run on any particular node, only global resources can be allocated on this level. For example, one can say that no matter how many processes form the section, the section processes one tape of data and requires fixed amount of NFS-shared disk for that. In this case, one would request certain amount of the global resource associated with the disk space on NFS server (e.g. “nfs_disk”) to be allocated to the section itself, not to individual nodes. In this case, the resource will remain allocated until last process of the section finishes.

 

Resource Requirements Specification

FBSNG determines process and section resource requirements based on FBSNG configuration and Job Description File (JDF). One of required fields user must specify for each section of the job is queue name. In FBSNG configuration, each queue is associated with default process type. Process type should be used to define common resource requirements for repeating batch processes of the same project or task. Minimal common resource requirements for processes of the same type are defined in FBSNG configuration as default resource requirements. In JDF, user may use SECT_RESOURCES and PROC_RESOURCES fields to request resources in addition to these defaults. In case when certain resource is mentioned both in process type default requirements in FBSNG configuration, and JDF, the greater amount of the resource will be used. PROC_TYPE field of JDF can be used to override default process type associated with the queue.

 

For example, let us assume that FBSNG configuration defines two queues, “LongQ” and “ShortQ”, both with default process type “Worker”:

 

%set queue LongQ

default_proc_type = Worker

 

%set queue ShortQ

default_proc_type = Worker

 

Two process types are defined in the configuration:

 

%set proc_type Worker

default_proc_resources = cpu:100

 

%set proc_type IO

default_proc_resources = cpu:50 disk:1 IO_node

 

JDF consists of two sections “Prestage” and “Run”:

 

SECTION Prestage

QUEUE = ShortQ

PROC_TYPE = IO

PROC_RESOURCES = disk:10

 

SECTION Run

QUEUE = LongQ

PROC_RESOURCES = nfs_disk:5 cpu:1 encp

 

As you can see, JDF overrides default process type for section “Prestage” from “Worker” (default for “ShortQ” queue) to “IO”. Section “Run” is of processes of type “Worker”, default process type for queue “LongQ”.

 

Each process of “Prestage” section will consume:

 

Processes of section “Run” will consume: