bpctl

Name

bpctl -- Control the operational state and ownership of compute nodes.

Synopsis

bpctl [-h, --help ] [-v, --version ] [-M, --master ] [-S num, --slave num ] [-r dir, --chroot dir]
[-s state, --state state ] [-m mode, --mode mode] [-u user, --user user] [-g group, --group group] [-f] [-R, --reboot ] [-H, --halt]
[-P, --pwroff] [---cache-purge-fail] [---cache-purge] [--reconnect master[:port [,local [:port]]]]]

Description

This utility is part of the BProc package and is installed by default. It allows the root user to modify the state of the compute nodes. Compute nodes may be in one of eight states: down, boot, up, unavailable, error, reboot, halt, pwroff. The states are described as follows:

down

No communication with compute node and prior node state is unknown.

boot

Node has initialized communication and started but not completed the nodeup script. This state is not commandable. It is status information only.

up

Node is communicating and has completed the node up script without errors.

error

Node is communicating and encountered an error while running the nodeup script.

unavailable

Node is communicating and the cluster administrator has marked the node as unavailable to non root users. While in this state the node operates as normal, but will not accept user commands.

halt

Node has been commanded to halt. This command causes the node cpu/s to execute the halt machine instruction. Once halted the node must be reset by external means to resume normal operations.

pwroff

Node will power off. This command is valid for nodes that meet the ATX specification. This command requires BIOS support. Non-ATX machines may reboot on this command.

reboot

Node will do a software reboot. Node status will show reboot through start of machine shutdown until node up script has begun.

Normally the node will transition from down, to boot, to up, and stay in up until commanded otherwise. up is the operational state for user programs. User BProc commands will be rejected if the node is not up.

BProcsupports a simplified user and group compute node access scheme. Before any action is taken on a node, BProc checks if the user or group match. If either is matched the user action is processed. Note, normal file permissions are still in affect on each node. BProc permissions simply allow users to execute a program on a node. Root bypasses the check and always has access.

User and group changes made with bpctl remain in effect until the node or the beowulf daemons are restarted. After a restart the user and group information is read from the /etc/beowulf/config file. For persistent changes, you should edit the the config file. Note that Beosetup has a conveneint way to do this. The new file will take effect when you SIGHUP the daemons, or reboot the nodes. With SIGHUP, running jobs will not be affected unless they start a new process and are denied node access based on the file changes.

Anytime the beowulf daemons are restarted all nodes will be initialized to the down state. Any node history will be lost. When this occurs, previously communicating nodes will reboot and attempt to re-establish communication after a 30 second timeout.

Options

The following options are available to the bpctl program.

-h

Print the command usage message and exit. If -h is the first option, all other options will be ignored. If -h is not the first option, the other options will be parsed up to the -h option, but no action will be taken.

-v

Print the command version number and exit. If -h is the first option, all other options will be ignored. If -h is not the first option, the other options will be parsed up to the -h option, but no action will be taken.

-M

Specifies that the remaining options apply to the master node.

-S num

Specifies that the remaining options apply to the specified compute node. The num can range from 0 to number of (nodes - 1)

-r dir

Command the compute daemon to chroot to the indicated directory. After doing this, all processes started on a node via BProc will see the directory dir as their root directory. This command is only usable on compute nodes.

-s state

Set the node to the indicated state. Valid states are down, unavailable, error, up, reboot, halt, pwroff. Setting a node to down will cause the node to reboot due to a communications timeout after 30 seconds.

-m mode

Set the permission bits for the indicated node. Not valid after -M.

-u user

Set the user id for the indicated node. Will reject invalid users. Numbers or strings may be used, but numbers will be converted to names if is the machine's database. Not valid after -M.

-g group

Set the group id for the indicated node. Will reject invalid groups. Numbers or strings may be used, but numbers will be converted to names if in the machine's database. Not valid after -M.

-f

Fast mode. Whenever possible, do not wait for acknowledgement from remote nodes.

-R, --reboot

Reboot the indicated node.

-H, --halt

Halt the indicated node.

-P, --pwroff

Power off the indicated node.

--cache-purge-fail

Purge the library cache fail list.

--cache-purge

Purge library cache.

--reconnect master[:port[,local[:port]]]

Reconnect to the front-end.

Examples

This command will cause all nodes to reboot.

[root@cluster /root]# bpctl -S all -s reboot

This command is an error. Boot is not commandable.

[root@cluster /root]# bpctl -S 4 -s boot
Non-commandable node state: boot

Sets node 3 user to "foo", which must be a valid user.

[root@cluster /root]#  bpctl -S 3 -u foo

Return Values

bpctl will return 0 for success. On failure, an error message is printed to stderr and the process exits with 1.