auto.net 7 LOCAL automount Structure and Usage

NAME

auto.net - description of local automount configuration Managing and troubleshooting automount (the Unix idea of Apollo //)

Introduction
==============

Automount is a daemon provided as part of the NFS package, and
is fully described in "Managing NFS and NIS", published by
O'Reilly Associates. It is supplied with both Apollo NFS 4.1 and
most Unix system vendor's NFS software.

It provides a host-independent naming structure for accessing
files anywhere within our (MGSL, PSL, LSB, Biophysics) shared disk
resources. The volumes are not actually mounted until
referenced in some way, and are dismounted after 5 minutes
of no activity (the timeout can be changed; default is 5 min).
The result is an efficient and consistent method of accesing
files on remote volumes, regardless of manufacturer or OS.

Current Structure, & How to Use it
=======================================

Currently, 6 pseudo-directories on each host are managed by automount:

/v entry point for access to remote volumes
/eo/jbx750 entry point for optical jukebox on par8; N=32
/eo/jbxlg entry point for the large optical jukebox on par9; N=144
/eo/apjbx entry point for the Apollo optical jukebox on //jbx; N=32
/eo/fdajbx entry point for optical jukebox on deimos; N=16
/eo/fdajbx2 entry point for optical jukebox on phobos; N=32

The structure is determined by "map" files in the /etc directory
on each host; the master files are on par10, others are replicas
updated via the /etc/update.auto.maps script on par10. On the Apollos,
duplicates are on //dcc, //gate and //hal, most of the rest are links.
All begin with 'auto.', followed by the directory name except auto.master;
that is simply a list of the directories and their corresponding map
files (auto.v, auto.jbx750, etc.).

On client hosts, these directories (/v, /eo/jbxlg, ...) are empty until a
"leaf" is referenced in some; doing 'ls /v' only shows volumes in use, so
the file /etc/auto.v must be consulted to get the full list of *potential*
remote volumes available. The names to use in a /v reference are those
listed on the left margin; in general, volume names in the auto.v map contain
the simple hostname. If the hostnames end in a numeral, the letters a,b,c
etc. indicate additional volumes; hostnames ending in a letter use the
numerals 1,2,3 etc. to indicate them. At least 20% of the names don't follow
this convention, however. To mount a volume in the map, simply reference
it on the command line, in a script, or (except on the Terra) in a CHARMM
input file, e.g.

cd /v/par10a
cd /eo/jbxlg/ac122a

The jukebox platters are systematically named acNA where N is the
platter number (given above for each jukebox), and A is either a or b
and indicates the A and B sides of the platter. The only variants are
the jukebox name from the above table, and the platter number and side.

DO NOT ATTEMPT TO USE BOTH SIDES OF THE *SAME* PLATTER SIMULTANEOUSLY.

Finally, it should be noted that the remote volumes are mounted under the
/tmp_mnt directory on client machines, but a reference to /tmp_mnt/v/par10a
WILL NOT WORK UNLESS /v/par10a HAS ALREADY BEEN REFERENCED. Some public
domain contributed shells such as tcsh don't handle this properly, and will
erroneously include the /tmp_mnt prefix for shell variable $cwd (current
working directory). The use of vendor-supplied command shells is therfore
encouraged.

How to Change and Re-distribute the Maps
========================================

For the following, be very careful when editing the master map files.
Find and consult the O'Reilly book if uncertainty prevails. Do not make
changes except on par10, as they will be nuked (wiped out) the next
time the proper procedure is followed.

PROCEDURE 1: UPDATING THE MAP FILES
a) login to par10 as root
b) edit the appropriate map file(s), e.g. /etc/auto.v
c) execute /etc/update.auto.maps

When Things Go Wrong...
=========================

Automount should be started at boot time ( Apollo /etc/rc.nfs ), but
does not always succeed, depending on whether or not 'automount' was
stopped cleanly. Commands such as 'kill -9' or 'sigp -b', or a node
crash can corrupt directory objects created and managed by the automount
daemon. The daemon is started in /etc/netnfsrc2 under HP-UX9; under IRIX,
the files /etc/config/automount and /etc/config/automount.options are used.

The following assumes (1) you are logged in as 'root', (2) access via either
/v or /eo is not working, and/or (3) automount is not running. If automount
is running, and one host or jukebox fails to respond, try several others; the
problem may be the remote host, and not the local automount. If all
references through /v or /eo fail, then automount should be shut down
gracefully, using

kill -HUP automount_pid OR kill automount_pid

where 'automount_pid' is the process ID (PID) from a 'ps -ef' listing.

Once automount is stopped, or when kill spawns another copy of automount,
use the mount command (no args) to determine if any volumes are mounted
in the /tmp_mnt directory. Volumes listed in /tmp_mnt should be removed
using umount, e.g.

umount /tmp_mnt/v/par10c

Note that batch queues and any user processes (CHARMM jobs, console logins,
network logins, everything which references /v in any way) must all be
stopped to unmount the volumes. After the volumes are unmounted (verify with
mount), it may be possible to kill off a cloned automount and it's parent; it
is risky, but kill -9 can work under these circumstances (a clone was spawned,
no volumes mounted under /tmp_mnt).

If automount cannot be stopped, or if volumes listed under /tmp_mnt cannot
be removed with umount (or both), reboot the machine.

Once automount is no longer running and all it's managed volumes are
dismounted (mount shows nothing in /tmp_mnt), proceed to

PROCEDURE 2: CLEANING UP THE MESS
a) check /etc/mtab for /v, /eo/apjbx, /epfdajbx, or /eo/jbx750 entries;
change 'ignore' to 'nfs' for each occurrence
b) use 'umount' to remove the objects, e.g. umount /v OR umount /eo/jbx750
c) use 'rmdir' to remove any directories under /tmp_mnt/eo or
/tmp_mnt/v; remove /tmp_mnt/eo and /tmp_mnt/v
d) remove /v and any subdirectories; remove any directories in the /eo
directory (NOTE: do not remove /eo, it is required for automount)

If you are positively, absolutely, 100%, guaranteed certain that nothing
is mounted under /tmp_mnt, you may use 'rm -r' in place of 'rmdir'; the
consequnces of being wrong (and using 'rm -r') can be disasterous.

Once /etc/mtab contains only local volumes and explicit NFS mounts (e.g.
/par10), there are no subdirs in /tmp_mnt, /v does not exist, and /eo
is an empty directory, then (and only then) automount can be restarted:

/etc/server -p /usr/etc/automount -m -av -f /etc/auto.master [Apollo]
/usr/etc/automount -m -f /etc/auto.master [HP,SG,IBM]

Automount should start and run cleanly at this point, providing all the above
conditions were met, and the network is healthy. If it doesn't, the final
solution (as always) is to reboot the machine.

Contributed by: rvenable

CHARMM .doc Homepage

Information and HTML Formatting Courtesy of:

NHLBI/LBC Computational Biophtsics Section
FDA/CBER/OVRR Biophysics Laboratory