Velocity Reviews

Velocity Reviews (
-   Ruby (
-   -   [ANN] dirwatch-0.0.6 (

Ara.T.Howard 11-04-2004 06:06 PM

[ANN] dirwatch-0.0.6


dirwatch v0.0.6

dirwatch [options]+ [directory = ./] [mode = watch] [dbdir = 0]

dirwatch is a tool used to design file system based event driven systems.

dirwatch manages an sqlite database that mirrors the state of a directory
and then triggers user definable event handlers for certain filesystem
activities such file creation, modification, deletion, etc. dirwatch
normally runs as a daemon process sychronizing the database inventory with
that of the directory and then fires appropriate triggers. dirwatch is
designed such that more than one 'watch' may be placed on a given directory
and it is nfs clean.

the following actions may have triggers configured for them

created -> a file was detected that was not already in the database
modified -> a file in the database was detected as being modified
updated -> a file was created or modified (union of these actions)
deleted -> a file in the database is no longer in the directory
existing -> a file in the database still exists in the directory and has not
been modified

the command line 'mode' must be one of the following

create (c) -> initialize the database and supporting files
watch (w) -> maintain database inventory and trigger actions
list (l) -> dump database to stdout in silky smooth yaml format
template (t) -> generate a template config file

for all modes except 'template' the command line argument must be the name of
the directory to apply the operation

mode: create (c)

initializes a storage directory, known from here on as 'dbdir', with all
required database files, logs, command directories, sample configuration,
sample programs, etc.

by default the dbdir will be stored in a numbered subdirectory such as


where 'directory' is the directory named on the command line and 'n' is the
watch number.

multiple dirwatches may be placed upon a directory - these 'watches' will be
automagically numbered starting from 0 as they are created. for instance
the command

dirwatch ./ create

followed by another

dirwatch ./ create

would initialize both the dbdirs './.dirwatch/0' AND '././dirwatch/1' to
allow two 'watches' (0 and 1) to later be placed upon the directory. see
watch section below.

dbdir may be specified at creation (or watch) time as either the last
command line argument, or by using the '--dbdir' option, as the full path to
the storage directory. as a special case dbdir may be specified as a number
only (matching /[0-9]+) in which case the dbdir is assumed to be a numbered
subdirectory of directory/.dirwatch/.

for example

dirwatch ./ create 42


dirwatch --dbdir=42 ./ create

would use the directory ./.dirwatch/42/ as dbdir, and

dirwatch ./ create /full/path/to/dbdir

would use /full/path/to/dbdir as a dbdir

when a dirwatch directory is created a hierarchy is created for storing
commands (programs) to be triggered for the various actions. the hierachy
is :


the idea being that that actual trigger commands (programs) will be stored
in either the commands/ subdirectory or in an action specific subdirectory
(commands/created/, commands/deleted/, etc.). it is not required to store
programs here, but these locations are automatically checked based on
trigger type.

a default config file will be auto-generated and placed in the 'dbdir' with
the name 'dirwatch.conf'. this config will automatically be used, iff
found, when watching. use the '--config' option to override this.

mode: watch (w)

dirwatch is designed to run as a daemon, updating the database inventory
at the interval specified by the '--interval' option (5 minutes by default)
and firing appropriate trigger commands. two watchers may not watch the
same dbdir simoultaneously and attempting the start a second watcher will
fail when the second watcher is unable to obtain the pid lockfile. it is a
non-fatal error to attempt to start another watcher when one is running and
this failure can be made silent by using the '--quiet' option. the reason
for this is to allow a crontab entry to be used to make the daemon
'immortal'. for example, the following crontab entry

*/15 * * * * dirwatch directory --daemon --dbdir=0 \
--files_only --flat \
--interval=10minutes --quiet

or (same but shorter)

*/15 * * * * dirwatch directory -D -d0 -f -F -i10m -q

will __attempt__ to start a daemon watching 'directory' every fifteen
minutes. if the daemon is not already running one will started, otherwise
dirwatch will simply fail silently (no cron email sent due to stderr).

this feature allows a normal user to setup daemon processes that not only
will run after machine reboot, but which will continue to run after other
terminal program behaviour.

the meaning of the options in the above crontab entry are as follows

--daemon -> become a child of init and run forever
--dbdir -> the storage directory, here the default is specified
--files_only -> inventory files only (default is files and directories)
--flat -> do not recurse into subdirectories (default recurses)
--interval -> generate inventory, at mininum, every 10 minutes
--quiet -> be quiet when failing due to another daemon already watching

as the watcher runs and maintains the inventory it is noted when
files/directories (entries) have been created, modified, updated, deleted,
or are existing. these entries are then handled by user definable triggers
as specified in the config file. the config file is of the format

actions :
created :
commands :
updated :
commands :

where the commands to be run for each trigger type are enumerated. each
command entry is of the following format:
command : command to run
type : calling convention
pattern : filter files further by this pattern
timing : synchronous or asynchronous execution

the meaning of each field is as follows :

command: this is the program to run. the search path for the program is
determined dynamically by the action run. for instance, when a
file is discovered to be 'modified' the search path for the
command will be

dbdir/commands/modified/ + dbdir/commands/ + $PATH

this dynamic path setting simply allows for short pathnames if
commands are stored in the dbdir/commands/* subdirectories.

type: there are four types of commands. the type merely indicates the
calling convention of the program. when commands are run there
are two peices of information which must be passed to the
program, the file in question and the mtime of that file. the
mtime is less important but programs may use it to know if the file
has been changed since they were spawned. mtime will probably be
ignored for most commands. the four types of commands fall into
two catagories: those commands called once for each file and those
types of commands called once with __all__ files

each file:

simple: the command will be called with three arguments: the file
in question, the mtime date, and the mtime time. eg:

command foobar.txt 2002-11-04 01:01:01.1234

expaned: the command will be have the strings '@file' and
'@mtime' replaced with appropriate values. eg:

command '@file' '@mtime'

expands to (and is called as)

command 'foobar.txt' '2002-11-04 01:01:01.1234'

all at once:

filter: the stdin of the program will be given a list where each
line contains three items, the file, the mtime data, and
the mtime time.

yaml: the stdin of the program will be given a list where each
entry contains two items, the file and the mtime. the
format of the list is valid yaml and the schema is an
array of hashes with the keys 'path' and 'mtime'.

pattern: all the files for a given action are filtered by this pattern,
and only those files matching pattern will have triggers fired.

timing: if timing is asynchronous the command will be run and not waited
for before starting the next command. asynchronous commands may
yield better performance but may also result in many commands
being run at once. asyncronous commands should not load the
system heavily unless one is looking to freeze a machine.
synchronous commands are spawned and waited for before the next
command is started. a side effect of synchronous commands is
that the time spent waiting may sum to an ammount of time greater
than the interval ('--interval' option) specified - if the amount
of time running commands exceeds the interval the next inventory
simply begins immeadiately with no pause. because of this one
should think of the interval used as a minimum bound only,
especially when synchronous commands are used.

note that sample commands of each type are auto-generated in the
dbdir/commands directory. reading these should answer any questions regarding
the calling conventions of any of the four types. for other questions regard
the sample config, which is also auto-generated.

mode: list (l)

dump the contents of the database in yaml format for easy viewing/parsing

mode: template (t)

generate a template config. the first directory argument is ignored so one
may type

dirwatch directory template [template file]


dirwatch template [template file]


for dirwatch:

export SQLDEBUG=1 -> cause sql debugging info to be logged
export LOCKFILE_DEBUG=1 -> cause lockfile debugging info to be logged

for triggers run under dirwatch:

DIRWATCH_DIR -> directory being watched
DIRWATCH_ACTION -> trigger type
DIRWATCH_TYPE -> command type
DIRWATCH_N_PATHS -> total number of paths for this trigger
DIRWATCH_PATH_IDX -> for simple|expanded path number
for filter|yaml set to DIRWATCH_N_PATHS
DIRWATCH_PATH -> for simple|expanded path
for filter|yaml nil
DIRWATCH_MTIME -> for simple|expanded mtime of path
for filter|yaml nil
DIRWATCH_PID -> pid of dirwatch watcher
DIRWATCH_ID -> trigger unique identifier
PATH -> .dirwatch/(0...n)/commands/action + ENV['PATH']

directory/.dirwatch/n/ -> dirwatch data files
directory/.dirwatch/n/dirwatch.conf -> default configuration file
directory/.dirwatch/n/commands/* -> default location for triggers
directory/.dirwatch/n/db -> sqlite database file
directory/.dirwatch/n/db.schema -> sqlite database schema
directory/.dirwatch/n/lock -> sentinal lock file used for nfs safe access
directory/.dirwatch/n/ -> default pidfile
directory/.dirwatch/n/dirwatch.log -> default log file
directory/.dirwatch/n/* -> misc files used by locking subsystem

success -> $? == 0
failure -> $? != 0


1 < bugno && bugno < 42

--lockfile=[lockfile], -L
coordinate inventory on lockfile - (default directory/.lock)
--dbdir=dbdir, -d
specify dbdir used - (default directory/.dirwatch/0)
--interval=interval, -i
specify polling interval - (default 5 minutes)
--nloops=nloops, -N
specify the number of watch loops - (default infinite)
--daemon, -D
run as a daemon
--quiet, -q
fail quietly if pidfile cannot be generated
--pattern=pattern, -p
watch only files matching pattern (__not__ shell glob)
--files_only, -f
ignore everything but files - (default directories and files)
--flat, -F
do not recurse into subdirectories - (default recurse)
--pidfile=pidfile, -P
specifiy pidfile used - (default @dbdir/
--verbosity=verbostiy, -v
0|fatal < 1|error < 2|warn < 3|info < 4|debug - (default info)
--log=path, -l
set log file - (default stderr or, iff existing, @dbdir/dirwatch.log)
daily | weekly | monthly - what age will cause log rolling (default
size in bytes - what size will cause log rolling (default 1mb)
--config=path, -c
valid path - specify config file (default @dbdir/dirwatch.conf)
valid path - generate a template config file in path (default stdout)
--help, -h
this message


0) initialize a directory for watching (dbdir = directory/.dirwatch/0/)

~ > dirwatch dir create

1) initialize another watch (the '1' is optional)

~ > dirwatch dir create 1

2) create a config (to edit afterwards)

~ > dirwatch template config
~ > vi config

3) watch a directory using all defaults, logging to stderr

~ > dirwatch dir watch

4) start daemon to watch a directory using all defaults, daemons log to
dbdir/dirwatch.log by default

~ > dirwatch dir watch -D

5) same as above but use dbdir .dirwatch/2/

~ > dirwatch dir watch 2 -D

6) dump contents of database (dbdir = .dirwatch/0/) in yaml format

~ > dirwatch dir list

7) same as above but use dbdir .dirwatch/2/

~ > dirwatch dir list 2

8) crontab entry to keep alive a watcher for a directory using default dbdir,
watching files only and not recursing into subdirectories

*/15 * * * * /full/path/to/dirwatch /full/path/to/directory w -D -f -F -q

9) another watch on that same directory using different dbdir (7). this one
watches all entries and recurses into subdirectories

*/15 * * * * /full/path/to/dirwatch /full/path/to/directory w 7 -D -q


================================================== =============================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| When you do something, you should burn yourself completely, like a good
| bonfire, leaving no trace of yourself. --Shunryu Suzuki
================================================== =============================

All times are GMT. The time now is 02:12 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.