A collate/broadcast virtual filesystem

SEH 2006-10-31: This vfs has been rewritten and updated, now built upon the latest version of a template virtual filesystem.

An important new feature has been added: a "catchup" function. The vfs (as before) allows the user to specify several write locations to which all file changes are distributed. But now, when the catchup option is activated, one or more of the specified write locations doesn't have to exist at the time of the file write. If a write can't be done because the write directory is unavailable, the action is recorded in a "catchup" file. At each subsequent write action, the catchup file is examined to see if any of the missing directories has become available; if it has, the write is executed, allowing a newly-mounted location to "catch up."

This allows functionality similar to IMAP directories or the Andrew VFS; one can work offline on local files, then sync with a main repository automatically when reconnecting to a network.

The collate vfs and the other template virtual filesystems have been released in a zip file as a subsidiary project on the sourceforge project page of the FILTR [L1 ].


SEH 2006-01-10: An update of this code has been posted to a Sourceforge project site [L2 ]. All future updates will be placed there as well.


SEH A collate/broadcast/collect virtual filesystem.

Usage: ::vfs::template::collate::Mount <list of read directories> <list of write directories> <list of collect directories> <virtual mount point>

Collate: reads from multiple specified directories and presents the results as one at the mount location.

Broadcast: applies all writes in the mount location to multiple specified directories.

Collect: copies any file read from or written to any of the above locations to specified directories.

The lists of specified read, write and collect locations are independent; they can overlap or not as desired.

For file read access, each respective location in the read list is searched for the requested file, and the first instance of the file found is read.

Write and create commands are applied to each respective write location.

Directory listings are aggregates of all respective directory contents in all read locations.

Collect locations are not included in file or directory listings, and are not searched for read access.

Any of the read, write or collect lists can be an empty string.

Example use: specify parallel locations on a hard drive, on a CD-ROM mount and an ftp vfs as the read list. Files will be read first from the hard drive, if not found there the CD-ROM and ftp site will be searched in turn. The hard drive can be specified as the single write location, and no writes to the CD-ROM or ftp site will ever be attempted.

Example collect location use: specify a single hard drive location as a read, write and collect directory. Specify an ftp vfs as a secondary read directory. As ftp files are downloaded they are copied to the collect directory; the local copies are accessed first on subsequent reads and writes: hence the collect specification produces a self-generating local cache.

Based on a template virtual filesystem

SEH 2004-12-07: Posted my latest code, which takes care of some issues and adds a feature: now you can use variables in the pathnames of the directories in the lists specified on the command line; so, for example, you could have "$::env(HOME)/files" as one of the read directories in a startup script containing the mount command, and anyone using the script would have their own home directory referenced.


NEM: Interesting idea. The description of file reads reminded me somewhat of inheritance in object-Oriented languages, with the directories representing classes and files representing instance data.


SEH 2004-07-14: I added a "collect" option to the filesystem. See above.

2004-07-19: Fixed problems pointed out by CMcC


2004-11-03: Brian Theado - I found that [cd] to a any virtual directory didn't work and [file stat] only worked for files and not directories. See change to AcquireFile below (please review my change). I also found an extra set of square braces in the else clause in the Open command. See fix below

SEH 2004-12-07: Fixes incorporated into latest code below. Thanks!


CMcC 2004-12-04: It would be useful to be able to specify an additive directory, which is overlayed on every directory within a vfs, such that a file present in the additive directory would be present in every directory of the collate vfs. It would be like an additional read set, extending each element of the read set.

SEH 2004-12-07: If you were to take the template virtual filesystem and make "set relative ." the first line of the handler procedure, then you should have a virtual filesystem where the files in the root directory of the virtual mount appear to exist in every subdirectory queried. Then you could put that virtual location into the read list of this virtual filesystem, and you should have the behavior you're after.


# collatevfs.tcl --
#
#        A collate/broadcast/collect virtual filesystem.
#
# Written by Stephen Huntley ([email protected])
#
# Install: This code requires that the template vfs (https://wiki.tcl-lang.org/11938) procedures have already
# been sourced into the interpreter.
#
# Usage: ::vfs::template::collate::Mount <list of read directories> <list of write directories> <list of collect directories> <virtual mount point>
#
# Collate: reads from multiple specified directories and presents the results as one at the mount location.
#
# Broadcast: applies all writes in the mount location to multiple specified directories.
#
# Collect: copies any file read from or written to any of the above locations to specified directories. 
#
# The lists of specified read, write and collect locations are independent; they can overlap or not as desired.
#
# For file read access, each respective location in the read list is searched for the requested file, 
# and the first instance of the file found is read.
#
# Write and create commands are applied to each respective write location.
#
# Directory listings are aggregates of all respective directory contents in all read locations.
#
# Collect locations are not included in file or directory listings, and are not searched for read access.
#
# Any of the read, write or collect lists can be an empty string.
#
# Example use: specify parallel locations on a hard drive, on a CD-ROM mount and an ftp vfs as the read list.
# Files will be read first from the hard drive, if not found there the CD-ROM and ftp site will be searched in turn.
# The hard drive can be specified as the single write location, and no writes to the CD-ROM or 
# ftp site will ever be attempted.
#
# Example collect location use: specify a single hard drive location as a read, write and collect directory.  
# Specify an ftp vfs as a secondary read directory.  As ftp files are downloaded they are copied to the 
# collect directory; the local copies are accessed first on subsequent reads and writes: hence the collect
# specification produces a self-generating local cache.

CMcC: This is a really important VFS, because it allows very flexible stacking of file systems, which gives you the ability to functionally compose file systems in a flexible manner.

jcw: Came late to the show, but I agree! SEH's VFS collection is starting to become a phenomenal toolbox, though I suspect that the implications will only become clear in actual use (or demo's of such). It does bring up another issue, which IMO deserves a page of its own: Array vs. VFS.