Updated 2015-04-15 00:31:32 by RVM

Text files are flexible, and are used to store all sorts of information these days. The upcoming standard for document exchange - XML - is based on text documents, and their implicit portability and ease of use. But they have a couple of not-so-attractive properties:

  • They are linear, inserting and deleting information is inefficient
  • They are not safe, if a change is interrupted, information may be lost
  • They are not very efficient for storing numeric and binary data

Database systems include additional logic and complexity to address these issues, but they can bring other consequences with them:

  • Complexity, with regards to installation as well as during use
  • Being "too big a solution", ending up slow and resource-consuming
  • Forcing a rigid design, which hinders evolution and ad-hoc scripting

Entirely different aspects of data-storage come into play in multi-user contexts: resolving contention and preventing data corruption when multiple clients access and modify a common dataset, finding ways to present a consistent view of the data to all clients while changes are made, and finding ways to evolve/update the data model as well as the software used on each of the client workstations (deployment of updates).

A simple application has been created, implemented with Tcl and MetaKit as a Scripted Document, which addresses each of the above issues. At this stage, it is being incorporated in a workflow-type application, with some two dozen clients viewing and initiating new "tasks". The application offer different views on a common, but varying base of information, which are occasionally adjusted by one or two administrators. In user terms, this is a single application - but in terms of technology, it is in fact an application of which the user interface is adjusted occasionally, at run time.

It's called the "MetaKit Consistency Server"

The server has the following properties:

  • The server and database are platform-independent
  • There is no design mode, the database will adapt to store any data
  • Multi-user access through standard TCP/IP
  • The Tcl client interface uses arrays and tracing to track data
  • Changes on the server are automatically propagated to all clients
  • Client scripts can be stored and propagated through the server
  • No server down time to alter data structures or client software
  • Being a scripted document, it runs "out of the box", without installation

This solution is still young. It requires Tcl on both the server and the client side. It has not yet been tuned for performance, the first goal has been to create a simple and robust functional system.

On the client side

The client side of this mechanism is very simple. It requires a standard "client.tcl" script. There is very little setup. For example, to share data stored in the global "myArray" consistently between a group of identical clients, all that is needed are the following lines to set it up:
        source client.tcl
        global myArray
        McOpen 12345
        McAttach myArray

This example assumes the server is running on the same machine, and listens on port 12345.

Once these lines have been executed, "myArray" essentially becomes a replicated (and persistent) entity. It will be loaded with the current contents on start-up, changes to it will be propagated to all clients (both "set" and "unset" commands are handled), and remote changes will be propagated back to this client whenever requests come in from the server. This all happens in the background, in idle time, with the magic of Tcl's variable traces and fileevents.

Keeping more data consistent is a matter of adding more McAttach calls.

On the server side

The server is based on Tcl and MetaKit. Due to the dynamic properties of MetaKit, it is able to add, alter, and remove data structures on the fly, while running. As a consequence, there is no setup. Just start an "empty" generic server once, and forget about it. In normal use, the clients will connect and request specific arrays to be shared. If present, the current contents is returned to initialize them. If new, a corresponding view is defined in the server, and from then on the data becomes shared.

The server has two configuration parameters: a port number to listen on (which the clients need to know to connect to the server), and a timer value which defines when and how often commits are forced on the server. If this value is not defined, the server will only commit its current state to file when one of the clients issues a "McCommit" call. A more convenient approach, however, is to set this timer to the number of milliseconds until changes are committed to file. Setting this value too low bogs down the server with continuous commits during heavy load, setting it very high means there is a certain amount of time during which catastrophic failure on the server will cause the last changes to be lost. But since the server uses transaction processing, failures will always leave the server in a consistent state - it will merely be the one of the previous commit.

To setup the server port, use the following command:
        ./server --config-port=<port>

The default port is (arbitrarily) set to 20458.

The following command will set the commit timer:
        ./server --config-timer=<milliseconds>

The default timer value is set to 5000, i.e. 5 seconds.

Once these two values have been set, you can start the server and forget about it:
        ./server &

To have it display what it is doing, there's a third setting which can be adjusted (before starting the server), this will turn messages on:
        ./server --config-verbose=1

See Tequila