tar

tar, or "tape archive" is a format for storing file hierachies in a single file, known as a tarball.

See Also

Tcllib tar
reads and writes tarballs
Jimlib
writes tarballs from Jim Tcl

Description

SDX

jcw: There's code to decode and unpack a tar file in the SDX starkit utility, unwrap and look at lib/app-sdx/tgz2kit.tcl, it's about 25 lines.

The sdx untar does not respect file permissions and owners, nor special files. It does not set owner/group or permissions on the files being extracted and does not create special files such as hard and soft links.


LV: Something I recently learned about the tar package in tcllib - we ran into a case where until Feb, 2007, the tar files created by the tar package occasionally didn't include all the files it should. This has been fixed, but the code hasn't, yet, been included in a formal tcllib release.

Anyways, during the testing of that, I discovered that the GNU tar program creates larger tar files than the non-GNU tar program that comes in Sun Solaris or the Tcl tar package in tcllib. For that matter, the Solaris and the Tcl tar package create different tar files for the same data, but the size of the tar package is the same in this latter case. Just wanted to warn people to expect the unexpected...

AF: GNU tar is a mostly incompatible format with POSIX tar. The tcllib tar module documents the fact that it only supports POSIX tar. The main difference between GNU and POSIX is that GNU supports paths longer than 100 characters (the POSIX header is a fixed length) thus the larger size. The difference in tarballs between the Solaris tar and the module I think is due to the order of the files in the tarball, which should be insignificant.

MAKR: Correction: There are quiet a number of tar formats available: gnu, oldgnu, v7, star, ustar (POSIX.1-1988), posix (POSIX.1-2001, see pax there), etc. Recent GNU tar versions support at least extraction for all of those listed. Issuing the --help option will show which format will be created by default by your particular GNU tar executable. With --format the format can be changed. The best choice WRT portability would IMHO be posix. It does not impose restrictions on path name length, maximum file size, device node numbers, UID/GID numbers, encodings, etc. anymore, while trying to be backward compatible to ustar as much as possible.

GNU tar supports a number of operation extensions, e.g. instead of creating a tar archive and then piping it through a compressor, one could just issue -Z for compress, -z for gzip, -j for bzip2, or even --lzma for LZMA (as of version 1.20).


DKF: Tar is also, and coincidentally, a black sticky substance that traps stuff that falls into it (e.g. the La Brea Tar Pits in LA [L1 ]). In computing, this brings on the term "a tar-pit project", which sucks in all effort that goes even close and swallows it all without trace. Avoid tar pits by using Tcl/Tk!

LV: I've tended to refer to this type of effort as a black hole. I've also heard it referred to as a tar baby (which refers back to old children's tales of Brer Rabbit and company .

ECS: The first reference I know to "a tar-pit project" is from Fred Brooks' The Mythical Man-Month: Essays on Software Engineering .

GWM I've also called these 'bucket of worms' projects - every time you move one layer another one slithers into view. Similar: 'a box of frogs' (originally referring to the way 5-10 year old boys leap about - imagine opening a shoe box with frogs in, out they all come).

escargo 2007-02-13: Another tar-pit is the Turing tar-pit, where everything is possible, but nothing is easy. [L2 ] [L3 ]

AMG: Avoid Turing tar pits by using Tcl/Tk!