filesystem benchmarking

Some aspects of Tcl's filesystem (glob, file, pwd, cd primarily) became a bit slower with the vfs layer that was introduced in Tcl 8.4.0. This has gradually been addressed with later Tcl 8.4.x releases and also with Tcl 8.5.x. The current timings are as follows:

000 VERSIONS
1:8.5a0 2:8.4.5 3:8.3.5
    001 FILE dirname absolute tmpfile (obj)               8      10       0
    002 FILE dirname absolute tmpfile (str)              11      14      10
    003 FILE dirname relative tmpfile (obj)               8      11       0
    004 FILE dirname relative tmpfile (str)              12      15      10
    005 FILE dirname ~                                    6       7      30
    006 FILE exec interp                              12652   12436   11820
    007 FILE exec interp: exit                        13463   12323   10920
    008 FILE exec interp: pkg require                 38329   67561   29940
    009 FILE exec interp: pkg require+                49142   93147   46400
    010 FILE exec interp: pkg require+auto_path       53856   78622   43400
    011 FILE exists tmpfile (obj)                        20      20      20
    012 FILE exists ~                                    11      12      40
    013 FILE exists! absolute tmpfile (obj)              17      18      20
    014 FILE exists! absolute tmpfile (str)              23     199      20
    015 FILE exists! relative tmpfile (obj)              14      18      20
    016 FILE exists! relative tmpfile (str)              17     406      10
    017 FILE exists! tmpfile (obj)                       20      21      30
    018 FILE exists! tmpfile (str)                       26     351      20
    019 FILE glob  tmpdir (30 entries) / -dir          4380    5055    4600
    020 FILE glob  tmpdir (30 entries) / cd            5146    5458    4710
    021 FILE glob  tmpdir (subset of 30 entries)       2148    8242    4410
    022 FILE glob / all subcommands                   76189   60837  102150
    023 FILE glob / atime                              6771    7358   11210
    024 FILE glob / attributes                        45904   46978   43270
    025 FILE glob / dirname                            4398    5456    4800
    026 FILE glob / executable                         8607    6796    5810
    027 FILE glob / exists                             6277    6804    5810
    028 FILE glob / extension                          4338    5140    4610
    029 FILE glob / isdirectory                        6703    8396   11810
    030 FILE glob / isfile                             6959    7721   11420
    031 FILE glob / mtime                              6780    7310   11520
    032 FILE glob / owned                              6804    7635   11510
    033 FILE glob / readable                          11928    6794    5610
    034 FILE glob / rootname                           4411    5337    4710
    035 FILE glob / size                               7427    7106   11620
    036 FILE glob / tail                               4388    4973    4600
    037 FILE glob / writable                          10904    6807    5710
    038 FILE glob deep dirs (30 entries+)             40638   53213   43570
    039 FILE glob deep dirs (subset of 30 entries+)   22169   54320   42660
    040 FILE glob dirs (30 entries)                    4430    4582   11210
    041 FILE recurse / -dir                           94584  102725  128390
    042 FILE recurse / cd                            101832  163048  127880
    043 FILE recurse+stat / -dir                     116454  137213  232730
    044 FILE recurse+stat / cd                       138270  202646  244150
    045 FILE tail absolute tmpfile (obj)                  4       6      11
    046 FILE tail absolute tmpfile (str)                  7      11       0
    047 FILE tail relative tmpfile (obj)                  5       6       0
    048 FILE tail relative tmpfile (str)                  8      52      10
    049 FILE tail ~                                       4       7      20
    049 BENCHMARKS                                  1:8.5a0 2:8.4.5 3:8.3.5

(This is for a slightly modified version of file.bench, and a slightly modified version of Tcl 8.5a0 which is undergoing performance testing, where the above timings are on WinXP. Also see end of page for a normalized version).

If you have particular pieces of code using file, glob which are slower in Tcl 8.4/8.5, please add them to this page so that they can be added to the benchmark suite.

Note: reading from and writing to files is a totally separate topic, i.e. channel i/o, which has nothing to do with the vfs changes made to Tcl 8.4.0, and is therefore irrelevant to the particular benchmarking exercise on this page.

Note 2: the 'readable' and 'writable' benchmarks are slower because they actually return the correct values in Windows 2000/XP (Tcl 8.3/8.4 don't correctly check the more complex NTFS Windows user permissions).

Note 3: even with a new (hacked) version of tclbench which allows me to multiply the number of timed iterations by a factor of 10, I still see really huge variation in relative timings from one run to the next. For example, see the following two sets of normalized data, which should ostensibly be the same. Each benchmark is run somewhere between 300 and 10000 times to produce these.


Same data, normalized, run 1:

000 VERSIONS
1:8.5a0 2:8.4.5 3:8.3.5
    001 FILE dirname absolute tmpfile (obj)            1.75    2.50    1.00
    002 FILE dirname absolute tmpfile (str)            2.20    2.80    1.00
    003 FILE dirname relative tmpfile (obj)            1.60    2.20    1.00
    004 FILE dirname relative tmpfile (str)            2.40    3.40    1.00
    005 FILE dirname ~                                 0.21    0.29    1.00
    006 FILE exec interp                               1.04    1.10    1.00
    007 FILE exec interp: exit                         1.09    1.16    1.00
    008 FILE exec interp: pkg require                  1.37    2.39    1.00
    009 FILE exec interp: pkg require+                 1.02    1.99    1.00
    010 FILE exec interp: pkg require+auto_path        1.19    2.04    1.00
    011 FILE exists tmpfile (obj)                      0.91    0.95    1.00
    012 FILE exists ~                                  0.33    0.36    1.00
    013 FILE exists! absolute tmpfile (obj)            0.85    0.90    1.00
    014 FILE exists! absolute tmpfile (str)            1.14    9.00    1.00
    015 FILE exists! relative tmpfile (obj)            1.00    1.29    1.00
    016 FILE exists! relative tmpfile (str)            1.06   20.81    1.00
    017 FILE exists! tmpfile (obj)                     0.83    0.88    1.00
    018 FILE exists! tmpfile (str)                     1.12   11.08    1.00
    019 FILE glob  tmpdir (30 entries) / -dir          0.96    1.01    1.00
    020 FILE glob  tmpdir (30 entries) / cd            1.06    1.15    1.00
    021 FILE glob  tmpdir (subset of 30 entries)       0.46    1.05    1.00
    022 FILE glob / all subcommands                    0.70    0.62    1.00
    023 FILE glob / atime                              0.58    0.63    1.00
    024 FILE glob / attributes                         1.06    1.10    1.00
    025 FILE glob / dirname                            0.90    1.12    1.00
    026 FILE glob / executable                         1.47    1.22    1.00
    027 FILE glob / exists                             1.12    1.22    1.00
    028 FILE glob / extension                          0.94    1.05    1.00
    029 FILE glob / isdirectory                        0.58    0.63    1.00
    030 FILE glob / isfile                             0.58    0.63    1.00
    031 FILE glob / mtime                              0.58    0.63    1.00
    032 FILE glob / owned                              0.58    0.63    1.00
    033 FILE glob / readable                           1.83    1.22    1.00
    034 FILE glob / rootname                           0.95    1.05    1.00
    035 FILE glob / size                               0.58    0.63    1.00
    036 FILE glob / tail                               0.92    1.07    1.00
    037 FILE glob / writable                           1.82    1.22    1.00
    038 FILE glob deep dirs (30 entries+)              0.91    1.19    1.00
    039 FILE glob deep dirs (subset of 30 entries+)    0.52    1.23    1.00
    040 FILE glob dirs (30 entries)                    0.38    0.41    1.00
    041 FILE recurse / -dir                            0.66    0.80    1.00
    042 FILE recurse / cd                              0.65    1.29    1.00
    043 FILE recurse+stat / -dir                       0.50    0.60    1.00
    044 FILE recurse+stat / cd                         0.49    0.91    1.00
    045 FILE standard directory 'package require'      1.04    2.04    1.00
    046 FILE tail absolute tmpfile (obj)               1.33    1.67    1.00
    047 FILE tail absolute tmpfile (str)               2.00    2.25    1.00
    048 FILE tail relative tmpfile (obj)               1.25    1.50    1.00
    049 FILE tail relative tmpfile (str)               2.00    2.50    1.00
    050 FILE tail ~                                    0.17    0.22    1.00
    050 BENCHMARKS                                  1:8.5a0 2:8.4.5 3:8.3.5

Same data, normalized, run 2:

000 VERSIONS
1:8.5a0 2:8.4.5 3:8.3.5
    001 FILE dirname absolute tmpfile (obj)            1.40    2.00    1.00
    002 FILE dirname absolute tmpfile (str)            2.20    2.80    1.00
    003 FILE dirname relative tmpfile (obj)            1.60    2.20    1.00
    004 FILE dirname relative tmpfile (str)            2.40    3.60    1.00
    005 FILE dirname ~                                 0.26    0.30    1.00
    006 FILE exec interp                               1.06    1.11    1.00
    007 FILE exec interp: exit                         1.09    1.16    1.00
    008 FILE exec interp: pkg require                  1.38    2.39    1.00
    009 FILE exec interp: pkg require+                 1.02    1.99    1.00
    010 FILE exec interp: pkg require+auto_path        1.19    2.04    1.00
    011 FILE exists tmpfile (obj)                      0.95    1.00    1.00
    012 FILE exists ~                                  0.33    0.36    1.00
    013 FILE exists! absolute tmpfile (obj)            0.90    0.90    1.00
    014 FILE exists! absolute tmpfile (str)            1.05    8.59    1.00
    015 FILE exists! relative tmpfile (obj)            0.93    1.29    1.00
    016 FILE exists! relative tmpfile (str)            1.06   20.75    1.00
    017 FILE exists! tmpfile (obj)                     0.87    0.91    1.00
    018 FILE exists! tmpfile (str)                     1.13   11.54    1.00
    019 FILE glob  tmpdir (30 entries) / -dir          0.98    1.03    1.00
    020 FILE glob  tmpdir (30 entries) / cd            1.07    1.17    1.00
    021 FILE glob  tmpdir (subset of 30 entries)       0.46    1.05    1.00
    022 FILE glob / all subcommands                    0.70    0.63    1.00
    023 FILE glob / atime                              0.58    0.63    1.00
    024 FILE glob / attributes                         1.06    1.17    1.00
    025 FILE glob / dirname                            0.90    1.12    1.00
    026 FILE glob / executable                         1.48    1.23    1.00
    027 FILE glob / exists                             1.12    1.22    1.00
    028 FILE glob / extension                          0.95    1.06    1.00
    029 FILE glob / isdirectory                        0.58    0.63    1.00
    030 FILE glob / isfile                             0.58    0.63    1.00
    031 FILE glob / mtime                              0.58    0.63    1.00
    032 FILE glob / owned                              0.58    0.63    1.00
    033 FILE glob / readable                           1.83    1.22    1.00
    034 FILE glob / rootname                           0.95    1.05    1.00
    035 FILE glob / size                               0.58    0.63    1.00
    036 FILE glob / tail                               0.92    1.07    1.00
    037 FILE glob / writable                           1.83    1.22    1.00
    038 FILE glob deep dirs (30 entries+)              0.92    1.20    1.00
    039 FILE glob deep dirs (subset of 30 entries+)    0.52    1.23    1.00
    040 FILE glob dirs (30 entries)                    0.38    0.41    1.00
    041 FILE recurse / -dir                            0.66    0.80    1.00
    042 FILE recurse / cd                              0.66    1.29    1.00
    043 FILE recurse+stat / -dir                       0.50    0.60    1.00
    044 FILE recurse+stat / cd                         0.49    0.89    1.00
    045 FILE standard directory 'package require'      1.05    2.06    1.00
    046 FILE tail absolute tmpfile (obj)               1.33    2.33    1.00
    047 FILE tail absolute tmpfile (str)               1.40    1.80    1.00
    048 FILE tail relative tmpfile (obj)               1.67    2.00    1.00
    049 FILE tail relative tmpfile (str)               1.60    2.00    1.00
    050 FILE tail ~                                    0.17    0.22    1.00
    050 BENCHMARKS                                  1:8.5a0 2:8.4.5 3:8.3.5

DGP 2004-01-22:

Working with the file.bench attached to Tcl Patch 871583, I can report that the news is not so good on Unix systems. A large commit yesterday did improve filesystem performance compared to before the patch, but comparing to 8.4 and 8.3 as above, the VFS still hurts:

 000 VERSIONS:                                   1:8.5a0 2:8.5a0 3:8.4.5 4:8.3.5
 001 FILE dirname absolute tmpfile (obj)            2.71    2.71    2.71    1.00
 002 FILE dirname absolute tmpfile (str)            3.00    3.12    3.25    1.00
 003 FILE dirname relative tmpfile (obj)            2.60    2.60    2.40    1.00
 004 FILE dirname relative tmpfile (str)            2.83    3.17    3.17    1.00
 005 FILE dirname ~                                 0.11    0.11    0.10    1.00
 006 FILE exec interp                               1.15    1.14    1.09    1.00
 007 FILE exec interp: exit                         1.15    1.14    1.10    1.00
 008 FILE exec interp: pkg require                  1.40    1.39    1.36    1.00
 009 FILE exec interp: pkg require+                 1.00    0.98    1.44    1.00
 010 FILE exec interp: pkg require+auto_path        1.15    1.52    1.11    1.00
 011 FILE exists tmpfile (obj)                      1.14    1.14    1.29    1.00
 012 FILE exists ~                                  0.04    0.05    0.05    1.00
 013 FILE exists! absolute tmpfile (obj)            0.54    0.62    0.62    1.00
 014 FILE exists! absolute tmpfile (str)            1.50    3.17    4.83    1.00
 015 FILE exists! relative tmpfile (obj)            0.83    1.33    1.33    1.00
 016 FILE exists! relative tmpfile (str)            1.62    7.62   12.00    1.00
 017 FILE exists! tmpfile (obj)                     0.75    0.88    0.88    1.00
 018 FILE exists! tmpfile (str)                     1.70    3.00    4.20    1.00
 019 FILE glob  tmpdir (30 entries) / -dir          1.04    1.13    1.19    1.00
 020 FILE glob  tmpdir (30 entries) / cd            1.16    1.44    1.57    1.00
 021 FILE glob  tmpdir (subset of 30 entries)       0.97    1.00    1.03    1.00
 022 FILE glob / all subcommands                    1.00    1.02    1.00    1.00
 023 FILE glob / atime                              2.66    2.50    2.49    1.00
 024 FILE glob / attributes                         1.12    1.11    1.11    1.00
 025 FILE glob / dirname                            0.55    2.14    2.08    1.00
 026 FILE glob / executable                         2.60    2.50    2.46    1.00
 027 FILE glob / exists                             2.63    2.53    2.48    1.00
 028 FILE glob / extension                          0.99    1.63    1.59    1.00
 029 FILE glob / isdirectory                        2.58    2.54    2.49    1.00
 030 FILE glob / isfile                             2.57    2.46    2.45    1.00
 031 FILE glob / mtime                              2.59    2.47    2.43    1.00
 032 FILE glob / owned                              2.47    2.38    2.36    1.00
 033 FILE glob / readable                           2.61    2.53    2.45    1.00
 034 FILE glob / rootname                           1.18    1.71    1.64    1.00
 035 FILE glob / size                               2.59    2.50    2.49    1.00
 036 FILE glob / tail                               0.65    2.04    1.79    1.00
 037 FILE glob / writable                           2.66    2.54    2.49    1.00
 038 FILE glob deep dirs (30 entries+)              1.12    1.20    1.34    1.00
 039 FILE glob deep dirs (subset of 30 entries+)    0.98    1.08    1.18    1.00
 040 FILE glob dirs (30 entries)                    0.75    0.77    0.82    1.00
 041 FILE recurse / -dir                            2.12    2.17    2.16    1.00
 042 FILE recurse / cd                              1.17    2.79    2.99    1.00
 043 FILE recurse+stat / -dir                       1.34    1.35    1.37    1.00
 044 FILE recurse+stat / cd                         1.05    1.65    1.74    1.00
 045 FILE standard directory 'package require'      1.42    1.33    1.60    1.00
 046 FILE tail absolute tmpfile (obj)               2.00    2.20    2.00    1.00
 047 FILE tail absolute tmpfile (str)               2.14    2.57    2.57    1.00
 048 FILE tail relative tmpfile (obj)               2.25    2.25    2.25    1.00
 049 FILE tail relative tmpfile (str)               2.80    3.20    3.40    1.00
 050 FILE tail ~                                    0.07    0.08    0.09    1.00
 050 BENCHMARKS                                  1:8.5a0 2:8.5a0 3:8.4.5 4:8.3.5

Note that benchmarks 025 and 036 do show a speed improvement even over Tcl 8.3.5. Note also in the comment history of Tcl Patch 871583 that those are the only benchmarks that were specifically analyzed for improvements on Unix. I suspect there may be further improvements to achieve here. Just need someone on Unix to apply the effort the way Vince has done for Windows.

Vince: I've attached the file.bench with a few extra tests to the patch. I agree it would be great to do more work on Unix -- there are still too many things that are twice as slow. The interesting thing I noticed on Windows is that almost all the time is spent (60% in most benchmarks) in native normalization (checking whether directories exist and then finding their unique case-sensitive name and checking if they are a link). Certainly one would hope that Unix systems with 'realpath' should be much faster (assuming the realpath implementation is any good).