Updated 2015-09-02 17:53:17 by pooryorick

There are a lot of arguments in favor of keeping as much backward compatibility as possible. This argument trumps a number of discussions the Tcl newsgroup has had over the years. This keeps the language stable but it also stultifies, and makes it much harder to make progress in new directions.

When this has happened in other languages, the situation was resolved with a fork of a kind - a second version standard. Modula/2, Oberon/2, and so on provided a relief valve that permitted the language to change and grow but avoiding the need to drag previous users, screaming and kicking, into an incompatible area.

This page is to discuss what we'd like to do to Tcl if backward compatibility was not a factor. The primary objective is to reduce the overall conceptual load of the language (the stuff you need to remember to code efficiently) while increasing its overall power. The bottom line goal: to make Tcl the most powerful and simplest language it can be. I'd like to focus on several areas:

  • Regularizing the grammar (e.g., "list length" and "list create" instead of "llength" and "list").
  • Regularizing object concepts: cfile rather than "file", to make the usage of object-like commands match Tk.
  • Removing hideous warts: the :: is the worst offender in this respect although . from Tk is another example -- since Tcl already has the basic idea of a list built-in, the idea of listing a series of nesting things in order from some common ancestor (global namespace, top window, whatever) should use that same syntax rather than inventing another. Rather than set ::a::b::c foo, set {a b c} foo. Rather than .top.menu.file, it should be { top menu file }. It has been my experience that this item alone confuses more new users than any of the basic Tcl concepts. ::a::b::c looks like C++, but .a.b, which also looks like C++ or C has a completely different meaning. The need to parse and maintain these constructs also adds needless bloat to the language.
  • Conform arrays and dicts: the distinction between arrays and dicts makes sense to programmers who have used Tcl for ages but they are new landmine for new users. Ideally, these should not be separate, since they are the same concept in two different syntaxes. It might be argued that "dict" is the right way to go and that "array" as such simply dropped. The reason for this will become obvious in the "expr" section following.
  • Simplify and generalize: right now we have a complex primitive proc in the language that is really not necessary. Rather than:
 proc foo { arg1 arg2 args } {
   puts $arg1
   fiddle-with $arg2
   something-else-with $args

...where the proc itself is required to parse the args, simply allow vars to contain procedures:
 set foo {
   init arg1 arg2
   puts $arg1
   fiddle-with $arg2
   something-else-with $args

"init" is a built-in providing parsing. With it, every proc has keyword parsing for its argument lists:
  foo arg1=this arg2=that and the other thing

...where "and the other thing" becomes "args" and the others are set as expected.

This makes object-orientation like TOOT "native".

  • expr is a pain in the butt. It was painful enough in its first incarnation as an expression evaluator since it is way verbose and it and its associated [ ] obfuscate code. However, since adding bytecode compiling to the language the need for adding yet another layer of { } not only obfuscate further, they introduce a new landmine for newcomers: a { } expression that manifestly does do $-replacement! Furthermore, whenever expr does encounter an unknown identifier, it doesn't need the $ to know it needs to look it up. Similarly, the fact that an identifier is suffixed with "(" should be clue enough for expr to look up a tcl proc if the identifier is not in its internal function table. If we get rid of arrays, it becomes simple to use ( ) to delineate things to be evaluated with expr. Thus, something like: "string range $foo [expr $index-1] [expr $index+1]" becomes just "string range $foo (index-1) (index+1)" - much easier to read. I would also remove floating point and trancendental functions from expr. All it really needs to do is index computations, and integer or double ints should be plenty. More powerful functions that deal with floating point or other higher math requirements would move to their own procs.
 [expr sin(x)], [mpexpr 2534598746534**872874562487], [complex 4+2i + -8-3i ] etc.

  • Patterns are a similar pain - the overloaded use of both ( ) and [ ] make it very difficult to write patterns properly and to read them once you have written them. Ideally some method of delineating patterns - perhaps < > - and the use of @ rather than $ (or some similar choice) for eol, would enable us to embed patterns much more gracefully than we now do.
  • Macros. It should be possible to define a "macro" - similar to a "proc" - that gets a shot at a command before it is byte-compiled. These would provide ways to, for example, allow a command kicked off to "unknown" to be resolved and byte-compiled without further performance penalty.
  • Make the language friendlier. The "set" command is a particular offender here, it is more verbose and different from the "normal" way of doing assignments without offering any great improvement in utility. I offer two possible improvements here: let or ->. -> works just like ";" except it notifies the parser that the value of the command it just evaluated should be copied into the variable name that follows. Thus, [string range $foo (i-1) (i+1)] -> substring.
  • And add real comments - I'm tired of the old "why can't I have unbalanced [ ] in comments?" questions. It really isn't that hard to have the parser eliminate "//.*$".

I'm sure more ideas in this same vein have occurred to others. I'd like to explore these ideas with an eye toward producing a successor to Tcl that would vastly increase its power, readability, friendliess, and utility.

What are your ideas?

XX From the Marketing Division of the TCL Branch of the Sirrius Cybernetics Corporation:

The case for both sides is so compelling that the *ideal* answer for TCL/2 or TCL9 or whatever is to grab the best of both worlds and make it a feature of the language. Some rough possible ideas from the perspective of good marketing possibities follows. keep in mind that while I've tried to leave out the outlandish, I do not intend to make any suggestions on the feasibility of implementation or even technological superiority (yet, anyway). Just trying to get the ball rolling...

So if I had to "sell" TCL/2 or TCL9, I'd want to be able to say that the next generation of TCL is unconstrained by previous code and/or design decisions (mistakes?) yet compatibility is close to 100%. How can this be accomplished, bearing in mind some of the strengths of TCL to begin with:

1. Compatibility engines as loadable extensions - the idea being that a programmer can add one line to an old program and have it work with TCL/2 TCL9. Of course, this compatibility can only be guaranteed for the core and for TCL code and not binary extensions. "Emulation libraries" could be provided that would allow TCL/9 to fall back to the language implentation represented by that particular library.

One attractive feature about this model is that there is no core bloat for compatibility. One downside is that (presumably) binary extensions would need to be distributed with the core. Key versions would need support in the core. What would be really neat would be a "generator" that could take an existing TCL or TK library and synthesize a compatibility module "on-the-fly" (using Critcl type technolgy, perhaps). Then we could say: "TCL9 supports itself + any version of TCL and TK you have installed on your system or deployment"

2. Compatibility "switch" as part of invoking syntax for an interp or new command. The difference here is that the functionality of previous versions would be "encapsulated" in the core. One could activate such support by invoking a command during execution. This might bloat the core sizewise (though I have NO idea how much) but shouldn't cost much performancewise, and should cost nothing if not invoked.

I encourage those that know the mechanics of these things to show me up and come up with better implementation details. My point is that while there are downsides to both breaking old code *and* keeping compatibility syntax; TCL should innovate and do neither. Make any syntax changes necessary in the new language definition, and offer optional compatibility to older software and site users.

PWQ 6 Jan 05, I too wonder why the TCT have never discussed adding compatibily options to the interp. Most other languages have this feature. IT would not be much of a ask to type:
        tclsh -compat 8.0 my_old_script

LV I don't recall anyone ever submitting a TIP for such a feature, with the intent on implementing it, to the TCT for discussion.

It would as always require changes to the core, including internal dispatching through the stubs table, but it allows changes to proceed unimpeded.

RS I think it would be more robust to specify such dependiencies in the script itself:
 package require -exact Tcl 8.0

DKF: It's been discussed, though perhaps not recently. DGP is the person to chat to about this; he has a pretty deep understanding of versioning complexities.

One other quick suggestion would be the deployment with the new core of a sort of Upgrade Assistant for old code. Since things like this *never* work 100% and very often not without some tweaking of the new code, calling it a translator would be foolish. The "Assistant" would take an older TCL script, and make several levels of suggestions for code changes:

1. TCL9 Syntax Compliance 2. Obsolete command replacement 3. TCL9 Technology advances

Thus, the programmer can obtain suggestions for at the very least running his script in a TCL9 interpreter without compatibility options. More experimental suggestions could also be made, encouraging the programmer to use new TCL9 commands and constructs. Either way, a potential win for both the developer and TCL9.

This page is to discuss what we'd like to do to Tcl if full backward compatibility was not a factor.

XX I wonder, then, if i should move my suggestions above to another page, possibly the TCL9 wishlist. I entered them here because it was the original comments about the situation being resolved with a "fork of a kind." I agree with the principle of keeping the best, fixing the worst, and refining the implementation of the language into an elegant implementation of the "TCL idea" without being bound by design decisions of the previous generation. I view my suggestions as a way to make possible what is actually being discussed on this page without compromising the smooth transition path and reliable operation that *needs* to be in effect. So in my opinion, it is relevent to this page, though I am probably biased so I'll leave this bit of housekeeping to the judgement of others.

DKF: I find it fascinating that someone was asking for a regularized grammar in one breath, and asking for an infix assignment operator in the next. Fascinating, but not compatible.

Larry Smith: -> is not an infix operator, it is a command separator - exactly like ";" as far as the parser goes. When it is encountered, the interpreter does its "thing" for that command the way it would for ";" but when it is done it scans past the -> to find a (or several?) variable names and assigns the result it is holding in hand to it (them).

Going over some of the points in more detail:

  • "lindex" vs "list index": I agree and wish it could be fixed without massive incompatabilities. :^(
  • cfile: Are you aware of TIP#208? Larry Smith I am now. I like it.
  • The widget pathname "wart": This wart is useful to me ;^) so expect it to be useful to many others too. Namespace paths are more warty though.
  • Arrays vs dicts: I know, and I'd love to do something on this. No guarantee for 8.5 though.
  • Expr and REs: You're way wrong here. These are embedded little languages, and hence building on one of Tcl's key strengths. And $ as EOL-constraint is holy for REs.

Larry Smith I understand the concept of embedded little languages, it is the means for accessing the one for computation that is irritating. It is far too common to make people do it the way it is done now. And as for the holy $-sign - "$" means "substitute". Using it to also mean "eol" - even in a little embedded language - makes it a wart. A painful wart.

DKF: Yeah, but when every other RE engine uses $ to mean EOL, it's time to not be too precious.

  • Macros: You can do it now. It's a touch complex, but the code to do it (written by SS) is on the wiki somewhere.
  • Infix -> for assignment: How is the Tcl interpreter to know that a particular -> argument is an assignment operator and not some random string? Larry Smith The same way it knows ";" is a command terminator. (And let is a very complex command indeed; it does all sorts of things in different circumstances. Larry Smith let's exact syntax is open for discussion - yes, the example implementation is complex, but can still compile down to bytecodes. If any of the above ideas is accepted then let can become simpler - using ( ) for expressions means let really doesn't need the "arithmetic assign", and the "evaluation assign" := is a wart of a sort (cohort, cohort =) all by itself. The basic objective here is to make assignments more concise.)
  • "Real" comments: This can be done right now using a customized source command, and the problem with } characters is simply because they are in a quoted context. You might as well ask why you can't have comments in C strings...

Peter Newman 6 January 2005: Interesting DKF. I also take the view that for Tcl to move forward, we've got to start again. My basic view on how Tcl should be structured is on If I were to complain - and your response to that was bang on the money. The interconnections are the problem.

But they're not a problem if we start of the assumption that backwards compatability is not required. Take set, which you complained about above. It's surely a good example of (the evil) interconnectivity. Because one can set global variables, local variables, lists, arrays and god knows what else. So obviously set has to know about all these different data types.

But in my vision, every data type, and every command (or set of commands) should be a completely separate, stand-alone entity - which the script level programmer can include or exclude from their particular version of the language, as they see fit.

But if we throw away the requirement for backward compatability - we can easily eliminate sets interconnection problems.

Taking a hint from Tk:-
 frame .myFrame
 .myFrame configure
 .myFrame cget
 etc etc

we could change set to a method. In other words:-
 Global myGlobalVarName
 Local  myLocalVarName
 List   myListVarName
 Array  myArrayVarName

(to define the variables required) - and then:-
 myGlobalVarName set   # To get the variables existing value
 myLocalVarName  set
 myListVarName   set
 myArrayVarName  set

 myGlobalVarName set newValue       # To set the new value
 myLocalVarName  set newValue
 myListVarName   set newValue
 myArrayVarName  set newValue

In other words, we treat data types exactly the same as Tk treats widgets. And set becomes just a method supported by any data type that chooses to support it. There's no more interconnectivity. It's also a technique that Tcl/Tk programmers are familar with (not to forget those familar with the more modern object-oriented languages). And it hopefully eliminates the clumsiness of set, that you mentioned above.

Another great thing about it, is that we could easily implement the syntax - and thus the new Global, Local, List and Array etc data types that might appear in our new Tcl/2 - using current high-level Tcl.

Let's code it up in high-level Tcl (to test and refine the syntax). And make each data type a separate stand-alone package. And IMHO, we could very quickly and easily sort out lists, strings, files, and the other basic core components of Tcl, in similar fashion.

Then it's just a case of:- 1) Sorting out the interpreter; 2) Re-coding the above in C, and then; 3) Ditching everything else (the traditional Tcl bits we don't want, and don't convert over). Then we have Tcl/2. I like the name too. Cheers.

Note also that the syntax suggested above is compatable with the current Tcl syntax. More specifically, both could exist in the same (existing) interpreter with no conflict. Making it easier for people to migrate from Tcl to Tcl/2.

Larry Smith It took me a bit to understand the direction this idea was proposing, but I think I like it. It does, in fact, regularize the language, which I think is important here. See the 3rd paragraph above for more on why I am pushing this.

RS: Note however that global or local variables can be arrays or lists, and lists can be reinterpreted as strings and vice versa... My hopes for a newer Tcl would be that it's more systematic, and more powerful, but not more complicated than what we have now :)

Larry Smith And that is precisely the objective. And I'm prepared to turn any sacred cows (even my own) to ground chuck to accomplish it. ;)

DKF: Oh well, be sure to let us know once you've got the first prototype going...

Zarutian 06. jan 2005: I like your idea, Peter, regarding treating variables as objects/procs as it eliminates the need for the "trace * variable" command. And arrays can be easily dropped by useing dictionaries instead. But what about useing the Lisp way of doing artimetric, that would make implementions of bignum, matrixes and other such easier, no?

Larry Smith It is certainly more consistent, the problem is it is too alien to new users, especially those without prior programming experience. I would personally find rpn to be a better choice, too, but I rejected it out of hand for the same reason. We are, more or less, stuck with "normal"-looking arithmetic expressions. We must make the best of it without letting the seams show too much.

Zarutian 06. jan 2005: hmm, what about providing both expr and Polish notion arithmetic? Implementing the expr in the Polish notion one would solve a problem I currently don't have any name over.

Lars H: For Polish notation arithmetic, see TIP #174 [1].

Peter Newman 8 January 2005: See Unified Programming Language.

escargo 4 Jan 2006 - See Parsing Polish notation.

LES: Eh, my opinion shouldn't matter since I am not a professional programmer, but I still feel compelled to butt in because the whole discussion strikes me as quite an odd one. It all sounds to me like words coming from the mouth of someone who dislikes Tcl and is forced to work with it for some reason. I started programming with PHP and Perl. I didn't like the set command when I first saw it, but that was because I didn't understand it. Now I do and I really like it. Do you want to assign values with = or ->? Why, I have it right here, available every day with unknown. All my programs source a Tcl customization file (with unknown and many procs), including Tkcon. I can use that hack at any time I use Tcl. But I don't really use that kind of assignment instead of set that much. I've learned to appreciate set. And if you hate expr, why, I also have plenty of interesting expr tweaks in my custom Tcl file. Replace expr with (expressions in parentheses) altogether? Eh! I do that every day. Tcl is so generous and lets us do all that and lots more.

Besides, IMVHO, complaints about syntax are very common, natural and acceptable from people who come from other languages. But if you really use Tcl often and get used to it, like I have, geez, changes are annoyances, not relief. Maybe I am blind and missing something very big here, but I find it hard to avoid the impression of a bunch of people trying to impose on everyone specific styles or preferences that Tcl already allows (with customizations), but these people don't find them good enough simply because the way they are they can't be called "a standard" and imposed onto all and sundry.

Larry Smith I have used Tcl for years, and I continue to use it, warts and all, because it is a very, very useful tool. So useful, in fact, that I'd like to see it adopted by others - lots of others. There are probably a lot of ways to do that, but my point of view is to try to regularize the language and to deal with the issues most likely to trip up others coming to the language.

Using unknown to add features works very well, but it slows the interpreter down because it cannot be byte-compiled.

I see Tcl as very close to perfection, just a few steps away from being a tool for both novice and professional, as powerful as Lisp. Ousterhaut did not have such a thing in mind when he coded up his scripting language, but this is where we ended up. Now, I can deal with set rather than = or ->, but I think it is inelegant and verbose. Expr also adds verbiage. Many existing constructs show up over and over again in the newsgroup: Why is the interpreter barfing on 08? Why can't I pass an array to a proc? Why do we have both dicts and arrays?

I frankly think we really have only two fundamental problems here: firstly, we have worshipped too often at the shibboleth of backward compatibility, and secondly, we are so used to our cherry kool-aid we won't even try the raspberry. Yes, it is quite possible to internalize all the warts and write huge amounts of code with less effort - and I do. But that doesn't blind me to the fact that the warts are still there.

This is certainly not a case of "words coming from the mouth of someone who dislikes Tcl" - quite the contrary. If I didn't like Tcl I wouldn't bother.

ds: my thoughts about "Grand Unified TCL."

I'd focus on regularizing the language, and eliminating special cases. "::" is particularly warty. All paths (namespace, filesystem, etc) should be lists rather than the assorted hodge podge that exists now. This would simplify path manipulation code as well as having other benefits. In particular, things can be made much more "lisp like" without losing that which makes TCL useful. (everything is a string.)

Upgrading set and unset would be easy, the first argument is no longer a namespace path, instead it is a "var path". A var path is the TCL list formatted equivalent to what is currently a namespace path, eg "::foo::bar" becomes {foo bar}. This means that the first argument to set must always be a list. To set the empty var you would use:
        % set {{}} "empty"

instead of:
        % set {} "empty"

To make the new namespace scheme work fully, "$" magic would have to be reworked significantly to allow list formatted paths. I'd favour:
        "${?elem? ?elem...?}" - path reference without substitution. equivalent to "set {elems...}"
        "$(?elem? ?elem...?)" - substitutes path elements. equivalent to "set [list elems...]"

I'm also in favour of:
        "(?elem? ?elem...?)" - as a shortcut for what is currently [list]

The "unquoted" form of "$", eg, "$name" would have changed behaviour in a GUT. "name" must not start with "{" or "(" or it will be considered a path reference, and the form can only refer to a var path consisting of a single element. The current behaviour for a trailing "(subscript)" would be removed as a Grand Unified TCL does not have an equivalent to array variables with their special syntax, leaving "(" and ")" free to have the "list substitute" meaning.

Being able to write "($foo bar)" instead of "[list $foo bar]" earns us some Lisp brownie points and makes things more readable. This approach leaves list free for an orthogonal list maniuplation ensemble! :^)

        % set baz "world"
        % eval [list some_func $a b [list hello $baz]]

        % set baz "world"
        % eval (some_func $a b (hello $baz))

The meaning of "(" and ")" would also correlate nicely with the meaning of "$(...)". In both cases the "(" and ")" mean "substitute as a list".

The "everything is a string" mantra should be taken to it's logical conclusion, procs and namespaces having a string representation is a Good Thing. If done correctly we could do away with namespace array and dict in one go and replace them with a general purpose map/tree structure that has a string representation.

This could be achieved by extending eval to accept a var path as it's first argument. Equivalent functionality to the current eval would be accessed via "eval {} ?code...?" rather than "eval ?code...?". The var path {} means "the current call frame." This makes the namespace command enitrely redundant. Namespaces woule be ordinary vars that you can "unset."

currently we have:
        % namespace eval foo {
                proc bar {x y} {return "$x$y"}
        % puts $foo
        ... no such variable, etc

vs. the Grand Unified TCL way:
        % eval foo {
                proc bar {x y} {return "$x$y"}
                set x 0
                incr x
                eval qux {
                        set y 5
                        set z 2
        % chan puts $foo
        bar {{x y} {return "$x$y"}} x 1 qux {y 5 z 2}

dict becomes somewhat superfluous because you can do:
        % chan puts ${foo qux y}
        % set x bar
        % chan puts $(foo $x)
        {x y} {return "$x$y"}

        % for each {key value} ${foo qux} {puts ($key $value)}
        y 5
        z 2

a shortened form of proc (with the name omitted) would be useful, it would simply return a proc list representation:
        % set (foo $x) [proc {x y} {return "$x . $y"}]

which is ultimately just sugar for:
        % set (foo $x) {{x y} {return "$x . $y"}}

This earns us extra lisp-points.

For the lunatic fringe, local variables can be used for procs and namespaces:
        % proc foo {} {
                eval foo {
                        proc bar {} {return "yikes"}
                return [{foo bar}]
        % chan puts [foo]

The empty {} var path would refer to the current call frame, (which is just a normal var, albiet with no name) and like all vars has a string representation. This is a little like "this" from Javascript:
        % chan puts ${}
        {} {empty} baz world x bar foo {bar {{x y} {return "$x . $y"}} x 1 qux {y 5 z 2}} key y value 5

This would allow for all sorts of upvar abuse:
        % proc foo {} {
                upvar {} caller
                set {caller qux} "howdy"
        % foo
        % chan puts $qux

        % proc fanout {vars value} {
                upvar {} caller
                for each var $vars {
                        set (caller $var) $value
        % fanout {x y z} "42!"
        % chan puts $x
        % chan puts $z

I agree that functionality equivalent to trace would be difficult to implement. If someone feels the need to do so, it should be named "come from" in honour of Intercal >:)