Summary edit
A word is a string (of characters).In Tcl, at the script level, all values are "words". The phrase "Everything is a string" is used to refer to the implications of this fact. "Everything is a word" would probably be more precise.See Also edit
Description edit
"Everything is a string" refers to the fact that the only "type" Tcl syntax specifies is word. Other languages have baked-in syntax for types such as number, identifier, keyword, list, dictionary, etc. Tcl only provides word, i.e., a string. The syntax specifies three kinds of substitutions that happen to word values at runtime: variable substitution, command substitution, and backslash substitution. Since variable values, procedure results, and interpreted backslash sequences can be substituted into words, all these things must be strings. This design provides a great deal of flexibility and power, and gives Tcl a unique flavour that can take some time to understand and appreciate. Those coming from other language styles might at first feel constrained by things "missing" from Tcl, but climbing the Tcl learning curve often leads to realization that these "limitations" are actually strengths.This page lays out the implications of the "everything is a string" paradigm, as well as the details of working with this paradigm when using the C API, where both the string representation and the structured representation of values are exposed.Everything is a string, but not just a string. Each command is free to interpret its arguments (which are words) in arbitrary ways. Here are some examples:- [Expr]
- interprets words as numbers, operators, or strings.
- [lassign], [lindex], [linsert], [lmap], [llength], [lrange], [lreplace], [lsort]
- interpret some or all of their arguments as lists.
- [lappend], [lset]
- interpret some of their arguments as the names of variables whose values are lists
- [lassign]
- interprets some of its arguments as the names of variables to create and assign values to
- [eval]
- interprets its argument(s) as a script
tcl chatroom 2013-04-30:DGP: I find it useful to think of "types" in Tcl as being subsets of the value universe. So it doesn't make sense to ask what type a value is. Instead, you can identify those types where a value is a member, and where the value is not a member.CMcC: Right, subsets, not partitionsDGP: "Everything is a String" is just the trivial observation that all values are in the same value universe.
Implementation edit
For performance, Tcl internally tracks how the value was most recently used, and stores the relevant internal format (often a C data structure) alongside the string representation. Tcl keeps the string representation and the internal format synchronized. Thus, e.g., a list can be modified either by changing its string representation or by using a command like lappend, which works directly with the internal format version of the value. For performance, Tcl only updates one of the representations when that particular representation is needed, and the other representation is newer.The internal format of a value is not exposed at the script level, does not have any semantic impact on the language, and is just an implementation detail. The internal format simply has no purpose at script level. Tcl handles all the messy details of tracking and synchronizing the script-level values and the internal format(s) of those values so that the user can work in the "seamless" world of words. A user of Tcl's C API will gain an appreciation for the way Tcl values are handled at the C level, each one having both a string interface and a structured interface.The Magic of EIAS edit
Every kind of data is readily accessible: When some new data type is introduced in a language like C or Java, it usually has to come with its own library for printing values, doing I/O, initialising variables, and often even for copying values. In Tcl all that is immediately available, since it can be done with strings and the new data type is represented using strings. Everything just works!Strings are general: The standard computing models are all readily expressible in terms of strings. What is currently on the tape of a Turing machine is a finite string of symbols. Lambda calculus is manipulation of strings. Post [1] production systems models computability by replacing parts of strings with other strings.EIAS the Misunderstood edit
When Java/Perl/Python/etc. programmers say everything's a string, they are not singing the praises of Tcl. Instead, they are criticizing the language because they believe that complex algorithms requiring data structures are not going to be possible in Tcl. Those who level this criticism, however, usually make the mistake of assuming that the lack of some functionality available in their language of choice is a disadvantage. Perhaps their language has made the mistake of exposing too many possibilities. Witness the problem of aliasing
in C. EIAS makes aliasing a foreign concept in Tcl. The data structures they think Tcl is missing are simply expressed in another way, but that is difficult to see at the outset.However LV would like to point out that the true philosophy of Tcl says Do all that you can in Tcl - but then, do the rest in C/assembly/whatever and create glue and handles to it for Tcl.Note: unlike the vast majority of dynamic languages where there are at least two data types that can be distinguished. Hence Tcl is the [Totally untyped language].DKF: But also see tcl::unsupported::representation, which can peek behind this veil. If you use this, feel dirty!AMG: In testing and debugging high-performance applications I use this to confirm that I'm avoiding shimmering.Strings and Handles edit
In Tcl, there are some strings that are used as tokens that identify state or other information being kept by the interpreter. File descriptors are an example of this. The file descriptor itself is a string, just like any other Tcl value. It is used used by the interpreter to access the information about some file instance. Tk window info is another example.Subtleties edit
shimmeringSarnold: There are a few (IMO) cases where this philosophy do surprise, notably:- [list [list $x]] need not differ from [list $x], due to the string representation of lists.
- the body of a procedure cannot distinguish types of its arguments... getting type information is hard*
- [add other corner cases we can think of...]
Misc edit
Donald Porter remarked in the Tcl chatroom: More precisely, every value has a string representation. Tcl arrays are not values; they are special types of variables.lvirden: I guess there are other things that fit into the same category as arrays - created items like procs, and in tk all sorts of widgets, etc.aku: But most have a way to serialize them into a value, and back (array set|get, proc|info body|arg|default)kennykb: And the ones that don't have natural serialization generally are managing external resources (channel handles are the most obvious example)Shin The Gin: If everything was a string, then one could easily save the whole runtime environment to a file and restore it later.RS: likes the ditty "I'm not afraid of anything, if everything is a string". In fact, the Tcl mantra often relieves fears of complexity: anything that can be brought to the prototype "string in, string out", can be nicely done in Tcl. Arabic, Korean? Of course, everything is a Unicode string! Geographic mapping? Just give me a string with the latitudes, longitudes, and whatever other data, and presto - Tclworld. Images can in many ways be rendered as strings (XBM, PNM...); one pretty intuitive way is in strimj - string image routines.
Todd Coram: Data typing is an illusion. Everything is a sequence of bytes. Call 'em ints, floats, symbols, strings, whatever. Tcl exposes both code and data to the user as sequences of bytes (called strings). This is Tcl's choice of abstraction. And its quite a powerful choice IMHO.BR: Hm, isn't it actually like that a string is a sequence of characters, and bytes (in Tcl) are just characters with the values 0 - 255? I think that's the model of binary data in Tcl. IOW bytes are not fundamental in Tcl, but characters and strings are.Except that characters could be Unicode instead of ASCII.
2003-05-13: Recently, Bruce Eckel in Strong Typing vs. Strong Testing
, and Robert C. Martin in Are Dynamic Languages Going to Replace Static Languages?
talk about weak typing and dynamic languages.CL thinks these two make mistakes, but hasn't time now to explain more. In any case, yes, these are good noteworthy references.escargo 2003-05-13: Another way that everything is a string can be an issue is where a string representation can only be an approximation of what is being represented. The main instances of this that come to mind are floating point numbers (for which there are already some existing wiki pages). There may be other examples as well.What? There is no reason a string can't fully represent a floating point number. And Kevin Kenny has a TIP in the works to ensure that Tcl does always indeed achieve exactness in this case - Roy Terry, But really, it seems a waste of time to make fine points about "everything is a string" which is merely a programmer's cliche and doesn't begin to express the power of Tcl.escargo 2003-05-14: Sorry; there was a slip of the finger there. I said "floating point" and what I meant was "real".Lars H 2003-05-15: Real numbers are beyond what is computable. The number of possible outputs from a Turing machine (and thus the set of real numbers which one can specify in any way whatsoever) is merely countable, whereas the set of all real numbers is uncountable. But this view does provide an answer to why "Everything is a string" is such a powerful idea. Many languages (most notably C) take the approach that "Everything is a number (with native machine representation) or some fixed aggregation of such numbers", but all such representations are limited. In order to support general strings, it is necessary to venture into some scheme of dynamic memory allocation and pointers to allocated objects. The string, on the other hand, achieves the maximal generality of a Turing machine (the tape always has an obvious representation as a string) and thus if something wouldn't be representable as a string, it wouldn't be computable either.
escargo 2003-05-16: What would it take to make Tk widgets serializable? I was thinking about xml2gui and wondering what it would take to make a widget produce an XML description of itself. Further, what would it take to have widgets that contain other widgets produce XML of themselves? This would seem to me to be one useful goal.Another goal would be the converse, what XML would need to be used to create all the Tk widgets (and pack them the right way, etc.)? (This would be a suitable storage format for GUI Building Tools.)jcw 2003-05-17: There already is a serialized form of Tk, able to cope with any complexity of widget hierarchies: the Tcl script that creates them.jmn: Yes, but is there a canonical form for it?escargo: I am reminded of one-way hashes. You can have a function that given an input can produce a hash value that cannot be used to derive the original input. Just because I have a widget does not make it clear to me that I can derive in an algorithmic way Tk and Tcl code to recreate the widget. Perhaps this is something for the Tk 9.0 WishList, but I would certainly like to see whatever changes would be necessary to allow this (if it's practical at all).
jcw 2003-05-17: While EIAS is indeed a wonderfully powerful and flexible abstraction, I'd like to point out that LISP'ers and Scheme'rs have a very similar set of self-contained mechanisms at their disposal, based on "everything is made up of cons cells" (it's more of a mouthful, though...). IMO, "strings" as convention to represent data in a certain way is not inherently different from other representation choices - one could even use neurons and synapses if that were practical. What EIAS does imply is "code is data" and "data can be used as code", which is why one can play so many tricks in Tcl (and in LISP).NEM 2005-07-25: replying to this a couple of years too late... The difference with Lisp is that cons cells aren't universal; as I understand it, some basic data types like numbers are not represented as cons cells. You could build up everything from cons cells, in a similar way to building everything from set theory, but Lisp doesn't, and so you can't treat an integer as a list. In Tcl, though, the string is the universal medium of representation, so I can treat an integer as a list (of one element).
FW: Come to think of it, what are some other typeless languages in the "everything is a string" sense - RS has already submitted and documented thoroughly the antique TRAC in Playing TRAC.JE: MUMPS (aka "M")[2] is another EIAS language. Forth and BCPL are also typeless, but there the fundamental type is a "cell" or native machine word instead of strings. (BCPL seems to be extinct, but Forth and MUMPS are still around.)
Is everything a list?
MS 2004-12-04: Relevant comments moved here from How Tcl is special''
Everything is a string edit
Wrong!Tcl has the data type, array, and in 8.5 the dictionary data type was introduced. Both of these are defiantly not strings. And, what about procs and namespaces? These are not strings either. People that say everything in Tcl is a string do not know what they're talking about.JE: Actually, dictionaries are strings, just as much as lists are. But you're right: there are plenty of important entities in Tcl that are not strings -- for example, open file channels, Tk widgets, variables (both scalar and array), encodings, and interpreters. But in Tcl, all of those things are accessed by name, and the name is a string. That's why, for example, you can't pass an array to a procedure (the array is not a string), you have to pass the array name instead (the array name is a string).MS: Please refer to the precise statement and discussion at the top of this page.If "everything is a string," then how can you tell what's an object?escargo 2005-07-23: That's what I woke up to this morning. I was thinking that Tcl lacks what I have seen called a "meta-object protocol," something that allows some object-oriented languages (like Smalltalk) to do some useful operations on objects and classes. I like Snit because of what it allows me to do to compose objects using delegation. However, if I'm operating in Tcl (or in Tk) and I have an identifier, how can I tell if its value represents an object from an object system like Snit (or any of the other object systems added onto Tcl). And if it is an object, how can I tell which object system it is an object in, so that I can guess what behavior is has (which functions it understands or implements)?The only way I can see something like this working is if there were some agreed-upon standard for names (or references) such that a classifier (say [string is object ...]) could return a yes or no answer.Even better would be one that could tell which object system implemented the object (say [string is objectsystem ...]).This might be possible in a system like Jim if the Jim References encoded the object system and whether something was an object.Even without add-on object systems, it would be nice to be able to determine if there could be [string is command ...], but that in some respects defeats the purpose of unknown. (I'm still fuzzy from sleep, so maybe there is something that does this already, otherwise how would unknown get called?)Lars H 2005-07-24: I think the best way of pointing out how your analysis here is wrong is to point out that
- Tcl has no whattype command;
- What design error did you make that made you ask that question in the first place? Where did you (or someone else) throw away the information that you now find you need?
type: valuecan be partially evaluated (or partially applied), to yield a new function specialised for that interpretation of that value. This can be optimised and can enforce a type abstraction.Lars H: Well put. The part about late "commitment" puts a name on something I think is very important in understanding the strengths of Tcl. I'll see if I can find a good place to put this idea for easy access.DKF: Actually, in 8.6 there is tcl::unsupported::representation, which includes type cache information in its result. Don't use it for anything other debugging. Or if you do, feel very naughty. It is very bad style to write code that depends on types (albeit inevitable for solving certain types of problem in the support of Java and JSON correctly, alas).
SYStems 2005-07-23: Those are not very complete thoughts, but. I think to really answer and understand the idiom everything is a string, we need to identify the context, or perspective.A Tcl script is a series, a sequence of statements, each statement receive input
- A string.
- An event.
- produce output.
- cause a side effect.
- produce output and cause side effect
- Raise an error
: [LV]: Uh - maybe that is how you _want_ it to work. But since I can say
set abc 123 : then set doesn't just store another tcl command's output...So we can say that everything inline- a tcl script, anything that can be passed around, a tcl script memory, a tcl script internal environment, must be a string. Or in other words, we can say, that Tcl introduces a new in-tcl context, where everything must have a textual representation.Anything outside a Tcl script, outside the in-tcl context, for example, a command side effect, or an external environment, can be not a string.
NEM 2010-12-15: One aspect of EIAS that is worth consideration is how it has kept Tcl "pure" in some sense. Part of EIAS that is little mentioned is that Tcl's strings are immutable. This means that Tcl's value space is purely functional, in the Haskell sense. All side-effects are confined to the shadow world of commands and variables and other second-class entities. What this means is that Tcl now possesses some very powerful purely functional data-structures that are somewhat better than those available in other languages. For instance, I cannot think of another popular language that supplies O(1) purely functional dictionaries and lists (arrays) out of the box (or even in the library). Not to mention efficient unicode and binary strings.
