[Discuss here why this philosophy makes so many things easy]
- Every kind of data is readily accessible
- When some new data type is introduced in a language like C or Java, it usually has to come with its own library for printing values, doing I/O, initialising variables, and often even for copying values. In Tcl all that is immediately available, since it can be done with strings and the new data type is represented using strings. Everything just works!
- Strings are general
- The standard computing models are all readily expressible in terms of strings. What is currently on the tape of a Turing machine is a finite string of symbols. Lambda calculus is manipulation of strings. Post [1] production systems models computability by replacing parts of strings with other strings.
[Discuss here why this philosophy sounds like a curse word in many programmer's mouths]When Java/Perl/Python/etc. programmers say everything's a string, they are not singing the praises of Tcl. Instead, they are criticizing the language because they believe that complex algorithms requiring data structures are not going to be possible in Tcl.However LV would like to point out that the true philosophy of Tcl says Do all that you can in Tcl - but then, do the rest in C/assembly/whatever and create glue and handles to it for Tcl.[Discuss here why this philosophy makes so many things hard - shimmering, etc.] Sarnold - There are a few (IMO) cases where this philosophy do surprise, notably :
- [list [list $x]] need not differ from [list $x], due to the string representation of lists.
- the body of a procedure cannot distinguish types of its arguments... getting type information is hard*
- [add other corner cases we can think of...]
- [DKF: But also see tcl::unsupported::representation, which can peek behind this veil. If you use this, feel dirty!]
RS likes the ditty "I'm not afraid of anything, if everything is a string". In fact, the Tcl mantra often relieves fears of complexity: anything that can be brought to the prototype "string in, string out", can be nicely done in Tcl. Arabic, Korean? Of course, everything is a Unicode string! Geographic mapping? Just give me a string with the latitudes, longitudes, and whatever other data, and presto - Tclworld. Images can in many ways be rendered as strings (XBM, PNM...); one pretty intuitive way is in strimj - string image routines.
Data typing is an illusion. Everything is a sequence of bytes. Call 'em ints, floats, symbols, strings, whatever. Tcl exposes both code and data to the user as sequences of bytes (called strings). This is Tcl's choice of abstraction. And its quite a powerful choice IMHO.-- Todd CoramHm, isn't it actually like that a string is a sequence of characters, and bytes (in Tcl) are just characters with the values 0 - 255? I think that's the model of binary data in Tcl. IOW bytes are not fundamental in Tcl, but characters and strings are.-- BRExcept that characters could be Unicode instead of ASCII.
13-May-2003: Recently, Bruce Eckel in [2] and Robert C. Martin in [3] talk about weak typing and dynamic languages. CL thinks these two make mistakes, but hasn't time now to explain more. In any case, yes, these are good noteworthy references.
escargo 13 May 2003 - Another way that everything is a string can be an issue is where a string representation can only be an approximation of what is being represented. The main instances of this that come to mind are floating point numbers (for which there are already some existing wiki pages). There may be other examples as well.What? There is no reason a string can't fully represent a floating point number. And Kevin Kenny has a TIP in the works to ensure that Tcl does always indeed achieve exactness in this case - Roy Terry, But really, it seems a waste of time to make fine points about "everything is a string" which is merely a programmer's cliche and doesn't begin to express the power of Tcl.escargo 14 May 2003 - Sorry; there was a slip of the finger there. I said "floating point" and what I meant was "real".Lars H (15 May 2003): Real numbers are beyond what is computable. The number of possible outputs from a Turing machine (and thus the set of real numbers which one can specify in any way whatsoever) is merely countable, whereas the set of all real numbers is uncountable. But this view does provide an answer to why "Everything is a string" is such a powerful idea. Many languages (most notably C) take the approach that "Everything is a number (with native machine representation) or some fixed aggregation of such numbers" , but all such representations are limited. In order to support general strings, it is necessary to venture into some scheme of dynamic memory allocation and pointers to allocated objects. The string, on the other hand, achieves the maximal generality of a Turing machine (the tape always has an obvious representation as a string) and thus if something wouldn't be representable as a string, it wouldn't be computable either.
escargo 16 May 2003 - What would it take to make Tk widgets serializable? I was thinking about xml2gui and wondering what it would take to make a widget produce an XML description of itself. Further, what would it take to have widgets that contain other widgets produce XML of themselves? This would seem to me to be one useful goal.Another goal would be the converse, what XML would need to be used to create all the Tk widgets (and pack them the right way, etc.)? (This would be a suitable storage format for GUI Building Tools.)17may03 jcw - There already is a serialized form of Tk, able to cope with any complexity of widget hierarchies: the Tcl script that creates them.jmn - Yes, but is there a canonical form for it?escargo - I am reminded of one-way hashes. You can have a function that given an input can produce a hash value that cannot be used to derive the original input. Just because I have a widget does not make it clear to me that I can derive in an algorithmic way Tk and Tcl code to recreate the widget. Perhaps this is something for the Tk 9.0 WishList, but I would certainly like to see whatever changes would be necessary to allow this (if it's practical at all).
17may03 jcw - While EIAS is indeed a wonderfully powerful and flexible abstraction, I'd like to point out that LISP'ers and Scheme'rs have a very similar set of self-contained mechanisms at their disposal, based on "everything is made up of cons cells" (it's more of a mouthful, though...). IMO, "strings" as convention to represent data in a certain way is not inherently different from other representation choices - one could even use neurons and synapses if that were practical. What EIAS does imply is "code is data" and "data can be used as code", which is why one can play so many tricks in Tcl (and in LISP).25July05 NEM - replying to this a couple of years too late... The difference with Lisp is that cons cells aren't universal; as I understand it, some basic data types like numbers are not represented as cons cells. You could build up everything from cons cells, in a similar way to building everything from set theory, but Lisp doesn't, and so you can't treat an integer as a list. In Tcl, though, the string is the universal medium of representation, so I can treat an integer as a list (of one element).
FW: Come to think of it, what are some other typeless languages in the "everything is a string" sense - RS has already submitted and documented thoroughly the antique TRAC in Playing TRAC.JE: MUMPS (aka "M")[4] is another EIAS language. Forth and BCPL are also typeless, but there the fundamental type is a "cell" or native machine word instead of strings. (BCPL seems to be extinct, but Forth and MUMPS are still around.)
Is everything a list? edit
LES on April 27, 2004: I am a newbie so what do I know. But the more I use Tcl, the more I am convinced that everything is a list. And that's one of the best features in Tcl.FW: Wrong ;)
"{", for example, is not a list. Your real point is probably that, say, numeric or other simple values can be treated as one-item lists: [lindex 123 0] == 123. Which, when you think about it, is really because of everything being a string!Lars H: Actually, "{"is a list (whose only element is a left brace), but {is not a list. Although when you write them in a command you need to quote them, and "{" is one way to quote {, so llength "{"rightfully produces the error "unmatched open brace in list". The most transparent way to quote "{" is probably \"\{\"FW: Right, I meant "{" as in [llength "{"], not that the quotes would be part of the value.LES: I don't think it is "transparent"... % set x \"\{\""{"% llength $x1
% puts $x"{"Hmm... I was expecting to get { instead of "{"Lars H: Time to reread the endekalogue? Of course the above sets x to the three character string quote, left brace, quote (set said so itself). And of course it doesn't matter that you've treated the string as a list when you later ask puts to print it; everything is a string, and merely using that string doesn't change it. You do however get
% puts [lindex $x 0]
{
% puts [lrange $x 0 end]
\{LES So right. But I still say that it is not "transparent". The brace is a special character in Tcl and will often boggle minds if it's part of that string. But now we're digressing.Is everything a list (2)? edit
KJN 2004-11-04The endekalogue does not define lists - it leaves it to commands to decide whether to interpret an argument as a list (and, implicitly, to define what is meant by a list). It appears that the list commands will automatically interpret any string as a list (unless the string has unmatched braces, see above).I would like to express an arbitrary (nested) proper list in Tcl; but how do I distinguish"an atom"from
[list "an" "atom"]It is true that I can write
{"an atom"}but it appears that list commands cannot distinguish this from[list [list "an" "atom"]]In short, whenever you use list commands to look at the element "an atom", it appears that they will always regard this as a list of length 2, not length 1.Is there any way to fix this, rather than to work around it by:(a) substituting spaces in "atomic" strings;(b) recording the type in the expression, either only in cases where list commands would get it "wrong" (cf the 4 examples above):
[list ATOM "an atom"] [list "an" "atom"] [list [list ATOM "an atom"]] [list [list "an" "atom"]]or always (which by warning me not to use list commands on the "atom", also saves me the job of escaping any braces inside it):
[list ATOM "an atom"] [list [list ATOM "an"] [list ATOM "atom"]] [list [list ATOM "an atom"]] [list [list [list ATOM "an"] [list ATOM "atom"]]](c) since "ATOM" is starting to look like a tag, give up on "simple" lists and use XML instead;(d) give up on Tcl, and use Lisp insteadRHS 04Nov2004There is no difference between
"an atom"and
list "an" "atom"I think what you're looking for is more of:
% lindex [list "an" "atom"] 0 an % lindex [list "an atom"] 0 an atom
KJN 2004-11-05Tcl does:
% lindex [lindex [list "an"] 0] 0 anSo what I'd like is
% lindex [lindex [list "an atom"] 0] 0 an atombut what I get is
% lindex [lindex [list "an atom"] 0] 0 anAs you say: there is no difference between
"an atom"and
list "an" "atom"Both are list of length 2.RHS 05Nov2004Ok, let me go about this another way then... Why is it you want this behavior? I think the problem is that you want you be able to say "this is a list" and "this is not a list". In Tcl, there's no such concept. You can only say "I want to treat this like a list" and "I want to treat this like some-other-thing-that-isn't-a-list".What are you trying to do that this isn't sufficient for?KJN I'd like to store tree-structured data, with the possibility that a leaf can be an arbitrary string. If I build up the tree using list, and then inspect it with llength, lindex and so on, I hit the problem that my intended leaf is itself treated as a list, and may have length not equal to 1. There are obvious workarounds, like the ones I've mentioned above. I wondered whether it is possible to instruct the list commands that a string such as "an atom" is to be treated as length 1, so that I could use the native Tcl commands with no workarounds. A fictitious way to do this might be
list -norecurse -- "an atom"RS: As lists can always go via string rep and back, this hidden feature would not be stable. But how about separating the domains of "text" (any string) and "child nodes" (a list)? E.g. to model a parent node with two children,
set tree {"a parent" {{"a child" {}} {"another child" {}}KJN: yes, this is the same in spirit as my[list [list [list ATOM "an"] [list ATOM "atom"]]]workaround. This kind of solution is the best I can think of, because it makes maximum use of the native list commands and data structures, and avoids the need to process the leaf strings to make them both list-safe and length 1. The result
% llength "an atom" 2surprised me, but I suppose it is a consequence of the encoding that Tcl chooses when it expresses a list as a string, plus the requirement (which you mention) of invariance when a value is converted from list to string and back again.So, to summarise Is everything a list? (I hope I understand this now, please edit any errors)
- nearly every string is also a valid list - the exceptions are strings that contain unmatched braces or quotes. If such a string is passed to a list-processing command when a list is expected, the command will throw an error.
- if a string that is not a valid list is an argument of the list command, the return value of the command is a valid list, that has escapes and braces inserted to ensure its validity. These escapes and braces are visible in the string representation of the list, but are removed if the list item is extracted (e.g. with lindex), so that the original item is restored.
- a string is not always a list of length 1, but is parsed into list elements, according to its whitespace, quotes and braces. The string representation of a valid list will be parsed into the original list, so the parsing rules can be understood by inspecting lists printed out with puts.
MS 2004-12-04 Relevant comments moved here from How Tcl is special
Everything is a string edit
Wrong! Tcl has the data type, array, and in 8.5 the dictionary data type was introduced. Both of these are defiantly not strings. And, what about procs and namespaces? These are not strings either. People that say everything in Tcl is a string do not know what their talking about.JE Actually, dictionaries are strings, just as much as lists are. But you're right: there are plenty of important entities in Tcl that are not strings -- for example, open file channels, Tk widgets, variables (both scalar and array), encodings, and interpreters. But in Tcl, all of those things are accessed by name, and the name is a string. That's why, for example, you can't pass an array to a procedure (the array is not a string), you have to pass the array name instead (the array name is a string).MS Please refer to the precise statement and discussion at the top of this page.If "everything is a string," then how can you tell what's an object?escargo 23 Jul 2005 - That's what I woke up to this morning. I was thinking that Tcl lacks what I have seen called a "meta-object protocol," something that allows some object-oriented languages (like Smalltalk) to do some useful operations on objects and classes. I like Snit because of what it allows me to do to compose objects using delegation. However, if I'm operating in Tcl (or in Tk) and I have an identifier, how can I tell if its value represents an object from an object system like Snit (or any of the other object systems added onto Tcl). And if it is an object, how can I tell which object system it is an object in, so that I can guess what behavior is has (which functions it understands or implements)?The only I can see something like this working is if there were some agreed-upon standard for names (or references) such that a classifier (say [string is object ...]) could return a yes or no answer.Even better would be one that could tell which object system implemented the object (say [string is objectsystem ...]).This might be possible in a system like Jim if the Jim References encoded the object system and whether something was an object.Even without add-on object systems, it would be nice to be able to determine if there could be [string is command ...], but that in some respects defeats the purpose of unknown. (I'm still fuzzy from sleep, so maybe there is something that does this already, otherwise how would unknown get called?)Lars H, 24 July 2005: I think the best way of pointing out how your analysis here is wrong is to point out that
- Tcl has no whattype command;
- What design error did you make that made you ask that question in the first place? Where did you (or someone else) throw away the information that you now find you need?
type: valuecan be partially evaluated (or partially applied), to yield a new function specialised for that interpretation of that value. This can be optimised and can enforce a type abstraction.Lars H: Well put. The part about late "commitment" puts a name on something I think is very important in understanding the strengths of Tcl. I'll see if I can find a good place to put this idea for easy access.
- DKF: Actually, in 8.6 there is tcl::unsupported::representation, which includes type cache information in its result. Don't use it for anything other debugging. Or if you do, feel very naughty. It is very bad style to write code that depends on types (albeit inevitable for solving certain types of problem in the support of Java and JSON correctly, alas).
SYStems, 23 Jul 2005: Those are not very complete thoughts, but. I think to really answer and understand the idiom everything is a string, we need to identify the context, or perspective.A Tcl script is a series, a sequence of statements, each statement receive input
- A string.
- An event.
- produce output.
- cause a side effect.
- produce output and cause side effect
- Raise an error
set abc "123"then set doesn't just store another tcl command's output...]So we can say that everything inline- a tcl script, anything that can be passed around, a tcl script memory, a tcl script internal environment, must be a string. Or in other words, we can say, that Tcl introduces a new in-tcl context, where everything must have a textual representation.Anything outside a Tcl script, outside the in-tcl context, for example, a command side effect, or an external environment, can be not a string.
NEM 2010-12-15: One aspect of EIAS that is worth consideration is how it has kept Tcl "pure" in some sense. Part of EIAS that is little mentioned is that Tcl's strings are immutable. This means that Tcl's value space is purely functional, in the Haskell sense. All side-effects are confined to the shadow world of commands and variables and other second-class entities. What this means is that Tcl now possesses some very powerful purely functional data-structures that are somewhat better than those available in other languages. For instance, I cannot think of another popular language that supplies O(1) purely functional dictionaries and lists (arrays) out of the box (or even in the library). Not to mention efficient unicode and binary strings.
homoiconic
