Updated 2015-10-09 05:28:14 by pooryorick

HE 2015-10-08:

lsearch determines which values in a list match a pattern. I want to determine which patterns in a list match a value.

Here is my solution:
Syntax
searchPatList mode value patList

mode specifies the type of pattern, and is is one of:
-exact
Exact string matching.
-glob
glob-style matching as provided by string match.
-regexp
regexp matching.

value is a string to apply each pattern to.

patList is a list of patterns.

searchPatList returns the index of the the first pattern in patList that matches value, or -1 if no pattern matches.
proc searchPatList {mode value patList} {
    set n 0
    switch -exact -- $mode {
        -exact {
            foreach el $patList {
                if {$el eq $value} {
                    return $n
                }
                incr n
            }
            return -1
        }
        -glob {
            foreach el $patList {
                if {[string match $el $value]} {
                    return $n
                }
                incr n
            }
            return -1
        }
        -regexp {
            foreach el $patList {
                if {[regexp -- $el $value]} {
                    return $n
                }
                incr n
            }
            return -1
        }
        default {
            error "Unknown mode '$mode'!"
        }
    }
    return
}

And the examples:
set patList [list \
        {test 3.4.1} \
        {dummy} \
        {^test *[0-9.]*$} \
        {test *} \
]

searchPatList -exact {test 3.4.1} $patList
0
searchPatList -exact {test 3.4.2} $patList
-1
searchPatList -exact {dummy} $patList
1
searchPatList -glob {test 3.4.1} $patList
0
searchPatList -glob {test 3.4.2} $patList
3
searchPatList -glob {dummy} $patList
1
searchPatList -regexp {test 3.4.1} $patList
0
searchPatList -regexp {test 3.4.2} $patList
2
searchPatList -regexp {dummy} $patList
1
searchPatList -default {dummy} $patList
Unknown mode '-default'!

PYK 2015-10-08:

$patList mixes patterns which are clearly intended to be used with some particular command, yet searchPatList only operates in one mode at a time, making the overall design a bit over-engineered. This type of situation just might be the right place to use if without bracing the expressions:
set matchers {
    {{test 3.4.1} eq [lindex @[email protected]]}
    {0 && [lindex @[email protected]]}
    {[string match {^test *[0-9.]*$} @[email protected]]}
    {[string match {test *} @[email protected]]}
}
foreach matcher $matchers {
    if [string map [list @[email protected] [list $val]] $matcher] {
        puts [list matched $matcher]
        break
    }
}

The advantages of this approach are that patterns and their associated commands appear together, and arbitrary matching algorithms can be introduced as needed. The [lindex @[email protected]] bit serves merely as an identity function, which is one way to properly quote a value being inserted into an expr template. Note also the use of [list $val] which is the right way to escape a value being interpolated into a Tcl script template.

HE 2015-10-08: Hello PYK Thank you for the suggestion. Even if your approach doesn't match the quest written in the first line. Perhaps I doesn't explain enough what I want to archive. I want a procedure, which like lsearch provide the index of the matching list item or -1 in case of no match. But instead of a list of values which are checked against a single pattern, I have a single value and a list of pattern. Like lsearch the kind of pattern is defined by an option.

With this information, you also understand, that patList is not a list of different kind of pattern. Even if I use in the test cases the same pattern for all three modes.

The result of my procedure is a number which can be used with if as it can be done with lsearch. The results of the example shows exactly what I want to get as a result. For sure some could extend the procedure to add option which deliver additional different result sets as it could be done with lsearch. But this is nothing I need as I wrote the searchPatList.

There is also a security issue in case the pattern list comes from outside the tcl script like a configuration file or the Tk GUI. I use a win64 system and add the pattern '{[exec more deda]}' to matchers. The external program 'more' is executed. If I add this pattern to patList and use it with searchPatList the external program 'more# is not executed. Think what happen with with other programs which could format your HD or do other harm.

PYK 2015-10-08: Doesn't my approach do what is written in the first line since it identifies the first pattern in the list that the value matches against? It would be trivial to wrap it into a procedure for the purpose of using it in conjunction with if. If the patterns are coming from untrusted sources and injection attacks are a concern, the patterns could take the form of commands, and the procedure could be run in a safe interpreter where only certain commands are exposed. Something like the following would be amenable to that:
proc matchpat {value patterns {token @[email protected]}} {
    set idx 0
    foreach pattern $patterns {
        if {[{*}[string map [list $token [list $value]] $pattern]]} {
            return $idx
        }
        incr idx
    }
    return -1
}

# In the user interface, document that a pattern should be a complete command
# and that a token like @[email protected] is a placeholder.
if {[matchpat {test 3.4.1} {
    {string match {test 3.4.1} @[email protected]}
    {expr 0 && @[email protected]}
    {string match {^test *[0-9.]*$} @[email protected]}
    {string match {test *} @[email protected]}
}] >= 0} {
    puts [list {some pattern matched}]
}

HE 2015-10-08: I tried your example and changed 'value' slightly and got an error:
(tclkit-8.6.1) 5 % if {[matchpat {test 3.4.2} {
    {string match {test 3.4.1} @[email protected]}
    {expr {0 && @[email protected]}}
    {string match {^test *[0-9.]*$} @[email protected]}
    {string match {test *} @[email protected]}
}] >= 0} {
    puts [list {some pattern matched}]
}
invalid bareword "test"
in expression "0 && test 3.4.2";
should be "$test" or "{test}" or "test(...)" or ...

Reason is the pattern:
{expr 0 && @[email protected]}

It should be:
{expr {0 && @[email protected]}}

With this I get:
(tclkit-8.5.10) 13 % matchpat {test 3.4.2} {
    {string match {test 3.4.1} @[email protected]}
    {expr {0 && @[email protected]}}
    {string match {^test *[0-9.]*$} @[email protected]}
    {string match {test *} @[email protected]}
    {exec more deda}
}
3

which is the result I expected with this pattern.

I know that I can prevent injection with save interpreter. But if their is an easier way I like to use it. It increase readability for me. So its a question of personal style.

I agree that your code is more flexible. I wonder which one would be faster ;-) Could be another exercise.

PYK 2015-10-08: Ah, right. That was left over from the first approach. I've made that change in the example now. People seem hesitant to use multiple interpreters, but they are often a great tool in the arsenal.

HE 2015-10-08: Hello pooryorick you removed the option -- behind the switch instruction. There is nothing wrong with -- and there are a lot of persons which like to use it to prevent possible future errors. Even if it is not needed in a special case.

Please accept that other person have a different opinion how to write code then yourself. Therefore, I changed it back and ask you not to change the code again. Let this wiki a place where diversity of writing tcl is possible.

Thanks.

PYK 2015-10-08: Even without your request, I wouldn't have removed it after you added it back, since that action is a clear communication that you want it there. The documentation for switch takes pains to mention that -- is unnecessary when the statements are grouped together into a single argument. I removed it because I prefer to avoid non-functional syntax in learning environments.