Rough sets a la Relations

if {0} {

A Relational Formulation of Rough Sets

Andrew Mangogna -- 12 July 2009

I noticed an update on Arjen Markus' page on Rough sets and the comment near the bottom that the content should be formulated in terms of relational algebra. I can't resist the temptation to express set oriented operations in relational terms, so here goes. AM's original suggestion was Ratcl, but I'll use TclRAL.

AM (13 july 2009) I was just thinking the same thing :) - I found a more recent article on rough sets and realised that TcLRAL (or Ratcl as I apparently suggested at the time) would be a nice vehicle to implement the idea. (My main line of work is to do with number crunching, so this is "merely" a fun topic ...) }

 package require ral 0.9
 package require ralutil 0.9

 namespace import ::ral::*
 namespace import ::ralutil::*

 if {0} {

First, we need to populate the data base. This is a simple example so we can hold the data as a single relation in an ordinary Tcl variable.

 }

 set props [relation table {
    Name string     Education string    Age string      ChanceForJob boolean} {
    Joe             HighSchool          Old             No                  } {
    Mary            HighSchool          Young           Yes                 } {
    Peter           Elementary          Young           No                  } {
    Paul            University          Young           Yes                 } {
    Cathy           Doctorate           Old             Yes
 }]
 puts [relformat $props "Starting Database"]
 if {0} {
 +------+----------+------+------------+
 |Name  |Education |Age   |ChanceForJob|
 |string|string    |string|boolean     |
 +------+----------+------+------------+
 |Joe   |HighSchool|Old   |No          |
 |Mary  |HighSchool|Young |Yes         |
 |Peter |Elementary|Young |No          |
 |Paul  |University|Young |Yes         |
 |Cathy |Doctorate |Old   |Yes         |
 +------+----------+------+------------+
 Starting Database
 -----------------
 }

 if {0} {

The equivalence classes were calculated to determine what set of people had the same value for a given class. One way to do that is to project out the "Name" attribute along with the classes of interest and then group together the names.

 }

 proc equivClasses {which_props} {
     # ralutil provides convenient control structure that allows us to pass the
     # result of one computation on to a subsequent one with a syntax
     # reminiscent of the UNIX shell pipeline.  We use it here to make clearer
     # the sequence of algebraic operations that might be more difficult
     # to see with ordinary Tcl command nesting.

     # Here, the result of the "relation project" is passed along to the
     # "relation group" command where the "~" argument is located.
     return [pipe {
         relation project $::props {*}$which_props Name |
         relation group ~ Who Name
     }]
 }

 set edu [equivClasses Education]
 puts [relformat $edu "Education Only"]
 if {0} {
 +----------+--------+
 |Education |Who     |
 |string    |Relation|
 +----------+--------+
 |University|+------+|
 |          ||Name  ||
 |          ||string||
 |          |+------+|
 |          ||Paul  ||
 |          |+------+|
 |Elementary|+------+|
 |          ||Name  ||
 |          ||string||
 |          |+------+|
 |          ||Peter ||
 |          |+------+|
 |HighSchool|+------+|
 |          ||Name  ||
 |          ||string||
 |          |+------+|
 |          ||Joe   ||
 |          ||Mary  ||
 |          |+------+|
 |Doctorate |+------+|
 |          ||Name  ||
 |          ||string||
 |          |+------+|
 |          |+------+|
 |          ||Cathy ||
 |          |+------+|
 +----------+--------+
 Education Only
 --------------
 }
 set age [equivClasses Age]
 puts [relformat $age "Age Only"]
 if {0} {
 +------+--------+
 |Age   |Who     |
 |string|Relation|
 +------+--------+
 |Young |+------+|
 |      ||Name  ||
 |      ||string||
 |      |+------+|
 |      ||Mary  ||
 |      ||Peter ||
 |      ||Paul  ||
 |      |+------+|
 |Old   |+------+|
 |      ||Name  ||
 |      ||string||
 |      |+------+|
 |      ||Joe   ||
 |      ||Cathy ||
 |      |+------+|
 +------+--------+
 Age Only
 --------
 }

 set edu_age [equivClasses {Education Age}]
 puts [relformat $edu_age "Education and Age"]
 if {0} {
 +----------+------+--------+
 |Education |Age   |Who     |
 |string    |string|Relation|
 +----------+------+--------+
 |HighSchool|Young |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Mary  ||
 |          |      |+------+|
 |Elementary|Young |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Peter ||
 |          |      |+------+|
 |HighSchool|Old   |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Joe   ||
 |          |      |+------+|
 |Doctorate |Old   |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Cathy ||
 |          |      |+------+|
 |University|Young |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Paul  ||
 |          |      |+------+|
 +----------+------+--------+
 Education and Age
 -----------------
 }

 if {0} {

Now we can look at the same data, but this time we are interested in those classes with at least one person of interest. Following along the same way as the original page:

 }

 proc subsetOnlyClasses {classes who} {
    set who_rel [pipe {
        relation restrictwith $::props {$Name in $who} |
        relation project ~ Name
    }]

    return [relation restrictwith $classes {
        [relation is $Who subsetof $who_rel]
    }]
 }

 set success_edu [subsetOnlyClasses $edu {Mary Paul Cathy}]
 puts [relformat $success_edu "Education Only, Mary Paul Cathy"]
 if {0} {
 +----------+--------+
 |Education |Who     |
 |string    |Relation|
 +----------+--------+
 |University|+------+|
 |          ||Name  ||
 |          ||string||
 |          |+------+|
 |          ||Paul  ||
 |          |+------+|
 |Doctorate |+------+|
 |          ||Name  ||
 |          ||string||
 |          |+------+|
 |          ||Cathy ||
 |          |+------+|
 +----------+--------+
 Education Only, Mary Paul Cathy
 -------------------------------
 }

 set success_age [subsetOnlyClasses $age {Mary Paul Cathy}]
 puts [relformat $success_age "Age Only, Mary Paul Cathy"]
 if {0} {
 +------+--------+
 |Age   |Who     |
 |string|Relation|
 +------+--------+
 +------+--------+
 Age Only, Mary Paul Cathy
 -------------------------
 }

 set success_edu_age [subsetOnlyClasses $edu_age {Mary Paul Cathy}]
 puts [relformat $success_edu_age "Education and Age, Mary Paul Cathy"]
 if {0} {
 +----------+------+--------+
 |Education |Age   |Who     |
 |string    |string|Relation|
 +----------+------+--------+
 |HighSchool|Young |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Mary  ||
 |          |      |+------+|
 |Doctorate |Old   |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Cathy ||
 |          |      |+------+|
 |University|Young |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Paul  ||
 |          |      |+------+|
 +----------+------+--------+
 Education and Age, Mary Paul Cathy
 ----------------------------------
 }

 if {0} {

And finally, we formulate those classes that have at least one successful person.

 }

 proc possibleClasses {classes who} {
    set who_rel [pipe {
        relation restrictwith $::props {$Name in $who} |
        relation project ~ Name
    }]

    return [relation restrictwith $classes {
        [relation isnotempty [relation intersect $Who $who_rel]]
    }]
 }

 set poss_edu [possibleClasses $edu {Mary Paul Cathy}]
 puts [relformat $poss_edu "Education that is successful, Mary Paul Cathy"]
 if {0} {
 +----------+--------+
 |Education |Who     |
 |string    |Relation|
 +----------+--------+
 |University|+------+|
 |          ||Name  ||
 |          ||string||
 |          |+------+|
 |          ||Paul  ||
 |          |+------+|
 |HighSchool|+------+|
 |          ||Name  ||
 |          ||string||
 |          |+------+|
 |          ||Joe   ||
 |          ||Mary  ||
 |          |+------+|
 |Doctorate |+------+|
 |          ||Name  ||
 |          ||string||
 |          |+------+|
 |          ||Cathy ||
 |          |+------+|
 +----------+--------+
 Education that is successful, Mary Paul Cathy
 ---------------------------------------------
 }

 set poss_age [possibleClasses $age {Mary Paul Cathy}]
 puts [relformat $poss_age "Age that is successful, Mary Paul Cathy"]
 if {0} {
 +------+--------+
 |Age   |Who     |
 |string|Relation|
 +------+--------+
 |Young |+------+|
 |      ||Name  ||
 |      ||string||
 |      |+------+|
 |      ||Mary  ||
 |      ||Peter ||
 |      ||Paul  ||
 |      |+------+|
 |Old   |+------+|
 |      ||Name  ||
 |      ||string||
 |      |+------+|
 |      ||Joe   ||
 |      ||Cathy ||
 |      |+------+|
 +------+--------+
 Age that is successful, Mary Paul Cathy
 ---------------------------------------
 }

 set poss_edu_age [possibleClasses $edu_age {Mary Paul Cathy}]
 puts [relformat $poss_edu_age\
        "Education and Age that are successful, Mary Paul Cathy"]
 if {0} {
 +----------+------+--------+
 |Education |Age   |Who     |
 |string    |string|Relation|
 +----------+------+--------+
 |HighSchool|Young |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Mary  ||
 |          |      |+------+|
 |Doctorate |Old   |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Cathy ||
 |          |      |+------+|
 |University|Young |+------+|
 |          |      ||Name  ||
 |          |      ||string||
 |          |      |+------+|
 |          |      ||Paul  ||
 |          |      |+------+|
 +----------+------+--------+
 Education and Age that are successful, Mary Paul Cathy
 ------------------------------------------------------
 }

 if {0} {

Observations

There are many way in which this example could have been formulated in relational terms. I have tried to choose simple, straight-forward means. One thing to notice in the code is the complete lack of any looping constructs. This is what makes relational algebra so powerful, namely, the operators are "set-at-a-time" in the way they function. Since relation are a "universal" (i.e. you can formulate any data structure with a relation) often relational approaches to problems result in much less looping code. Of course, there is a tradeoff in terms of the learning curve. TclRAL has a lot of commands, but like Tcl itself, these can be approached from many different levels. You can start. as on this page, with relatively simple operations and work your way up to complex queries, relvars, relvar constraints and more.

 }

GAM -- 14 July 2009 Minor correction to the print out for the starting database.