Disjunctive Normal Form

Richard Suchenwirth 2005-04-24 - In Integers as Boolean functions it was shown how from a logical expression f, a non-negative integer n(f) can be computed that represents the expression. Now how about the opposite direction - given an integer, which expression does it represent? A simple answer is to provide the "disjunctive normal form" [L1 ], where the truth table is constructed, and all rows that are true are joined with "or". For example, decimal 11 is binary 1011. From its logarithm to base 4 we see that it has an "arity" of 2, i.e. it matches a truth table of two variables, into which it is filled, least significant bit first:

 b a  f(11)
 0 0  1
 0 1  1
 1 0  0
 1 1  1

By specifying the cases where f(11)=1, we get the full disjunctive normal form (disjunction is just a fancy word for "or" i.e. "|"):

 (!a & !b) | (a & !b) | (a & b)

However, the first two elements of this disjunction are a bit redundant, as obviously the value of a doesn't really matter in them - they cancel themselves out:

  (!a & !b) | (a & !b) == !b

So f(11) reduces to

 !b | (a & b)

And another simplification is possible: if !b doesn't hold, b must be true, so the second element boils down to

 !b | a

Here's the code:

 proc disjunctiveNormalForm n {
    if {$n <= 0} {return 0} ;# never true
    set res {}
    set rows [1-bits $n]
    foreach row $rows {
        set element [conjunction [int2bits $row [arity $n]] $rows]
        if {$element ne {}} {lappend res ($element)} 
    }
    if [llength $res] {
        join [lsort -u $res] " | "
    } else {return 1} ;# always true - tautology
 }
#-- Determine the minimal arity for an integer
 proc arity n {
    foreach {min res} {4 1  16 2  256 3 65536 4} {
        if {$n<$min} {return $res}
    }
    error "arity too high"
 }
 
 proc 1-bits n {
    #-- set bits in an integer, e.g. 1-bits 11 -> {0 1 3}
    set res {}
    set row 0
    while {$n} {
        if {$n%2} {lappend res $row}
        set n [expr {$n/2}]
        incr row
    }
    set res
 }
 proc int2bits {n width} {
    set bit 1
    for {set i 0} {$i<$width} {incr i} {
        lappend res [expr {!!($n & $bit)}]
        incr bit $bit
    }
    set res
 }
 proc bits2int bits {
    set res 0
    set value 1
    foreach bit $bits {
        set res [expr {$res+$bit*$value}]
        incr value $value
    }
    set res
 }

Building up the conjunctive element for a row, we test each bit if the contrary case is also contained in the rows, and only add its form if it isn't:

 proc conjunction {bits rows} {
    set res {}
    set i 0
    foreach bit $bits {
        set contrary [lreplace $bits $i $i [not $bit]]
        if ![in $rows [bits2int $contrary]] {
            lappend res [expr {$bit? "": "!"}]$[letter $i]
        }
        incr i
    }
    join $res " & "
 }
 proc not bit {expr {!$bit}}
#-- A simple map generates variable names for 0..25:
 interp alias {} letter {} \
    lindex {a b c d e f g h i j k l m n o p q r s t u v w x y z}
#-- Handy shortcut for element inclusion in a list
 proc in {list element} {expr {[lsearch -exact $list $element]>=0}}
#--Testing:
 for {set i 0} {$i<16} {incr i} {puts $i:[disjunctiveNormalForm $i]}

results in:

 0:0
 1:(!$a)
 2:($a)
 3:1
 4:(!$a & $b)
 5:(!$a)
 6:(!$a & $b) | ($a & !$b)
 7:(!$a) | (!$b)
 8:($a & $b)
 9:(!$a & !$b) | ($a & $b)
 10:($a)
 11:(!$b) | ($a)
 12:($b)
 13:(!$a) | ($b)
 14:($a) | ($b)
 15:1

One thing that puzzles me:

 % disjunctiveNormalForm 168
 ($a & $b) | ($a & $c) | ($a)

Expected result is:

 ($a & $b) | ($a & $c)

Where does the silly third term come from? The truth table is

 c b a  f(168), 168=128+32+8
 0 0 0  0
 0 0 1  0
 0 1 0  0
 0 1 1  1   a & b & !c  (3)
 1 0 0  0
 1 0 1  1   a & !b & c  (5)
 1 1 0  0
 1 1 1  1   a & b & c   (7)

Hmm - in (7), the "b" is canceled by (5), and the "c" by (3)... But then row (1) would be true too, which it clearly isn't... Maybe the simplification algorithm needs more work. Hints welcome! :)

Lars H: There's a traditional method for simplifying DNF expressions in which one writes out the function as an array and then tries to cover all the 1s with as few terms as possible. In the case of a function of a, b, and c it might look as follows

  c0110
  b0011
 a     
 0 0000   ($a && $b) || ($a && $c)
 1 0111

A term which depends on all variables cover one entry in this array, a term which depends on all but one variable covers a 2x1 or 1x2 rectangle in this array, a term which depends on all but two variables covers a 4x1, 2x2, or 1x4 rectangle in this array, and so on. In the example above a minimum of two terms is required to cover the three ones, and the shortest expression is obtained if one lets the a=b=c=1 entry be covered twice.

Formally a function of n variables should be considered to "live" on an n-dimensional hypercube, but that quickly gets beyond what can conveniently be pictured on paper, so a trick is employed: a two-dimensional hypercube (a square) is in this context (it's only the corners that count) isomorphic to a 4-cycle, which can conveniently be cut open. That way one can half the number of dimensions of the array, at the price of a somewhat unexpected order of rows and columns (it's 00, 01, 11, 10). But this trick only helps one to do four variables using pen and paper.

As for automation, I suspect the problem is really rather difficult (but ought to be well studied). You basically need to find a minimal cost cover of a given graph (a subgraph of a hypercube) by a given set of graphs (lower dimensional hypercubes), while allowing repetitions (this last part might simplify the problem). Problems of that kind have a tendency to be NP-hard, but you might be lucky. Anyway there should be extensive literature on the subject (most of which was probably written by electrical engineers, though).