Rewriting as a Computational ParadigmDecember 20, 2015
There are all sorts of ways of getting a computer to compute. For many programmers, the most natural way is using an imperative-style language (e.g., C, Java, Python), where you lay out the exact steps of how computation should take place—do this, then this, then this. In contrast, the declarative languages generally focus less on how computation should take place and instead focus more on what the computation should do. In writing declarative programs, one describes the goals of computation without describing the exact steps—a function that computes so-and-so should have such and such properties.
The most popular kinds of declarative languages these days are the functional languages (e.g., OCaml, Haskel, Lisp), which make it easy to build expressions describing facts (as opposed to statements describing actions). However, there are other kinds of declarative paradigms focusing on logical deduction (e.g., Prolog) or data query (e.g., SQL, XQuery, XSLT). One of the fringe arms on the constellation of declarative languages are the rewriting languages.
In rewriting languages, the state of a running program is generally a thing (a string, a tree, an expression) that gets transformed during each step of computation to another thing. There are string rewriting languages, term rewriting languages, graph rewriting languages, etc. I personally find rewriting to be expressive and intuitive, and so I want to share the basic ideas with people who might be interested in learning a new paradigm. We will take a brief look at string rewriting languages, followed by term rewriting languages.
String Rewriting (A Warmup!)
The simplest kind of Turing-complete rewriting languages are the string rewriting languages. In a string rewriting language, the input (and state) of a running program is a string. The programmer writes a set of rules that describe how substrings should be rewritten into other strings. Each replacement transforms the string, which may enable other rules to apply, and so on.
String rewriting is not something that’s really used for general purpose programming. We are looking at it here to get an idea for the basics of using rewriting as a computational device.
An Example String Rewriting Language
Let’s make up our own string-rewriting language (basically a sublanguage of Thue) with rules arranged like (LHS ⇒ RHS). The left-hand-side (LHS) of a rule is a pattern (substring) to be matched, and if a match is found, it can be replaced with the right-hand-side (RHS) of the rule. Using these rules, each step of computation takes the following shape: match the LHS (left-hand side) of a rule as a substring in the current state and replace it with the RHS (right-hand side); repeat until no rules apply. In our string rewriting language, there will be no special characters—all characters are meant to be literal characters.
Now that we have a string-rewriting language, let’s write a program. Let’s say you want to increment a binary number that is given to you as a string delimited with underscores. E.g., you’d like a program that takes in “_1011_” as input and returns as output “_1100”. Here is a program to do so:
# get started 1_ ⇒ 1++ [Rule1] 0_ ⇒ 1 [Rule2] # eliminate ++ 01++ ⇒ 10 [Rule3] 11++ ⇒ 1++0 [Rule4] _1++ ⇒ _10 [Rule5]
In this program, we use
++ as a marker for “increment what’s to the left of this marker”.
Computation on our example input proceeds through the following program states:
|by applying Rule1|
Try extending the above program to eliminate the left-most underscore. As a harder exercise, try writing a program that adds two strings (e.g., “101+11” gets transformed to “1000”). It should be pretty clear how writing programs for a string rewriting language is similar to writing programs for a Turing machine.
This is a pretty inexpressive language, though we can imagine extending this language with variable capture (like regex capture groups), input/output, etc. Generally speaking, rewriting can be nondeterministic, so you can have rules that compete:
0 ⇒ a 0 ⇒ b
All together, a pretty interesting regex-style language could be made from these components.
At first glance, rewriting doesn’t appear to be any different from pattern matching in functional languages. In some sense, this is true—you get to match things and replace them with other things. However, because rewriting rules match anywhere they can, in whatever order they can, it frees the programmer from having to worry about traversing objects or ordering lists of rules or how rules might be applied concurrently. Of course, actual implementations of rewriting languages will have some strategies (simple or complicated) for applying rules, but as long as the programmer focuses on making individual steps that make progress, everything works out.
To really hammer the idea home, let’s go ahead and write a concrete implementation of the above rules in a functional language. It will be pretty obvious how to generate this code based on a set of rules. We have to pick a traversal mechanism to search a string for matches (let’s choose left-to-right), and an ordering for rules to apply (let’s choose top-to-bottom, starting at the top every time a match is applied). Here’s how it might look in OCaml:
Implementations of the helper functions
implode can be found here.
In OCaml, strings can’t be pattern matched against directly—they must be converted to
char lists first.
Again, this is just one of many possible implementations of a string rewriting language. It might be that applying the rules in different orders result in different outcomes (as in the nondeterminism example above), or more efficient chains of computation. As written, this would be a terribly inefficient implementation. One improvement would be to keep indexes to track which rules are enabled after each step. Leaving such decisions up to the implementation frees the programmer from having to worry about them; one is left with deciding how states should be transformed into other states.
Now that we’ve covered the basics in a toy language, let’s move on to a real rewriting language.
Term rewriting has been used as the basis of a number of general purpose programming languages such as Clean, Maude, Pure, Rascal, Stratego/XT, and Tom. We will be doing all of our examples in Maude, because it is the language I’m most familiar with. It is also, arguably, one of the most expressive and efficient languages of the bunch.
In term rewriting languages, terms (syntax trees) are the basic unit of state. A program consists of rules that match subterms and replace them with new subterms. Unlike our simple string rewriting example, term rewriting languages generally have complex pattern matching with variable binding, types, and even rewriting in the presence of equational theories. Each of these will be touched on in the examples below.
To get us started with a simple example, let’s see how we might encode numbers and perform arithmetic in a term rewriting language. Of course, numbers and arithmetic are builtin to most languages; we’re only using it as a convenient example.
Because we’re talking about term-rewriting, we need a way to define what terms are allowed. This is like describing a syntax that we’re going to allow ourselves to talk about. Natural numbers are built recursively using the following signature:
sort Nat . op 0 : -> Nat . op s(_) : Nat -> Nat .
This says that we’re defining a new sort (thing, type) “
We can build up
Nats in two ways:
0 is a
s (successor) is an operator taking
Nat arguments and resulting in a
In BNF, the above would be written as
<Nat> ::= "0" | "s(" <Nat> ")"
This grammar lets us construct different
Intuitively, we want the term
0 to correspond to the number 0,
s(0) to correspond to 1,
s(s(0)) to correspond to 2, etc.
As a reminder, Maude has actual built-in numbers; we’re just using this as a simple example.
Let’s add some syntax that lets us write basic arithmetic expressions:
op _+_ : Nat Nat -> Nat . op _*_ : Nat Nat -> Nat .
Now we can construct terms like
s(s(s(0))) * (s(s(0)) + s(0)), intuitively representing 3 * (2 + 1).1
Okay, so far all we’ve done is declare some syntax. How do we actually do computation with the terms we can express? Let’s write some rewrite rules to define addition:
eq N:Nat + 0 = N:Nat . [Rule1] eq N:Nat + s(M:Nat) = s(N:Nat + M:Nat) . [Rule2]
Rule1 is pretty straightforward—if we’re adding two numbers and the second number is a 0, then the result of the addition is just the first number.
Rule2 moves the
s symbol from the second argument of an addition to surround the addition itself.
This way, the right argument of an addition eventually becomes
To see how this works, here’s how the previous two rules would apply to
s(s(s(0))) + s(s(0)):
||Match Rule2 with
||Match Rule2 with
||Match Rule1 with
As in string rewriting, in each step we look for a place where the left-hand side of a rule matches a subterm.
In the table above, I’ve emphasized this location by using a thin, rectangular border around the subterm.
In step 1, the matched term is the entire term.
In step 2, the matched term is a proper subterm—it is the child of an outer
In step 3, the matched term is a subterm of an outer
Once we find a match, we transform it into the right-hand side of the rule.
We can get Maude to reduce this term for us by saying:
red s(s(s(0))) + s(s(0)) .
result Nat: s(s(s(s(s(0)))))
Try to convince yourself that these are the only rules needed to defined addition over natural numbers.
Multiplication can be computed as follows:
eq N:Nat * 0 = 0 . eq N:Nat * s(M:Nat) = (N:Nat * M:Nat) + N:Nat .
We could keep adding operators like this all day, but let’s stop after one more:
op _<_ : Nat Nat -> Bool . eq 0 < s(N:Nat) = true . eq N:Nat < 0 = false . eq s(N:Nat) < s(M:Nat) = N:Nat < M:Nat .
Maude has conditional rules, which allow you an alternate way to express the first equation:
ceq 0 < N:Nat = true if N:Nat =/= 0 .
In the case that the side condition holds, a conditional rule applies as a normal rule would.
Comparison with Functional
How would we define the above encoding of natural numbers in a functional language? The most obvious way (done here in OCaml) would be as follows:
Succ make up a new datatype
nat, and we define a function
However, in rewriting, there are no functions like this
_+_ operator is just another way to build terms.
This might seem like a pedantic distinction, but one can take advantage of it when writing programs.
Imagine a new operator for pretty-printing terms:
op describe(_) : Nat -> String . eq describe(0) = "zero" . eq describe(s(N:Nat)) = "s(" + describe(N:Nat) + ")" . eq describe(N:Nat + M:Nat) = "(" + describe(N:Nat) + " + " + describe(M:Nat) + ")" .
In most languages, being able to grab the addition operator in this manner would require some kind of reflection or meta-programming capabilities.
In term rewriting,
_+_ is just another way of building terms, no different from
Of course, the evolution of a term like
describe(0 + 0) is nondeterministic—it could either turn into
"0 + 0".
describe to work unambiguously, you’d need to disable the normal rules for
_+_ or set rule priorities.
In order to allow the possibility of “grabbing” the addition operator in the functional world, you’d need to add a
Plus variant to the type.
In order to actually evaluate addition, we’d now have to add a new function, perhaps called “simplify”, that would reduce a term to its simplest form.
simplify function would have to implement a traversal over the term, which we alluded to in a previous section.
We can see what this might look like by taking it one step:
simplifyOnce function plays the same role that the
rewriteOnce function did for string rewriting.
Every time you added a new way to produce naturals (e.g., multiplication), you’d have to add cases in the traversal.
Not only that, but to make this kind of execution fast is tricky—luckily Maude handles it all for us.
Let’s graduate to more traditional example programs—the list data type and associated operations.
Let’s start with cons lists, the kinds of lists that are typically associated with functional languages.
sort ConsList . op nil : -> ConsList . op _::_ : Int ConsList -> ConsList . op head(_) : ConsList -> Int . eq head(X:Int :: L:ConsList) = X:Int . op tail(_) : ConsList -> ConsList . eq tail(X:Int :: L:ConsList) = L:ConsList . op reverse(_) : ConsList -> ConsList . op reverseAux(_,_) : ConsList ConsList -> ConsList . eq reverse(L:ConsList) = reverseAux(L:ConsList, nil) . eq reverseAux(nil, L:ConsList) = L:ConsList . eq reverseAux(X:Int :: L:ConsList, L':ConsList) = reverseAux(L:ConsList, X:Int :: L':ConsList) .
Not a whole lot of surprises here; these definitions would feel at home in a typical functional language.
Although the lists above work, Maude allows the programmer to specify a richer kind of list:
sort List . subsort Int < List . op nil : -> List . op _;_ : List List -> List [assoc id: nil] .
[assoc] annotation tells Maude that the operator (
_;_) is associative, so that the parse tree
(2 ; (3 ; 5)) should be considered equivalent to
((2 ; 3) ; 5).
Without telling it otherwise, Maude doesn’t know how to parse
(2 ; 3 ; 5) unambiguously, and will simply choose a parsing.
[id: nil] annotation tells Maude that the
nil operator is the identity element for the
These annotations are essentially equivalent to the following equations:
--- associativity eq L1:List ; (L2:List ; L3:List) = (L1:List ; L2:List) ; L3:List . eq (L1:List ; L2:List) ; L3:List = L1:List ; (L2:List ; L3:List) . --- identity eq L:List ; nil = L:List . eq nil ; L:List = L:List . eq L:List => L:List ; nil . eq L:List => nil ; L:List .
However, such equations would result in nontermination, if actually used. With the annotation, Maude just knows that they’re true and applies them in the case that it enables other rules to apply. This is what is meant by rewriting in the presence of equational theories, or rewriting “modulo” theories.
Suddenly, writing rules involving lists becomes a lot easier:
eq reverse(nil) = nil . eq reverse(I:Int ; L:List) = reverse(L:List) ; I:Int .
Because Maude knows the identity element is
nil, it allows the above definition to reduce even
This is the same as
reverse(7 ; nil), after which the second and then first rule applies.
Generally speaking, such lists are a lot more natural than cons-lists.2
Sorting such lists can be enabled with a single, simple rule:
ceq I1:Int ; I2:Int = I2:Int ; I1:Int if I2:Int < I1:Int .
Were it not for the
[assoc] annotation, this rule would be unable to sort a list like
3 ; (2 ; 5), because the rule would not be able to match.
In most functional languages, it would be difficult to define sorting so succinctly.
Sets are where Maude really starts to shine.
sort Set . subsort Int < Set . op empty : -> Set . op _,_ : Set Set -> Set [assoc comm id: empty] . eq I:Int , I:Int = I:Int .
[comm] annotation tells Maude that the operator is commutative, so that its elements can be rearranged at will.
The equation eliminates duplicates (otherwise, we’d be defining a multi-set).
Let’s define an operator for checking if an item exists in a set:
op _in_ : Int Set -> Bool . eq I:Int in empty = false . eq I:Int in (S:Set , I:Int) = true . ceq I:Int in (I':Int , S:Set) = I:Int in S:Set if I:Int =/= I':Int .
The possibility of lists, sets, and multisets (associative and/or commutative operators), coupled with the underlying “let rules match wherever they can” idea, makes for a powerful combination. It means in a potentially large program state, only the relevant parts need to be matched and manipulated.
To round out the examples, let’s take a look at how one might describe and write algorithms for tree structures.
sort BinTree . op empty : -> BinTree . op _[_]_ : BinTree Int BinTree -> BinTree .
Traversals can be defined in a straightforward way:
op inorder(_) : BinTree -> List . eq inorder(empty) =  . eq inorder(L:BinTree [I:Int] R:BinTree) = inorder(L:BinTree) ; I:Int ; inorder(R:BinTree) . op preorder(_) : BinTree -> List . eq preorder(empty) =  . eq preorder(L:BinTree [I:Int] R:BinTree) = I:Int ; preorder(L:BinTree) ; preorder(R:BinTree) .
Even keeping the trees sorted (turning them into binary search trees) can be done with two simple, declarative rules:
ceq (T1:BinTree [I1:Int] T2:BinTree) [I2:Int] T3:BinTree = (T1:BinTree [I2:Int] T2:BinTree) [I1:Int] T3:BinTree if I1:Int > I2:Int . ceq T1:BinTree [I1:Int] (T2:BinTree [I2:Int] T3:BinTree) = T1:BinTree [I2:Int] (T2:BinTree [I1:Int] T3:BinTree) if I1:Int > I2:Int .
The above two rules keeps trees sorted by swapping when necessary.
Rewriting is a special kind of declarative style used in programming languages. It lets the programmer focus on making small transformations to the program state. Programs written in a rewriting style are simply collections of such transformations, and execution is the continued application of such rules.
In the interest of conveying the big picture, I’ve left out many details. If you’re interested in learning more about Maude, there are a few tutorials (here, here), though they probably need to be supplemented with information from the manual. With that said, Maude has plenty of shortcomings. Although it is relatively fast, considering its level of expressivity, it lacks many things necessary for a general purpose programming language (like proper I/O support). Many of the problems of Maude have been addressed in the K Framework, a domain specific language (based on rewriting) for defining programming languages. Although first and foremost a language for defining programming languages, it can also been used for general computational problems. I have only limited experience with other term rewriting languages, but my understanding is that Rascal is probably the most practical, and also the most multi-paradigm.
Pattern matching can be very powerful, and even the rich kinds we saw here are fairly straightforward. My hope is that language features based on rewriting might find their way into more and more languages or domains. For general purpose languages, I think all that’s needed is to round off some of the rough edges found in the currently existing languages—many started as academic or experimental languages. For special purpose languages, I’d love to see a shell or sysadmin tool based on rewriting for working with lines, files, and directories. Let me know if you can think of other kinds of uses for rewriting!
The astute reader will notice we never added a
(_)production for grouping; Maude allows parens for disambiguating ambiguous parsings. ↩
Though there are some gotchas. For example, the rule
eq reverse(L:List ; L':List) = reverse(L':List) ; reverse(L:List)would cause nontermination. Given a term like
reverse(nil), Maude can always apply this rule by first expanding the term to
reverse(nil ; nil). After applying the rule, and another two expansions, it’s now ready to apply the rule again twice! ↩