Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Design

Cat: A Functional Stack-Based Little Language


Christopher is a freelance programmer and consultant, with a particular interest in the design and implementation of programming languages. He can be contacted at [email protected].


I've always been fascinated by stack-oriented languages because of their simplicity and elegance. Instructions take input off of the stack, do something with it, and push the output back onto the stack. Thus, there are no named variables or arguments, and the order of execution is left-to-right without any need for parentheses to denote precedence. As a result, it doesn't take long for people to learn the basics of programming in a stack language.

Another appeal of stack languages is that it is relatively easy to create reasonably efficient implementations for them, especially where memory considerations are at a premium. Because of this, we find stack languages in multicore processors (SeaForth, www.dspdesignline.com/188701424), virtual machines (Java Virtual Machine language), and embedded devices.

In general, when we think of stack languages, we often think of imperative languages: Each instruction does something to a shared stack. An alternative and equally accurate view of stack languages is that each instruction is a function that takes a stack as input and returns a stack as output. This is the principle upon which Manfred von Thun based the Joy language (www.latrobe.edu.au/philosophy/phimvt/joy.html).

The language that Joy most closely resembles is PostScript in that explicit control structures (branches, gotos, jumps) are replaced with higher order instructions (instructions that execute other instructions). Joy, however, introduced explicit function literals (PostScript uses a delayed execution operator) and eliminated the concept of the definition dictionary; in other words, you can't dynamically define or redefine new operations. This yielded a new breed of stack language that shares the advantages of pure functional languages (for example, it is expressive and easy to reason about and manipulate formally). It is interesting to note that despite its similarities to other stack-based languages, Joy evolved independently from Schoenfinkel and Curry's combinatory logic and the FP language by John Backus (www.vector.org.uk/archive/v203/vonthun203.htm).

My interest in Joy was primarily motivated by my search for an intermediate language that could be easily targeted by imperative and functional languages, could be easily optimized, and could be statically verified. Joy relies heavily on dynamic checking, so I created a more restricted, statically typed language based on Joy and called it "Cat."

Introducing Cat

In the Cat specification, instructions are referred to as "functions," regardless of whether they have side effects. New functions are defined using the define keyword, and have global scope. Functions cannot be redefined, and are only visible after they are defined. Example 1 presents some simple examples.


define myFirstCatProgram
{ "what is your name?" writeln readln "Hello " swap strcat writeln }
define addone : (int -> int)
{ 1 + }
define swapd : ('a 'b 'c -> 'b 'a 'c)
{ [swap] dip }
define dupd : ('a 'b -> 'a 'a 'b)
{ [dup] dip }
define bury : ('a 'b 'c -> 'c 'a 'b)
{ swap swapd }
define fib // compute fibonnacci
{ dup 1 <= [dup dec fib swap fib +] [] if }
define fact // compute factorial
{ dup <= 1 [dup dec fact *] [pop 1] if }

Example 1: Sample Cat functions.

At the core of the Cat language are primitive instructions for:

  • Mathematical operations: add, mul, sub, div, mod, rem, inc, dec.
  • Logical operations: and, or, not, xor.
  • Function construction: compose, papply, quote.
  • Executing functions: apply, dip, if, while.
  • Manipulating the stack: swap, dup, pop.
  • Dealing with lists: list, cons, uncons, head, tail, count, nth.
  • Comparing values: eq, lt, gt, lteq, gteq, neq.

In Cat, a literal (for example, 42, 'q', "Hello Christopher\n", 3.14) pushes a value onto the stack. Additionally, you can also write literal functions, by enclosing an expression in square braces ([1 +]). This has the effect of pushing a function onto the stack without executing it. In Joy parlance, this is called a "quotation," but I think of it as an anonymous function. Anonymous functions are first class values: They can be constructed dynamically and treated like any other primitive value (you can dup them, swap them, pop them, and so on). You can execute anonymous functions using higher order functions such as apply, if, and while.

Closures (functions bound to values in the local environment) can be constructed in Cat using the papply primitive instruction. This has the effect of binding the top value on the stack to the function, a process known as "partial application" (partial application is frequently mislabeled as "currying"). An example of partial application is that if you wrote 1 [<=] papply, it has the same effect as if you wrote [1 <=].

The quicksort algorithm in Example 2 is a more sophisticated example of Cat. The algorithm relies on a binary recursion instruction (bin_rec) that provides a general implementation of a binary recursive process (also called "tree recursion"). Example 3 is a possible definition of bin_rec.


define qsort : (list -> list)
{{
  desc:
    This is a simple implementation of a quick-sort algorithm.
  test:
    in: [5 4 2 1 3 2 4] list qsort
    out: [5 4 4 3 2 2 1] list
}}
{
  // Does list have 0 or 1 elements?
  [small]
  // Base case do nothing
  []
  // Argument-relation
  // Split the list using the head as a pivot
  // storing the pivot under for later use
  [uncons under [lt] papply split]
  // Append the pivot to the first list
  // then concatenate the two lists.
  [[swap cons] dip cat]
  bin_rec
}

Example 2: A simple quick-sort algorithm.


define binrec(input, cond, term, unfold, fold) {
  input cond
    [term] // termination transformation
    [input unfold // split input
      [cond term unfold fold binrec] // self call with same arguments
      app2 // call function twice:
        // once with second value removed
        // once with top value removed
  ]
  if
}

Example 3: Example implementation of the bin_rec function.

The bin_rec function is an example of a hylomorphism (see citeseer.ist.psu.edu/meijer91functional.html)—the composition of an anamorphism (an unfolding function) and a catamorphism (a folding function). Hylomorphisms are interesting because they can be used to eliminate the construction of intermediate data structures (citeseer.ist.psu.edu/launchbury95warm.html).

The quicksort algorithm in Example 2 also demonstrates an extended feature of Cat called "metadata"—a form of structured comment that can associate additional data with a Cat function that can be used by tools. For example, metadata can be used to document functions and perform automatic unit tests. The format is based on YAML (Yet Another Markup Language) and uses significant whitespace to denote hierarchical structure.

To demonstrate how compact Cat can be, Example 4 includes an implementation of the Google MapReduce algorithm (labs.google.com/papers/mapreduce.html). The general idea of the Google MapReduce algorithm is to define a task in terms of subtasks that can be executed (in this example, counting instances of words), and a function to combine the results of the subtasks (called the "reduce function"). While my implementations of Cat do not execute MapReduce concurrently, you can easily develop an implementation of Cat that automatically executes map and self_join to take advantage of available parallelism in the executing environment. This leads to an important point about Cat: As an implementor, you decide how to implement the primitive instructions and whether to implement standard library functions as library functions or built-in functions. This opens lots of opportunities for high-performance implementations.


define map_reduce
{ [map flatten self_join] dip map }
define test_strings
{
  (
    (("The", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"), 1),
    (("I", "am", "very", "lazy"), 2),
    (("I", "hope", "this", "is", "over", "quick"), 3),
    (("I", "have", "high", "hopes", "for", "the", "lazy", "dog"), 4)
  )
}
define test_map_fxn
{ unpair pop [1 swap pair] map }
define test_reduce_fxn
{ unpair [vec_sum] dip pair }
define test_map_reduce
{ test_strings [test_map_fxn] [test_reduce_fxn] map_reduce print_list }

Example 4: An implementation of the Google MapReduce algorithm.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.