FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
Architecture & Design
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
April 15, 2008
Cat: A Functional Stack-Based Little Language

(Page 1 of 4)
Christopher Diggins
Cat is an intermediate language for program verification, optimization, and more!
Christopher is a freelance programmer and consultant, with a particular interest in the design and implementation of programming languages. He can be contacted at cdiggins@gmail.com.


I've always been fascinated by stack-oriented languages because of their simplicity and elegance. Instructions take input off of the stack, do something with it, and push the output back onto the stack. Thus, there are no named variables or arguments, and the order of execution is left-to-right without any need for parentheses to denote precedence. As a result, it doesn't take long for people to learn the basics of programming in a stack language.

Another appeal of stack languages is that it is relatively easy to create reasonably efficient implementations for them, especially where memory considerations are at a premium. Because of this, we find stack languages in multicore processors (SeaForth, www.dspdesignline.com/188701424), virtual machines (Java Virtual Machine language), and embedded devices.

In general, when we think of stack languages, we often think of imperative languages: Each instruction does something to a shared stack. An alternative and equally accurate view of stack languages is that each instruction is a function that takes a stack as input and returns a stack as output. This is the principle upon which Manfred von Thun based the Joy language (www.latrobe.edu.au/philosophy/phimvt/joy.html).

The language that Joy most closely resembles is PostScript in that explicit control structures (branches, gotos, jumps) are replaced with higher order instructions (instructions that execute other instructions). Joy, however, introduced explicit function literals (PostScript uses a delayed execution operator) and eliminated the concept of the definition dictionary; in other words, you can't dynamically define or redefine new operations. This yielded a new breed of stack language that shares the advantages of pure functional languages (for example, it is expressive and easy to reason about and manipulate formally). It is interesting to note that despite its similarities to other stack-based languages, Joy evolved independently from Schoenfinkel and Curry's combinatory logic and the FP language by John Backus (www.vector.org.uk/archive/v203/vonthun203.htm).

My interest in Joy was primarily motivated by my search for an intermediate language that could be easily targeted by imperative and functional languages, could be easily optimized, and could be statically verified. Joy relies heavily on dynamic checking, so I created a more restricted, statically typed language based on Joy and called it "Cat."

Introducing Cat

In the Cat specification, instructions are referred to as "functions," regardless of whether they have side effects. New functions are defined using the define keyword, and have global scope. Functions cannot be redefined, and are only visible after they are defined. Example 1 presents some simple examples.


define myFirstCatProgram
{ "what is your name?" writeln readln "Hello " swap strcat writeln }
define addone : (int -> int)
{ 1 + }
define swapd : ('a 'b 'c -> 'b 'a 'c)
{ [swap] dip }
define dupd : ('a 'b -> 'a 'a 'b)
{ [dup] dip }
define bury : ('a 'b 'c -> 'c 'a 'b)
{ swap swapd }
define fib // compute fibonnacci
{ dup 1 <= [dup dec fib swap fib +] [] if }
define fact // compute factorial
{ dup <= 1 [dup dec fact *] [pop 1] if }

Example 1: Sample Cat functions.

At the core of the Cat language are primitive instructions for:

  • Mathematical operations: add, mul, sub, div, mod, rem, inc, dec.
  • Logical operations: and, or, not, xor.
  • Function construction: compose, papply, quote.
  • Executing functions: apply, dip, if, while.
  • Manipulating the stack: swap, dup, pop.
  • Dealing with lists: list, cons, uncons, head, tail, count, nth.
  • Comparing values: eq, lt, gt, lteq, gteq, neq.

In Cat, a literal (for example, 42, 'q', "Hello Christopher\n", 3.14) pushes a value onto the stack. Additionally, you can also write literal functions, by enclosing an expression in square braces ([1 +]). This has the effect of pushing a function onto the stack without executing it. In Joy parlance, this is called a "quotation," but I think of it as an anonymous function. Anonymous functions are first class values: They can be constructed dynamically and treated like any other primitive value (you can dup them, swap them, pop them, and so on). You can execute anonymous functions using higher order functions such as apply, if, and while.

Closures (functions bound to values in the local environment) can be constructed in Cat using the papply primitive instruction. This has the effect of binding the top value on the stack to the function, a process known as "partial application" (partial application is frequently mislabeled as "currying"). An example of partial application is that if you wrote 1 [<=] papply, it has the same effect as if you wrote [1 <=].

The quicksort algorithm in Example 2 is a more sophisticated example of Cat. The algorithm relies on a binary recursion instruction (bin_rec) that provides a general implementation of a binary recursive process (also called "tree recursion"). Example 3 is a possible definition of bin_rec.


define qsort : (list -> list)
{{
  desc:
    This is a simple implementation of a quick-sort algorithm.
  test:
    in: [5 4 2 1 3 2 4] list qsort
    out: [5 4 4 3 2 2 1] list
}}
{
  // Does list have 0 or 1 elements?
  [small]
  // Base case do nothing
  []
  // Argument-relation
  // Split the list using the head as a pivot
  // storing the pivot under for later use
  [uncons under [lt] papply split]
  // Append the pivot to the first list
  // then concatenate the two lists.
  [[swap cons] dip cat]
  bin_rec
}

Example 2: A simple quick-sort algorithm.


define binrec(input, cond, term, unfold, fold) {
  input cond
    [term] // termination transformation
    [input unfold // split input
      [cond term unfold fold binrec] // self call with same arguments
      app2 // call function twice:
        // once with second value removed
        // once with top value removed
  ]
  if
}

Example 3: Example implementation of the bin_rec function.

The bin_rec function is an example of a hylomorphism (see citeseer.ist.psu.edu/meijer91functional.html)—the composition of an anamorphism (an unfolding function) and a catamorphism (a folding function). Hylomorphisms are interesting because they can be used to eliminate the construction of intermediate data structures (citeseer.ist.psu.edu/launchbury95warm.html).

The quicksort algorithm in Example 2 also demonstrates an extended feature of Cat called "metadata"—a form of structured comment that can associate additional data with a Cat function that can be used by tools. For example, metadata can be used to document functions and perform automatic unit tests. The format is based on YAML (Yet Another Markup Language) and uses significant whitespace to denote hierarchical structure.

To demonstrate how compact Cat can be, Example 4 includes an implementation of the Google MapReduce algorithm (labs.google.com/papers/mapreduce.html). The general idea of the Google MapReduce algorithm is to define a task in terms of subtasks that can be executed (in this example, counting instances of words), and a function to combine the results of the subtasks (called the "reduce function"). While my implementations of Cat do not execute MapReduce concurrently, you can easily develop an implementation of Cat that automatically executes map and self_join to take advantage of available parallelism in the executing environment. This leads to an important point about Cat: As an implementor, you decide how to implement the primitive instructions and whether to implement standard library functions as library functions or built-in functions. This opens lots of opportunities for high-performance implementations.


define map_reduce
{ [map flatten self_join] dip map }
define test_strings
{
  (
    (("The", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"), 1),
    (("I", "am", "very", "lazy"), 2),
    (("I", "hope", "this", "is", "over", "quick"), 3),
    (("I", "have", "high", "hopes", "for", "the", "lazy", "dog"), 4)
  )
}
define test_map_fxn
{ unpair pop [1 swap pair] map }
define test_reduce_fxn
{ unpair [vec_sum] dip pair }
define test_map_reduce
{ test_strings [test_map_fxn] [test_reduce_fxn] map_reduce print_list }

Example 4: An implementation of the Google MapReduce algorithm.

1 Introducing Cat | 2 The Type System | 3 Cat for Application Development | 4 Cat Implementation Next Page
TOP 5 ARTICLES
No Top Articles.
DR. DOBB'S CAREER CENTER
Looking for a new job? open | close
Search jobs on Dr. Dobb's TechCareers
Function:

Keyword(s):

State:  
  • Post Your Resume
  • Employers Area
  • News & Features
  • Blogs & Forums
  • Career Resources

    Browse By:
    Location | Employer | City
  • Most Recent Posts:



    MICROSITES
    FEATURED TOPIC

    ADDITIONAL TOPICS

    INFO-LINK



     




    Techweb
    Informationweek Business Technology Network
    InformationweekInformationweek 500Informationweek 500 ConferenceInformationweek AnalyticsInformationweek Events
    Informationweek MagazineGlobal CIOIWK Government ITbMightyByte and SwitchDark Reading
    Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingPlug Into The CloudDr. DobbsContentinople
    space
    TechWeb Events Network
    InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0Mobile Business ExpoNoJitter
    Black HatGTECEnergy CampCloud ConnectGov 2.0 ExpoGov 2.0 Summit
    space
    Light Reading Communications Network
    Light ReadingLight Reading AsiaUnstrungCable Digital NewsInternet EvolutionPyramid Research
    Heavy ReadingLight Reading LiveLight Reading InsiderEthrnet ExpoTelco TVTower Technology Summit
    space
    Financial Technology Network
    Advanced TradingBank Systems and TechnologyInsurance and TechnologyWall Street and TechnologyAccelerating WallstreetBST SummitBuyside Trading SummitIT Summit
    space
    Microsoft Technology Network
    MSDNTechNetTotal IT ProTotal Dev ProNET Total Dev Pro CommunitySQL Total Dev Pro Community
    space