.Dd Wed, Feb 25 2026
.Dt minimunger 1
.Sh NAME
.Nm MiniMunger
.Nd Language for writing text-processing filters
.Sh SYNOPSIS
.Nm minimunger Ao source-file Ac
.Sh DESCRIPTION
MiniMunger is a compiler-to-C for a small variant of Munger, written in Munger,
limited to writing filters.  An interface to SQLite is provided.
.Pp
This manual page describes only the differences between Munger and MM.  For
more information, see the Munger manual page.
.Pp
Example programs and helper modules are installed in
%%PREFIX%%/share/minimunger.
.Bl -tag -width "stringstack.mm"
.It grep.mm
is an egrep-like filter.
.It fmt.mm
is a fmt-like filter.
.It tables.mm
is a hash table demo and performance test.
.It sqlite.mm
is a SQLite demo
.It options.mm
aids the processing of command-line arguments.
.It stacks.mm
aids working with stacks.
.It stringstack.mm
aids in concatenating stacks of strings.
.It tsml2sqlite.mm
converts a TSML document into a SQLite database.
.It tsmlquery.mm
provides functions to query a converted TSML document.
.It tsmltest.mm
illustrates the use of tsml2sqlite.mm and tsmlquery.mm.
.It y.mm
uses the y-combinator to calculate the factorial of 19.
.El
.Pp
The example programs can be compiled in the source directory with:
.Bd -offset left
make tests
.Ed
.Sh IMPLEMENTATION NOTES
The MM compiler is a whole-program compiler, reading one input file, and
producing two output files of C code, which may then be compiled by the C
compiler with the MM runtime to produce an executable.  Instructions for
using the compiler follow this section.
.Bl -bullet
.It
MM does not support lists nor any list-related functions.  Programs are
written as lists, but programs may not create lists.
.It
The first-class data types are:  stacks, tables, closures, continuations,
compiled regular expressions, strings, and fixnums.
.It
Global variables must be declared before they are recognized in subsequent
code. If necessary, "declare" an initial dummy value as a forward reference and
then subsequently "bind" the correct value.
.It
Side-effects are only permissable on globals.
.It
MM has no looping constructs.  Iteration is performed by recursion. The
compiler peforms CPS conversion, which converts all calls to tailcalls.
.It
First-class continuations are captured with the "call_cc" intrinsic which
behaves like "call/cc" does in Scheme.
.It
User-defined functions have fixed-size argument lists.
.It
All equality comparisons use "eq".
.It
User-defined macros are not supported in minimunger code, but you can hack
the compiler to define your own compiler macros.  See the function
"make_initial_cte".
.It
The compiler does not detect parameter/argument count mismatches in the
application of user-defined functions. Such mismatches crash the runtime.
.It
For maximum performance, The MM runtime DOES NOT PERFORM type-checking. If you
your code calls an intrinsic function with arguments of the wrong type, the
runtime will crash.
.It
Although lambda-expressions can be bound to variables in "let" and "letn"
forms, there is no "labels" nor "letf" to allow the functions to see their own
bindings.  Any function that calls itself must have a toplevel binding.
.El
.Sh COMPILING MM PROGRAMS
MM depends upon the the SQLite database library, which must installed
before you can compile MM programs.  Invoke "pkg sqlite3 install"
.Pp
The MM compiler is written in Munger, and compiles MM code to an
intermediate language, defined as a set of macros in the source of the C
runtime.  Most of the macros expand to in-line code for speed, resulting in
larger executables than might be expected from the size of the original MM
programs.
.Pp
To compile a MM program the compiler must be invoked on the main source
file:
.Bd -literal -offset left
% minimunger grep.mm
.Ed
.Pp
The compiler performs source-to-source conversions before it begins to emit
code, printing status messages as it does so. When it has finished, two files
are created, one named "functions.c" and one name "functions.h".  To create an
executable from these files, invoke the C compiler on the MM runtime source,
which includes the other two files.  The command line below will be the same
when building any program compiled by MM exceptfor the argument to the -o
option.
.Bd -literal -offset left
% cc -o grep /usr/local/share/minimunger/runtime.c \\
-I./ -I/usr/local/share/minimunger -I/usr/local/include \\
-L/usr/local/lib -lsqlite3 -lcurses
.Ed
.Pp
The main source file of a program may include other source files with the
"include" directive.  The "include" directive resembles its similarly-named
C preprocessor counterpart, and consists of the word "include" preceded by
an octothorpe (#), and succeed by a double-quote delimited filename.  For
example:
.Bd -literal -offset left
#include "options.mm"
.Ed
.Pp
If the filename itself contains double quotes, they do not need to be
escaped.  Include directives must start in column zero to be recognized.
Otherwise, they will be treated as comments.  Included files themselves may
also "include" other source files.
.Sh THE INTRINSICS
The MM intrinsics bear strong resemblence to their similarly-named Munger
counterparts.  Some behave differently.  This summary does not completely
document the operation of the intrinsic functions, but merely lists which are
available and how they differ from their Munger counterparts.  For complete
documentation of an intrinsic, see the Munger(1) manual page.
.Ss Control Flow / Side-Effects
The empty string and 0 are boolean false values.  All other objects are
considered boolean true values.  The forms below function identically to
their Munger counterparts, with the exception of the conditionals.  Note
that "bind" is the only means of accomplishing side-effects on variables,
and that side-effects are only permissable upon globals.
.Pp
When "if" is invoked with only a "true" subsequent clause, and the test
condition evaluates to a false value, 0 is returned, and not the value of
the failed test condition.  Similarly, if all test clauses of an invocation
of "cond" fail, then 0 is returned, rather than the value of the last
failed test condition.  Both "when" and "unless" also return 0 if their
test conditions fail.
.Pp
If you want an expression to return a value rather than the result of
evaluating an expression, you must use "eval". "eval" effectively does nothing.
.Bl -column -offset left "unless" "(letn ((symbol expr)+) expr+)"
.It Sy Form Ta Sy Use
.It Li declare Ta  (declare symbol expr)
.It Li bind   Ta   (bind symbol expr)
.It Li if     Ta   (if test expr1 expr2 ...)
.It Li cond   Ta   (cond (test_expr subsequent ...)+ )
.It Li when   Ta   (when test expr ...)
.It Li unless Ta   (unless test expr ...)
.It Li progn  Ta   (progn expr ...)
.It Li eq     Ta   (eq expr1 expr2)
.It Li or     Ta   (or expr ...)
.It Li and    Ta   (and expr ...)
.It Li not    Ta   (not expr)
.It Li let    Ta   (let ((symbol expr)+) expr+)
.It Li letn   Ta   (letn ((symbol expr)+) expr+)
.It Li exit   Ta   (exit expr)
.It Li quit   Ta   (quit)
.It Li die    Ta   (die ...)
.It Li eval   Ta   (eval expr)
.El
.Pp
call_cc is used to capture the current continuation.  It functions
exactly as call/cc does in Scheme:
.Bl -column -offset left "call_cc" "(call_cc monadic_function)"
call_cc  (call_cc monadic_function)
.El
.Pp
.Ss Regular Expressions
.Bl -column -offset left "substitute" "(substitute rx rep str count)" "0 or stack of 2 fixnums"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li regcomp    Ta (regcomp str)               Ta compiled rx
.It Li match      Ta (match rx str)              Ta 0 or stack of 2 fixnums
.It Li matches    Ta (matches rx str)            Ta stack of 20 strings
.It Li substitute Ta (substitute rx rep str cnt) Ta string
.It Li regexpp     Ta (regexpp expr)             Ta 0 or 1
.El
.Ss Tables
.Bl -column -offset left "unhash" "(hash table expr1 expr2)" "associated expr"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li table      Ta (table)                       Ta new table
.It Li tablep     Ta (tablep expr)                 Ta 0 or 1
.It Li hash       Ta (hash table expr1 expr2)      Ta table
.It Li unhash     Ta (unhash table expr1)          Ta table
.It Li items      Ta (items table)                 Ta number of items
.It Li lookup     Ta (lookup table expr)           Ta associated expr
.It Li keys       Ta (keys table)                  Ta stack of keys
.It Li values     Ta (values table)                Ta stack of values
.El
.Ss Stacks
Note that the "unshift", "push", and "store", intrinsics all return the
affected stack instead of their second arguments.
.Pp
.Bl -column -offset left "sort_numbers" "(substack stack expr expr)" "item at index expr"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li stack   Ta (stack)                    Ta new stack
.It Li shift   Ta (shift stack)              Ta item at index 0
.It Li unshift Ta (unshift stack expr)       Ta stack
.It Li push    Ta (push stack expr)          Ta stack
.It Li pop     Ta (pop stack)                Ta item at top of stack
.It Li index   Ta (index stack expr)         Ta item at index expr
.It Li store   Ta (store stack fixnum expr)  Ta stack
.It Li used    Ta (used stack)               Ta stored item count
.It Li sort_numbers Ta (sort_numbers stack)  Ta stack (sorted in situ)
.It Li sort_strings Ta (sort_strings stack)  Ta stack (sorted in situ)
.It Li stackp       Ta (stackp expr)         Ta 0 or 1
.El
.Ss Fixnums
Each of these functions accept only TWO arguments, unlike their Munger
counterparts.
.Bl -column -offset left "Intrinsic" "(>= expr1 expr2)" "absolute value"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li eq        Ta (eq expr1 expr2)  Ta 0 or 1
.It Li <         Ta (< expr1 expr2)   Ta 0 or 1
.It Li <=        Ta (<= expr1 expr2)  Ta 0 or 1
.It Li >         Ta (> expr1 expr2)   Ta 0 or 1
.It Li >=        Ta (>= expr1 expr2)  Ta 0 or 1
.It Li +         Ta (+ expr1 expr2)   Ta sum
.It Li -         Ta (- expr1 expr2)   Ta difference
.It Li *         Ta (* expr1 expr2)   Ta product
.It Li %         Ta (% expr1 expr2)   Ta remainder
.It Li /         Ta (/ expr1 expr2)   Ta quotient
.It Li abs       Ta (abs expr)        Ta absolute value
.It Li minnum    Ta (minnum)          Ta Lowest fixnum value
.It Li maxnum    Ta (maxnum)          Ta Highest fixnum value
.El
.Pp
Note that "stringify" accepts only one argument, which must evalute to a
fixnum.
.Bl -column -offset left "stringify" "(stringify expr)" "string representation of expr"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li stringify    Ta (stringify expr)  Ta string representation of expr
.It Li numberp      Ta (numberp expr)    Ta 0 or 1
.It Li char         Ta (char expr)       Ta one-character string
.El
.Pp
.Ss I/O
Theses are the general I/O functions.  Note that both "getline" and
"reachars" return 0 upon encountering EOF, and the empty string on error.
"flush" does what "flush_stdout" does in Munger.
.Bl -column -offset left "file2string" "(println expr ...)" "does not return"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li print         Ta (print expr ...)        Ta 1
.It Li println       Ta (println expr ...)      Ta 1
.It Li flush         Ta (flush)                 Ta fixnum
.It Li die           Ta (die ...)               Ta does not return
.It Li warn          Ta (warn expr ...)         Ta 1
.It Li getline       Ta (getline)               Ta string or 0
.It Li readchars     Ta (readchars expr)        Ta string or 0
.It Li file2string   Ta (file2string expr)      Ta string or 0
.El
.Pp
These are the intrinsics redirecting the standard descriptors onto files
and processes.  These functions return 1 upon success, or a string describing
an error condition.
.Bl -column -offset left "with_output_file_append" "(with_output_file_append file expr ...)"
.It Sy Intrinsic Ta Sy Use
.It Li pipe                       Ta (pipe desc program)
.It Li with_input_process         Ta (with_input_process program expr ...)
.It Li with_output_process        Ta (with_output_process program expr ...)
.It Li redirect                   Ta (redirect desc file append)
.It Li with_input_file            Ta (with_input_file file expr ...)
.It Li with_output_file           Ta (with_output_file file expr ...)
.It Li with_output_file_append    Ta (with_output_file_append file expr ...)
.It Li resume                     Ta (resume desc)
.El
.Pp
.Ss System-Related
"random" returns a fixnum in the range of 0 to one less than its argument.
The "time" intrinsic returns a string represention of the UNIX time value,
padding with leading zeros to become sixteen-character strings, so they may be
compared with each other using "strcmp".  The "stat" intrinsic returns a five
element stack, containing all strings:  owner name or uid, group name or uid,
time of last access, time of last modification, and size, with the time values
formatted similary to those returned by "time".  The "date" intrinsic returns a
textual representation of the current date and time.
.Bl -column -offset left "directory" "(getenv str str)" "stack of filenames"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li basename      Ta (basename path)      Ta string
.It Li dirname       Ta (dirname path)       Ta string
.It Li rootname      Ta (rootname path)      Ta string
.It Li suffix        Ta (suffix path)        Ta string
.It Li directory     Ta (directory expr)     Ta stack of filenames
.It Li rename        Ta (rename from to)     Ta 0 or error string
.It Li remove        Ta (remove expr)        Ta 0 or error string
.It Li rmdir         Ta (rmdir expr)         Ta 0 or error string
.It Li stat          Ta (stat expr)          Ta stack or error string
.It Li getenv        Ta (getenv string)      Ta string or 0
.It Li random        Ta (random expr)        Ta fixnum
.It Li time          Ta (time)               Ta fixnum
.It Li date          Ta (date)               Ta string
.El
.Ss Command-Line Args
These function identically to their Munger counterparts.
.Bl -column -offset left "previous" "(previous)" "0 or string"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li next      Ta (next)      Ta 0 or string
.It Li previous  Ta (previous)  Ta 0 or string
.It Li current   Ta (current)   Ta string
.It Li rewind    Ta (rewind)    Ta string
.El
.Ss Strings
The ability of the "split" intrinsic in Munger to explode a string
into a list of one-character strings, is not present in the MM "split".
The "explode" intrinsic does this. A version of "join" that works on
stacks of strings is in "stringstack.mm".
.Bl -column -offset left "expand_tabs" "(substring string expr1 expr2)" "stack of strings"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li chop      Ta (chop expr)                    Ta string
.It Li chomp     Ta (chomp expr)                   Ta string
.It Li length    Ta (length expr)                  Ta fixnum
.It Li digitize  Ta (digitize expr)                Ta fixnum
.It Li code      Ta (code expr)                    Ta fixnum
.It Li explode   Ta (explode expr)                 Ta stack of strings
.It Li stringp   Ta (stringp expr)                 Ta 0 or 1
.It Li join      Ta (join delim expr ...)          Ta string
.It Li split     Ta (split delims string [limit])  Ta stack of strings
.It Li concat    Ta (concat expr1 expr2 ...)       Ta string
.It Li substring Ta (substring string expr1 expr2) Ta string
.It Li strcmp    Ta (strcmp expr1 expr2)           Ta fixnum
.It Li expand_tabs Ta (expand_tabs expr1 string)   Ta string
.El
.Ss SQLite
These functions provide the interface to the SQLite library.  Only one
database file may be open at any one time.  The database handle is managed
internally by the runtime engine.  Column data is returned as a stack of
strings.
.Bl -column -offset left "sqlite_finalize" "(sqlite_bind expr expr expr)" "sql object or string"
.It Sy Intrinsic       Ta Sy Use                       Ta Sy Return Value
.It Li sqlite_open     Ta (sqlite_open expr)           Ta error string or 1
.It Li sqlite_close    Ta (sqlite_close)               Ta 0 or 1
.It Li sqlite_exec     Ta (sqlite_exec expr)           Ta stack or string
.It Li sqlite_prepare  Ta (sqlite_prepare expr)        Ta sql object or string
.It Li sqlp            Ta (sqlp expr)                  Ta 0 or 1
.It Li sqlite_bind     Ta (sqlite_bind expr expr expr) Ta 1 or string
.It Li sqlite_step     Ta (sqlite_step expr)           Ta 0, 1 or string
.It Li sqlite_row      Ta (sqlite_row expr)            Ta stack or string
.It Li sqlite_reset    Ta (sqlite_reset expr)          Ta 1 or string
.It Li sqlite_finalize Ta (sqlite_finalize expr)       Ta 1 or string
.El
.Ss Terminal Colors
The foreground and background colors can be modified with the following
functions.
.Bl -column -offset left "bg_magenta" "(bg_magenta)" "1"
.It Sy Intrinsic Ta Sy Use Ta Sy Return Value
.It Li black      Ta (black)      Ta 1
.It Li white      Ta (white)      Ta 1
.It Li red        Ta (red)        Ta 1
.It Li green      Ta (green)      Ta 1
.It Li yellow     Ta (yellow)     Ta 1
.It Li blue       Ta (blue)       Ta 1
.It Li magenta    Ta (magenta)    Ta 1
.It Li cyan       Ta (cyan)       Ta 1
.It Li bg_black   Ta (bg_black)   Ta 1
.It Li bg_white   Ta (bg_white)   Ta 1
.It Li bg_red     Ta (bg_red)     Ta 1
.It Li bg_green   Ta (bg_green)   Ta 1
.It Li bg_yellow  Ta (bg_yellow)  Ta 1
.It Li bg_blue    Ta (bg_blue)    Ta 1
.It Li bg_magenta Ta (bg_magenta) Ta 1
.It Li bg_cyan    Ta (bg_cyan)    Ta 1
.El
.Sh AUTHORS
.An James Bailie Aq jimmy@mammothcheese.ca
.br
http://www.mammothcheese.ca
