SNOBOL (StriNg Oriented and symBOlic Language) is a generic name for a family of programming languages designed for the manipulation of strings. Among its features are symbolic naming of strings and pattern matching, making it especially suitable for text analysis. SNOBOL was developed in the 1960s at AT&T Bell Laboratories by David Farber, Ralph Griswold, and Ivan Polonsky.
The implementation of SNOBOL available on TWENEX.ORG is SITBOL20 and was developed at Stevens Institute of Technology. The documentation for SITBOL20 is not readily available. Nonetheless, SITBOL20 appears to be largely consistent with generic SNOBOL4. The most notable differences are with regards to I/O. This tutorial will focus on those aspects of SITBOL20 that appear particular to writing and running SNOBOL programs on TWENEX.ORG, especially I/O. Those interested in learning more about SNOBOL should consider one of the works listed at the bottom of this document.
To start SNOBOL, type “snobol” at the EXEC prompt.
At this point, SNOBOL is expecting a file-name or device-name for input. For example,
SITBOL>hello_world.sno or SITBOL>TTY:
In the first case, SNOBOL will attempt to read and run the code in “hello_world.sno.” In the second case, SNOBOL will read from the terminal device “TTY:” (available devices are listed below). SNOBOL will continue to take input from “TTY:” until reserved word “END” is found starting at the first column, at which point it will attempt to run what you have previously entered.
Here is an example of an interactive session in SITBOL20 at TWENEX.ORG.
@snobol SITBOL>TTY: * This is a comment. OUTPUT = "Hello, World!" END Hello, World! SITBOL>/EXIT EXIT @
Note that comments begin in the first column and are prefixed with an asterisk. Although not shown, line continuation is indicated by a plus-sign in the first column. Likewise, labels begin in the first column. All other instructions should begin after the first column. By convention, instructions start at column 9 and GOTOs start at column 56 (it is not a rule; just try to be consistent).
To leave SNOBOL, type “/EXIT” at the SITBOL> prompt, e.g.,
SITBOL>/EXIT EXIT @
You can also have SNOBOL load your program and begin running it by including the file-name of your program on the command line; e.g.,
SNOBOL has a few predefined variables reserved for I/O. For example, assignment from INPUT or to OUTPUT will read or write, respectively, to the default I/O device (e.g., TTY:).
* Simple SNOBOL I/O example. OUTPUT = "What is your name?" LINE_IN = INPUT OUTPUT = "Hello, " LINE_IN "." END
SNOBOL provides for greater control over I/O with the INPUT() and OUTPUT() functions. Parameters to these functions allow for association of names to input and output devices and files, as well as specification of I/O mode and format. Unfortunately, requirements for these parameters are not consistent across the various SNOBOL implementations, and, as noted, documentation for the version on TWENEX.ORG does not appear to be available at this time. The descriptions that follow are based largely on various SNOBOL texts, and trial and error.
Historically, I/O in SNOBOL was handled by FORTRAN IV I/O routines. This is apparent in the parameters to the INPUT() and OUTPUT() functions of some implementations of SNOBOL (not necessarily that at TWENEX.ORG). For example, in some implementations, the OUTPUT() function takes the form
OUTPUT('name', number, '(format)')
which associates a name with a device reference number, and specifies the output format. The format option is a string in FORTRAN IV style. For example,
OUTPUT('OUTPUT', 6, '(1X,27A5)')
Likewise, the INPUT() function takes the form
INPUT('name', number, length)
which associates a name with a device reference number, and specifies that the resulting string is to have a given length. For example,
INPUT('INPUT', 5, 80)
These descriptions and examples of the INPUT() and OUTPUT() functions are from the “PDP-10 Snobol4 User's Guide.” However, the version of SNOBOL at TWENEX.ORG seems to use a more contemporary specification, albeit with PDP-10 device names. SITBOL20 INPUT() and OUTPUT() functions appear to take the forms:
INPUT('name', 'device/file', 'format option') and OUTPUT('name', 'device/file', 'format option')
The following example shows how to associate a name with an I/O device (or file) and specify format options in SITBOL20 at TWENEX.ORG.
* Example of INPUT() and OUTPUT() parameters. * The 'T' format option will prevent a line-feed from * being appended to output. OUTPUT('Screen', 'TTY:', 'T') INPUT('Keyboard', 'TTY:') Screen = "What is your name? " LINE_IN = Keyboard Screen = "Hello, " LINE_IN "." END
Notice in the above example, output was specifically routed to the TTY: device and line-feeds were suppressed with the format option. Likewise, input was specifically taken from the TTY: device. Device names, as listed in the “PDP-10 SNOBOL4 User's Guide” are:
No definitive reference has been identified that describes format options for the SITBOL20 INPUT() and OUTPUT() functions; however, they seem to correspond generally to those of CSNOBOL4 (snobol4 on SDF), but limited to one character. These are:
Griswold, Ralph E., J. F. Poage, and I. P. Polonsky. The Snobol4 Programming Language. Second Edition. Englewood Cliffs, NJ: Prentice Hall, c1971. (AKA, The Green Book).
Hockey, Susan M. Snobol Programming for the Humanities. New York: Clarendon Press; Oxford: Oxford University Press, 1985.
SNOBOL4.ORG. SNOBOL4 Resources.
Wade, L. P. PDP-10 SNOBOL4 User's Guide. October 17, 1970.