Crate assembler

Crate assembler 

Source
Expand description

Assembler for the TX-2.

Accepts symbolic assembly source code for the TX-2. Follows the syntax and behaviour of the TX-2’s M4 assembler written by Larry Roberts.

There are a number of differences between this assembler and M4. These are:

  • This assembler’s input is a Unicode text file, while M4’s input was a tape (or magnetic tape input) containing Lincoln Writer codes. This has some consequences for how the input is represented (among other reasons because the Lincoln Writer supported some superscript and subscript characters which are not in Unicode).
  • M4’s input includes editing commands; these are not needed on modern systems, since they support files and editors which are independent of the assembler.
  • M4 produced output directly on a paper tape; this assmebler produces a tape image file (containing the same data).
  • Some features, notably support for local tag scope inside macro bodies, are not yet implemented.
  • Tab characters are not supported in the input.
  • This assembler produces diagnositics for some kinds of erroneous input that M4 accepted (see below).

§Diagnostics

This assembler produces diagnostics giving the line number (and often, the approximate column number) of errors.

The M4 assembler accepted circular symbol definitions (for origins for example, see XXX) but this assembler detects this and generates an error message.

There may be some cases in which M4 generated an error but were we either do not, or where we lack a test ensuring that the error message is correctly generated; https://github.com/TX-2/TX-2-simulator/issues/159 enumerates these.

Modules§

ast 🔒
Abstract syntax representation. It’s mostly not actually a tree.
collections 🔒
Containers, including OneOrMore.
directive 🔒
Intermetiate representation of the assembly language input.
driver 🔒
Invoke the various passes of the assembler.
eval 🔒
Turn the symbolic program into a sequence of words in binary.
glyph 🔒
Implement the @...@ constructs in the source code.
lexer 🔒
Turn text input into a sequence of tokens.
listing 🔒
Emit assembly language listing, with a symbol table.
manuscript 🔒
The immediate output of the parsing process.
memorymap 🔒
Decide final position of blocks of code and allocate RC-words.
parser 🔒
Turn a sequence of input tokens into a data structure representing it.
readerleader 🔒
The first part of the binary program output.
source 🔒
Representation of the original input.
span 🔒
Representation of locations within the original source file.
state 🔒
Represents the state of the source code parser.
symbol 🔒
Symbol names and information about how these are used in the program.
symtab 🔒
Explicit and implicit symbol definitions.
types 🔒
Fundamental types and errors.

Structs§

Binary
The assembled program; a sequence of BinaryChunk instances with an optional entry point.
BinaryChunk
A contiguous sequence of words at some starting address.
OutputOptions
Indicates what kind of output the user wants.

Enums§

AssemblerFailure
A failure to read the source and emit a binary.
DirectiveMetaCommand
Represents the meta commands which are still relevant in the directive. Excludes things like the PUNCH meta command.

Functions§

assemble_file
Assemble input file, producing a tape image.
reader_leader
Returns the standard reader leader.
write_user_program
Write the user’s program as a tape image file.