Expand description
Assembler for the TX-2.
Accepts symbolic assembly source code for the TX-2. Follows the syntax and behaviour of the TX-2’s M4 assembler written by Larry Roberts.
There are a number of differences between this assembler and M4. These are:
- This assembler’s input is a Unicode text file, while M4’s input was a tape (or magnetic tape input) containing Lincoln Writer codes. This has some consequences for how the input is represented (among other reasons because the Lincoln Writer supported some superscript and subscript characters which are not in Unicode).
- M4’s input includes editing commands; these are not needed on modern systems, since they support files and editors which are independent of the assembler.
- M4 produced output directly on a paper tape; this assmebler produces a tape image file (containing the same data).
- Some features, notably support for local tag scope inside macro bodies, are not yet implemented.
- Tab characters are not supported in the input.
- This assembler produces diagnositics for some kinds of erroneous input that M4 accepted (see below).
§Diagnostics
This assembler produces diagnostics giving the line number (and often, the approximate column number) of errors.
The M4 assembler accepted circular symbol definitions (for origins for example, see XXX) but this assembler detects this and generates an error message.
There may be some cases in which M4 generated an error but were we either do not, or where we lack a test ensuring that the error message is correctly generated; https://github.com/TX-2/TX-2-simulator/issues/159 enumerates these.
Modules§
- ast 🔒
- Abstract syntax representation. It’s mostly not actually a tree.
- collections 🔒
- Containers, including
OneOrMore
. - directive 🔒
- Intermetiate representation of the assembly language input.
- driver 🔒
- Invoke the various passes of the assembler.
- eval 🔒
- Turn the symbolic program into a sequence of words in binary.
- glyph 🔒
- Implement the
@...@
constructs in the source code. - lexer 🔒
- Turn text input into a sequence of tokens.
- listing 🔒
- Emit assembly language listing, with a symbol table.
- manuscript 🔒
- The immediate output of the parsing process.
- memorymap 🔒
- Decide final position of blocks of code and allocate RC-words.
- parser 🔒
- Turn a sequence of input tokens into a data structure representing it.
- readerleader 🔒
- The first part of the binary program output.
- source 🔒
- Representation of the original input.
- span 🔒
- Representation of locations within the original source file.
- state 🔒
- Represents the state of the source code parser.
- symbol 🔒
- Symbol names and information about how these are used in the program.
- symtab 🔒
- Explicit and implicit symbol definitions.
- types 🔒
- Fundamental types and errors.
Structs§
- Binary
- The assembled program; a sequence of
BinaryChunk
instances with an optional entry point. - Binary
Chunk - A contiguous sequence of words at some starting address.
- Output
Options - Indicates what kind of output the user wants.
Enums§
- Assembler
Failure - A failure to read the source and emit a binary.
- Directive
Meta Command - Represents the meta commands which are still relevant in the directive. Excludes things like the PUNCH meta command.
Functions§
- assemble_
file - Assemble input file, producing a tape image.
- reader_
leader - Returns the standard reader leader.
- write_
user_ program - Write the user’s program as a tape image file.