Hardware: Context
Wouldn't it be nice to be able to ship a run-anywhere stack-based VM like Java?
Well, yes and no. When Inferno migrated the Plan 9 codebase to a stack-based VM, it lost most of its community, who stayed behind on the C-based version. In the modern world, intermediate VMs are more often regarded as necessary evils and shortcuts than something desirable. (See also: the steady migration of llvm away from its namesake.)
Loper OS, the archetypal hobbyist daydream project that never gets anywhere concrete, proposed a return to the high-level architectures used by Lisp machines. The last time Stanislav posted about that project, he was working on learning how to implement his vision in FPGA, without even furnishing his audience with a specification for a CPU architecture. It's hard not to see such a drastic change of plans as an excuse to avoid actually producing any code. There are several excellent papers that describe very elegant and compact Lisp machines, including one literally called SIMPLE, and the detailed manuals necessary to emulate the Symbolics machines are available. Perhaps this way lies madness?
What about real-world CPU architectures? We're currently waking up to an ARM-dominated world, and the next revolution after that will probably RISC-V... but the majority of desktop machines are still very much rooted in x86-64. There is an obvious 'out' here—maybe it's better to hedge our bets for now and just develop a userland that compiles to C until the project can invest in serious assembly generation backends.
Still, there is an inescapable desire to truly pry up the floorboards and make something self-hosting. I've spent some time working through OSDev tutorials for Intel CPUs lately, and my main takeaway was that many of the recommended resources for development are horribly organized, outdated, or—worse—closed. So increasingly I've become interested in building an emulator for an idealized, clean architecture, on the premise that it could reasonably be ported to other machines later. Staring at Previous for many hours may have contributed to this.
The CHASM Workstation Architecture
The current design for a target platform for LETHE is called CHASM. The exact meaning of the acronym changes on a regular basis, but one might parse it as a 'container for a hybrid-architecture simulated microkernel.'
The mostly-complete spec, covering the emulation-relevant portions of CPU and hardware functionality, can be found here. Enough of it is documented already to support a basic networked workstation.
Basic attributes:
- Memory Model: Flat with paging
- Addressing: All memory addresses indicate the position of a 16-bit word; 64-bit address space
- Page Size: Chosen as part of page table setup; anywhere from 256 words to 32768 words (512 bytes to 65536 bytes)
- Registers: 8 data + 8 cdr (address) + 7 broadly-accessible system registers + 1 status register (all 64-bit)
- Endianness: Strictly big-endian, with swap instructions for fixing values generated by those who have strayed from the path
- Opcode mnemonics: 644 (mostly arithmetic operations with varying addressing modes and datatypes)
- Instruction Size: 1 word (16 bits), plus 1, 2, or 4-word 'trailing literal' values
- Datatypes:
- Unsigned 64-bit integer
- Signed 64-bit integer (two's complement)
- Byte vector (8x unsigned 8-bit integer)
- Double-precision IEE 754 floating point (binary64)
- Length-prefixed strings (64-bit length, 8-bit characters)
CHASM is a RISC-adjacent CPU. It violates the principle of minimality by providing several domain-specific datatypes, including floating-point and string support. Note that the trailing literal paradigm (typical of most CPUs) utilizes the same number of clock cycles per word as using LUI and LW to load a 32-bit value in MIPS. It's also easier to optimize—the current emulator loads these trailers and advances the program counter immediately instead of waiting for the instruction bus window to advance, as would real hardware.
Unusual features:
- Interrupts are used for two-way communication with devices
- Most operations use x86-style semantics
- Compare-and-swap instruction for multi-CPU concurrency
- Direct CPU support for switching into, out of, or between page tables during interrupts, allowing non-kernel tasks to truly own devices
- Memory-mapped devices table with plug-and-play device detection
- Block-copy and block-fill operations
- Opcodes for repeating the previous instruction with auto-incrementing registers
- Support for 64-bit 'cdr' registers that hold address values, for optimization of Lisp-family languages
As with other LETHE support projects, the CHASM emulator is being written in C and C++, targeting MinGW64. It uses SDL3 for graphics and input. A source code release is not yet available.
The CHAS Assembler
The CHASM spec includes a complete description of the assembly language used for the CHASM CPU, called CHAS (CHasm ASsembler). The source code for a rudimentary implementation of CHAS can be obtained at https://github.com/rhetorica/chas.
Background: Register machines, stack machines, and high-level architectures
There are three major paradigms for how a CPU operates:
- A register machine has arbitrary access to several variables as working storage, called registers. Such machines descend from the earliest digital computers, which used similar designs prior to the (re)invention of stored programs and addressable memory. The vast majority of physical CPU architectures work this way.
- A stack machine operates like the FORTH programming language or a Reverse Polish Notation calculator: CPU instructions pop as many arguments as they need off the stack, and push back return values. Stack machines are very difficult to directly program for complex tasks as the programmer must think about how to organize variables to be available when needed. They have the advantage of being easier to design and simulate. However, they are also intrinsically at odds with how cache hierarchies work in register machines—which also have stacks but only use them when necessary. Most language-specific virtual machines, like the JVM, are stack machines. A number of physical stack machine CPUs existed in the 1970s and 1980s, which were usually also high-level architectures.
- A high-level architecture is a CPU designed to mimic the semantics of a specific programming language. These machines often have tagged memory (extra bits attached to every word of RAM that describes what that word contains) and very large opcode tables. They may or may not have programmer-accessible registers. HLAs sometimes expose explicit prefetch predicates as part of their instruction set, allowing the low-level programmer to explicitly direct the CPU when to attempt speculative caching. If the architecture includes memory tagging, this can also be used very effectively to identify what data is a pointer and what simply looks like a pointer, saving the CPU the trouble of guessing (and potentially being wrong.) There are only a handful of physical HLA systems: the ill-fated iAPX 432 was designed for Ada; the MIT Lisp machines had half a dozen descendants across 3+ companies (Symbolics, Lisp Machines Inc., and Texas Instruments); experiments in Prolog CPUs were explored; and Burroughs shipped several mainframes designed to handle COBOL and ALGOL workloads.
On the whole, HLAs have failed because of the enduring popularity of C, and there has never really been a 'general-purpose' HLA, excepting perhaps workstations like the Xerox and Symbolics machines, which supported certain degrees of reprogramming CPU macrocode for domain-specific tasks. CPUs have been forced to work around the minimalist and chaotic nature of C, which considers type punning to be a feature, not a bug, compromising the architect's ability to make assumptions about the workload being processed. For a hot minute in 1982 it looked like the iAPX 432 might change this trajectory, but its reach exceeded its grasp, and the product that shipped was intolerably slow and buggy. (Intel would later repeat some of these mistakes with the i860 and Itanium architectures, which had some of the same designers involved.)