RTS4 Data File Structure

There haven’t been any gameplay changes recently, instead I’ve been working on adding features for modding. The current proto data format automatically supports a lot of scenarios, but many common concepts can be tedious to write. The latest build includes some of those changes (and a lot of bugs).
RTS4 Release 9
Webplayer (Unity)

I’m working towards supporting advanced scripts like this RPG-style script:

Player.default {
  Level = 1;
  Experience = 0;
  NextLevelXP = $Level * 200;
  Health = 200 + $Level * 20;
  RequireComponent<AttackMelee> {
    Attack = 20 + $Level * 2;
    OnKill = (other) {
      Experience += other.$XPGain;
  Event(Experience >= NextLevelXP) {
    Level += 1;
Grunt.default {
  XPGain = 1

Several fields (NextLevelXP, Health, Attack) are augmented by other fields. By default, these fields only include the augmentations that appear prior, adding a $ causes all augmentations to be used. Using this feature does introduce the possibility of cross-dependent fields which would crash the game; I’ll need to add some protections for that later (probably just not run those augmentations recursively). An example of something that would break it is:

Health += $BonusHealth
BonusHealth = Health * 0.5

The OnKill event handles giving experience when they kill a grunt. Any augmentations that appear in these scripts are somewhat special; a single augmentation is allocated at compile time, which starts off as the identity value; whenever the script is run, the augmentation value is changed by whatever the script says. This has the benefit of allowing any other augmentations later in the file to still be applied correctly (ie. if Experience *= 2 appears in a later technology, the experience will correctly be doubled until that tech is removed).

Finally, the script creates a level up event to be run when the players experience passes the required amount. Because of the $ in earlier dependent fields, changing the level later in the file still correctly affects Health/Attack/NextLevelXP.

Currently fields referencing other fields, and simple scripts are supported. Its also possible to define “APIs” (RequireComponent and RequireTechnology are now external APIs) to provide more functionality, without bloating the code.

RTS4 XScript Engine

This blog post is going to be pretty boring, no pretty pictures, and its late, and I’m very tired while typing it, but it underpins a lot of the features and functionality available in AoM.

XScript powers a significant part of the data-driven systems in the BANG! engine. As mentioned earlier, random maps are created in this language, but the engine also uses XScripts to describe other routines, such as the start-up sequence, victory conditions, and artificial intelligence. The language and frameworks available are very object-oriented c-style, using functions to create and manipulate objects by passing their ID as the first parameter.

Three annoying issues I’ve found so far with the language is the lack of support for classes, the strong typing, and the strange scoping. Objects in the BANG! engine are created by calling the appropriate createX() function, and storing the ID returned. The object can be interacted with by passing this ID as the first parameter of other functions. Dynamic, or even non-primitive types are not supported, and blocks do not indicate scope; a new scope is only created inside functions.


The simplicity of this language makes parsing relatively easily, though there were some strange bits of behavior that had to be accounted for.

  • for loops must iterate one integer variable, this variable must not be declared and is implicitly declared by the for statement
  • it uses the strange syntax where “void” can be specified in the arguments of a function declaration when there are no arguments.

Originally I tokenised the code, then built an abstract syntax tree, ready to be passed to the compiler to select instructions, however after building this system, I realised there was a lot of data duplication; the set of valid tokens which may appear after a given token was required both in the tokenisation pass and in the AST generation. Removing data duplication is one programming rule I always stick with, as I find it significantly improves maintainability and reduces bugs in code, so the system was rewritten to do this tokenisation/AST generation in a single pass.

To achieve this in a flexible and reliable way, a list of token types is fed into the compiler (things like number, string, literal, controlFlow, type), and set of grammar rules (statement -> if | for | flow | funcDeclr…, if -> “if” & “(” & condition & “)”…, condition -> expression…). These grammar rules contain in the possible tokens which may appear after a given token, and are able to take into consideration the context of the parse (ie. ‘-‘ can be treated as a unary operator or subtraction, “if” can be treated as a control flow instruction, or a variable). The parser starts by attempting to parse the given source code as a block. This method is then called recursively to process statements in the source code bit-by-bit. Each rule used is then pushed to a list before processing child-rules, and popped off when parsing of the given rule is complete.

Generating pseudo-instructions

For each rule, the system calls a special piece of code to gather required input from the source code and add instructions to another list representing the required behavior. The instructions generated by sub-rules is passed to parent rules when they begin running their special routine, so that they can organise sub instructions (ie. push both literals before running the addition instruction) and extract required meta-information (ie. where the condition of an if statement ends, and the code begins)

Generating VM-instructions

After a list of instructions has been generated, their types must be resolved, and the appropriate VM instruction selected. This is done by creating a virtual stack, and pushing or popping types required by instructions as iterating over each instruction. Doing this is not flexible and most would consider ugly and unreliable, though for the given cases, it works well, and most of the time, offers the expected types on the stack, if the code were to be run, by abusing the fact that code is typically sequential. A set of VM instructions is looked up by name based on the current instructions name, and of the matches, the best VM instruction to match the types expected on the stack is selected.

As this process treats the instructions as if they were always executed sequentially, it has problems past forks and jump instructions, however fortunately, almost all of the time, the stack will be balanced before and after a conditional jump section.