Tuesday, August 18, 2009

Rogue 3.6 Investigation Part I: Data Structures

Table of Contents:

Roadmap

We can find the source code to Rogue 3.6 here (or more precisely here), and we will study it first. It is the earliest example of a roguelike that I could find, after all!

This is part I of a several part series investigating the structure and code behind the Rogue 3.6 program. We will consider the data structures used in the game.

License

I think I need to follow the License and reproduce it here

Rogue: Exploring the Dungeons of Doom
Copyright (C) 1980, 1981 Michael Toy, Ken Arnold and Glenn Wichman
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
3. Neither the name(s) of the author(s) nor the names of other contributors
   may be used to endorse or promote products derived from this software
   without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR(S) OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.

===========================================================================

Portions of this software (save/restore game state) are based on the work 
of Nicholas J. Kisseberth. Used under license:

Copyright (C) 1999, 2000, 2006 Nicholas J. Kisseberth

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
3. Neither the name(s) of the author(s) nor the names of other contributors
   may be used to endorse or promote products derived from this software
   without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR(S) OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.

===========================================================================

Portions of this software (encryption) are based on the work 
of David Burren. Used under license:

FreeSec: libcrypt

Copyright (C) 1994 David Burren

All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
3. Neither the name(s) of the author(s) nor the names of other contributors
   may be used to endorse or promote products derived from this software
   without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR(S) OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
Source: /rogue3.6.3-src/LICENSE.TXT

Investigating Data Structures

We see that there are the following files in the source code:

armor.c    LICENSE.TXT  move.c         rip.c           rogue.r    weapons.c
chase.c    list.c       newlevel.c     rogue36.cat     rooms.c    wizard.c
command.c  machdep.h    options.c      rogue36.doc     save.c     xcrypt.c
daemon.c   main.c       pack.c         rogue36.html    scrolls.c
daemons.c  Makefile     passages.c     rogue36.sln     state.c
fight.c    mdport.c     potions.c      rogue36.vcproj  sticks.c
init.c     misc.c       readme36.html  rogue.6         things.c
io.c       monsters.c   rings.c        rogue.h         vers.c

As per usual in C, the header files usually contain the data structures. Fortunately, we only have a few header files to consider: machdep.h and rogue.h. We will investigate rogue.h, since machdep.h is all the boring machine dependent code.

Unfortunately, the rogue.h file is some 530 lines long. So we will investigate the data structures first (the structs), then in another post investigate the relevant methods.

The data structures start on line 268 of rogue.h (the big hint is the comment "Now we define the structures and types"). It's some 133 lines of code defining all the data structures, so lets keep a close eye on each data structure!

Help List

/*
 * Help list
 */

struct h_list {
    char h_ch;
    char *h_desc;
};

extern struct h_list helstr[];
lines 272-281 /rogue3.6.3-src/rogue.h

This defines some data structure, which has an index variable char h_ch and a corresponding string char *h_desc which describes (presumably) the key stroke.

This is stored into a global variable extern struct h_list helstr[] which consists of all the help strings. It's appropriately the help list.

Coordinate Data Type

/*
 * Coordinate data type
 */

typedef struct {
    int x;
    int y;
} coord;
lines 283-289 /rogue3.6.3-src/rogue.h

This is a simple Cartesian coordinate pair (x,y). When we work the the terminal (the console, command line, whatever you want to call it), we have a discrete sort of geometry: usually 80 characters wide and 30 characters tall.

To specify a character on the console, we need a coord to specify its location.

Magic Items

/*
 * Stuff about magic items
 */

struct magic_item {
    char mi_name[30];
    int mi_prob;
    int mi_worth;
};
lines 305-313 /rogue3.6.3-src/rogue.h

We skipped a few (e.g. the struct linked_list that are straightforward to understand) to get to something exciting: magic items!

They consist of char mi_name[30] a string of 30 characters' length, which is the name of the item.

The int mi_prob is, I think, the probability of finding the magic item. Remember in stdlib.h in C, the maximum random number is specified to be RAND_MAX=32767. I would suspect that if the random number is greater than mi_prob, then the item is created.

The last field, int mi_worth is the value (in gold pieces) of the item.

Dungeon Data Structures

Room Structure

/*
 * Room structure
 */
struct room {
    coord r_pos;                        /* Upper left corner */
    coord r_max;                        /* Size of room */
    coord r_gold;                       /* Where the gold is */
    int r_goldval;                      /* How much the gold is worth */
    int r_flags;                        /* Info about the room */
    int r_nexits;                       /* Number of exits */
    coord r_exit[4];                    /* Where the exits are */
};
lines 315-326 /rogue3.6.3-src/rogue.h

Now we are starting to define the dungeon's data structures. The first logical place to begin when building a dungeon is to build rooms.

We specify the upper left corner of the room (and the lower right corner of the room) by coord r_pos (and coord r_max respectively). This tells us the maximum size the room can be, the actual size is something randomly determined by these parameters.

We are then concerned about the contents of the room, specifically about gold and other things. The gold is specified by its position coord r_gold and the quantity int r_gold (since it's not unsigned, could there be negative gold?!).

The "other stuff" is specified by the int r_flags field.

The doorways into (and out of) the room are specified by the next (and last) two fields int r_nexits which give the number of exits, and coord r_exit[4] which gives the coordinates of the exits. We see that there can only be, at most, 4 exits.

Traps

/*
 * Array of all traps on this level
 */
struct trap {
    coord tr_pos;                       /* Where trap is */
    char tr_type;                       /* What kind of trap */
    int tr_flags;                       /* Info about trap (i.e. ISFOUND) */
};

extern struct trap  traps[MAXTRAPS];
lines 328-337 /rogue3.6.3-src/rogue.h

We begin getting to something interesting: traps! What do we need to know about a trap? Well, where do we put it, what kind is it, and has it been detected yet?

This corresponds to coord tr_pos (the position of the trap), char tr_type (the identifier for the type of trap it is), and int tr_flags (the flags giving more information about the trap).

We then collect all the struct trap into one giant list, a global variable which is the last line of code: extern struct trap traps[MAXTRAPS]. It contains the information about all of the traps on this level.

Monster and Item Data Structures

Stats

/*
 * Structure describing a fighting being
 */
struct stats {
    str_t s_str;                        /* Strength */
    long s_exp;                         /* Experience */
    int s_lvl;                         /* Level of mastery */
    int s_arm;                         /* Armor class */
    int s_hpt;                         /* Hit points */
    char s_dmg[30];                     /* String describing damage done */
};
lines 339-349 /rogue3.6.3-src/rogue.h

In most role playing games, the monsters have some set of statistics (a list of numbers indicating its strength, dexterity, etc.). Rogue is no different!

Here we have only a certain subset of the rich structure from, e.g., Dungeons and Dragons. We have str_t s_str for the strength of the creature. But wait...what is a str_t data type? It's not ANSI C!

We go back and look, and the data structure we skipped after introducing coord is precisely the culprit! We produce it here:

typedef struct {
    short st_str;
    short st_add;
} str_t;
lines 291-294 /rogue3.6.3-src/rogue.h
which consists of two shorts. Interesting that the "strength" statistic would depend on a "composite" data type!

Returning to our data structure at hand, line 344 specifies the long s_exp field. It's precisely what one expects, it specifies the "experience points" the creature has (indicating how "experienced" it is at...killing stuff...).

When the creature has accumulated certain amounts of experience, it then increments its int s_lvl field indicating that the creature has "leveled up".

The armor class, or how well protected the creature is, is another statistic of interest. It's precisely contained in the int s_arm field.

When armor fails, the creature has some stamina...some "hit points". This is the next entry int s_hpt which is the maximum hit points the creature has (not its current hit points!).

The type of damage the creature can do is specified by a string, char s_dmg[30]. I guess that it doesn't make any real difference what type it is, it's just frosting.

That kind of bothers me that the type of damage is a facade, but that is easier to program.

Structure for Monsters (and the Player)

/*
 * Structure for monsters and player
 */
struct thing {
    coord t_pos;                        /* Position */
    bool t_turn;                        /* If slowed, is it a turn to move */
    char t_type;                        /* What it is */
    char t_disguise;                    /* What mimic looks like */
    char t_oldch;                       /* Character that was where it was */
    coord *t_dest;                      /* Where it is running to */
    short t_flags;                      /* State word */
    struct stats t_stats;               /* Physical description */
    struct linked_list *t_pack;         /* What the thing is carrying */
        int t_reserved;         /* reserved for save/restore code */
};
lines 354-365 /rogue3.6.3-src/rogue.h

We now get to the data structure representing the player! We need to specify a few things first about the monster, specifically its stats and its coord position. This is precisely done on lines 355 and 362.

We can also specify if the monster is slowed. Perhaps a spell was cast on it, it's drunk, caught in a spider web, something (anything!) to make it slower. If it is slowed down, then bool t_turn is true to indicate it takes 1 entire turn to move.

We also should worry about what sort of monster is it?! Is it the player (yes, the player is a monster!)? Is it an orc? Or something else? This is specified by char t_type.

Since a roguelike is played on the command line, each monster is represented by a character. I can't say which field specifies it. This is specified either by char t_oldch or char t_disguise, I can't really say.

But monsters go places too. This is kept track by the thing's data structure itself, specified by the line coord *t_dest.

The monster is sort of like a finite state machine. Consequently, it has a state. This state is specified by the line short t_flags which gives it various characteristics.

Monsters (well, the player at least) have inventories. They have stuff, like armor or weapons. The field struct linked_list *t_pack specifies the monster's inventory.

In case we want to do anything else, we reserve some space int t_reserved for saving the monster or restoring it.

Information on the Taxonomy of Monsters

/*
 * Array containing information on all the various types of monsters
 */
struct monster {
    char m_name[20];                    /* What to call the monster */
    short m_carry;                      /* Probability of carrying something */
    short m_flags;                      /* Things about the monster */
    struct stats m_stats;               /* Initial stats */
};
lines 367-375 /rogue3.6.3-src/rogue.h

The taxonomy of monsters specifies, like the biological taxonomy, what kind of creature we are dealing with. Orcs are different from dragons, which are different from coyotes, which are different from...

We keep track of all this with this particular data structure. The char m_name[20] lets us name the monster type (e.g. "orc", "dragon", "coyote", etc.).

The short m_carry is the odds that it's carrying stuff, e.g. an orc is more likely to have stuff than a coyote.

The short m_flags specifies the flags telling us miscellaneous facts about the monster.

Like all monsters, it has struct stats m_stats specifying its statistical attributes.

Data Structure for the Inventory of a Monster

/*
 * Structure for a thing that the rogue can carry
 */

struct object {
    int o_type;                         /* What kind of object it is */
    coord o_pos;                        /* Where it lives on the screen */
    char o_launch;                      /* What you need to launch it */
    char o_damage[8];                   /* Damage if used like sword */
    char o_hurldmg[8];                  /* Damage if thrown */
    int o_count;                        /* Count for plural objects */
    int o_which;                        /* Which object of a type it is */
    int o_hplus;                        /* Plusses to hit */
    int o_dplus;                        /* Plusses to damage */
    int o_ac;                           /* Armor class */
    int o_flags;                        /* Information about objects */
    int o_group;                        /* Group number for this object */
};
lines 377-394 /rogue3.6.3-src/rogue.h

This specifies the items that the user (or some monster) can carry. This is a bit against the spirit of modern Roguelikes since it seems that everything "is-an" item (in the object oriented sense of the word). But bear in mind this is an early version of the original game!

The first two fields are self-explanatory. The third field char o_launch is foreign, however, and the comment doesn't help much. I suspect that it tells us information about whether the item is ammunition for e.g. a bow or a sling.

The next two fields tells us how the item damages a monster if we use it "like a sword" or if we hurl it at the monster. These are strings informing us HOW it would harm a monster, but not how much it would hurt.

The next field int o_count allows us to "stack" multiple items of the same type together, saving memory.

We (similar to the taxonomy of monsters) have a "taxonomy of items", which we use to specify the field int o_which which sort of item we have.

Any modifications to hitting or damaging a monster are specified by the int o_hplus (and int o_dplus) fields (respectively).

If the monster wears the item like it were armor, it modifies the armor class. This modification is specified by the field int o_ac.

As per usual, the miscellaneous features are lumped into int o_flags some "flags" field of the data structure.

The "group number" (whatever that means) is the last field entry in the data structure int o_group.

Actions As Data Structure

struct delayed_action {
    int d_type;
    int (*d_func)();
    int d_arg;
    int d_time;
};
lines 396-401 /rogue3.6.3-src/rogue.h

This is rather interesting, an action is a data structure! But what exactly does the data structure specify?

Well, the first field int d_type specifies what sort of event it is, and int (*d_func)() is a function pointer which points to the function that specifies what happens.

The int d_arg field, I suspect, specifies the argument to the function, but I can't say for sure at this time.

The int d_time may specify how frequently it happens, or what turn it will happen. It's presumably the latter.

Morals from the UNIX Philosophy

This "post" really just lists a bunch of morals we should learn from the UNIX philosophy.

  1. You cannot tell where a program is going to spend its time. Bottlenecks occur in surprising places, so do not try to second guess and put in a speed hack until you've proven that's where the bottleneck is.
  2. Measure. Do not tune for speed until your performance analysis tool tells you which part of the code overwhelms the rest.
  3. Fancy algorithms tend to run more slowly on small data sets than simple algorithms. They tend to have a large constant factor in O(n) analysis, and n is usually small. So don't get fancy unless Rule 2 indicates that n is big enough.
  4. Simplify your algorithms and data structures wherever it makes sense because fancy algorithms are more difficult to implement without defects. The data structures in most programs can be built from array lists, linked lists, hash tables, and binary trees.
  5. Data dominates. If you have chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.

Additionally, Mike Gancarz mentions some that are overlooked:

  1. Small is beautiful.
  2. Make each program do one thing well.
  3. Build a prototype as soon as possible.
  4. Choose portability over efficiency.
  5. Store data in flat text files.

Eric S. Raymond also gives a few worth noting:

Rule of Modularity:
Write simple parts connected by clean interfaces.
Rule of Clarity:
Clarity is better than cleverness.
Rule of Composition:
Design programs to be connected to other programs.
Rule of Separation:
Separate policy from mechanism; separate interfaces from engines.
Rule of Simplicity:
Design for simplicity; add complexity only where you must.
Rule of Parsimony:
Write a big program only when it is clear by demonstration that nothing else will do.
Rule of Transparency:
Design for visibility to make inspection and debugging easier.
Rule of Robustness:
Robustness is the child of transparency and simplicity.
Rule of Representation:
Fold knowledge into data so program logic can be stupid and robust.
Rule of Least Surprise:
In interface design, always do the least surprising thing.
Rule of Silence:
When a program has nothing surprising to say, it should say nothing.
Rule of Repair:
When you must fail, fail noisily and as soon as possible.
Rule of Optimization:
Prototype before polishing. Get it working before you optimize it.

So What Have We Learned?

Well, each data structure should do one thing and only one thing. We should also start out by writing the data structures in header files, and the various "accessor/mutator" methods for the data structures should be prototyped in the corresponding header files.

I think that I will try to use a more "data-driven" approach than a "purist object oriented" approach, composing data structures instead of using kind of bloated inheritance.

In fact, I may use regular, old fashioned C and use Phil's Object Oriented ANSI C approach. To reproduce his example (with some modifications), we have

 /* FooObj.h */
#ifndef FOO_OBJ_H_
#define FOO_OBJ_H_

struct fooobj {
    int privateint;
    char *privateString;
    /* Depending on your preferences, you
     * may prefer privateString to be a char buffer[],
     * OR malloc it and free on delete.
     */
};

typedef struct fooobj * FooOBJ;
FooOBJ newFooOBJ();
void setFooNumber(FooOBJ,int);
void setFooString(FooOBJ,char *); /* make comments about copy or not here */
void dumpFooState(FooOBJ);      /* dumps debug contents of FooOBJ to stdout */
void deleteFooOBJ(FooOBJ);
#endif /* def FOO_OBJ_H_ */

See the beauty here? And if we have, say, Bar extend fooobj, we can do the following:

 /* bar.h */
#include "FooObj.h"

#ifndef BAR_H_
#define BAR_H_
struct bar{
    fooobj *super;
    /* rest not shown */
};

typedef struct bar* Bar;

Bar newBar();
void deleteBar(Bar);
FooOBJ toFooOBJ(Bar); /* simply "return Bar->super;" */
#if __STDC_VERSION__ >= 199901L /* if C99 compliant */
inline void setBarNumber(Bar B, int i) { setFooNumber(B->super,i); }
inline void setBarString(Bar B, char *s) { setFooString(B->super, s); }
#else /* not C99 compliant, use macros instead of inline functions */
#define setBarNumber(B,i)       setFooNumber(B->super,i)
#define setBarString(B,s)       setFooString(B->super,s)
#endif /* if C99 compliant*/
/* rest not shown */
#endif /* def BAR_H_ */

This is precisely what happens with C++ when it does its inheritance voodoo. At least, in memory this is how it looks like.

If one is using a C compiler that does not allow explicit declaration of inline functions, one could easily use macros instead for setBarNumber() and setBarString. Or one could use a full blown function.

Introduction

Okay, so starting to program a roguelike in C++, what resources should we consider? Well, after a bit of browsing, I found the following:

I'll follow the fifteen step programme, chronicling what I found and did here.

I'm just an average Joe programmer, so I am going to divide the blog into three parts really.

The first part will be case studies, examining code from other roguelikes.

In the second part we'll start writing the code in the spirit of the fifteen step approach. This might get a bit hairy.

Finally, the third part, we'll also consider the theoretical aspects of programming a roguelike. That is, answering questions like "How should we implement magic?", "How can we do Artificial intelligence?", and so forth.

I expect things will get sloppy, so I am using a table of contents widget in addition to an archive widget.

Frequently (and Infrequently) Asked Questions

Why not use language X?
I'm trying to write an object oriented roguelike because, well, I'm accustomed to object oriented programming. C++ lets me do that. I'll get to more about that in a future post.

I see a new post that's not on the table of contents, but it comes and goes every few minutes, what's up with that?
Must be witches. Oh no, wait, what I mean is that I'm editing the post, and checking that it looks alright, so I'm publishing, then saving it as a draft and editing.

Bread crumbs and notes on designing a roguelike from scratch.

Table of Contents