Wednesday, October 28, 2020

Programming Exercise: Dice

A short exercise for the reader: come up with a domain specific language for rolling dice. This is a recurring mechanic in Rogue-likes.

Exercise 1. The conventional notation in Dungeons and Dragons is to write "mdn + k" (where "+ k" is optional) to roll m separate n-sided dice, sum them together, then add k to the result and return it. Write a function to accept a string of this form.

What happens when m is not a positive integer? When n is not a positive integer?

Exercise 2. In Dungeons and Dragons, when creating a character, you roll 4d6 but take the highest 3 values. How would you extend the domain language to include this situation?

The source for these exercises are the original Rogue-likes had a function which rolled dice based on a domain specific language, and this idea was picked up by Troll.

Thursday, October 8, 2020

One scheme to world-building

What makes this game fun? That's the question I want to ask when I've finished a game (or as I'm playing it). I recently re-played Fallout 2, and I noticed there's a lot of small quests which piece together to form a tapestry. Individually, the elements are not complex (deliver this item to so-and-so, talk to [person] about [subject]), but they compose together like lego blocks.

This post will discuss one way to world-building, which is necessary to coming up with quests and a game-plot. This is more a "review article" than anything innovative on my part.

Levine's "Stars" and "Passions"

Ken Levine has a great talk on "Narrative Legos". Basically, each location (village, fort, etc.) has around a half-dozen named characters ("Stars"), where each character has three "Passions". A "Passion" is what a character cares about relative to what the player can impact upon; it is transparent to the character, and responsive to the player's action. Effectively a "Passion" is a "bank account" for a Star (or a number between -10 and +10, starting at 0); helping a Star with their Passion results in that Star's invest positive points into their "account", thereby improve the Star's opinion of the player.

When a Star's opinion of the player (formed by combining the three Passion scores together somehow, e.g., adding them together, taking their geometric or harmonic mean, or whatever) reaches a certain high point, it unlocks certain bonuses. Blacksmiths offer additional bonus gear, Clerics offer additional services, etc. Conversely, if a Star's opinion diminishes, services cost more or are outrightly refused.

But two characters could have conflicting passions. This is a goal for writing a plot, because conflicting Passions for Stars means the player is thrust into a zero-sum game...which is fun for the player, and encourages the author to say, "Yes, and I can work this into another story arc!"

Example. Consider a simple fantasy setting. The player is tasked to fetch a McGuffin for the elf village. Elves hate Orcs (and Orcs hate Elves). A "Passion" shared by Elves is "Hurt Orcs".

Now, one of the Stars — Romeo the Elf — has a Passion: he loves Juliet the Orc. (Romeo the Elf's "Hurt Orcs" has the caveat "...except Juliet the Orc".)

For the quest (fetch McGuffin for the Elves), the player needs to take along a character from the elf village. Along the way, the player and companion are ambushed by Orcs, including Juliet the Orc. If the player takes Romeo the Elf, then Romeo will help kill the Orcs except for Juliet; the player is forced to choose what to do, kill Juliet the Orc, or run away somehow? And if the player takes along anyone else, then how does the player avoid killing Juliet the Orc?

Such is one way to have two different Passions interact with each other in a surprising way. It forces the game author to consider how to reconcile conflicting passions with "no wrong answer".

The number of characters should be around 5 or 6 to avoid over-burdening the player, and the number of passions should be 3 to avoid accidentally conflicting with too many other character passions (rendering the game unplayable accidentally).

There's a lot to digest here, and how we implement varies considerably. If we formalize a "passion" using a data structure (a few counters) and functions, we could [should] test the rewards and punishments are triggered properly upon shifting an NPC's opinion of the player. But this is just the observer pattern.

Exercise. Look at Fallout 1 and Fallout 2 (or any RPG you enjoy). For each town, write down who are the Stars and what are their Passions.

Begin with a Map and Needs

I've been playing a lot of post-apocalyptic games recently. I found it helps to begin with a map and asking myself, "How does society reproduce itself? Who produces what goods, and how are those exchanged among the cities producing them?" Looking at a real-world map, I can pick out a few cities that survived the apocalypse, think about trade routes, goods produced [at least food, water, armor, weapons, clothing], and this organizes and frames the thought-process about factions, their aims and beliefs.

These questions lead to more interesting power dynamics: multiple factions within a city disagree with how to distribute the goods, who to trade with, what to prioritize. How far are these factions willing to go to enact their policies?

Example. In one post-apocalyptic game, purified water is rare. Growing crops or raising livestock requires purified water. So for the survival of humans, purified water is absolutely critical (otherwise agriculture is impossible). But only one town has a prototype water purifier that can filter out radiation. As the only source of purified water, traders flock to the town to trade for the water. This makes the town a target for raiders, since traders regularly go to it. Fearing the raiders may damage the water purifier, the town has built sturdy fortifications around it...which has created an "inner city" within the fortifications, and an "outer city". One faction wants to keep outer city dwellers out as second-class citizens, another faction wants to keep the status quo. How far will the first faction go? Will they collaborate with raiders to drive out their rivals?

This "first draft" of an idea for quests sounds promising, but it has a problem: it's close to a "Jesus versus Hitler" morality choice. We need to make the status quo faction morally flawed somehow. There are many ways to do this — the status quo faction could support slavery, be violent imperialists refusing trade with cities that don't submit to their rule, etc.

Now that we have some idea of the factions and issues at hand in this town, we can apply Levine's "Stars" and "Passions" idea to further refine the idea. Each faction has a leader, but there's also the "head mechanic" in charge of maintaining the water purifier and its bottling, as well as the town doctor and the town trader. Thus we have our five stars for the town.

A town that produces nothing except coordinates trading (like the Hub in Fallout) has its own unique concerns. Just to rattle off a few:

If a single partner (call it McGuffinsburg) is its sole source of McGuffins, then the internal politics of McGuffinsburg is a concern of the trading hub.
Raiders are a perpetual problem.
Power dynamics among competing traders; when there's little or no law, it gets quite cut-throat.
Arms dealers in particular encourage secret agents to incite war between two neighboring powers, to profit from selling arms to both sides.
Any of these could be reversed (McGuffinsburg exerting power over the trade hub, raiders as sympathetic figures, lawkeepers trying to maintain order, etc.)
Any two or more of these could be combined.

One bit of advice: just as each Star has 3 Passions (for the sake of foregiving the player for acting unfavorably against one Passion without alienating the Star, thereby cutting off a potential quest-giver), we should take care to have several sources for each commodity. This is a rule-of-thumb, not a Law: sometimes, it's fun to have a single provider for a good, which causes tension and drives the plot ("We need to finish the [apparatus] to provide [goods] to save the world").

Remark (On Economy). At this point, it may be problematic determining the value of goods in the game. I'd advocate a Neo-Ricardian framework for determining the value of goods, if we are really concerned about a sensible economy. This works well for the settings I'm interested in, it works well. Each firm has its own methods-of-production and this determines the value of goods.

As far as the goods considered, we can consider every conceivable item in the game. This is hard. Instead we could start with some basic goods — grain and iron suffice for a first pass, we could successively add more as needed, like wood, weapons, coal, ammunition, etc. Traders add a markup to the value of the goods they transport in their caravan, which has been studied by Robert Vienneau.

Arguably, this doesn't need to be maintained. Prices could be determined as the sum of the cost of inputs, plus some markup, along the lines of Kaleckian economics...because if you aren't careful with your economy, your model may trigger hilarious/player-unfriendly results.

Note: although I have been thinking about a game in a post-apocalyptic setting, there's nothing preventing us from applying it to any setting. It's just a little harder if we have no map. (Post-apocalyptic settings let me be lazy, and use existing maps.)

Recap. So far, we have considered using the economy as a way to organize towns and factions. Each town produces some goods, and need to trade with each other to survive. This leads to identifying factions within towns, tensions between towns, and problems to be sorted out. It also leads naturally to using Levine's "Stars" and "Passions" to further refine the game. Both provide natural motivations for quests.

The map gives us a way to visually organize factions, consider how society sustains itself, the relationships between different towns and factions. Such considerations naturally give rise to Stars and their Passions. Altogether we have not formed a plot, but the fertile grounds for a plot as carried out through quests.

We have thus the basic process of world-building naturally give us ideas for quests and plot-lines. The "bottom up" approach with Levine's Stars and Passions combines well with the "top down" approach of drawing a map, determining the villages and towns, coming up with economies, inserting Stars and their Passions within each town, generating factions, and so on.

How real is the economy?

We need to decide how much detail the economy needs. This can serve a variety of purposes: just fluff, determine the actual value of goods, or as grounds for certain quests. Let's consider this last point in particular.

Missed Example 1. I was pleasantly surprised in Fallout 1 to learn the "gun runners" use historically accurate method of sulphur-less gunpowder production (using animal dung and urine). Although just lore and backstory, it could have served as a quest-line for players with high enough Science to suggest more modern methods of gunpowder production (leading to superior ammunition produced for the user, from gun runner traders alone).

This could also have led to the quest of finding sulphur sources (or other chemical sources). If using pre-war stockpiles, then there could be a limited amount of "superior ammunition" produced and provided to the player. If trading for chemical sources, then this could serve as grounds for multiple ways to obtain it: negotiate for trade, kill for obtaining it, etc. Perhaps there is some mistrust between the producer of the chemicals and the gun runners, the player needs to negotiate this (either by using a middle man caravan, thereby increasing the price of the superior ammunition, or by helping the factions bury the hatchet).

All told, I'm glad Fallout 1 didn't do this, because there's a time limit to the game, and this would have been a huge time sink.

As for modeling the value of commodities using economics, this has problems in the real world using textbook economics. As far as world building cares: value matters for trade routes, and for player bartering with merchants. For trade routes, we only need them to feel "about right" (e.g., goods expensive in one town is imported from a town where those goods are cheap, don't import goods when "domestically produced" versions are cheaper than the imported version, etc.).

Suggestion: Update the world map to reflect trade, specifically roads and ports are built (and improved) over time to better facilitate trade between partners. The quality of roads and ports reflect the trade power between partners.

For the player, accuracy can conflict with fun. When this happens, always side with fun (unless we want to create a constraint for quest lines, e.g., iron shortage causes more expensive equipment...motivating the player to, y'know, fix that shortage). The real concern is that the player has access to equipment matching the challenge. We don't want to give the player overpowered gear too early, nor do we want to force the player to have access only to mediocre equipment.

Economy for lore, well, this provides the grounds for quests. Town A wants to open trade with town B since B produces silk cloth. This isn't reflected in the goods sold in either town, but provides a new quest-line. Alternatively, if B were instead the sole source of iron, then town A could supply only, say, bronze equipment. This combines lore and player experience (which is good and desired: the player should experience consequences of their choices).

In short, put yourself in the world, and ask yourself, "What goods would I have access to? How would I get food? Water? Who produces them? How would I get them? What about luxury goods? Or equipment?" Answering these questions require us to consider the towns further, and increases the realism. It's not enough to have farms randomly placed around towns (Fallout 4 tried to do that): we must also consider the infrastructure, the shortages, needs, scarcities and abundances. This leads us to consider how towns interact, how factions within a town interact, and helps us build a world.

Generating History and Culture

History, Myth, State formation. If we consider how these Stars and villages interact, we can generate a history from conflict. Items of our Stars become revered artifacts. Music and art memorialize these events. Myths emerge from misunderstanding or deliberate lies. Rivalries build up, grudges between Stars and factions emerge over time. Villages band together, forming quasi-states, which dissolve under stress and strain.

Governments. It's worth noting that we could borrow liberally from history. For example, the Polish–Lithuanian Commonwealth had a unique form of government that is seldom discussed...it could easily generate problems to be remedied for plot-line. The ducy of Venice picked its leader through a lottery (well, a lottery picked an electorate, who then picked another electorate, and so on — the convoluted process of lotteries and indirect elections resulted in a new duke).

Religion. I don't have much to comment on here. Post-apocalyptic games tend to seldom discuss religion, and the fantasy games I play have similar pantheons. One thing worth considering, the original Rogue had "daemons" responsible for updating the health, etc., for players. I don't think anyone has made these daemons the Pantheon for the game, which could lead to interesting gameplay: praying to daemons leads to temporary buffs, destroying temples related to a daemon results in temporary negative buffs, etc. Or it could have the same effect as praying to the laws of physics (i.e., nothing noticeable).

Communication. And the most underappreciated point of consideration: it takes time to communicate these events. When an event occurs, news of it spreads through caravans and travelers. Spreading information takes time, and plans revolve around information. This is a challenge to code up, because we no longer have a simple observer pattern to update dialog and quest lines.

Friday, October 2, 2020

Testing the Game

I've come to the opinion, when writing software, you should test it...usually unit testing suffices. But a game is a special kind of software. So special, one naturally faces the question, "Should we still unit test a game?"

To be clear, there are varying degrees and different notions of "testing" a game. We could test the software (make it "bug free", or at least have fewer bugs), test the enjoyability of the game (e.g., make sure it's fun, winnable, etc.), test the UI behaves as desired, "integration tests", end-to-end testing (some sort of "autmated player" searching for specific bugs). I'll discuss a few of these notions.

Testing the Code

If we have adopted a model-view-controller architecture of some flavor, then we can unit test the models. I contend we should unit test the models, and use contracts to enforce the assumptions of the model methods. What would this look like?

We can encode assumptions like Actor::dies() should demand the actor's health is non-positive (i.e., either zero or negative). This could be encoded with an assertion ("precondition"). Then we could write a unit test to create an actor, give the actor some amount of health points, cause damage, then try to call actor.dies(). There are three test cases (when actor.health() is positive, zero, and negative) which should result in the death of the actor in two cases.

Organizing Unit Tests

For object oriented languages, I'm inclined to follow some kind of xUnit testing framework or JUnit: each class we write (say class Thinger) should have some corresponding test class (e.g., class ThingerTest) where each method of the class is tested several times...so Object Thinger::methodOne() should have several corresponding methods void ThingerTest::methodOneShouldDoXTest(), and so on.

For Lisp, there's usually a framework given (Clojure has clojure.test, Common Lisp has several frameworks, etc.). The organization is analogous as for object-oriented languages (functions in module.lisp like (defun thinger-method (...)) should have several test cases handled in module_tests.lisp).

Terminology: JUnit organizes test cases as methods using assertions on a Test class, which are organized into test suites (analogous to how files are organized into directories). A test runner then iterates through the test suites and executes each test case, recording results (both successes and failures) for later use. The exact terminology varies (some xUnit systems, e.g. in smalltalk, have test case classes), but the intuition remains the same: test cases organized into test suites, and a test runner that executes the test cases and records the results.

We organize code by modules, which contain classes, which contain methods. These terms are used loosely: C programmers lack any module system, but use structs instead of classes, and functions instead of methods; Haskell programmers use modules, data types, and functions; etc. Whatever the terminology, we have some kind of ersatz class and ersatz method. Each "class" should have a corresponding test suite, each method should have several tests. Depending on how we organize classes, we should have a corresponding organization of tests: if we have one file per class, we should have an analogous file per test suite. The motivation for this scheme is to make it obvious where to place tests for code (all the tests for /game/src/models/my_module.code is placed in /game/tests/models/my_module_tests.code for example). It's more important to be consistent in whatever scheme you choose.

What to Unit Test

Test only code you have written. There's no need to write tests for third-party libraries. We trust ncurses works as expected, the GNU Scientific Library functions, and so on. It's only the code we wrote that we want to test.

We should unit test all public functions. While aiming for 100% coverage is ideal, if we get to "a lot" (I dunno, say, ~90% or whatever), we can say it's "good enough". The rationale here is that public functions are used to build our game, so if we have tested them thoroughly enough, then we can have greater confidence in their correctness (they do what we think they should).

Test all code paths. Has each statement in the method been tested? Has each edge in the control-flow graph been tested? Has each branch of every if-else statement been tested? (Or every case in a switch statement been tested?) Has every boolean subexpression of each conditional been tested? Complicated conditional tests could be refactored into predicate functions, which can be independently tested.

Each function should do one thing, and unit tests make sure the function does what we expect/hope. For example, the Actor::dies() method does one thing; once it is called, we should have Actor::isAlive() return false. This gives us two cases to consider: one where Actor::dies() fails (e.g., when Actor::health() is positive), the other when Actor::dies() succeeds. The former case should have Actor::isAlive() return the same result as Actor::health() > 0, the latter should have Actor::isAlive() == false.

Tests should be atomic. Each test case will test exactly one thing. If a test case is testing more than one thing, we should refactor it into multiple test cases.

Tests should be independent of each other. They should not rely on each other (in the sense that they don't call each other). A unit test should test exactly one thing.

Tests should be readable. Think of them as not just testing the behavior of the function, but also as an example of how to use the function. This gives us a name for the test (e.g., Actor::diesShouldNotBeAliveTest() or Actor::healthy_should_not_be_dead_test(), etc.).

Tests should be repeatable/deterministic. We should test the mechanical parts of the game (e.g., marking an Actor with zero health as "dead") where the same inputs produce the same outputs. If we are testing randomness ("rolling a die"), we should have some way to "mock out" that randomness with something deterministic ("load the die", "use a two-headed coin", etc.) to make sure the methods do what we expect.

Tests should be fast. Since each test case tests exactly one thing, we should make them small and fast.

Tests should be automated and tracked. We should be able to run the tests with a single command (e.g., "make test" or whatever), and we should include the test code in our git repository. Best practices suggest running the tests and make sure they pass before pushing code out to the repository's master branch ("don't break master").

Testing the Game

"Testing the game" has several distinct meanings: make sure the game is playable, make sure the game is fun, etc. In some sense, unit testing is like checking to make sure each square of the board is flat: but if we glue the edges badly, we could actually end up with a curved board. Unit testing checks locally each function does what we hope, but it doesn't check the game does what we hope. This motivates integration testing and end-to-end testing.

Integration testing can be useful. If we want to make sure dialog options trigger quests and completes quests, we need integration testing. This amounts to setting up a mock game, simulating dialog, then checking the game state matches what we expect. Since multiple "units" are being tested in conjunction (dialog, quests, etc.), we're really testing that they're "integrated" correctly. For people trying to create an old-school Fallout or Wasteland-type game, this can be very useful.

Testing from the Player's Perspective

I would suggest testing the game from the user's perspective. This is harder to do, and varies depending on the type of game you're making. In my mind, I assume you are programming something like a Baldur's Gate, Fallout, or Wasteland-type game: a mixture of quests, dialog, combat, and possibly more.

What do we hope to test? We'd like the game to be playable (quests "fit together" in a way that the player can get to the end of the game), but we'd also like the game to be fun. This is where design decisions are needed: how do we specify a game to be "fun"? Is there sufficient choice architecture?

Test the game is playable (quests sequence properly). For example: quest A occurs before quest B, wherein quest A requires killing actor X, but actor X issues quest B after completing quest A, rendering quest B un-triggerable. (More concretely: if the king gives us a quest after completing his minister's quest, and the minister asks us to assassinate the king, then there better be some mechanism for us to continue after killing the king...like the minister takes over. Otherwise, if we are waiting on the deceased king to give us our next quest, we'll be stuck.)

If we have a domain specific language for quests, actors, items (and if we store these in .info files), then we could have a simple helper program which runs through the quests, makes certain the quest items are fetchable, the quest-issuing actors are alive, and there are chains of dialog/quests which start with a specified initial quest and end at the specified final quest.

The helper script should record sequences of quests which are unwinnable, or when there are disconnected components (e.g., killing actor X early in the game prevents quests B, C, and D). At the end, it will print out to the screen a summary (along the lines of "M paths succeed, N paths tried with K character builds") and a more verbose explanation to a file ("Character build C played the quest chain Q₁, ..., Q_m then got stuck at quest X") possibly with the trajectory of events for reconstruction. We could automate this script to try all variations of skills and stats, too.

I've discussed this idea in passing a few times, I'll probably make it the subject of a future post...maybe have a minimal working example for people to play with, we'll see. It'll involve a variation of depth-first search along a few distinct play-styles...we'll see, friends, we'll see.

Test you aren't a jerk to the player. Suppose our game has factions and the player has a reputation (loved, liked, undecided, disliked, hated). If our game penalizes the player's reputation when the player kills a member of the faction, then we should beware of the situation when the player witnesses the death of a faction's member: will this tarnish the player's reputation or not? This is a prime candidate for sticking away into a unit test, for regression testing.

If this is part of the plot (the player, witnessing a murder, is then falsely charged with the murder), then it should be written into the game manually. The last thing a player wants is to find the police are after them for...apparently doing nothing. That may be amusing to the programmer (I certainly chuckled), but it's no fun to the player.

There are other similar cases which, when programming, do not immediately sound consequential. But for the player, it feels like the game is designed by a vindictive jerk. It may not be easy to discern when this happens, but once discovered we should try to create unit tests to ensure we aren't jerks.

Heuristic "Tests"

These are measurements of symptoms which boring games exhibit. Alas, there's no way to automate the underlying "boring-ness" away.

Test the game has choice and consequences. Does dialog change to reflect the player's actions? Are new quests opened up specific to the player's choices? Do new interactions [dialog, NPC encounters, quests offered] occur after the player chooses particular outcomes?

Is it possible to have a playthrough where the player kills everyone before talking to them? This forces us to design the quests with constraints that force the game to have consequences and the player to have freedom. Chris Sawyer noted this design decision in an interview with IGN as key to game reactivity and player choice. We can enforce this check with a particular playthrough in our helper script.

Test the game takes you to all the locations. What locations are visited by the play-throughs? We could dump the trajectory of locations visited for further analysis. Sometimes a location is visited more frequently than intended, other times a location is never visited. This is a symptom of possibly less fun games, which can't be automated to enforce: it's an aid to help revise quest considerations.

In general, test for symptoms of fun. Depending on what your game is trying to accomplish, the criteria for "having fun" varies. Each criteria has different symptoms, and we should figure out how to automate ways to check these symptoms are present in our game. On the flip side, there are certainly symptoms of anti-fun: fun-killing elements we want to avoid. We should also automate ways to check these anti-fun elements are not present in our game.

In some sense, this is the best we can do with automated testing: test for proxies of what we want, and regression-checks against what we dislike. There's no way to properly "test for fun", but we can test the game can be played in different playstyles and for the "kill everyone before you even interact with them" heuristic Chris Sawyer noted.

Concluding Remarks

Game developers tend not to test their games, at least not in the same way that software engineers test their programs. Unit testing is generally discouraged among game developers, for good reason (having unit tests give false sense of being "correct", whereas games seek fun not correctness).

But we can test for an RPG "being playable". We can further make such testing automated. Insofar as we can make such testing scripts, I think we should...at least, I should. Such automated testing checks the quests are ordered correctly and unlockable, speakers are referenced properly, and so on. Again, this doesn't test gameplay, but it tests the game can be played.

As for what this looks like, I'm working on a minimal RPG I've decided to refer to as "project Delaware". (Why Delaware? Nothing special: I've just opted to use the names of states in the U.S. by order of admission to the union. And Delaware is the first state admitted to the union.) I hope to have something to share soon-ish.

Wednesday, October 28, 2020

Programming Exercise: Dice

Thursday, October 8, 2020

One scheme to world-building

Levine's "Stars" and "Passions"

Begin with a Map and Needs

How real is the economy?

Generating History and Culture

Friday, October 2, 2020

Testing the Game

Testing the Code

Organizing Unit Tests

What to Unit Test

Testing the Game

Testing from the Player's Perspective

Heuristic "Tests"

Concluding Remarks

Chronicling a Roguelike

Table of Contents

Blog Archive