Monday, September 4, 2017
Your Random May Vary
At the beginning of the summer, we started hosting a semi-monthly Dungeons and Dragons game at our house. It's typically a lot of fun, involving much role-playing and dice rolling. We share Dungeon-Mastering duties; I think I've run most of the sessions, but at least two or three have been run by someone else. When I'm running the game, my character is typically an NPC. It's a bit of a shame, because my stats are pretty awesome.
I got to thinking this morning... I rolled my current character (as all my characters), using real dice and a "roll-4-keep-3" approach. I roll four 6-sided dice, discarding the lowest, and summing the other three. I do that six times in a row, and I've got the stats block for a new character.
I never use anything but real dice unless I happen not to have a set (and fyi: I carry a couple of sets around in my backpack just in case a spontaneous or random one-shot happens). Garrett used to roll great stats very often using his IPhone dice, and tended to roll challenges really highly as well. I started thinking about the difference in our results and decided to mess around with a rolling app this morning.
It's been a while since I posted anything technical. Bear with me.
I started by creating a little app that would roll a six-sided dice for me.
I quickly realized that I didn't quite know where to go. TDD to the rescue!
I started with a description of one of the end goals: a function that would roll four dice and keep the highest three. That... is probably too broad. Let's decompose for a moment. We know we'll need to roll four 6-sided dice and drop the lowest roll. Maybe that dropping function is easier to test and implement.
There's the test; we should stub a red (failing) function.
Done! Run the tests!
Failed as expected. What's a simple solution that will work?
Seems reasonable. Tests tell us...?
Yay! Green tests! Let's fill in some other assertions to make sure we're not fooling ourselves.
That's good enough. A question I'm often asked is "how much automated test coverage should I implement?" The pedantic answer is "100%," but the pragmatic answer is "as much as you can that covers stuff you would do by hand anyway." That's my rule-of-thumb, in any case. Sometimes I do more, sometimes less. But if I find bugs, I try to reproduce them with tests and then they're covered into perpetuity. As such the test coverage grows organically.
Alright, let's fill in some more of the decomposed functions and see where we wind up.
That seems reasonable. Test coverage is minimal, but enough for me to know that I'm not doing something overtly silly.
At this point, I'm ready to do the REAL implementation. We'll set up a little recursion to see if we can generate a set of six 18s. I'm not good at math, but the likelihood seems pretty low that I'll actually do it. Let's compromise and try 100 MILLION times. We'll also squawk if we get close.
Alrighty! We can run this in the REPL. The first try's results follow.
It took us 22 minutes to try 100 MILLION times, and in all those tries, we only got close once. The frequency lines up with my (admittedly bad-at-math-and-fuzzy) expectations.
So how in the world did Garrett's IPhone roll so well for him so often? I suspect they weren't using the same approach as I was (roll-4-keep-3). After a little more research, I found that another approach that simply took a random integer between 3 and 18. This seemed like a pretty simple and probably more efficient implementation, but I suspected that the level of randomness in that picking was LOWER than that of the roll-4-take-3 approach. Again, I don't have the math or computer science to tell you why -- it was just my instinct.
So, I implemented a couple more rolling functions...
...and refactored the roll-stats function into a multi-method that dispatches on the roll style you wanted to use.
No tests were added. I wanted to get to the bottom of this silly thing. Let's do some more REPLing!
WHOA! There were a TON more close calls, and an actual hit about 14 milllion rolls in.
I decided to re-run the roll-4-take-3 strategy to verify its performance.
A few more close calls were had, but roughly the same result. I re-ran the range-3-to-18 strategy:
Wowzers. SO MUCH MORE LIKELY to roll 18s using that approach. The second run only took a second and slightly more than 321,000 tries.
So, what's the moral of the story? I think it's that the amount of randomness you get from computers may vary, based on what the underlying strategy is. Random numbers aren't truly random in the machine. Always keep that in mind when they rise up and become our overlords.
Also: rolling 321,000 times by hand, at one set of rolls per second, would still take you almost 4 straight days of rolling to get that all 18s set. So for you DMs out there... if a player comes to you with a character they said they rolled, and it's got six 18s, you give them a knowing wink while handing them your dice and telling them "roll again..."