IF titles: the next generation of generation

Sunday, June 3, 2018

Comments: 1   (latest 6 days later)

Tagged: python, procgen, interactive fiction, ifdb, titles, if, neural nets

Many years ago, Juhana Leinonen wrote an IF name generator which mix-matched the titles of IF games:

  • Asteroid Synesthesia Factory
  • Ill The O Zone
  • Voices of Spoon Planet
  • Lethe Hobbit
  • The Quest Detective

This is "IF titles created by joining the beginning and end parts of random existing titles," to quote the author. The source code shows what's going on: it's taking a random number of words from the beginning of one title and a random number of words from the end of another, with some tweaks to avoid pulling just "The" or "A".

The result is very convincing. But this is 2018! Not only do we have neural nets, we have plug-and-play neural nets that any bozo can install.

I looked through some of Janelle Shane's blog posts -- she's been doing the lists of Pantone colors, D&D spell names, and so on which you might have seen. Obviously she knows what she's doing and gets excellent results out of her experiments. I do not know what I'm doing, so I probably got sub-par results. But they're still pretty great, so here's a list!

  • Hills of Paradise
  • Castle of the Impala
  • The School of Rock
  • The Door Drivers
  • The Volvil's Room
  • Guttersnipe: Sorcerors
  • Color the Demon Adventure
  • Vault of Survival
  • Il Das Etverra de Joie (Terror 1)
  • Playa Alley
  • The Dream Whore, Bubble Zefro
  • Smast of Imron
  • A Beginning of the Princess
  • Iramidic Text Adventure
  • Space Lust War Tale
  • El Sexter
  • Blackback
  • Friendly Doors
  • Shuce-Quest
  • Wolf: Spy to grind a codion
  • Gris e no pluu
  • The House of Zombrit
  • The Citch and the Dogs
  • The Heather Continences

This is pretty good stuff! I did a little bit of hand-selection, but this is most of one generator run, plus a few extras. (I couldn't resist The Heather Continences.) Most of my editing was to delete real titles like The Cube and All Things Devours.

Okay, so how did I do this? Content warning: the rest of this post is about Python code.

I chose textgenrnn, because Janelle Shane mentioned it and it's a Python module. I like Python.

Installing this on MacOS is easy. I recommend installing homebrew first, if you haven't, and then Python 3, if you don't have that:

brew install python3

Then you need TensorFlow, a machine learning library:

pip3 install tensorflow

This throws up a few warnings about "Your CPU supports instructions that this TensorFlow binary was not compiled to use" -- that's fine. It will run, just not at absolute top speed. Really, if you want absolute top speed, you should install TensorFlow on Windows or Linux, which can take advantage of an NVIDIA video card. The Mac version sadly does not support NVIDIA acceleration, no doubt because MacOS is a jerk or has high security or something. But I don't have an NVIDIA card to begin with, so whatever, I just let it use the CPU.

Anyhow, you can now install the textgenrnn module:

pip3 install textgenrnn

Finally, grab your list of IF titles, or whatever you want to neural-net the heck out of. Like Juhana, I used the title list from IFDB, which has over 9000 titles as of mid-2018. I chewed through the SQL dump, but you don't have to: here's the title list as a text file.

Okay! We are ready to begin. Launch Python, and type:

from textgenrnn import textgenrnn
textgen = textgenrnn()
textgen.train_from_file('ifdb-titles-2018.txt', num_epochs=50)
textgen.save('ifdb-titles-epoch50.hdf5')

The train_from_file() line is the training step, and it takes a while. On my MacBook Pro, it took just about four hours. The fan was spinning the whole time.

This goes through 50 epochs (training rounds). After each one, it prints some samples using different "temperature" settings. The temperature is roughly the wild-and-crazy parameter. Low values mean "stick to patterns which are very common in the training input"; high values mean "bounce all over the place."

(I think that the generator starts with a lot of English training information out of the box, and then adds the ifdb-titles-2018.txt info on top of that. The documentation of textgenrnn isn't all that great.)

The last line saves the trained model to a file, so that you don't have to go through the training regime every time you want output.

Really you don't have to go through the training regime at all. You can download the trained model from my web site. (It's a 1.8 megabyte binary file.) With that, you just need to type:

from textgenrnn import textgenrnn
textgen = textgenrnn('ifdb-titles-epoch50.hdf5')
textgen.generate(20)

This spits out twenty lines using a default temperature value of 0.5. This is decent, although it's not very adventurous about the first word:

  • The Play of Feeding
  • The Underground City
  • The Life of a Family Cruise
  • The Forest BrimstUn
  • The Legend of the Old Miser
  • (...and so on, I'm not pasting all twenty here)

If you turn the temperature down, you'll get very cliched results, with a lot of names directly copied from the original list.

textgen.generate(20, temperature=0.2)
  • The Last Ship of Moon
  • The Secret of GranadiaR
  • The Crown of Lamentas
  • The Case of the Holy Grail
  • The Lost Treasures of Infocom

If you turn the temperature up, things get more interesting. At 1.5 it can't stay word-like for more than a few letters at a time:

textgen.generate(20, temperature=1.5)
  • Lours Part 1 Helceonouse: Qajon Longe counter
  • Ilnerta World
  • Krajibe
  • Strimpo
  • Island

I recommend 0.8 for a good variety:

textgen.generate(20, temperature=0.8)
  • Harvess
  • Solaring Indus the Banacistion Bear
  • The Wide
  • Oscura
  • The Primrose
  • Avacaduare
  • Sherwood Forest Helped by a McK Princess Offings
  • In the Lophent of Wi Kolt Adventure
  • The Circular Tricture Prehis
  • Time Quest
  • Magic Mansion
  • Rhahe and the Santa's Und M
  • Kingdom of Violetral
  • Destiny
  • For A Spinutworkwoves

Note that there are a lot of Spanish IF games in the original list, so the generator jumps into Spanish fairly often.

I'm sure there are lots more parameters I could be tweaking, but as I said, the documentation isn't great.

This has been my weekend experiment with neural nets! textgenrnn is very easy to get started with. If all you want to work with is lines of text, it's your monkey.

This brings up an interesting question. I've seen lots of neural-net projects that deal with text and image data. What about other formats?

3D models, for example. A voxel image (3D array of pixels) should be doable, but I don't know if the standard plug-and-play libraries can do it. Can you just slice your voxel array into 2D layers, paste them together into a rectangular image, and feed it into an image processor? Are the neural nets set up to assume a raster image layout, or do they not care? (See, I told you I don't know what I'm doing.)

What about traditional 3D model files, which are meshes of 2D polygons in 3-space? Or what about simple graphs, treated purely as connectivity with no spatial positions? Could you feed every CYOA book network into a net and pop out generative book geometries? I have no idea. Googling around turns up this blog post by Thomas Kipf, which indicates that people are at least thinking about the problem.


Comments imported from Blogger