Creativity by Markov Chain, or Why Predictive Text Isn’t the Novel-Writing Shortcut You’re Looking For

Over the past year, I’ve played with Botnik‘s predictive text generator to create everything from alternative histories of popular holidays to terrible Christmas carol lyrics to the median New Year’s resolutions. It’s fun, it’s silly, and it is far more labor-intensive than most people imagine computer-generated texts would be.

Most of the conversations I see around AI and text generation assume that writers are going to be put out of business shortly. They assume that AI can not only generate text but generate it well, without human intervention.

These assumptions are…a bit overdone.

Here’s why predictive-text novels won’t be the next big trend in literature.

social media image with title of blog post

What’s a Markov Chain?

Predictive text is typically powered by a Markov chain, an algorithm that tracks a set of defined “states” and determines the probability of jumping to the next state from a current position in any one state.

For instance, if you wanted to create a super-simple Markov chain model of a writer’s behavior, “writing” might be one state and “not-writing” might be another. (This list of possible states is called a “state space.”) At any given time, the writer is either “writing” or “not-writing.”

There are four possible transitions between “writing” and “not-writing”:

  1. writing to writing (“must finish paragraph!”),
  2. writing to not-writing (“what’s on Netflix?”),
  3. not-writing to writing (“once…upon…a…time”), and
  4. not-writing to not-writing (“why yes, I WILL binge all of The Witcher, thanks”).

Thus, the probability of making a transition from any state to any other state is 0.5 (here’s a visual representation). At least at the beginning.

Markov chains also have a limited ability to learn from data inputs. For instance, one could program a two-state Markov chain to predict whether you will write or not-write on any given day, based on last year’s calendar. (If you’re like me, your Markov chain will be more likely to predict that you’ll write tomorrow if you wrote today, and more likely to predict not-writing tomorrow if you didn’t write today.)

What Does This Have to Do With Predictive Text?

Predictive text algorithms are Markov chains. They analyze words you have input in the past (or in the case of Botnik, how often words appear in proximity to other words) in order to predict the probability of you jumping to a particular word from the state “the word you just wrote.”

Why Writing With Predictive Text is Hard

You don’t need to understand the nuances of Markov chains to grasp that a book written by one would be tough to produce – but that understanding does make it easier to explain why.

Markov Has a Short Memory

As mentioned above, Markov chains have a limited ability to adjust their predictions based on factors like how frequently a state appears or how often it appears relative to (as in, before or after) other states.

The key word in that sentence is limited.

Markov chains don’t have any memory of the past. They can tell you which word is most likely to appear after this word, but they can’t tell you whether that prediction has already appeared 500 times or not at all.

In online predictive-text memes, this means that some results get stuck in an endless loop. For instance:

Predictive text meme Tweet

Predictive text meme Tweet that reads “Trans people are going to be a good time to get a chance to look at the time to get a chance to look at the time to get a chance to look at the time….” A response reads “Ok but did you get a chance to look at the time?”

This was a response to a predictive-text meme on Twitter that challenged people to type “Trans people are” into their phones and then hit the predictive-text suggestion to generate a result. This Twitterer’s predictive text got caught in a loop pretty quickly – it doesn’t recognize that it said “time to get a chance to look at the” already. It takes another human to save the joke here: “Ok but did you get a chance to look at the time?”

What Does This Mean for a Predictive-Text Novel?

A Markov chain’s predictive limitations pose two problems for long-form creative text generation:

  1. The Markov chain can get stuck. The more common a word is, the more likely it is to get stuck. “A,” “and,” “the,” “of,” and similar function words can easily trap the chain.
  2. Novels depend on memory. Story development requires attention to what came before. Predictive text, however, can only predict what word is most likely to come next. They can’t do that in the context of prior theme, character or plot development.

The results, therefore, are more likely to be incomprehensible than anything else – at least without careful editing. (I’ll get to that below.) For some examples of absurdist Markov chain results, see r/SubredditSimulator, which consists entirely of Reddit posts by Markov chains.

The Raw Material Blues

While generating last year’s various holiday posts on Botnik, I quickly discovered that the raw material fed to the predictive text generator makes a huge difference in the quality of the output.

If you’ve read the post series, you may have noticed a trend: In each one, I note that I fed “the first page of Google search results” or “the first twenty” Google search results” to Botnik (those are the same number, by the way). Why so specific?

It appears that the minimum size of the text bank Botnik requires to produce text that is funny but not incomprehensible is 20:1. In other words, if I wanted a blog-post-sized text, I needed to put in at least 20 texts of equal or greater length.

Twenty to one might even be undershooting it. Most of my predictive-text posts are around 500 words, while the top Google results from which they were generated tended to be 1,500 to 2,000 words.

What Does This Mean for a Predictive-Text Novel?

I haven’t tested this ratio on anything longer than a blog post. I do not, however, have any reason to believe that the ratio would be smaller for a novel. In fact, I predict the ratio would be larger for a coherent novel that looked sufficiently unlike its predecessor to survive a copyright challenge.

In every holiday blog post I generated via predictive text, the generator got “stuck” in a sentence of source text at least once. In other words, the Markov chain decided that the most likely word to follow the one on screen was the next word that already existed in a sentence somewhere in my source text.

When generating text from Google’s top twenty blog posts on the history of Thanksgiving, for instance, it was pretty easy to pick up on these sticking points. I didn’t have the entire source text memorized, but I knew my Thanksgiving history well enough to recognize when Botnik was being unfunnily accurate.

For a predictive-text novel of 70,000 words, one would need:

  1. Approximately 1.4 million words of source text (minimum), or about twenty 70,000-word novels, and
  2. A sufficient knowledge of that source text to recognize when the predictive text generator had gotten stuck on a single sentence or paragraph.

Point 2 has some creative opportunities. A predictive-text novella based on Moby-Dick, for instance, might benefit from repeating a large chunk of Moby-Dick verbatim (said novella would need to stay under 10,455 words to fit within the source text limitations, if you’re wondering). But the writer would still have to know Moby-Dick well enough to recognize when predictive text was simply reciting the book versus when it wasn’t:

 We, so artful and bold, hold the universe? No! When in one’s midst, that version of Narcissus who for now held somewhat aloof, looking up as pretty rainbows in which stood Moby-Dick. What name Dick? or five of Hobbes’ king? Why it is that all Merchant-seamen, and also all Pirates and Man-of-War’s men, and Slave-ship sailors, cherish such a scornful feeling towards Whale-ships; this is a question it
would be hard to answer. Because, in the case of pirates, say, I should
like to know whether that profession of theirs has any peculiar glory
about it. Blackstone, soon to attack of Moby-Dick; for these extracts of whale answered; we compare with such. That famous old craft’s story of skrimshander storms upon this grand hooded phantom of honor!

A Future for Creative Writing?

I learned with the first predictive-text holiday post that I couldn’t accept the predictive-text generator’s first suggestion every time, nor could I click suggestions at random. I was still writing; it’s just that I was choosing the next word in each sentence from a predictive-text generator’s suggestions, not from my own much larger vocabulary.

Many conversations about predictive-text creative writing suggest or assume that predictive-text will eventually take over our own creative processes – that it will supplant writing rather than support it. Not in its current form, it won’t.

For me, some aspects of writing via predictive text are actually harder than writing on my own. The Markov chain frequently backs into function-word corners and has to be saved with the judicious application of new content words. Punctuation is typically absent. Because the algorithm has no idea what it wrote previously, it doesn’t know how to stay on topic, nor does it know how to build coherent ideas over time.

Everything it couldn’t do, I had to do – and I had to do it with my next word choice perpetually limited to one of eighteen options.

That said, I love the idea that predictive-text authoring could arise as an art form within writing itself. Predictive text generators challenge us to engage with the art and craft of writing in new ways. They set new limitations, but they also suggest new possibilities. In so doing, they create an opportunity to engage with writing in new – and often hilarious – ways.

Anyway, here’s Wonderwall:

So maybe
Ya go to sadness baby
Cause when you tried
I have wasted dreams


Support the arts: leave me a tip or share this post on social media.

How Insurance Might Just Save Us All

I’m not going to lie to you, Internets: I kind of hate insurance.

My first job out of law school was in insurance defense. We were the lawyers the insurance company calls when you sue them for refusing to cover your claims despite the years of cash you’ve tossed down their gaping maws just in case something catastrophic actually happened. My job was to explain, over and over again in increasingly tedious terms, why You No Can Has Payments No I Don’t Care If Your House No Longer Exists and Neither Does Your Leg.

Some of the cases I defended were legitimately nonsense, like the couple that wanted a brand-new guest house build entirely up to code even though they freely admitted they did not live in their guest house and their insurance policy said in bold all caps on every single page “THIS POLICY COVERS ONLY YOUR PRIMARY DWELLING.”

Others were absolutely heartbreaking, like the patient whose insurance was rescinded after they were diagnosed with terminal cancer, leaving their family on the hook for $500,000 in medical bills because the insurer claimed the patient had never told them about certain test results, which it turns out the doctor never even ordered.

As a consumer, I kind of hate it too. I only pay for it because I know what happens if I don’t have it. Like the case in which a drunk driver hit a teenager, lacerating the teen’s brain stem and causing them to burn through $1.5 million in medical bills in the first six weeks, none of which they’ll ever even remember because they spent all six of them in a coma.

(PS: The ACA has decimated the number of cases like those last two, in case you’re wondering whether ordinary Americans really need government-supported health insurance.)

It’s important, but I still kinda think it’s crap.

…Maybe. Today may have changed my mind.

How INsurance Might Just Save Us All

I don’t practice insurance law anymore, but I do still write content for insurance publications and, increasingly, for insurtech companies. Today I wrote an article that as near as I can tell is the first of its kind: A how-to guide for insurance companies that want to encourage smart home device use, but that also don’t want to blow off their own legs in the privacy and security minefield those devices pose.

And fam? I think insurance companies might save us all. I really do.

Check out this creepy 150-page report from the Internet of Things Privacy Forum, arguing that smart devices are going to change the concept of privacy as we know it. The report argues that not only will we lose our sense of “private” and “public” spaces, we’ll even start losing the privacy of our own emotions, as devices get better at inferring emotional states from available data.

You didn’t want to sleep tonight, right?

The combination of artificial intelligence and omnipresent devices networked into a collection of information that expands exponentially and that only machines can currently parse presents risks. Huge risks. Risks that, like the size of the problem itself, we can’t comprehend.

You know who absolutely loves contemplating risk? Insurance companies. And they’re good at it. Like, really, really good.

Governments have been slow to address data security concerns with smart devices, even though it’s increasingly common knowledge that any smart device is a giant open hole in your network security with a neon sign on it that screams “Free Mayhem Here!” California passed a bill that won’t take effect until next year. Worse, the UK’s standards for smart device security are voluntary.

But if insurance companies decide to protect their own behinds while still accessing that sweet, sweet customer data, we’ll see much higher demand for smart device security, and we’ll see it in a hurry. Sure, that security will probably only go one way, which is toward insurers’ interests. But that’s still better than what we have now.

Okay, insurance. I guess I’ll hate you slightly less now.