Welcome to TFB - Now full of 66% more win!


Thanks for your interest! For the latest posts, just scroll on down. Handy shortcuts:

Tuesday, December 5, 2017

No Humans Required: Exploring the Possibility of Computer Generated Fiction

By Wil Forbis

“It’s not hard to generate a story. It’s not hard to tell a story. It’s hard to tell good stories. How do you get a computer to understand what good means?”

Mark Riedl, associate professor at the Georgia Institute of Technology



Lately, I’ve been playing around with self-editing apps such as AutoCrit, ProWritingAid and SlickWrite. These tools, all of which offer some functionality for free, scan a user’s text and flag common literary transgressions like poor word choice, word repetition, improper sentence length and passive verbs. The apps aren’t perfect and tend to be biased towards a modern, no-frills writing style (the work of Lovecraft would caused their circuits to overload) but they can be helpful. They have caught errors that I would have otherwise missed.

The robust feature set of these apps makes the point that writing is not a single process but many. Proper word choice, balanced sentences and good grammar are key to successful writing as are larger concerns about structure, style and narrative. A good writer ties these skills together though no one masters them all. Some writers are known for excellent pacing but banal vocabulary. Others earn acclaim for great plotting while being damned for wooden dialogue. 

These editing apps do not, of course, make editing decisions for you; they simply offer suggestions. Still, I find myself wondering whether they could be used to automate the editing process. Then the question arises: will software eventually write? Will computers create works of fiction out of nothing, no humans required?

This question might seem premature. The apps I’m playing with are helping with the more mundane, technical aspects of writing but they aren’t anywhere near the creative side. They aren’t developing plots or characters, or exploring the emotional symbolism of colors or religious icons. And it’s hard to imagine they could.

Still, we recognize that creativity is not magic; it is a process that can be studied and deconstructed. Bookstores are filled with volumes about using the right side of the brain, or developing creative “flow”, or finding a step-by-step process to awaken the muse. And employing process-oriented steps is exactly what software is good at.

Additionally, creative computers are not science fiction. In the world of music, computers have been composing for some time. Programmer/musician David Cope has used software to generate thousands of hours of classical pieces. Several tech start-ups such as Amper and Jukedeck have been automating the creation of background music used in online videos and films. The quality of the music varies—nothing has yet appeared to make hit songwriters nervous—but it’s credible enough.

Computers are also getting into the writing game, specifically journalism. The “natural language generation” technology of a company called Automated Insights has been used by the Associated Press to write finance articles. A competing tech company, Chicago based Narrative Science, has been generating sports and other statistics heavy news stories for years. Jeff Bezo’s Washington Post is using an AI bot called Heliograf to massage raw data about politics into human readable text.

Of course, journalism is not fiction (well, not all of it) and fiction is the kind of writing that requires the most creativity. Even there progress is being made. A European academic project, the What-If-Machine (WHIM), constructs basic plot premises by analyzing data on the web. (The WHIM software teamed up with another program, PropperWryter, to write the plot structures for a musical that recently ran in London.  ) Another software tool, Scheherazade, developed at the Georgia Institute of Technology, writes original short fiction after analyzing human penned stories. These tools haven’t produced anything that’s going to put authors out of work but we are at a point where speculation on how computers could create stories and novels is valid.

As mentioned previously, writing is really many different skills, and exploring every function software will need to obtain to pen fiction is beyond the scope of this article. I’m going to consider the possibility of software tackling three tasks inherent in narrative writing: plotting, pacing and word choice.

Plotting
Let’s define a plot as the “who, what, when, where, how and why” of a story. The form of it can vary between a one-paragraph synopsis or a ten-page story breakdown.

Automated plot development is not new; classic pulp fiction authors often used primitive plot generators. Erle Stanley Gardner employed a “plot wheel” to randomly combine story elements for his Perry Mason stories. Lester Dent swapped out elements in his “master fiction” plot to create stories for his Doc Savage novels.

Those early efforts were crude compared to what today’s technology offers. We now have computers with incredible processing power and the ability to parse written material and learn to correlate meaning to words. Computers are starting to “understand*” that Sweden is a place and bound to the various restrictions that places are bound to, or that dogs are living creatures and subject to various canine behaviors. As this capability expands in the future, software should have no problem filling in the “who, what, when, etc.” required for plot development.

*I realize I’m on tenuous philosophical ground when I imply computers can “understand” meaning, a feat that would presume they have some kind of consciousness. This is merely a writing shortcut: I make no claims about computers being able to think.

Early attempts will doubtless be underwhelming. (The WHIM software mentioned above already does this kind of plot development but only rarely creates gems.) Computers will need to not only understand what plots are, but what good plots are. How can this happen?

Two possibilities come to mind. One is that computers submit their plots to human reviewers. Via a crowdsourcing platform, plots could be ranked by engagement. As good plots are highlighted and bad plots down-voted, the data could then be fed back into machines that could analyze the differences. For example, computers might learn that lots of action or exotic locales are important to a good plot. (At least as defined by some readers.)

Another possibility hinges on a technique gaining ground in the world of artificial intelligence: deep learning. In this process, computers digest large amounts of data and observe trends and correlations in that data that might be missed by humans. Via deep learning, computers could analyze the text in every fiction book ever digitized*, as well those books’ sales figures and critical reception. This could lead to numerous observations about what makes plots good or bad. That data could then be used to aid a WHIM type tool in plot development.

*This kind of analysis is already occurring. Recently, scientists at the University of Vermont ran computer analysis on hundreds of stories and confirmed Kurt Vonnegut’s theory that most stories follow one of six plot outlines.

Pacing
Pacing can be thought of as the flow of a story, the speed with which it progresses. Action scenes (battles/break-ins/romantic encounters, etc.) speed up the pace while expository scenes (dialogue/ruminations/descriptions etc.) slow things down. Good stories balance these two elements, though there’s no single, perfect formula.

Can computers automate the task of setting a story’s pace? To do so, they would need to be able to identify action scenes and expository scenes within text.

One way to define a scene’s nature is by identifying word types. Action scenes have a lot of action or emotion words like “scream,” “break,” “shoot,” “stab” and so on. Expository scenes have a lot of cerebral and calm words like “considered,” “wrote,” “says,” “mused” and so on.

Sentence length also indicates a scene’s character. Action scenes tend to have short, curt sentences that capture the frantic pace of what is being described. Expository scenes move more languidly and flesh things out over longer sentences.

These are two of many attributes that can be used to identify the pacing of text. With these tools in hand, computers could analyze stories and move scenes around to achieve a good balance between action and exposition.  

There’s much more to pacing than described here, but this provides a high level view of how computers might tackle this writing challenge.

Word Choice
The need for variety drives good word choice. Readers don’t want to see the same word echoed over and over. All of the self-editing apps mentioned above already flag repeated words.

Of course, choosing word substitutes is not about blindly swapping out synonyms.  Several factors affect our choices. They include…

• Alliteration
We sometimes take advantage of the sound of language when finding a word. Say you’ve already used the word “snake” and now want to refer to it again prefaced it with the adjective “repulsive.” Instead of saying “the repulsive snake” you may choose “the repulsive reptile” to play off the alliterative properties.

• Syllable Count
Sometimes a you want a word that has some beef to it. You may be referring to a “reprise” but instead choose “recapitulation” as a meatier substitute. In other situations you may seek shorter words to balance a sentence correctly.

• Genre/Style
The nature of the work will always have an effect on the words used. In a period detective story, a female character might be a “dame” or “moll,” while in a high society novel she may be a “lady” or “ingénue.”

• Intended Audience
Every author must writer for his or her readers. Complex words should be avoided in kids’ novels but embraced in the fiction section of The New Yorker magazine.

There are many additional factors. Each of these could be thought of as a rule that could then be applied by software in the writing process. Via deep learning, computers could analyze existing stories and suss out the delicate ways these rules interoperate and influence each other. Computers may even develop new “styles” of word choice that humans find unique and engaging.

Summing it up
I don’t want to make any of this sound easy. Efforts to automate writing will likely evolve in fits and starts, and the road to progress will be littered with failures. I suspect much of the development will not be in the interest of replacing human authors but aiding them. Who wouldn’t want a “pacing recommendation engine” or an “automatic thesaurus”?

There’s also the possibility that there is some unique property, some tic of the human brain, that grants a magic spark to the best human created fiction. Computer authors may never replicate this. But it’s a mistake to think they have to. Computers don’t need to write like Shakespeare to be competitive in the marketplace since most published human authors don’t meet that standard. Sometimes “good enough” is fine.

When all this could happen is hard to say. According to the science fiction of yesteryear, we should all be flying around in jet packs right now. Predicting the future is a fool’s errand but I’m enough of a fool to claim that within 20 years we will have an automated writing tool capable of generating readable fiction. And in 50, 100, 500 years? Who knows?

No comments:

Post a Comment