Friday 21 July 2017

Anti-Songs, Nonsense and Gibberish: Computer Generated Writing

We had a really exciting meetup including a talking, moving robot!

Céline and her NAO robot.

There were two talks, both having a common thread of text processing algorithms and bringing text to life with a robot or animation.

Céline's slides are here: https://www.slideshare.net/CelineBoudier/nonsense-and-gibberish-computer-generated-writing

A video of the talk is here - including the moving, talking robot: https://skillsmatter.com/skillscasts/10052-anti-songs-nonsense-and-gibberish-computer-generated-writing


Generated Writing

Ever since electronic machines were used to automate calculations, there have always been few who have wanted to explore the creative possibilities of using such computers.

Computers can be additional tools for artists, with electronic displays and digital image effects to augment the traditional pencil and paintbrush. But another, more interesting, approach has been to see if computers can be more than just an artist's electronic brush ... to see if computers can be responsible for more of the creative process itself.

Could computers compose images, choose shapes, colours and arrangements themselves? The answer is yes, but a human still has to define the algorithms - the rules, the logic - by which a computer generates art. And that, even more critical role, is what an algorithmic artist does.

Most algorithmic art has been visual - colours, shapes, patterns, animations - on an electronic canvas. But art is broader than images on a surface - it includes words, literature, poems, songs, spoken word performance.

So it is natural to ask - can computers generate writing, poems, songs, literature even? Celine has been exploring this very question.


"Artificial Surrealism, Computational Poetry"


Natural Language Processing

Unlike numbers and other structured data, the language we humans speak and write is far from the precise, consistent and unambiguous language of mathematics or programming languages. Our human language has evolved organically, with imperfections,  inconsistencies, and irregularities which are actually what make these languages interesting, beautiful and evocative.

That's great for literature - but makes an algorithmic artists job more challenging!

Over the last few decades, the fields of computational linguistics, text mining and natural language processing (NLP) have really come along, with huge progress in the last few years.

Céline has based much of her work on methods from text analytics and NLP.

For example, she demonstrated how breaking down text into trigrams, and counting their frequency in a language, it becomes possible to determine whether the language was French or English. The reason this works well is that there is a different distribution of trigrams between English and French. English, for example, has lots of the trigram "the", whereas French does not.



Gibberish

Gibberish is not gibberish. It has a distinguished pedigree .. from Lewis Carroll's Jabberwocky (1855?),


Hey Diddle Diddle from as far back as the 16th century,



to more modern performance art:



Markov Chains

Céline's approach to generating her own gibberish builds makes use of another powerful and popular tool in text analytics, Markov Chains.

This approach learns how likely something (a word) follows another thing (another word). In a body of English text, you would expect the word "there" to be followed by "is" or "are" but being followed the word "banana" is extremely unlikely - because it wouldn't be grammatically correct. You can imagine the words "there probably" might occur but not very often.

The beauty of this approach is that it neatly captures the essence of a languages structure and grammar very well - because it is capturing language as it is and isn't used.

The following diagram shows how a reasonable sentence is generated by considering the numerical probabilities of words following the current word.


You can see that the word "park" has a much lower probability of following "I" than the word "walked". So we select "walked" to build our sentence. We repeat the process, and find that "on" and "in" are not unlikely after "walked" but here "in" wins out. It wins because the body of text from which these probabilities are learned from just happened to have more "in"  than "on". More sophisticated generation might choose to randomly choose "on" and not "in" in proportion to these probabilities, rather than always having the same winner (which can lead to text loops).

Here is a nice visual and interactive demonstration of Markov Chains - link.

Céline uses a variation of this idea, where word pairs instead of single words, are considered. This creates more plausible text.


She also tried a model using the bits of words (trigrams), not just whole or pairs of words, to generate new words based on the likely connectedness of these bits .. the results are certainly interesting! Spectangular! Peacockney!

Having the NAO robot act out these generated texts is really fun, and adds a new dimension to the performance of these words, especially as  the robot moves its body as it enunciates the words.

"... lost in his shining eyes ..."


Anti-Texts

An interesting idea is to see if there exists an opposite or inverse of a text. This idea is appealing because we love and explore inverses in mathematics all the time, sometimes finding quite interesting objects.

How do we find the opposite of a piece of text? Well, there are many suggestions we can come up with, and a good one is to take descriptive words, like adjectives or some verbs, and replace them with their opposite. Simply putting a "not" before a sentence to find a logical negation isn't that interesting.


That's a great start, but very quickly we find words that don't have obvious opposites. What's the inverse of "stars"? So an augmented approach is to replace them with their dictionary definition, and then apply this simple word replacement to  that extended text. You can see this applied next. Céline used the venerable NLTK Python natural language toolkit to help pick out candidate works.


You can keep applying this cycle up to a given depth. The results applied to William Blake's Tyger Tyger are quite surreal!

Taking things a step further, the output of this method applied to a Yeats poem Aedh Wishes for the Cloths of Heaven was recited by the NAO robot to music ... creating a genuine rap vibe. For me, it evoked Pigeon by Canibal Ox (click to open):


You have to watch the robot to really appreciate the power of performance - video link.


Art As Therapy

Roger gave a short talk on his experience of art, and algorithmic art, as part of a therapeutic process.

He was unfortunately a victim of an attack which left his hand and arm physically damaged. He went through a process of healing, including physiotherapy, but of course it is not just the physical self that needs to be healed and rebuilt.

Roger is working on a project to portray the relationship between a hand therapist and her patient. Hand therapists use physiotherapy and occupational therapy to help patients to recover hand functions and improve their ability to carry out activities of daily living.

On his quest to gather tools to support his story-telling with speech and visual elements Roger discovered what open-source software can provide, but he told us he also discovered that the promise of the internet can ebb and flow. He described technical challenges and artistic opportunities when translating audio wave-forms into animated mouth movements, providing a quick but informative summary of several approaches that can be taken to this non-trivial task.

For me, his story reminded me that art is nothing if it isn't about people, and how art either reflects our journeys, or helps shape them.


And Finally ... 

NAO performing to music.