For my assignment, I decided to do analyze transcripts from the Loebner Prize in Artificial Intelligence’s homepage. The Loebner Prize is a competition held each year where computer programs aiming to impersonate a human in chat are pit against humans whose job it is to judge whether or not the entity they are conversing with in an instant messenger interface is a human confederate a computer contestant. A quick perusal of the transcripts at http://loebner.net/Prizef/2005_Contest/Transcripts.html from the 2005 contest reveals that most of them, well, suck. However, I wanted to find out whether or not we could quantify the results; looking at the Monk piece, it is suggested that co-referring expressions such as “it,” “that,” “they,” “he,” and “she” are evidence that the two speakers believe they have achieved common ground. Although the paper described them as being generally infrequent in text-based communication, I wanted to see whether or not they would appear more often in dialogue with a human confederate or with Jabberwacky, the 2005 contest winner (its transcripts are at the bottom of the page linked to above). My hypothesis was that because the computers would be unable to achieve common ground with the human judges and are not smart enough to stay on-topic across a chain of messages, there would be fewer co-referring expressions in the transcripts of the judge-computer conversations than in the judge-confederate conversations.
My hypothesis was actually wrong:
| Confederate | | Jabberwacky |
it | 2 | | 7 |
that | 3 | | 6 |
they | 0 | | 4 |
he | 0 | | 1 |
she | 2 | | 0 |
total | 7 | | 18 |
| | | |
| Confederate | | Jabberwacky |
it | 16 | | 3 |
that | 6 | | 6 |
they | 1 | | 0 |
he | 0 | | 2 |
she | 0 | | 0 |
total | 23 | | 11 |
| | | |
| Confederate | | Jabberwacky |
it | 5 | | 4 |
that | 1 | | 8 |
they | 0 | | 0 |
he | 0 | | 0 |
she | 0 | | 0 |
total | 6 | | 12 |
| | | |
| Confederate | | Jabberwacky |
it | 3 | | 9 |
that | 5 | | 8 |
they | 1 | | 0 |
he | 0 | | 0 |
she | 0 | | 0 |
total | 9 | | 17 |
| Confederate | | Jabberwacky |
Average | 11.25 | | 14.5 |
There are a few explanations for this result; for one, I suspect that the creator of Jabberwacky programmed it to use co-referring expressions precisely for the reason I formed my hypothesis. In order for the computer to stand a chance of seeming like a human, it had to be able to refer to previously discussed topics in a natural way, so it’s possible that Jabberwacky is programmed to use them in an almost exaggerated way. Also, the human judges seemed to become confused by Jabberwacky because of its bizarre sentences and would ask things about what it had just said. Finally, most of the conversations with the confederates are shorter; the judges realized they were speaking to a human and decided to converse more with the other entity to study it.
However, I found it interesting that there was a huge spike in the use of “it” in the second conversation between a judge and a confederate; looking at the transcript, they became interested in talking about the copyright policies of the RIAA and that fueled the rest of the discussion. This is something that could not possibly have happened with Jabberwacky; never does the frequency count of any single word in the Jabberwacky transcript rise above 9. Again, I suspect this is because Jabberwacky cannot stay on-topic.
3 comments:
I found your analysis to be very interesting, especially since you were analyzing conversations that were generated from both humans as well as artificially created "jabberwacky". I agree that your hypothesis was probably not confirmed due to the fact that the programmer specifically designed the jabberwacky to use "it", "that", etc. This has interesting implications for the future: if we create artificial intelligence that will replace humans, how will language use factor in to how "realistic" converation of articial creatures is?
Never know what you can learn with automated beings when studying communication theory...
I thought this was a very creative avenue to take in regards to the assignment description. I must say I'm sort of surprised by the findings. My assumption would have been that expressions of conference are much rarer in conversations with bots because it would be more difficult for a bot to establish common ground. The fact that this is not the case is really fascinating. After examining some of the transcripts, I found that the results might be somewhat misleading as the bots usage of it that and they are are sometimes a little hamfisted. Still this was a really interesting idea.
Post a Comment