Can a computer program help us identify unknown writers? 2

The jury’s still out at present, but I am very grateful for the kind offers of help of various sorts! I now have text versions of episodes of:

  • “Slave of the Clock” and “The Secret of Angel Smith” (Jay Over)
  • “The Sentinels” (Malcolm Shaw)
  • “Fran of the Floods” (Alan Davidson)
  • “Concrete Surfer (Pat Mills).

So I have enough to try to see if I can get the program to identify “Slave of the Clock” as being by Jay Over rather than any of the other writers. If anyone is able to send in any more texts, the following would be useful:

  • Some texts by female writers such as Anne Digby, Alison Christie, Benita Brown
  • Some more texts by the writers named above, so that I can offer the program a wider base of texts per each writer (rather than keeping on increasing the number of individual authors)

How far have I got so far? Not that far yet, I’m afraid to say. I have downloaded a copy of the program I chose (JGAAP) and I’ve got it to run (not bad in itself as this is not a commercial piece of software with the latest user-friendly features). I’ve loaded up the known authors and the test text (Slave of the Clock). However, the checks that the program gives you as options are very academic, and hard for me to understand as it’s not an area I’ve ever studied. (Binned naming times, analysed by Mahalanobis distance? What the what??) Frankly, I am stabbing at options like a monkey and seeing what I get.

I can however already see that some of the kinds of checks that the program offers are plausibly going to work, so I am optimistic that we may get something useful out of this experiment. These more successful tests involve breaking down the texts into various smaller elements like individual words, or small groups of words, or the initial words of each sentence, or by tagging the text to indicate what parts of speech are used. The idea is that this should give the program some patterns to use and match the ‘test’ text against, and this does seem to be bearing fruit so far.

So, an interim progress report – nothing very definite yet but some positive hints. I will continue working through the options that the program offers, to see if I can narrow down the various analytical checks to a subset that look like they are successfully identifying the author as Jay Over. I will then run another series of tests with a new Jay Over file – I’ll type up an episode of “The Lonely Ballerina” to do that, unless anyone else has kindly done it before me 🙂 – scans from an episode are shown below, just in case! That will be a good test to see if the chosen analytical checks do the job that I hope they will…

Jay Over, Lonely Ballerina pg 1

Jay Over, Lonely Ballerina pg 2
click thru
Jay Over, Lonely Ballerina pg 3
click thru
Advertisements

6 thoughts on “Can a computer program help us identify unknown writers? 2

  1. I’ve started on a few more Malcolm Shaw examples: episodes from The Four Faces of Eve, Lucky by Name… and Bella. Also I can do an episode of Bella by Primrose Cumming as contrast.

    The Sentinels document that I sent you also had first episode of Tennis Star Tina by Anne Digby at the end.

    I imagine it’s a bit more difficult to capture the essence of the the writing style with just the dialogue rather than the full script with dialogue and descriptions that must of been originally used. But I’m glad you’re make progress and are optimistic. 🙂

    1. So Malcolm Shaw wrote Four Faces of Eve? I didn’t know that.

      He did write The Robot Who Cried from Jinty.

      1. Yes, Pat Mills mentioned Malcolm wrote Four Faces of Eve in an article about Misty. He only worked with him for a little bit, but seems he would have liked to work more with him, pity he died so young.

        1. I am looking forward to seeing The Four Faces Eve reprinted later this year, I’m hoping that Malcolm Shaw and Brian Delaney get credited for their work. I would love if The Sentinels reprinted too, hopefully if they have success with the initial reprints we’ll see a lot more of them (maybe even with credits!)

          1. I think it is very likely that these two creators will get properly credited, as the initial announcment included information about them. Sadly I feel it is very unlikely that we will get credits on anything that is not already identified, because the records just aren’t there, as far as I am aware.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s