Can a computer program help us identify unknown writers?

I don’t know yet, but I’m going to give it a go.

And I’ll need a little help from others, please.

I have been thinking about the problem of unknown writers and how we can try to identify them. In writing story posts here, Mistyfan and I sometimes raise questions about whether such and such a writer might have also written such and so other story, based on things like similar plot lines and the like. But there is a whole area of research into using computers in the Humanities, and a specific technique designed to help you attribute authorship to unknown writers: it’s called Stylometry. I want to try to use one of the pieces of software that does this – JGAAP – to see if we can get any help in thinking about who might have written what, or at  least in some cases. (Edited to add: this is written by the chap who did the analysis that strongly suggested that J K Rowling was the author of “The Cuckoo’s Egg”.)

The way it works is that I need to feed the program a number of texts from Known Authors, because it then compares the unknown writing with those known samples. (All it can ever do is say ‘this piece looks most likely to have been written by Author A out of the list of A – Z that you have given me’ – it’s just matching a sample to a known finite list, so it has limitations.) That means I need some text files (as many as possible) which are typed-up versions of stories where we already know the authors, such as the below:

  • Jay Over, Slave of the Clock / The Secret of Angel Smith / The Lonely Ballerina from Tammy 1982 and 19833
    • I can do the first two but haven’t got any copies of The Lonely Ballerina
  • Alison Christie – see list on the interview post
  • Pat Mills, various stories including Moonchild in Misty and Concrete Surfer in Jinty
    • I am in the middle of typing up the episode of Concrete Surfer included in the post about this story
  • Alan Davidson, Fran of the Floods / The Valley of Shining Mist / Gwen’s Stolen Glory
  • Malcolm Shaw, The Robot Who Cried

Can any one help by typing up one or more episodes from the stories mentioned, and sending them to me? I’m working out a standard format to use, because it’s going to be important to be consistent about things like how to indicate thought balloons or the text boxes at the beginning of each episode. We can work that out further together of course. Very many thanks in advance!

Once I have enough example files to start running them through the program, this is what I am intending to try (any comments or suggestions will be received with interest).

  1. Can I get the program to work at all?
    • If I load a credited Jay Over text as a Known Author, and a Pat Mills story likewise as a Known Author, will an episode of “Slave of the Clock” be successfully identified as a Jay Over story?
  2. What if I then compare a credited “Pam of Pond Hill” story – will the program identify this as a Jay Over story, or will the comedy style mean it is not as recognisable to the program?
  3. What if I then compare an uncredited “Pam” story with a credited “Pam” story? We think all the Pam stories were written by Jay Over but could this program show us any other views?
  4. What if I then add in more Known Authors and re-run the tests above – will the results still come out the same?
  5. And then excitingly I could try some further tests, like:
    • If I compare an episode of “Prisoner of the Bell” to “Slave of the Clock”, does the former look like the known Jay Over texts?
    • If I compare an episode of “E. T. Estate” by Jake Adams to the uncredited story “The Human Zoo”, what does the program indicate about any plausible attribution?
    • We think Benita Brown probably wrote “Spirit of the Lake” – is there any textual / stylistic similarity we can find between this and “Tomorrow Town” that we know she wrote?

Of course no stylistic attribution program is going to replace a statement from a creator or a source from the time, but we know these are thin on the ground and getting thinner, and what’s more people’s memories and records are getting more fragmentary as time goes by, so this seems worth trying. I don’t expect anything to happen very quickly on this because it does mean quite a bit of typing to get a good body of texts. If anyone is able to help on the typing front then I will be very grateful and hopefully will then be able to show any results sooner rather than later.

Apologies, I had meant to say something about the format of the text. I have a sample document which hopefully can be viewed via this link. In case that doesn’t work, this is what I mean for it to look like:

text grab

But I can add in extra detail such as the description that the text appeared in a word balloon, if I have a scan of the pages in question.

35 thoughts on “Can a computer program help us identify unknown writers?

  1. It’s quite a lot of work, and I’m already short in time as it is. But since it’s an interesting experiment, I would be willing to do one episode of a story. Do you need just the tekst, or also if it’s tekst in a balloon or in a box, or is that something you will do later? Have you got an example of how you would like us to write a tekst?

      1. So glad you guys think it’s an interesting thing to try! I know it is time consuming to type stuff up and you may not be able to do it right now but if you can do any at any point that would be fantastic. I’ve put more detail in the post.

    1. I’ve got scans of lots of stories of course but so far have only typed in half of one story episode so I’m quite far from being able to start. I only need one episode from each story though.

      1. If you already scanned episodes, would it be possible to send me an episode you would like me to do? My comics are stored in a not very accessible way, so that would save me some time and effort!

        1. I’ve been using the scans that are uploaded on the story posts. Here’s the first scan for Fran of the Floods, for instance:

          If you click through you should see it as a good large size.

          Once again, thanks in advance! It will be great to have this if you can.

    2. If you could do an episode of Lonely Ballerina at some point that would be great. Or if you could send me a scan of an episode that would be a good starting point for me, alternatively.

      1. There should be an episode of Lonely Ballerina in your inbox now, along with several other scans of episodes from other stories.

        1. There are many scans in my inbox indeed! Wow, many thanks. I have saved them for now and will think how best to use or share them.

  2. A while ago I lost all my emailaddresses, including yours, so I will post the text here, if that’s okay. If you think the post is too large, please delete it.
    I’ve done it at work (…), where I have the advantage of having two screens. That way you have the comic on one screen and you can type the tekst on the other. That saves a lot of time.
    Please let me know if my tekst is okay like this, or if there are things you would like in a different way. I couldn’t use the tab, like you did in your tekst. It went a lot faster than I thought, so I could do one more next week.

    1. Wow! thanks Marckie. I will check it out shortly and advise further. That should work fine, I expect, and how great to know that it went faster than we’d assumed!

      1. Thanks! 🙂 I noticed a typo I made in page 2 panel 4: recue instead of rescue!
        So when I do my next episode, I will do it the same way I did this one.
        Shall I do ‘The robot who cried’, since you have scans of the first episode already on this blog? But it will be next week before I can start with it.

        1. Oh, and one more thing, which you must know better than I do: I wrote Mr with a capital. Is that correct? Not that it will make a difference when you enter the text in the software, but it’s just for me, to deliver the text as correctly as possible.

        2. Instead, could you please do either Benita Brown’s “Tomorrow Town” (on the Casanovas entry) or an Alison Christie story? Then we will start to include female writers in the Known Authors texts.

  3. Fran of the Floods

    Episode: first

    Writer: Alan Davidson

    Introduction text: The sun is giving an extra flicker of warmth – and on earth there is sultry heat and non-stop rain. As England is gradually overwhelmed by floods, food supplies run low – and Fran Scott has some unexpected visitors!

    Page 1 panel 1:

    (word balloon) Out of the way, kid!

    (word balloon) You’ve got food here and we want it!

    Page 1 panel 2:

    (text box) Fran could scarcely believe the nightmare change that had come over the nice Mr. Jacobs.

    (word balloon) Search the place till we find it!

    (protagonist word balloon) Stop it! You’ve got no right!

    Page 1 panel 3:

    (text box) There was just one person in the street – Rod Pearson, who lived a few doors away.

    (protagonist word balloon) Help! Please help me! They’ve come to steal our food!

    Page 1 panel 4:

    (protagonist word balloon) Get out, all of you. This is our house!

    (word balloon) Here it is!

    Page 1 panel 5:

    (protagonist word balloon) You shan’t take it! You shan’t!

    Page 1 panel 6:

    (protagonist word balloon) Mum and dad! Oh, thank goodness!

    (word balloon) Mr. Jacobs! What are you doing? You’re our neighbour!

    (word balloon) I’m nobody’s neighbour any longer. Get out of the way!

    Page 1 panel 7:

    (protagonist word balloon) Dad!

    Page 2 panel 1:

    (protagonist word balloon) The – the world’s gone mad! People are becoming savages again – and all because of the rain! Is everybody going to be like this? I can’t bear to think about it…

    Page 2 panel 2:

    (word balloon) Put that stuff back, Mr. Jacobs, and then leave.

    Page 2 panel 3:

    (protagonist word balloon) Rod! Rod Pearson! You weren’t deserting me! You were just running off to get your friends!

    (word balloon) I could hardly handle this lot on my own.

    Page 2 panel 4:

    (word balloon) You lot ought to be ashamed of yourselves. You’ve panicked, haven’t you?

    (protagonist thought balloon) I’ve always disliked him, and now he’s come to our recue!

    Page 2 panel 5:

    (protagonist word balloon) I – I didn’t think you’d help, Rod. You were so unfriendly when I saw you earlier.

    (word balloon) Oh, that! I’d got something on my mind, Fran.

    Page 2 panel 6:

    (word balloon) He had some rather bad news this morning. His older brother was killed in London last night – washed away by the floods. A lot of people in Hazelford have got relatives missing.

    Page 2 panel 7:

    (word balloon) If you run short of food, Rod, make sure you share ours.

    (word balloon) We’ve got vegetables in the garden, as well.

    (protagonist thought balloon) Then everybody isn’t turning into a savage – not everybody.

    Page 2 panel 8:

    (word balloon) You’ve seen what it’s going to be like now, haven’t you, Fran? If this rain doesn’t stop – now – the world’s going to be a terrible place.

    (protagonist word balloon) But we’ll survive, dad – as long as there are enough people like Rod Pearson around.

    Page 2 panel 9:

    (text box) Or would they? On television…

    (word balloon) Here is an urgent announcement. The prime minister has ordered the evacuation of all low-lying areas of the country, including the whole of East Anglia. The army will try to set up tented camps on high land for refugees.

    Page 2 panel 10:

    (word balloon) But half the people will be refugees! Millions of them!

    (word balloon) As more and more power stations are flooded, electricity supplies will fail and the television service may soon be ended. News broadcasts will continue for as long as possible.

    Page 3 panel 1:

    (text box) The country was sliding into chaos. Next morning…

    (protagonist word balloon) There’s no paper, dad. That’s because Fleet Street’s flooded.

    (word balloon) No milk, either! Lucky we’ve got a lot of powdered in stock.

    Page 3 panel 2:

    (text box) There was still post – just.

    (word balloon) Maybe the last time I’ll call. They can hardly get the letters through any longer. You’re lucky to get this one.

    (protagonist word balloon) It’s from June!

    Page 3 panel 3:

    (text box) Fran’s elder sister had left home a few days earlier. Mum handed Fran the letter to read.

    (protagonist thought balloon) She’s staying in Scotland – says there’s hardly any flooding there.

    Page 3 panel 4:

    (protagonist word balloon) I – I think June may be regretting leaving home in such a hurry. She – she’s probably realized she can’t get back…

    (word balloon) It could be we’ll never see her again!

    Page 3 panel 5:

    (protagonist thought balloon) And it’s me who’s to blame. I could have stopped her from going, if only I’d been a bit more thoughtful. And now I can’t even say I’m sorry.

    Page 3 panel 6:

    (text box) Civilisation crumbled faster. In East Anglia…

    (word balloon) It’s all right saying make for the high ground. But we can’t get there! All the roads are blocked by floods!

    (word balloon) I’m out of fuel. There aren’t any garages open.

    Page 3 panel 7:

    (text box) The navy moved into London to pick up survivors.

    Page 3 panel 8:

    (text box) In Hazelford, the electricity went off for ever.

    Page 3 panel 9:

    (text box) But life still went on.

    (protagonist word balloon) You’ll be coming to hear me sing in the school concert tonight, won’t you mum?

    (word balloon) Course I will, Fran. We’re not under water yet.

    Page 3 panel 10:

    (text box) But in the hills above them, the reservoir was at bursting point. Hazelford’s turn was coming…

  4. There is also the problem of some strips being written by more than one person. Four people are known to have written on Bella, three on Button Box, and the first episode of Star Struck Sister was written by another person and Jenny McDade took over for the rest.

  5. I’d be happy to help when I have some time too. Malcolm Shaw wrote The Sentinels, I could do an episode of that, I could also do Tennis Star Tessa, I’d have to check what other stories I have that have known writers.

  6. Where is the program going to appear once you set it up? Will there be a new entry for it and keywords that we can type in to search for who may have written, say, The Human Zoo?

    1. I’m still only slowly getting to grips with it – I’m not sure whether my initial choice of program will work out or not (though I think having texts written out in a standardised format will always be useful and interesting). I’m not expecting to have any way of blog readers running the program through the blog itself which seems to be what you are hoping? I would expect to write some posts with the results / how I am getting on, and any conclusions. But it’s not looking like it will move that quickly at least to start with, because the way the program works is a bit hard to fathom (for me, as I am no expert in using it).

      1. Ok, just wondering how it was going to look and how it was going to be used once set up.

  7. Just coming to this post. Happy to take on typing up an episode or two if of help as would be very interesting to know if this sort of software works on comics!

    1. Thanks Julia, yes, that would be great – see my latest post on this and a suggestion of a type up of “The Lonely Ballerina” if you are able…

  8. Here is, as promised, an episode of ‘Stefa’s heart of stone’.

    Stefa’s Heart of Stone

    Episode: ?

    Writer: Alison Christie

    Introduction text: Shattered by grief on the death of her beloved fried, Joy, Stefa Giles was making sure she’d never suffer such pain again by developing a heart of stone, like that of the statue in the garden. Now, on her birthday…

    Page 1 panel 1:

    Page 1 panel 2:

    (word balloon) All the girls in your class have sent cards, dear. Wasn’t it nice of them?

    (protagonist thought balloon) I couldn’t care less. There’s only one card I wanted…

    Page 1 panel 3:

    (protagonist thought balloon) …And that’s from you, Joy. But you’ll never send me a birthday card again…

    Page 1 panel 4:

    (word balloon) Happy birthday, Stefa! Here’s your present from us!

    (protagonist word balloon) You shouldn’t have bothered.

    Page 1 panel 5:

    (protagonist thought balloon) A new guitar! A lovely one, too. But they needn’t think I’m going to go all sloppy over it!

    (protagonist word balloon) Thanks. But didn’t I tell you? I’ve given up guitar playing.

    (word balloon) Oh, dear. I thought she’d be thrilled!

    (word balloon) Nothing thrills her these days, ungrateful brat!

    Page 1 panel 6:

    (protagonist word balloon) I’ve had to stop mum and dad, little statue! That’s why I cold-shouldered their present.

    Page 1 panel 7:

    (text box) At school…

    (protagonist thought balloon) I’ll just thank them icily for their cards – and that’s all!

    (word balloon) Here she comes, kids – get ready!

    Page 2 panel 1:

    (word balloon) Happy birthday to you! Happy birthday to you!

    (protagonist thought balloon) Oh, why won’t they just leave me alone? Presents now, from the whole crowd of them!

    Page 2 panel 2:

    (word balloon) It shows how much we all like you, Stefa!

    (protagonist thought balloon) They’re still trying to thaw me out, with their stupid “Melt Stefa” campaign, but it won’t work!

    Page 2 panel 3:

    (text box) At break…

    (word balloon) It’s pouring outside, Miss, so can we stay in and play the new cassette I gave Stefa?

    (word balloon) All right, for a birthday treat!

    (protagonist thought balloon) I wish the girls would stop being so chummy! It makes it harder to stay aloof!

    Page 2 panel 4:

    (text box) But, as the music began…

    (word balloon) …Just you and I, together… Birds of a feather…

    (protagonist word balloon) Put that off! Put it off at once, d’you hear?

    (protagonist thought balloon) I can’t bear it! That was our special song – Joy’s an mine!

    (word balloon) Okay, okay! Keep your shirt on!

    (thought balloon) What’s upset her so suddenly?

    Page 2 panel 5:

    (protagonist word balloon) Here! Take all your presents back! I don’t want your friendship – now or ever!

    (word balloon) Of all the cheek!

    (word balloon) I’m chucking this “Be nice to Stefa” lark – it’s useless!

    Page 2 panel 6:

    (word balloon) Don’t take offence, girls – please! That music upset Stefa, I’m sure of it!

    (protagonist word balloon) You’re right it did! And it reminded me how foolish it is to care about people…

    Page 2 panel 7:

    (protagonist word balloon) …particularly you, Ruth Graham! So just leave me alone in future!

    (thought balloon) I’m so like her dead chum, Joy, she keeps me at arm’s length even more than the others! But how I wish we could be friends.

    Page 2 panel 8:

    (text box) Some days later…

    (word balloon) Take these notices to your parents, girls. It’s an invitation to our “open evening” next Friday.

    (word balloon) Super! My parents love that!

    (protagonist thought balloon) So do mine – but I’ll make sure they don’t come.

    Page 2 panel 9:

    (protagonist word balloon) You might as well read this, in case someone else tells you about it. But I don’t want either of you to come!

    (word balloon) Stefa, why not?

    (word balloon) Because she doesn’t want us, dear – that’s why!

    Page 2 panel 10:

    (text box) On parent’s night

    (protagonist thought balloon) They’ve all got their parents with them, except me! But I don’t care!

    Page 3 panel 1:

    (protagonist thought balloon) All the same, it is lonely, on my own. Oh, stop feeling sorry for yourself – statues don’t!

    Page 3 panel 2:

    (word balloon) Hi, Stefa!

    (protagonist word balloon) You’ve come, after all! I told you I didn’t want you here, or did you forget?

    (word balloon) No, dear – we didn’t!

    Page 3 panel 3:

    (word balloon) But we haven’t come to see your work!

    (protagonist thought balloon) Then whose work have they come to see?

    Page 3 panel 4:

    (protagonist thought balloon) Ruth Graham’s of all the cheek! She isn’t their daughter!

    (word balloon) I love to see embroidery, my dear!

    (word balloon) My sewing’s here, Mr. and Mrs. Giles.

    Page 3 panel 5:

    (text box) Later…

    (word balloon) We’re going for a cuppa in the school canteen. Coming, Stefa?

    (protagonist word balloon) No, thanks! I wouldn’t dream of breaking up your cosy threesome!

    Page 3 panel 6:

    (protagonist thought balloon) I’ll walk home, on my own! Oh, why do I feel so miserable? Mum and dad are losing interest in me – and that’s what I want, isn’t it?

    Page 3 panel 7:

    (protagonist word balloon) I wish you could talk, little stony-face! You’d tell me not to feel jealous, ‘cos statues don’t!

    Page 3 panel 8:

    (word balloon) That was a great evening, Hugh!

    (word balloon) Sure was, Babs! Hi, Stefa, we ran Ruth home first.

    (protagonist word balloon) While your own daughter hoofed it! Why do you bother with goody-goody Ruth Graham, anyway?

    Page 3 panel 9:

    (protagonist word balloon) She’s got parents of her own to take an interest in her! Not that I care a straw what you do.

    (word balloon) I do believe that daughter of ours is jealous, Babs!

    (word balloon) She obviously doesn’t know about Ruth!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s