Shared Tasks in the Digital Humanities

Phase 1: Systematic Analysis of Narrative Texts through Annotation (SANTA)

Phase 1
17th July 2017
Marcus Willand • Evelyn Gius • Nils Reiter

SANTA: Creation of Annotation Guidelines

In this post, we want to share how we envision the first phase of the shared task on systematic analysis of narrative texts through annotation (SANTA) in detail. We hope this gives potential participants a clearer picture of what is expected of them and how long it would take.

Phase 1 commences with its 1st Call for Participation (in August 2017) and will be completed in summer 2018 with a workshop. In between, there are three fixed dates that participants need to be aware of (summarized at the end).

The first phase consists of four consecutive steps:
(1) Guideline creation (until June 15, 2018)
(2) Application of own guidelines (until June 25, 2018)
(3) Application of guidelines of other participants (until July 6, 2018)
(4) Workshop: Evaluation of all contributions (tba, in August/September 2018)

(1) Guideline creation

until June 15th 2018, run-time: several months

For the first step, all participants or participating groups (and we want to encourage you to participate in a team) will be supplied with 20 texts in German and English. These texts will contain different narrative level phenomena and cover a variety of genres, epochs etc. The texts (according to the participants’ language preferences) can be used as a base for developing general narrative level annotation guidelines. In our experience the actual annotation of texts is the best way to test the guidelines one has developed. Therefore we strongly suggest that you also use the texts for actual manual annotation, be it digital or non-digital. We will discuss the exact composition of this development corpus in a later blog post.

The guidelines are to be created with general applicability in mind: They should be applicable on any text of similar genre/epoch. This means that the guidelines have to be generic, not related to the texts selected in the first step and formulated in order to allow for their application without additional knowledge.

Once completed (no later than June 15), the guidelines are submitted to the organizers (in English, regardless of the language of the texts a participant is working with).

(2) Application of own guidelines

until June 25th 2018, run-time: about one week

This step will follow the first immediately. After submitting the guidelines, participants will receive a set of six new, unknown texts selected by the organizers. These six texts then need to be annotated according to the developed guidelines from the first step. This time, the annotation needs to be done in a web-based annotation tool provided and maintained by the organizers.

(3) Application of guidelines of other participants

until July 6th 2018, run-time: between June 18th and July 6th 2018

In the third step, all participants annotate the six texts known from (2) on the basis of guidelines created by other participants. For this, the submitted guidelines will be anonymized and re-distributed among all participants in such a way that every participant/team will get the guidelines from one other participant/team.

In parallel, the student research assistants affiliated with the organizers annotate the same texts according to all submitted annotation guidelines.

(4) Workshop: Evaluation of all guideline contributions

tba, presumably in August/September 2018

On the final workshop of SANTA’s Phase 1, all participants will present their guidelines and discuss them in plenum. Both the presentations and the discussion will focus on the quality of the submitted guidelines. Presenters will be asked to provide answers to the following questions: Can you capture just the simple phenomena, or also more complex ones? What is the relevance of the captured phenomena from a literary studies point of view?

Additionally, we will provide the so called ‘inter annotator agreement’ (IAA) for all submitted guidelines, a measure for consistency of guidelines used in the field of natural language processing. For its calculation the annotations made in steps 2 and 3 will be used.

The overall aim of the final workshop of Phase 1 is to identify the guidelines that will be used in Phase 2 of SANTA. This can be done by choosing (and adjusting, if necessary) the best guidelines or by assembling a new set of guidelines from the submitted ideas.

We will detail our plans for the workshop more concretely in a future blog post.

Important Dates

Date Action
August 20171st CfP
June 15, 2018Submission of annotation guidelines
June 25, 2018Submission of annotated texts using own guidelines
July 6, 2018Submission of annotated texts using other guidelines
August/September 2018Workshop

Questions?

As always, if you have comments, questions, or suggestions, feel free to post them to our mailing list.