Evaluation of Annotation Guidelines
The goal of the workshop (in Hamburg, September 17-19) is the evaluation of the annotation guidelines, and the selection of winner guidelines, respectively the creation of consensus guidelines. We would like to evaluate the guidelines in to three dimensions: i) Relation of guidelines and theory, ii) applicability, and iii) usefulness. To explicate these dimensions more clearly (and to guide our discussions during the workshop) we have created a list of (more concrete) questions for each dimensions.
Conceptual coverage
- Do the guidelines give a clear intuition of “narrative level”?
- Is the narrative level concept explicitly defined?
- Is it based on existing definitions?
- How comprehensive are the guidelines with respect to aspects of the theory? Do they omit something?
- Do the guidelines extend concepts/aspects from the theory? Do they make this extension explicit?
- How complex is the theoretical concept implemented by this guidelines?
- Where would you locate the concept of narrative levels in terms of complexity?
- Are you aware of aspects of other narrative level definitions that the understanding of narrative level(s) does not cover?
Applicability
- How easy is it to apply the guidelines?
- for researchers not involved in the guidelines development - for laymen
- How high is the inter-annotator agreement? (The organizers will supply quantitative inter-annotator agreement for the workshop)1
Usefulness
- Thought experiment: Assuming that the narrative levels defined in the annotation guidelines can be detected automatically on a huge corpus. How helpful are these narrative levels for an analysis you are interested in?
- How helpful are they as an input layer for subsequent analysis steps (that depend on narrative levels)?
- How helpful are the guidelines in getting a better understanding of textual details or the text as a whole?
- Do you gain new insights about narrative levels in texts by applying the guidelines, compared the application of your own guidelines?
- Does the application of these guidelines influence your interpretation of a text?
(The above list has been last updated on May 22, 2018)
-
We are currently evaluating the use of Gamma for this purpose. Gamma has been described in Mathet et al. (2015):
Yann Mathet, Antoine Widlöcher, and Jean-Philippe Métivier. The unified and holistic method gamma (γ) for inter-annotator agreement measure and alignment. Computational Linguistics, 41(3):437–479, 2015. ↩