11 Representation of Primary Sources

Contents

This chapter defines a module intended for use in therepresentation of primary sources, such as manuscripts or otherwritten materials. Section 11.1 Digital Facsimiles provides elementsfor the encoding of digital facsimiles or images of such materials,while the remainder of the chapter discusses ways of encoding detailedtranscriptions of such materials. It is expected that this module willalso be useful in the preparation of critical editions, but the moduledefined here is distinct from that defined in chapter 12 校本, and may be used independently of it. Detailed metadatarelating to primary sources of any kind may be recorded using theelements defined by the manuscript description module discussed inchapter 10 Manuscript Description, but again the present module may be usedindependently if such data is not required.

It should be noted that, as elsewhere in these Guidelines, thischapter places more emphasis on the problems of representing thetextual components of a document than on those relating to thedescription of the document's physical characteristics such as thecarrier medium or physical construction. These aspects, of particularimportance in codicology and the bibliographic study of incunables,are touched on in the chapter on Manuscript Description (10 Manuscript Description) and also form the subject of ongoing work in the TEIPhysical 参考文献 workgroup.

Although this chapter discusses manuscript materials morefrequently than other forms of written text, most of therecommendations presented are equally applicable mutatismutandis in the encoding of printed matter or indeed anyform of written source, including monumental inscriptions. Similarly,where in the following descriptions terms such as‘scribe’, ‘author’,‘editor’, ‘annotator’ or‘corrector’ are used, these may be re-interpretedin terms more appropriate to the medium being transcribed. In printedmaterial, for example, the ‘compositor’ plays arole analogous to the ‘scribe’, while in anauthorial manuscript, the author and the scribe are the same person.

11.1 Digital Facsimiles

These Guidelines are mostly concerned with the preparation ofdigital texts, in which a pre-existing text is transcribed orotherwise converted into character form, and marked up inXML. However, it is also very common practice to make a different formof ‘digital text’, which is instead composed ofdigital images of the original source, typically one per page, orother written surface. We call such a resource a digitalfacsimile. A digital facsimile may, in the simplest case, justconsist of a collection of images, with some metadata to identify themand the source materials portrayed. It may sometimes contain avariety of images of the same source pages, for example of differentresolutions, or of different kinds. Such a collection may form part ofany kind of document, for example a commentary of a codicological orpaeleographic nature, where there is a need to align explanatory textwith image data. And it may also be complemented bya transcribed or encoded version of the original source, which may belinked to the page images. In this section we present elementsdesigned to support these various possibilities and discuss theassociated mechanisms provided by these Guidelines.

When this module is included in a schema, the classatt.global is extended to include a new pointerattribute facs:
  • att.global.facs elements which can be associated with an image or a surface within a facsimile element.
    facs(facsimile) points directly to an image, or to a part of a facsimile element which corresponds with this element.
This attribute may be used to associate anyelement in a transcribed text with an image of it, by means of theusual URI pointing mechanism.
If a digital text contains one image per page or column (or similarunit), and no more complex mapping between text and image isenvisaged, then the facs attribute may be used to pointdirectly to a graphic resource:
<TEI>
 <teiHeader>
<!--...-->
 </teiHeader>
 <text>
  <pb facs="page1.png"/>
<!-- text contained on page 1 is encoded here -->
  <pb facs="page2.png"/>
<!-- text contained on page 2 is encoded here -->
 </text>
</TEI>
By convention, this encoding indicates that the image indicated byfacs attribute represents the whole of the text followingthe pb (pagebreak) element, up to the next pbelement. Any convenient milestone element (see further 3.10.3 MilestoneElements) could be used in the same way; for example if theimages represent individual columns, the cb element might beused. Though simple, this method has some drawbacks. It does notscale well to more complex cases where, for example, the images donot correspond exactly with transcribed pages, or where the intentionis to align specific marked up elements with detailed images, or partsof images. And it makes the management of the information about theimages more difficult by scattering references to them through thefile. Nevertheless, this solution may be adequate for manystraightforward ‘digital library’applications.

The recommended approach to encoding facsimiles is instead to usethe facs attribute in conjunction with the elementsfacsimile, surface, and zone, which arealso provided by this module. These elements make it possible toaccommodate multiple images of each page, as well as to recordarbitrary planar coordinates of textual elements on any kind ofwritten surface and to link such elements with digital facsimileimages of them. Typical applications include the provision of fulltext search in ‘digital facsimile editions’, andways of annotating graphics, for example so as to identify individualsappearing in a group portraits and link them to data about the personrepresented.

The following elements are used to represent components of adigital facsimile:
  • facsimile contains a representation of some written source in the form ofa set of images rather than as transcribed or encoded text.
  • surface defines a written surface in terms of a rectangularcoordinate space, optionally grouping one or more graphic representations ofthat space, and rectangular zones of interestwithin it.
    startpoints to an element which encodes the starting position of the text corresponding to theinscribed part of the surface.
  • zone defines a rectangular area contained within a surfaceelement.
The facsimile element is used to represent a digitalfacsimile. It appears within a TEI document along with, or instead of,the text element introduced in section 4 テキスト構造モジュール. When this module is selected therefore, alegal TEI document may thus comprise any of the following:-
  • a TEI Header and a text element
  • a TEI Header and a facsimile element
  • a TEI Header, a facsimile element, and a text element

Like the text element, a facsimile element mayalso contain an optional front or back element, usedin the same way as described in sections 4.5 前付け and4.7 後付.

In the simplest case, a facsimile just contains a series ofgraphic elements, each of which identifies an image file:
<facsimile>
 <graphic url="page1.png"/>
 <graphic url="page2.png"/>
 <graphic url="page3.png"/>
 <graphic url="page4.png"/>
</facsimile>
If desired, the binaryObject element described in 3.9 図等の非テキスト内容 (or any other element from themodel.graphicLike class) can be used instead of agraphic
In this simple case, the four page images are understood torepresent the complete facsimile, and are to be read in the sequencegiven. Suppose, however, that the second page of this particular workis available both as an ordinary photograph and as an infra-red image,or in two different resolutions. The surface element may beused to indicate that there are two image files corresponding with thesame area of the work:
<facsimile>
 <graphic url="page1.png"/>
 <surface>
  <graphic url="page2-highRes.png"/>
  <graphic url="page2-lowRes.png"/>
 </surface>
 <graphic url="page3.png"/>
 <graphic url="page4.png"/>
</facsimile>

The surface element provides a way of indicating that thetwo images of page2 represent the same physical surface within thesource material. A surface might be a sheet of paper orparchment, a face of a monument, a billboard, a membrane of a scroll,or indeed any two-dimensional surface, of any size.

The actual dimensions of the object represented are not documentedby the surface element; instead, the surface islocated within an abstract coordinate space, which is defined by thefollowing attributes, supplied by the att.coordinated class:
  • att.coordinated elements which can be positioned within a two dimensionalcoordinate system.
    ulxgives the x coordinate value for the upper left corner of arectangular space.
    ulygives the y coordinate value for the upper left corner of arectangular space.
    lrxgives the x coordinate value for the lower right corner of arectangular space.
    lrygives the y coordinate value for the lower right corner of arectangular space.

The same coordinate space is used for a surface and forall of its child elements. 34 It may be most convenientto derive a coordinate space from a digital image of the surface inquestion such that each pixel in the image corresponds with a wholenumber of units (typically 1) in the coordinate space. In other casesit may be more convenient to use units such as millimetres; in neithercase is any specific mapping to the physical dimensions of the objectrepresented implied.

Each surface can contain one or more zoneelements, each of which represents a rectangular region orbounding box defined in terms of the same coordinatespace as that of its parent surface element. This provides aunit of analysis which may be used to define any rectangular region ofinterest, such as a detail or illustration, or some part of thesurface which is to be aligned with a particular text element. Theatt.coordinated attributes listed aboveare also used to supply the coordinates of a zone.

As we have seen, a surface will usually correspond with the wholeof a written surface. A zone, by contrast, defines any arbitraryrectangular area of interest using the same coordinate system. Itmight be bigger or smaller than its parent surface, or might overlapits boundaries. The only constraint is that it must be defined usingthe same coordinate system.

When an image of some kind is supplied within either a zone or asurface, the implication is that the whole of the image representsthe zone or surface containing it. In the simple case therefore, wemight imagine a surface defining a page, within which there is agraphic representing the whole of that page, and a number of zonesdefining parts of the page, each with its own graphic. If one of thoseimages actually represents an area larger than the page (for exampleto include a binding or the surface of adesk on which the page rests), then it might be enclosed by a zonewith coordinates smaller or larger than those of the parentsurface.

Note that this mechanism does not provide any way of addressing anon-rectangular area, nor of coping with distortions introduced byperspective or parallax; if this is needed, the more powerfulmechanisms provided by the Standard Vector Graphics (SVG) languageshould be used to define an overlay, as further discussed in 16.4.3 A Three-way Alignment.

For example, consider the following figure:
Relation
between page, surface, and zone
Figure 2. Relationbetween page, surface, and zone
This is an image of a two pagespread from a manuscript in the Badische Landesbibliothek,Karlsruhe. We have no information as to the dimensions of the original object,but the low resolution image displayed here contains 500 pixelshorizontally and 321 pixels vertically. For convenience, we might mapeach pixel to one cell of the coordinate space. 35
The coordinates of the surface (that is, the area of theimage which represents the written two page spread) can then bespecified in terms of this coordinate space, simply by countingpixels in the image. The left corner of the two page spread appears50 units from the left of the image and 20 units from the top, whilethe bottom right corner of the spread appears 400 units from the leftof the image, and 280 units from the top. We therefore define thewritten surface within this image as follows:
<facsimile>
 <surface
   ulx="50"
   uly="20"
   lrx="400"
   lry="280">

<!-- ... -->
 </surface>
</facsimile>
To describe the whole image, we will also need todefine a zone of interest which represents an area larger than thissurface. Using the same coordinate system as that defined for thesurface, its coordinates are 0,0,500,321. This zone ofinterest can be defined by a zone element, within which we canplace the uncropped graphic:
<facsimile>
 <surface
   ulx="50"
   uly="20"
   lrx="400"
   lry="280">

  <zone
    ulx="0"
    uly="0"
    lrx="500"
    lry="321">

   <graphic
     url="http://upload.wikimedia.org/wikipedia/commons/5/50/Handschrift.karlsruhe.blb.jpg"/>

  </zone>
 </surface>
</facsimile>

If desired, the binaryObject element described in 3.9 図等の非テキスト内容 (or any other element from themodel.graphicLike class) may be used instead of agraphic element.

The desc element may also be used within eithersurface or zone to provide some further informationabout the area being defined. For example, since the image in thisexample contains two pages, it might be preferable to definetwo distinct surfaces, one for each page, including its illuminatedmargins. In this case, each surface must specify a bounding box whichencloses the appropriate page, as well as defining the zone for thegraphic itself:
<facsimile>
 <surface
   ulx="50"
   uly="20"
   lrx="210"
   lry="280">

  <desc>left hand page</desc>
  <zone
    ulx="0"
    uly="0"
    lrx="500"
    lry="321">

   <graphic
     url="http://upload.wikimedia.org/wikipedia/commons/5/50/Handschrift.karlsruhe.blb.jpg"/>

  </zone>
 </surface>
 <surface
   ulx="240"
   uly="25"
   lrx="400"
   lry="280">

  <desc>right hand page</desc>
  <zone
    ulx="0"
    uly="0"
    lrx="500"
    lry="321">

   <graphic
     url="http://upload.wikimedia.org/wikipedia/commons/5/50/Handschrift.karlsruhe.blb.jpg"/>

  </zone>
 </surface>
</facsimile>
In addition to acting as a container for graphic elements,zone elements may also be used to selectparts of each surface for analytical purposes. Forexample, to define the written part of the left hand page:
<facsimile>
 <surface
   ulx="50"
   uly="20"
   lrx="210"
   lry="280">

  <desc>Left hand page</desc>
  <zone
    ulx="0"
    uly="0"
    lrx="500"
    lry="321">

   <graphic
     url="http://upload.wikimedia.org/wikipedia/commons/5/50/Handschrift.karlsruhe.blb.jpg"/>

  </zone>
  <zone
    ulx="90"
    uly="40"
    lrx="200"
    lry="225">

   <desc>Written part of left hand page</desc>
  </zone>
 </surface>
</facsimile>
In the following example, we discuss a hypothetical digital editionof an early 16th century French work, Charles de Bovelles'Géometrie Pratique. 36 In this edition, eachpage has been digitized as a separate file: for example, recto page 49is stored in a file called Bovelles-49r.png. In thefacsimile element used to contain the whole set of pages, wedefine a surface element for this page, which we situate within acoordinate scale running from 0 to 200 in the x (horizontal) axis,and 0 to 300 in the y (vertical) axis. The surface elementcontains a graphic element which represents the whole of this surface:
<facsimile>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300">

  <graphic url="Bovelles-49r.png"/>
 </surface>
</facsimile>
We can now identify distinct zones within thepage image using the coordinate scale defined for the surface. InFigure 3, Zones within a surface we show the upper part of the page, withboxes indicating four such zones. Each of these will be represented bya zone element, given within the surfaceelement already defined, and specified in terms of the samecoordinate system.
Zones within a surface
Figure 3. Zones within a surface
The following encoding defines each of the four zones identified inthe figure.
<facsimile>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300">

  <graphic url="Bovelles-49r.png"/>
  <zone
    ulx="25"
    uly="25"
    lrx="180"
    lry="60">

   <desc>contains the title</desc>
  </zone>
  <zone
    ulx="28"
    uly="75"
    lrx="175"
    lry="178"/>

<!-- contains the paragraph in italics -->
  <zone
    ulx="105"
    uly="76"
    lrx="175"
    lry="160"/>

<!-- contains the figure -->
  <zone
    ulx="45"
    uly="125"
    lrx="60"
    lry="130"/>

<!-- contains the word "pendans" -->
 </surface>
</facsimile>
Note that the location of each zone is definedindependently but using the same coordinate system, so that they mayoverlap freely. Zones need not nest within each other; they musthowever be rectangular, as previously noted. As noted earlier, a zone mayfall outside the area of the surface which defines its coordinate space.
In this example a single graphic element has beenassociated directly with the surface of the page rather than nestingit within a zone. However, it is also possible to include multiplezone elements which contain a graphic element, iffor example a detailed image is available. Since all zoneelements use the same coordinate system (that defined by their parentsurface), there is no need to demonstrate enclosure of onezone within another by means of nesting. To continue the currentexample, supposing that we have an additional image calledBovelles49r-detail.png containing an additional imageof the figure in the third zone above, we might encode that zone asfollows:
<zone
  ulx="105"
  uly="76"
  lrx="175"
  lry="160">

 <graphic url="Bovelles49r-detail.png"/>
</zone>
Now suppose that we wish to align a transcription of this pagewith the zones identified above. The first step is to give eachrelevant part of the facsimile an identifier:
<facsimile>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300">

  <zone
    ulx="0"
    uly="0"
    lrx="200"
    lry="300">

   <graphic url="Bovelles-49r.png"/>
  </zone>
  <zone
    ulx="105"
    uly="76"
    lrx="175"
    lry="160">

   <graphic url="Bovelles49r-detail.png"/>
  </zone>
  <zone
    xml:id="B49rHead"
    ulx="25"
    uly="25"
    lrx="180"
    lry="60"/>

<!-- contains the title -->
  <zone
    xml:id="B49rPara2"
    ulx="28"
    uly="75"
    lrx="175"
    lry="178"/>

<!-- contains the paragraph in italics -->
  <zone
    xml:id="B49rFig1"
    ulx="105"
    uly="76"
    lrx="175"
    lry="160"/>

<!-- contains the figure -->
  <zone
    xml:id="B49rW457"
    ulx="45"
    uly="125"
    lrx="60"
    lry="130"/>

<!-- contains the word "pendans" -->
 </surface>
</facsimile>
The alignment between transcription and image is made, as usual, bymeans of the facs attribute:
<pb facs="#B49r"/>
<fw>De Geometrie 49</fw>
<head facs="#B49rHead">DU SON ET ACCORD DES CLOCHES ET <lb/> des alleures des chevaulx,
chariotz &amp;amp; charges, des fontaines:&amp;amp; <lb/> encyclie du monde,
&amp;amp; de la dimension du corps humain.</head>
<head>Chapitre septiesme</head>
<div n="1">
 <p>Le son &amp; accord des cloches pendans en ung mesme <lb/> axe, est
   faict en contraires parties.</p>
 <p rend="itfacs="#B49rPara2">LEs cloches ont quasi fi<lb/>gures de rondes
   pyra<lb/>mides imperfaictes &amp;amp; <lb/> irregulieres: &amp;amp; leur
   accord se <lb/> fait par reigle geometrique. Com<lb/>me si les deux
   cloches C &amp;amp; D <lb/> sont <w facs="#B49rW457">pendans</w> à ung
   mesme axe <lb/> ou essieu A B: je dis que leur ac<lb/>cord se fera en
   co<ex>n</ex>traires parties<lb/> co<ex>m</ex>me voyez icy
   figuré. Car qua<ex>n</ex>d <lb/> lune sera en hault, laultre
   declinera embas. Aultrement si elles decli<lb/>nent toutes deux
   ensembles en une mesme partie, elles seront discord, <lb/> &amp;amp; sera
   leur sonnerie mal plaisante à oyr.<figure facs="#B49rFig1">
   <graphic url="Bovelles49r-detail.png"/>
  </figure>
 </p>
</div>

Further discussion of the encodingchoices made in the above transcription is provided in the remainderof this chapter.

It is also possible to point in the other direction, from asurface or zone to the corresponding text. This isthe function of the start attribute, which supplies theidentifier of the element containing the transcribed text found withinthe surface or zone concerned. Thus, another way of linking this pagewith its transcription would be simply
<facsimile>
 <surface start="#PB49R">
  <graphic url="Bovelles-49r.png"/>
 </surface>
</facsimile>
<text>
<!-- ... -->
 <pb xml:id="PB49R"/>
 <fw>De Geometrie 49</fw>
<!-- ... -->
</text>

11.2 Scope of Transcriptions

When transcribing a primary source, scholars may wish to recordinformation concerning individual readings of letters, words, or largerunits, whether the object is simply a ‘neutral’transcription or a critical edition. In either case they may alsowish to include other editorial material, suchas comments on the status or possible origin of particular readings,corrections, or text supplied to fill lacunae. Further, it iscustomary in transcriptions to register certain features of thesource, such as ornamentation, underlining, deletion, areas of damageand lacunae. This chapter provides ways of encoding such information:

These recommendations are not intended to meet everytranscriptional circumstance likely to be faced by any scholar.Rather, they should be regarded as a base which can be elaborated if necessary by different scholars in different disciplines.

As a rule, all elements which may be used in the course of atranscription of a single witness may also be used in a criticalapparatus, i.e. within the elements proposed in chapter 12 校本.This can generally be achieved by nesting aparticular reading containing tagged elements from a particular witnesswithin the rdg element in an app structure.

Just as a critical apparatus may contain transcriptional elementswithin its record of variant readings in various witnesses, one mayrecord variant readings in an individual witness by use of the apparatusmechanisms app and rdg. This is discussed insection 12.3 転記中の校合要素.

11.3 Altered, Corrected, and Erroneous Texts

In the detailed transcription of any source, it may prove necessaryto record various types of actual or potential alteration of the text:expansion of abbreviations, correction of the text (either by author,scribe, or later hand, or by previous or current editors or scholars),addition, deletion, or substitution of material, and the like. Thesections below describe how such phenomena may be encoded using eitherelements defined in the core module (defined in chapter 3 コアモジュール) or specialized elements available only when the moduledescribed in this chapter is available.

11.3.1 Core elements for Transcriptional Work

In transcribing individual sources of any type, encoders may record corrections, normalizations,expansions of abbreviations, additions, and omissionsusing the elements described in section 3.4 簡単な編集上の変更.Those particularly relevant to this chapter include:
  • abbr (abbreviation) contains an abbreviation of any sort.
  • add (addition) contains letters, words, or phrases inserted in the text by anauthor, scribe, annotator, or corrector.
  • choice groups a number of alternative encodings for the same point in a text.
  • corr (correction) contains the correct form of a passage apparently erroneous in the copy text.
  • del (deletion) contains a letter, word, or passage deleted, marked as deleted,or otherwise indicated as superfluous or spurious in the copy text by anauthor, scribe, annotator, or corrector.
  • expan (expansion) contains the expansion of an abbreviation.
  • gap indicates a point where material has been omitted in atranscription, whether for editorial reasons described in the TEIheader, as part of sampling practice, or because the material isillegible or inaudible.
  • sic (latin for thus or so) contains text reproduced although apparently incorrect or inaccurate.
Several of these elements bear additional attributes for specifying whois responsible for the interpretation represented by the markup,and the certainty associated with it. In addition, someof them bear an attribute allowing the markup to be categorised bytype and source.
  • att.editLike provides attributes describing the nature of a encoded scholarly intervention or interpretation of any kind.
    cert(certainty) signifies the degree of certainty associated with the intervention or interpretation.
    resp(responsible party) indicates the agency responsible for the intervention or interpretation, for example an editor or transcriber.
    sourcecontains a list of one or more pointers indicating the sources which support the given reading.
  • att.typed provides attributes which can be used to classify or subclassify elements in any way.
    typecharacterizes the element in some sense, using any convenient classification scheme or typology.
    subtypeprovides a sub-categorization of the element, if needed
The specific aspect of the markup described by these attributes differson different elements; for further discussion, see the relevant sectionsbelow, especially section 11.4.2 Hand, Responsibility, and Certainty Attributes.

The following sections describe how the core elements just named maybe used in the transcription of primary source materials.

11.3.2 Abbreviation and Expansion

The writing of manuscripts by hand lends itself to the use ofabbreviation to shorten scribal labour. Commonly occurring letters,groups of letters, words, or even whole phrases, may be represented bysignificant marks. This phenomenon of manuscript abbreviation is sowidespread and so various that no taxonomy of it is here attempted.Instead, methods are shown which allow abbreviations to be encoded usingthe core elements mentioned above.

A manuscript abbreviation may be viewed in two ways. One maytranscribe it as a particular sequence of letters or marks upon thepage: thus, a ‘p with a bar through the descender’, a‘superscript hook’, a ‘macron’. One may also interpret theabbreviation in terms of the letter or letters it is seen as standingfor: thus, ‘per’, ‘re’, ‘n’. Both of these views aresupported by these Guidelines.

In many cases the glyph found in the manuscript source also existsin the Unicode character set: for example the common Latin brevigraph⁊, standing for et and often known asthe ‘Tironian et’ can be directly represented inany XML document as the Unicode character with code pointU+204A (see further Character References and vi.i Language identification). In cases where it does not, these Guidelinesrecommend use of the g element provided by the gaiji module described in chapter 5 Representation of Non-standard Characters and Glyphs. This module allows the encoder great flexibility bothin processing and in documenting non-standard characters or glyphs,including the ability to provide detailed documentation and images forthem.

These two methods of coding abbreviation may also be combined. Anencoder may record, for any abbreviation, both the sequence of lettersor marks which constitutes it, and its sense, that is, the letter orletters for which it is believed to stand. For example, in thefollowing fragment the phrase euery personeis represented by a sequence of characters which may be transcribeddirectly, using the g element to indicate the two brevigraphsit contains as follows:
eu<g ref="#er">er</g>y <g ref="#per">per</g>sone that loketh after heuen hath a place in this
ladder
Note that in each case the g element may contain a suggestedreplacement for the referenced brevigraph; this is purely advisoryhowever, and may not be appropriate in all cases.
The transcriber may also wish to indicate that, because of thepresence of these particular characters, the two words are actuallyabbreviations, by using the abbr element:
<abbr>eu<g ref="#er">er</g>y</abbr>
<abbr>
 <g ref="#per">per</g>sone
</abbr>
...
Alternatively, the transcriber may choose silently toexpand these abbreviations, using the expan element:
<expan>euery</expan>
<expan>persone</expan> ...
And, of course, thechoice element can be used to show that one encoding is analternative for the other:
<choice>
 <abbr>eu<g ref="#er">er</g>y</abbr>
 <expan>euery</expan>
</choice>
When abbreviated forms such as these are expanded, two processesare carried out: some characters not present in the abbreviation areadded (always), and some characters or glyphs present in theabbreviation are omitted or replaced (often). For example, when theabbreviation Dr. is expanded toDoctor, the dot in the abbreviation is removed,and the letters octo are added. Where detailed markup of abbreviatedwords is required, these two aspects may be marked up explicitly,using the following elements:
  • ex (editorial expansion) contains a sequence of letters added by an editor or transcriber when expanding an abbreviation.
  • am (abbreviation marker) contains a sequence of letters or signs present in an abbreviation which are omitted or replaced in the expanded form of the abbreviation.
Using these elements,a transcriber may indicate the status of the individual letters or signs within both the abbreviation and theexpansion. The am element surrounds characters or signs suchas tittles or tildes, used to indicate the presence of anabbreviation, which are typically removed or replaced by othercharacters inthe expanded form of the abbreviation:
<abbr>eu<am>
  <g ref="#er"/>
 </am>y</abbr>
<abbr>
 <am>
  <g ref="#per"/>
 </am>sone
</abbr> ...
while the ex element may be used to indicate those characterswithin the expansion which are not present in the abbreviatedform.
<expan>eu<ex>er</ex>y</expan>
<expan>
 <ex>per</ex>sone
</expan> ...
The content of the abbr element should usually include thewhole of the abbreviated word, while the expan element shouldinclude the whole of its expansion. If this is not considerednecessary, the am and ex elements may be used withina choice element,as in this example:
eu<choice>
 <am>
  <g ref="#er"/>
 </am>
 <ex>er</ex>
</choice>y
<choice>
 <am>
  <g ref="#per"/>
 </am>
 <ex>per</ex>
</choice>sone ...

As implied in the preceding discussion, making decisions aboutwhich of these various methods of representing abbreviation to usewill form an important part of an encoder's practice. As a rule, theabbr and am elements should be preferred where it is wished tosignify that the content of the element is an abbreviation, withoutnecessarily indicating what the abbreviation may stand for. Theex and expan elements should be used where it is wished to signify thatthe content of the element is not present in the source but has beensupplied by the transcriber, without necessarily indicating theabbreviation used in the original. The decision as to which course ofaction is appropriate may vary from abbreviation to abbreviation;there is no requirement that the one system be used throughout atranscription, although doing so will generally simplifyprocessing. The choice is likely to be a matter of editorial policy.If the highest priority is to transcribe the text literatim, whileindicating the presence of abbreviations, the choice will be to useabbr or am throughout. If the highest priority is to present areading transcription, while indicating that some letters or words arenot actually present in the original, the choice will be to useex or expanthroughout.

Further information may be attached to instances of these elements bythe note element, on which see section 3.8 注釈, Annotation, and Indexing, andby use of the resp and cert attributes. In thisinstance from the English Brut,a note is attached to an editorial expansion of the tail on the final dof good to goode:
For alle the while that I had
good<ex xml:id="exp01">e</ex>
I was welbeloued
Then the note:
<note target="#exp01">The stroke added to the final d could signify the
plural ending (-es, -is, -ys>) but the singular <hi rend="it">good</hi> was used with the meaning <q>property</q>,
<q>wealth</q>, at this time (v. examples quoted in OED, sb. Good,
C. 7, b, c, d and 8 spec.)</note>
The editor might declare adegree of certainty for this expansion, based on the OED examples, andstate the responsibility for the expansion:
For alle the while that I had
good<ex resp="#mpcert="high">e</ex> I was welbeloued
Thevalue supplied for the resp attribute should point to thename of the editor responsible for this and possibly otherinterventions; an appropriate element therefore might be arespStmt element in the header like the following:
<respStmt xml:id="mp">
 <resp>Editorial emendations</resp>
 <name>Malcom Parkes</name>
</respStmt>
Observe that the cert and resp attributes areused with the ex element only to indicateconfidence in the content of the element (i.e. the expansion), andresponsibility for suggesting this expansion respectively.
The choice element may be used to indicate that theproposed expansion is one way of encoding what might equally well berepresented as an abbreviation, represented by the hooked D, as follows:
For alle the while that I had
<choice>
 <sic>good<abbr>ɽ</abbr>
 </sic>
 <expan resp="#mpcert="high">good<ex>e</ex>
 </expan>
</choice>
I was welbeloued
If it is desired to express aspects of certainty and responsibilityfor some other aspect of the use of these elements, then themechanisms discussed in chapter 21 確信度・責任 should beused. See also 11.4.2 Hand, Responsibility, and Certainty Attributes for discussion of the issues ofcertainty and responsibility in the context of transcription.

If more than one expansion for the same abbreviation is to berecorded, multiple notes may be supplied. It may also be appropriateto use the markup for critical apparatus; an example is given insection 12.3 転記中の校合要素.

11.3.3 Correction and Conjecture

The sic, corr, and choice elements,defined in the core module should be used toindicate passages deemed in need of correction, or actually corrected,during the transcription of a source. Forexample, in the manuscript of William James's A PluralisticUniverse, edited by Fredson Bowers (Cambridge: HarvardUniversity Press, 1977) a sentence first written

Onemust have lived longer with this system, to appreciate itsadvantages.

has been modified by James to begin ‘But One must...’, without the inital capital O having been reduced tolowercase. This non-standard orthography could be recorded thus:
But <sic>One</sic>
must have lived ...
or corrected:
But <corr>one</corr> must
have lived ...
or the two possibilities might be representedas a choice:
But
<choice>
 <sic>One</sic>
 <corr>one</corr>
</choice> must have lived
...
Similarly, in this example from Albertus Magnus,both a manuscript error angues and its correctionaugens are registered within a choice element:
Nos autem iam ostendimus quod nutrimentum
et <choice>
 <sic>angues</sic>
 <corr>augens</corr>
</choice>.

Note that the corr element is used to provide a correctedform which is not present in the source; in the case of acorrection made in the source itself, whether scribal, authorial, orby some other hand, the add, del,and subst elements described in 11.3.4 Additions and Deletions should beused.

The sic element is used to mark passages considered by thetranscriber to be erroneous; in such cases, the corr elementindicates the transcriber's correction of them. Where the transcriberconsiders that one or more words have been erroneously omitted in theoriginal source and corrects this omission, the supplied element discussed in 11.3.7 Text Omitted from or Supplied in the Transcription should be used in preference to corr. Thus,in the following example, from George Moore's draft of additional materials forMemoirs of My Dead Life,the transcriber supplies the word we omitted by the author:
You see that I avoid the word create for we
create nothing <supplied>we</supplied> develope.

As with expan and abbr, thechoice as to whether to record simply that there is an apparent error,or simply that a correction has been applied, or to record bothpossible readings within a choice element isleft to the encoder. The decision is likely to be a matter of editorial policy,which might be applied consistently throughout or decided case by case.If the highest priority is to present an uncorrected transcription whilenoting perceived errors in the original, the choice will typicallybe to use only sic throughout. If the highest priority is topresent a reading transcription, while indicating that perceived errorsin the original have been corrected, the choice will be to useonly corr throughout.

Further information may be attached to instances of these elements bythe note element and resp and certattributes. Instances of these elements may also be classifiedaccording to any convenient typology using the typeattribute.

For example, consider the following encoding of anemendation in the Hengwrt manuscript proposed by E. TalbotDonaldson:
Telle me also, to what conclusioun
Were membres maad, of generacioun
And of so parfit wis a
<choice xml:id="corr117">
 <sic>wight</sic>
 <corr>wright</corr>
</choice>
ywroght?

<!-- ... -->
<note target="#corr117">This emendation of the Hengwrt copy text,
based on a Latin source and on the reading of three late
and usually unauthoritative manuscripts, was proposed
by E. Talbot Donaldson in <bibl>
  <title>Speculum</title> 40 (1965)
   626–33.</bibl>
</note>
The note element discussed in 3.8 注釈, Annotation, and Indexing may beused to give a more detailed discussion of the motivation for or scopeof a correction. If linked by means of a pointer (as in this example)it may be located anywhere convenient within the transcription;typically all detailed notes will be collected together in a separatediv element in the back. Alternatively, the pointermay be omitted, and the note placed immediately adjacent tothe element being annotated. The advantage of the former solution isthat it permits the same annotation to refer to several corrections.
The attribute cert may be used to indicate thedegree of confidence ascribed by the encoder to the proposedemendation on a broad scale: high, medium, or low. The attributeresp is used to indicate who is responsible for theproposed emendation. Its value is a pointer, which will typicallyindicate a respStmt or name element in the header of the transcribeddocument, but can point anywhere, for example to some online authorityfile. Using these two attributes, we the corr elementpresented above might usefully be enhanced as follows:

<!-- somewhere in the header ... --><name xml:id="ETD">E Talbot Donaldson</name>
<!-- ... -->
And of so parfit wis a
<choice>
 <sic>wight</sic>
 <corr resp="#ETDcert="medium">wright</corr>
</choice>
ywroght?
As remarked above, where the same annotation applies to severalcorrections, this may be represented by supplying multiple pointers onthe note. Consider for example such corrections as the following, inDudo of S. Quentin. Parkes cites two cases in this manuscript of thesame phenomenon:
quamuis <choice xml:id="sic-1">
 <sic>mens</sic>
 <corr>iners</corr>
</choice> que nutu dei
gesta sunt ... unde esset uiriliter
<choice xml:id="sic-2">
 <corr>uegetata</corr>
 <sic>negata</sic>
</choice>
which may be described as follows:
<note target="#sic-1 #sic-2">Substitution of a more familiar word which resembles
graphically what the scribe should be copying but which does not make
sense in the context.</note>
The target attribute on the note elementindicates the choice elements which exemplify this kind ofscribal error. This necessitates the addition of an identifier to eachchoice element. However, if the number of corrections is largeand the number of notes is small, it may well be both more practicaland more appropriate toregard the collection of annotations as constituting a typology andthen use the type attribute. Suppose that the note givenabove is one of half a dozen possible kinds of corrected phenomenaidentified in a given text; others might include, say, ‘repetitionof a word from the preceding line’, etc. The typeattribute on the corr element can be used to specify anarbitrary code for the particular kind of correction (or othereditorial intervention) identified within it. This code can be chosenfreely and is not treated as a pointer.
quamuis
<choice>
 <sic>mens</sic>
 <corr type="graphSubs">iners</corr>
</choice> que nutu dei
gesta sunt ... unde esset uiriliter
<choice>
 <corr type="graphSubs">uegetata</corr>
 <sic>negata</sic>
</choice>
Note that this encoding might be extended to include a range ofpossible corrections:
quamuis
<choice>
 <sic>mens</sic>
 <corr type="graphSubs">iners</corr>
 <corr type="reversal">inres</corr>
</choice> que nutu dei
gesta sunt ...
In addition, the conscientious encoder will provide documentationexplaining the circumstances in which particular codes are judgedappropriate. A suitable location for this might be within thecorrection element of the encodingDesc of theheader, which might include a list such as the following:
<correction>
 <p>The following codes are used to categorise corrections identified
   in this transcription:
 <list type="gloss">
   <label>graphSubs</label>
   <item>Substitution of a more familiar word which resembles
       graphically what the scribe should be copying but which does not make
       sense in the context.</item>
<!-- ... -->
  </list>
 </p>
</correction>
A subtype attribute may be used in conjunction with thetype for subclassification purposes: the above examplesmight thus be represented as choice type="substitution"subtype="graphicResemblence" for example.

For a given project, it may well be desirable to limit the possiblevalues for the type or subtype attributes automatically. This is easilydone but requires customization of the TEI system using techniquesdescribed in 23.2 Personalization and Customization, in particular 23.2.1.4 Modification of Attribute and AttributeValue リスト, which should be consulted for further informationon this topic.

When making a correction in a source which forms part of a textualtradition attested by many witnesses, a textual editor willsometimes use a reading from one witness to correct the reading of thesource text. In the general case, such encoding is best achieved with themechanisms provided by the module for textual criticism described inchapter 12 校本. However, for simple cases, thesource attribute of the corr attribute maysuffice. In the passage from Chaucer'sWife of Bath's Tale mentioned above, Parkes proposes toemend the problematic word wight towyf which is the reading found in the Cambridgemanuscript Gg.1. 27. This may be simply represented as follows:
And of so parfit wis a
<choice>
 <sic>wight</sic>
 <corr resp="#mpsource="#Gg">wyf</corr>
</choice>
ywroght?
The value of the source attribute here is, like the valueof the resp attribute, a pointer, in this case indicatingthe manuscript used as a witness. Elsewhere in the transcribed text, alist of witnesses used in this text will be given, one of which has anidentifier Gg. Each witness will be represented either bya witness element (see 12.1 校合項目, 解釈, and Witnesses) or more fullyby a msDesc element (see 10 Manuscript Description) :
<msDesc xml:id="Gg">
 <msIdentifier>
  <settlement>Cambridge</settlement>
  <repository>University Library</repository>
  <idno>Gg.1. 27</idno>
 </msIdentifier>
<!-- further description of the manuscript here -->
</msDesc>
The app element described in chapter 12 校本provides a more powerful way of representing all three possiblereadings in parallel:
And of so
parfit wis a
<app>
 <rdg wit="#Hg">wight</rdg>
 <rdg wit="#Ln #Ry2 #Ld">wright</rdg>
 <rdg wit="#Gg">wyf</rdg>
</app>
This encoding simply records the three readings found in thevarious traditions, and gives (by means of the witattribute) an indication of the witnesses supporting each. If theresp attribute were supplied on the rdg element,it would indicate the person responsible for asserting that themanuscript indicated has this reading, who is not necessarily the sameas the person responsible for asserting that this reading shouldbe used to correct the others. Editorial intervention elements such ascorr can however be nested within a rdg to providethis additional information:
And of so
parfit wis a
<app>
 <rdg wit="#Hg">wight</rdg>
 <rdg wit="#Ln #Ry2 #Ld">
  <corr resp="#ETD">wright</corr>
 </rdg>
 <rdg wit="#Gg">
  <corr resp="mp">wyf</corr>
 </rdg>
</app>
This encoding asserts that the reading wyffound in Gg is regarded as a correction by Parkes.

Like the resp attribute, the cert attributemay be used with both corr and rdg elements. Whenused on the rdg element, these attributes indicate confidencein and responsibility for identifying the reading within the sourcesspecified; when used on the corr element they indicateconfidence in and responsibility for the use of the reading to correctthe base text. If no other source is indicated (either by thesource attribute, or by the wit attribute of aparent rdg), the reading supplied within a corr hasbeen provided by the person indicated by the respattribute.

If it is desired to express aspects of certainty and responsibilityfor some other aspect of the use of these elements, then themechanisms discussed in chapter 21 確信度・責任 may be founduseful. See also 11.4.2 Hand, Responsibility, and Certainty Attributes for further discussion of theissues of certainty and responsibility in the context oftranscription.

11.3.4 Additions and Deletions

Additions and deletions observed in a source text may be describedusing the following elements:
  • add (addition) contains letters, words, or phrases inserted in the text by anauthor, scribe, annotator, or corrector.
  • addSpan/ (added span of text) marks the beginning of a longer sequence of text added by an author, scribe, annotator or corrector (see also add).
  • del (deletion) contains a letter, word, or passage deleted, marked as deleted,or otherwise indicated as superfluous or spurious in the copy text by anauthor, scribe, annotator, or corrector.
  • delSpan/ (deleted span of text) marks the beginning of a longer sequence of text deleted,marked as deleted, or otherwise signaled as superfluous or spurious by anauthor, scribe, annotator, or corrector.
Of these, add and del are included in the coremodule, while addSpan and delSpan are available onlywhen using the module defined in this chapter. These particular elementsare members of the att.spanning class,from which they inherit the following attribute:
  • att.spanning provides attributes for elements which delimit a span of text by pointing mechanisms rather than by enclosing it.
    spanToindicates the end of a span initiated by the element bearing this attribute.
Further characteristics of each addition and deletion, such as thehand used, its effect (complete or incomplete, for example), or itsposition in a sequence of such operations may conveniently be recordedas attributes of these elements, all of which are members of theatt.transcriptional class:
  • att.transcriptional provides attributes specific to elements encoding authorial or scribal intervention in a text when transcribing manuscript or similar sources.
    seq(sequence) assigns a sequence number related to the order in which the encoded features carrying this attribute are believed to have occurred.
    statusindicates the effect of the intervention, for example in the case of a deletion, strikeouts which include too much or too little text, or in the case of an addition, an insertion which duplicates some of the text already present.
    handsignifies the hand of the agent which made the intervention.
As described in section 3.4 簡単な編集上の変更, the addelement is used to record any manuscript addition observed in thetext, whether it is considered to be authorial or scribal. In theautograph manuscript of Max Beerbohm's The GoldenDrugget, the author's addition of doever may be recorded as follows, with the handattribute indicating that the addition was Beerbohm's by referencing ahandNote element defined elsewhere in the document (seefurther 11.4.1 Document Hands):
Some things are best at first
sight. Others — and here is one of them — <add hand="#mb">do
ever</add> improve by recognition ....

<handNote xml:id="mb">Max Beerbohm holograph</handNote>
Similarly, when the del element is used torecord manuscript deletions. In the autograph manuscript of D.H. Lawrence's Eloi, Eloi, lama sabachthanithe author's deletion of my may be recorded asfollows. In this case, the hand attribute indicating thatthe deletion was Lawrence's is complemented by a rendattribute indicating that the deletion was by strike-through:
For I hate this <del rend="strikethroughhand="#dhl">my</del> body, which is so dear to me
...

<handNote xml:id="dhl">D H Lawrence holograph</handNote>

If deletions are classified systematically, the typeattribute may be useful to indicate the classification; when they areclassified by the manner in which they were effected, or by theirappearance, however, this will lead to a certain arbitrariness indeciding whether to use the type or the rendattribute to hold the information. In general, it is recommended thatthe rend attribute be used for description of theappearance or method of deletion, and that the typeattribute be reserved for higher level or more abstractclassifications.

The place attribute is also available to indicate thelocation of an addition. For example, consider the following passagefrom a draft letter by Robert Graves:
At the end of this extract, the writerinserts the word ‘cant,’ above the line, with a stroke toindicate insertion: this might be encoded as follows:
The O.E.D. is not a dictionary so much as a corpus of
precedents <del hand="RG">in the</del>: current,
obsolete, <add hand="RGplace="supralinear">cant,</add>
cataphretic and nonce-words are all included.
A little earlier in the same extract, Graves writes ‘for anabridgement’ above the line, and then deletes it. This may beencoded similarly:
As for 'significant artist.' You quote the O.E.D <add hand="RGplace="supralinear">
 <del>for an abridgement</del>
</add>in
explanation...
Similarly, in the margin, the word ‘Norton’ has been added and then deleted:
You quote the <add hand="RGplace="left">
 <del>Norton</del>
</add>
O.E.D...
The word ‘O.E.D.’ in this first sentence has also clearly beenthe result of some redrafting: it may be that Graves started to write‘Oxford’, and then changed it; it may be that he inserted otherpunctuation marks between the letters before replacing them with thecentre dots used elsewhere to represent this acronym. We do not dealwith these possibilities here, and mention them only to indicate thatany encoding of manuscript material of this complexity will need tomake decisions about what is and is not worth mentioning.
An encoder may also wish to indicate that an addition replaces aspecific deletion, that is to encode a substitution as a singleintervention in the text. This may be achieved by grouping theaddition and deletion together within a subst element.At the end of the passage illustrated above, Graves first writes ‘It is theexpressed...’, then deletes ‘It is’, and substitutes anupper-case T at the start of ‘the’.
...
are all included. <del hand="RG">It is</del>
<subst>
 <add>T</add>
 <del>t</del>
</subst>he expressed
The use of this element and of the seq attribute to indicate theorder in which interventions such as deletions are believed to haveoccurred are further discussed in section 11.3.5 Substitutions below.

The add and del elements defined in the core modulesuffice only for the description of additions and deletions which fitwithin the structure of the text being transcribed, that is, whicheach deletion or addition is completely contained by the structuralelement (paragraph, line, division) within which it occurs. Where thisis not the case, for example because an individual addition or deletioninvolves severaldistinct structural subdivisions, such as poems or prose items, orotherwisecrosses a structural boundary in the text being encoded, specialtreatment is needed. The addSpan anddelSpan elements are provided by this module for thatpurpose. (For a general discussion of the issue see further 20 Non-hierarchical Structures).

In this example of the use of addSpan, theinsertion by HelgiÓlafsson of a gathering containing four neo-Eddic poems intoLbs 1562 4to is recorded as follows. A handNote element is first declared, within the header of thedocument, to associate the identifier heol with Helgi. Eachof the added poems is encoded as a distinct div element. Inthe body of the text, an addSpan element is placed to markthe beginning of the span of added text, and an anchor isused to mark its end. The hand attribute on theaddSpan element ascribes responsibility for the addition tothe manuscript to Helgi, and the spanTo attribute points tothe end of the added text:
<handNote xml:id="heolscribe="HelgiÓlafsson"/>
<!-- ... -->
<body>
 <div>
<!-- text here -->
 </div>
 <addSpan n="added gatheringhand="#heolspanTo="#p025"/>
 <div>
<!-- text of first added poem here -->
 </div>
 <div>
<!-- text of second added poem here -->
 </div>
 <div>
<!-- text of third added poem here -->
 </div>
 <div>
<!-- text of fourth added poem here -->
 </div>
 <anchor xml:id="p025"/>
 <div>
<!-- more text here -->
 </div>
</body>
The delSpan element is used in the same way. An authorialmanuscript will often contain several occasionswhere sequences of whole lines are marked for deletion, either by boxes or by beingstruck out. If the encoder is marking up individual verse lines withthe l element, such deletions are problematic: deletion oftwo consecutive lines should be regarded as a single deletion, but thedel element must be properly nested within a singlel element. The delSpan element solves this problem:
<l>Flowed up the hill and down King William Street,</l>
<delSpan spanTo="#EPdelEndresp="#EPrend="strikethrough"/>
<l>To where Saint Mary Woolnoth kept the time,</l>
<l>With a dead sound on the final stroke of nine.</l>
<anchor xml:id="EPdelEnd"/>
<l>There I saw one I knew, and stopped him, crying "Stetson!</l>...
It is also often the case that deletions and additions may themselves contain otherdeletions and additions. For example, in Thomas Moore's autograph of the second version ofLalla Rookh two lines are marked for omission by verticalstrike-through. Within the first of the two lines, the wordupon has also been struck out, and the wordover has been added:
<l>
 <delSpan rend="verticalStrikespanTo="#delend01"/>
Tis moonlight <del>upon</del>
 <add>over</add> Oman's sky
</l>
<l>Her isles of pearl look lovelily<anchor xml:id="delend01"/>
</l>
In this case the anchor and delSpan have beenplaced within the structural elements (the ls) rather thanbetween, as in the previous example. This is to indicate thatplacement of these empty elements is arbitrary.

The text deleted must be at least partially legible, in order forthe encoder to be able to transcribe it. If it is not legible at all,the gap element should be used to signal that the text wasnot transcribed, because it could not be; the reasonattribute can give the cause of the omission from the transcription as‘deletion, illegible’. If the deletedtext is partially legible, the unclear element described insection 11.5.1 Damage, Illegibility, and Supplied Text may be used to indicate areas oftext which cannot be read with confidence. See further section 11.3.7 Text Omitted from or Supplied in the Transcription and section 11.5.1 Damage, Illegibility, and Supplied Text.

11.3.5 Substitutions

Substitution of one word or phrase for another is perhaps the mostcommon of all phenomena requiring special treatment in transcriptionof primary textual sources. It may be simply one word overwritinganother, or deletion of one word and its replacement by anotherwritten above it by the same hand at the one time; the deletion andreplacement may be done by different hands at different times; theremay be a long chain of substitutions on the one stretch of text, withuncertainty as to the order of substitution and as to which of manypossible readings should be preferred.

As we have shown, the simplest method of recording a substitutionis simply to record both the addition and the deletion. However, whenthe module defined by this chapter is in use, an additional element isavailable to indicate that the encoder believes the addition and thedeletion to be part of the same intervention: a substitution.
  • subst (substitution) groups one or more deletions with one or more additions when the combination is to be regarded as a single intervention in the text.
Using this element, the example at the end of the last section mightbe encoded as follows:
<l>
 <delSpan rend="verticalStrikespanTo="#delend02"/>
Tis moonlight <subst>
  <del>upon</del>
  <add>over</add>
 </subst> Oman's sky
</l>
<l>Her isles of pearl look lovelily<anchor xml:id="delend02"/>
</l>
Since the purpose of this element is solely to group its child elementstogether, the order in which they are presented is not significant. Byconvention, however, deletion precedes addition. This may beover-ridden by means of the seq attribute, which is ofparticular usefulness when a sequence of deletions and additionsoccurs.
For example, returning to the example from William James, in apassage first written out by James as ‘One must have lived longerwith this system, to appreciate its advantages’ the wordthis is first replaced by sucha and this is then replaced by a.37 This maybe encoded as follows, representing the two changes as a sequence ofadditions and deletions:
One must have lived longer
with <subst>
 <del seq="1">this</del>
 <del seq="2">
  <add seq="1">such
     a</add>
 </del>
 <add seq="2">a</add>
</subst> system, to appreciate its
advantages.
Note the nesting of an add element withina del to record text first added, then deleted in thesource. The numbers assigned by the seq attribute may beused to identify the order in which the various additions anddeletions are believed by the encoder to have been carried out, andthus provide a simple method of supporting the kind of‘genetic’ textual criticism typified by (forexample) Hans Walter Gabler's work on the reconstruction of the‘overlay’ levels of implicit in the manuscripts of James Joyce'sUlysses.
As a more complex example, consider the following passage in one ofthe manuscripts of Wilfred Owen's Dulce et decorumest
This passage might be encoded as follows:
<l>And towards our distant rest began to trudge,</l>
<l>
 <subst>
  <del>Helping the worst amongst us</del>
  <add>Dragging the
     worst amongt us</add>
 </subst>, who'd no boots
</l>
<l>But limped on, blood-shod. All went lame;
<subst>
  <del status="shortEnd">half-</del>
  <add>all</add>
 </subst> blind;</l>
<l>Drunk with fatigue ; deaf even to the hoots</l>
<l>Of tired, outstripped <del>fif</del> five-nines that dropped behind.</l>
In this representation,
  • the false start fif in the last line is simply marked asa deletion;
  • the other two authorial corrections are marked assubstitutions, each combining a deletion and an addition.
  • the authorial slip (amongt foramongst) is retained without comment.
The app element presented in chapter 12 校本provides similar facilities, by treating each state of the text as adistinct reading. The rdg element has a varSeqattribute which may be used in the same way as theseq attribute to indicate the preferred sequence. The Jamesexample above might thus be represented as follows:
One must have lived longer with
<app>
 <rdg varSeq="1">
  <del>this</del>
 </rdg>
 <rdg varSeq="2">
  <del>
   <add>such a</add>
  </del>
 </rdg>
 <rdg varSeq="3">
  <add>a</add>
 </rdg>
</app>
system, to appreciate its advantages.

11.3.6 Cancellation of Deletions and Other Markings

An author or scribe may mark a word or phrase in some way, and thenon reflection decide to cancel the marking. For example, text may bemarked for deletion and the deletion then cancelled, thus restoring thedeleted text. Such cancellation may be indicated by therestore element:
  • restore indicates restoration of text to an earlier state bycancellation of an editorial or authorial marking or instruction.

This element bears the same attributes as the other transcriptionalelements. These may be used to supply further information such asthe hand in which the restoration is carried out, the type ofrestoration, and the personrsponsible for identifying the restoration as such, in the same way aselsewhere.

Presume that Lawrence decided to restore my to thephrase of Eloi, Eloi, lama sabachthani first written‘For I hate this my body’, with the my first deletedthen restored by writing ‘stet’ in the margin. This may beencoded:
For I hate this
<restore hand="#dhltype="marginalStetNote">
 <del>my</del>
</restore>
body

Another feature commonly encountered in manuscripts is the use ofcircles, lines, or arrows to indicate transposition of material fromone point in the text to another. No specific markup for thisphenomenon is proposed at this time. Such cases are most simplyencoded as additions at the point of insertion and deletions at thepoint of encirclement or other marking.

11.3.7 Text Omitted from or Supplied in the Transcription

Where text is not transcribed, whether because of damage to theoriginal, or because it is illegible, or for some other reason such as editorial policy,the gap core element should be used to register the omission;where text not present in the source is supplied (whetherconjecturally or from other witnesses) to fill an apparent gap in thetext, it should be marked using the supplied element providedby the module defined in this chapter.
  • gap indicates a point where material has been omitted in atranscription, whether for editorial reasons described in the TEIheader, as part of sampling practice, or because the material isillegible or inaudible.
    reasongives the reason for omission. Sample values include sampling, illegible, inaudible, irrelevant, cancelled, cancelled and illegible.
    handin the case of text omitted from the transcription because ofdeliberate deletion by an identifiable hand, signifies the hand whichmade the deletion.
    agentIn the case of text omitted because of damage, categorizes the cause of the damage, if it can be identified.
  • supplied signifies text supplied by the transcriber or editor for any reason, typically because the original cannot be read because of physical damage or loss to the original.
    reasonindicates why the text has had to be supplied.
By its nature, the gap element has no content. Itmarks a point in the text where nothing at all can be read, whether becauseof authorial or scribal erasure, physical damage, or any other formof illegibility. Its attributes allow the encoder to specify theamount of text which is illegible in this way at this point, using anyconvenient units, wherethis can be determined. For example, inthe Beerbohm manuscript of The Golden Drugget citedabove, the author has erased a passage amounting about 10 cm in lengthby inking over it completely:
Others <gap
  reason="cancelled"
  hand="#mb"
  extent="10"
  unit="cm"/>
—and
here is one of them...
In an autograph letter of Sydney Smith now in the Pierpont Morganlibrary three words in the signature are quite illegible:
I am dr Sr yr <gap reason="illegibleextent="3unit="word"/>Sydney Smith
The degree of precision attempted when measuring the size of a gapwill vary with the purpose of the encoding and the nature of thematerial: no particular recommendation is made here.

As noted above, the gap element should only be used wheretext has not been transcribed; if partially legible text has beentranscribed, one of the elements damage and unclearshould be used instead. These elements are described in section 11.5.1 Damage, Illegibility, and Supplied Text.

If the source text is completely illegible or missing, an encodermay sometimes wish to supply new (conjectural) material to replace it.This conjectural reading is analogous to a correction in that itcontains text provided by the encoder and not attested in thesource. This is not however a correction, since no error isnecessarily present in the original; for that reason a differentelement supplied should be used.If another (imaginary) copy of the letter above preserved thesignature as reading ‘I am dear Sir your very humble Servt SydneySmith’, the text illegible in the autograph might be supplied inthe transcription:
I
am dr Sr yr <supplied reason="illegibleresp="#msmsource="#AmCo">very humble Servt</supplied> Sydney Smith
Herethe source and resp attributes are used, aselsewhere, to indicate respectively the sigil of a manuscript fromwhich the supplied reading has been taken, and the identifier of theperson responsible for deciding to supply the text. If thesource attribute is not supplied, the implication is thatthe encoder (or whoever is indicated by the value of theresp attribute) has supplied the missing reading. Bothgap and supplied may be used in combination withunclear, damage, and other elements; for discussion,see section 11.5.2 Use of the gap, del, damage, unclear, and supplied Elements in Combination.

11.4 Hands and Responsibility

This section discusses in more detail the representation of aspectsof responsibility perceived or to be recorded for the writing of aprimary source. These include points at which one scribe takes overfrom another, or at which ink, pen, or other characteristics of thewriting change. A discussion of the usage of the hand,resp, and cert attributes is also included.

11.4.1 Document Hands

For many text-critical purposes it is important to signal theperson responsible (the hand) for the writing of a wholedocument, a stretch of text within a document, or a particular featurewithin the document. A hand, as the name suggests, need notnecessarily be identified with a particular known (or unknown) scribeor author; it may simply indicate a particular combination of writingfeatures recognized within one or more documents. The examples givenabove of the use of the hand attribute with coding ofadditions and deletions illustrate this.

The handNote element is used to provide information abouteach hand distinguished within the encoded document.
  • handNote (note on hand) describes a particular styleor hand distinguished within a manuscript.

A handNote element, with an identifier given by itsxml:id attribute, may appear in either of two places in theTEI Header, depending on which modules are included in a schema. Whenthe transcr module defined by the presentchapter is used, the element hand注釈 is available, withinthe profileDesc element of the Header, to hold one or morehandNote elements. When the msdescription module defined in chapter 10 Manuscript Description is included, the handDesc element described in10.7.2 Writing, Decoration, and Other Notations also becomes available as part of a structuredmanuscript description. The encoder may choose to placehandNote elements identifying individual hands in eitherlocation without affecting their accessibility since the element isalways addressed by means of its xml:id attribute. ThehandDesc element may be more appropriate when a fullcataloguing of each manuscript is required; the hand注釈element if only a brief characterization of each hand is needed. Itis also possible to use the two elements together if, for example, thehandDesc element contains a single summary describing all thehands discursively, while the hand注釈 element givesspecific details of each. The choice will depend on individualencoders' priorities.

As shown above, the hand attribute is available onseveral elements to indicate the hand in which the content of theelement (usually a deletion or addition) is carried out. ThehandShift element may also be used within the body of atranscription to indicate where a change of hand is detected forwhatever reason.
  • handShift/ marks the beginning of a sequence of text written in a newhand, or the beginning of a scribal stint.
Both handShift and handNote are members of theatt.handFeatures class, and thus sharethe following attributes:
  • att.handFeatures provides attributes describing aspects of the hand in which a manuscript is written.
    scribegives a standard name or other identifier for the scribebelieved to be responsible for this hand.
    scriptcharacterizes the particular script or writing style used bythis hand, for example secretary, copperplate, Chancery, Italian, etc..
    mediumdescribes the tint or type of ink, e.g. brown, or otherwriting medium, e.g. pencil
    scopespecifies how widely this hand is used in the manuscript.

A single hand may employ different writing styles and inks within adocument, or may change character. For example, the writing stylemight shift from ‘anglicana’ to ‘secretary’, or the ink fromblue to brown, or the character of the hand may change. Simplechanges of this kind may be indicated by assigning a new value to theappropriate attribute within the handShift element. It isfor the encoder to decide whether a change in these properties of thewriting style is so marked as to require treatment as a distincthand.

Where such a change is to be identified, the newattribute is used to indicate the hand applicable to the materialfollowing the handShift. This will ordinarily, but notnecessarily, be the order in which the material was originallywritten.

As might be expected, one hand may employ different renditionswithin the one writing style, for example medieval scribes oftenindicate a structural division by emboldening all the words within aline. These should be indicated by use of the rendattribute on an element, in the same manner as underlining,emboldening, font shifts, etc. are represented in transcription of a printed text,rather than by introducing a new handShift element.

In the following example there isa change of ink within the one hand. This is simply indicated by a new valuefor the medium attribute on the handShift element:
<l>When wolde the cat dwelle in his ynne</l>
<handShift medium="greenish-ink"/>
<l>And if the cattes skynne be slyk <handShift medium="black-ink"/> and gaye</l>
In the following example, the encoder hasidentified two distinct hands within the document and given themidentifiers h1 and h2, by means of the followingdeclarations included in the document's TEI Header:
<hand注釈>
 <handNote xml:id="h1script="copperplatemedium="brown-ink">Carefully written with regular descenders</handNote>
 <handNote xml:id="h2script="printmedium="pencil">Unschooled scrawl</handNote>
</hand注釈>
Then the change of hand is indicated in the text:
<handShift new="#h1resp="#das"/>... and that good Order Decency and regular worship
may be once more introduced and Established in this
Parish according to the Rules and Ceremonies of the
Church of England and as under a good Consciencious
and sober Curate there would and ought to be
<handShift new="#h2resp="#das"/>
and for that purpose the parishioners pray

11.4.2 Hand, Responsibility, and Certainty Attributes

The hand and resp attributes have similar, butnot identical, meanings. Observe their distinctive uses in thefollowing encoding of the William James passage mentioned above insection 11.3.3 Correction and Conjecture. In this example, the Butinserted by James is tagged as an add, and the consequenteditorial correction of One to one treatedseparately:
<add place="supralinearresp="#FBhand="#WJ">But</add>
<choice>
 <sic>One</sic>
 <corr resp="#FB">one</corr>
</choice> must have
lived ...

<!-- elsewhere -->
<respStmt xml:id="FB">
 <resp>editorial changes</resp>
 <name>Fredson Bowers</name>
</respStmt>
<respStmt xml:id="WJ">
 <resp>authorial changes</resp>
 <name>William James</name>
</respStmt>
As in this example, hand should be reserved for indicatingthe hand of any form of marking—here, addition but also deletion,correction, annotation, underlining, etc.—within the primary textbeing transcribed. The scribal or authorial responsibility for thismarking may be inferred from the value of the hand attribute.The value of the hand attribute should be one of the handidentifiers declared in the document header (see section 11.4.1 Document Hands).

The resp attribute, by contrast, indicate the person responsible for deciding to apply the elementcarrying it to this part of the text, and hence has a slightlydifferent interpretation. In the case of theadd element, for example, the resp attribute willindicate the responsibility for identifying that the addition isindeed an addition, and also (if the hand attribute issupplied) to which hand it should be attributed. In this case, Bowers is credited with identifying the hand as that of WilliamJames. In the case of the corr element, the respattribute indicates who is responsible for supplying theintellectual content of the correction reported in the transcription:here, Bowers' correction of ‘One’ to ‘one’. In the case of adeletion, the resp attribute will similarly indicate whobears responsibility for identifying or categorising the deletion itself,while other attributes (hand most obviously) attributeresponsibility for the deletion itself.

As these examples show, the field of application of theresp attributes varies from element to element. In somecases, it applies to the content of the element (corr, ex, and supplied); in others it applies to the value of a particularattribute (sic, abbr, del, etc.). In allcases where both the resp and cert attributes aredefined for a particular element, the two attributes refer to the sameaspect of the markup. The one indicates who is intellectuallyresponsible for some item of information, the other indicates the degreeof confidence in the information. Thus, for acorrection, the resp attribute signifies the personresponsible for supplying the correction, while the certattribute signifies the degree of editorial confidence felt in thatcorrection. For the expansion of an abbreviation, theresp attribute signifies the person responsible for supplyingthe expansion and the cert attribute signifies the degree ofeditorial confidence felt in the expansion.

This close definition of the use of the resp andcert attributes with each element is intended to provide forthe most frequent circumstances in which encoders might wish to makeunambiguous statements regarding the responsibility for and certainty ofaspects of their encoding. The resp and certattributes, as so defined, give a convenient mechanism for this.However, there will be cases where it is desired to state responsibilityfor and certainty concerning other aspects of the encoding. Forexample, one may wish in the case of an apparent addition to state theresponsibility for the use of the add element, rather than theresponsibility for identifying the hand of the addition. It may also bethat one editor may make an electronic transcription of another editor'sprinted transcription of a manuscript text — here, one will wish toassign layers of responsibility, so as to allow the reader to determineexactly what in the final transcription was theresponsibility of each editor. In these complex cases of dividededitorial responsibility for and certainty concerning the content,attributes, and application of a particular element, the more generalmechanisms for representing certainty and responsibility described inchapter 21 確信度・責任 should be used.

It should be noted that the certainty and responsibility mechanismsdescribed in chapter 21 確信度・責任 replicate all the functions of theresp and cert attributes on particular elements.For example, the encoding of Donaldson's conjectured emendation ofwight to wright in line 117 of Chaucer'sWife of Bath's Prologue (see 11.3.3 Correction and Conjecture) may beencoded as follows using the resp and certattributes on the corr element:
<choice>
 <sic>wight</sic>
 <corr resp="#ETDcert="medium">wright</corr>
</choice>
Exactly the same information could be conveyed using the certaintyand responsibility mechanisms, as follows:
<choice>
 <corr xml:id="c117">wright</corr>
 <sic>wight</sic>
</choice>
<certainty target="#c117locus="transcribedContentdegree="0.7"/>
<respons target="#c117locus="transcribedContentresp="#ETD"/>
The choice of which mechanism to use is left to the encoder. Intranscriptions where only such statements of responsibility andcertainty are made as can be accommodated within the resp andcert attributes of particular elements, it will be economicalto use the resp and cert attributes of thoseelements. Where many statements of responsibility and certainty aremade which cannot be so accommodated, it may be economical to use therespons and certainty elements throughout.

The above discussion supposes that in each case an encoder is able tospecify exactly what it is that one wishes to state responsibility forand certainty about. Situations may arise when an encoder wishes tomake a statement concerning certainty or responsibility but is unable orunwilling to specify so precisely the domain of the certainty orresponsibility. In these cases, the note element may be usedwith the type attribute set to ‘cert’ or ‘resp’and the content of the note giving a prose description of the state ofaffairs.

11.5 Damage and Conjecture

The carrier medium of a primary source may often sustain physicaldamage which makes parts of it hard or impossible to read. In thissection we discuss elements which may be used to represent suchsituations and give recommendations about how these should be used inconjunction with the other related elements introduced previously inthis chapter.

11.5.1 Damage, Illegibility, and Supplied Text

The gap and supplied elements described above(section 11.3.7 Text Omitted from or Supplied in the Transcription) should be used with appropriateattributes where the degree of damage or illegibility in a text issuch that nothing can be read and the text must be either omitted orsupplied conjecturally or from one or more other sources. In manycases, however, despite damage or illegibility, the text may yet beread with reasonable confidence. In these cases, the followingelements should be used:
  • damage contains an area of damage to the text witness.
  • damageSpan/ (damaged span of text) marks the beginning of a longer sequence of text which is damaged in some way but still legible.
As members of the class att.damaged, these elements bearthe following attributes
  • att.damaged provides attributes describing the nature of any physical damage affecting a reading.
    extentindicates approximately how much text is in the damaged area,in letters, minims, inches, or any appropriate unit, where thiscannot be deduced from the contents of the tag.
    handIn the case of damage (deliberate defacement, etc.) assignableto an identifiable hand, signifies the hand responsible for thedamage.
    agentcategorizes the cause of the damage, if it can be identified.
    degreeSignifies the degree of damage according to a convenient scale.The damage tag with the degree attribute shouldonly be used where the text may be read with some confidence; textsupplied from other sources should be tagged as supplied.
    groupassigns an arbitrary number to each stretch of damage regarded as forming part of the same physical phenomenon.
As a member of the att.spanning class,damageSpan inherits the following additional attribute:
  • att.spanning provides attributes for elements which delimit a span of text by pointing mechanisms rather than by enclosing it.
    spanToindicates the end of a span initiated by the element bearing this attribute.

The following examples all refer to the recto of folio 5 of the uniquemanuscript of the Elder Edda. Here, themanuscript of Vóluspá has been damagedthrough irregular rubbing so that letters in various places are obscuredand in some cases cannot be read at all.

In the first line of this leaf, the transcriber may believe that thelast three letters of daga can be read clearly despitethe damage:
um aldr
d<damage>aga</damage> yndisniota
If, as is often the case, the damage crosses structural divisions, sothat the damage element cannot be nested properly within the containingdiv elements, the damageSpan element may be used, inthe same way as the delSpan and addSpan elementsdiscussed in section 11.3.4 Additions and Deletions.
<p>
<!-- ... -->
 <pb n="5r"/>
 <damageSpan agent="rubbingextent="whole leafspanTo="#damageEnd"/>
</p>
<p> .... </p>
<p> ....
<pb n="5vxml:id="damageEnd"/>
</p>
Note that in this example the spanTo element points to thenext pb element rather than to an inserted anchorelement, since the whole of the leaf (the text between the twopb elements has sustained damage. For other techniques of handling non-nesting information, see chapter20 Non-hierarchical Structures.
If, as is also likely, the damage affects several disjoint parts ofthe text, each such part must be marked with a separatedamage or damageSpan element. To indicate that each of these is to beregarded as forming part of the same damaged area, the groupattribute may be used as in the following example. In this (imaginary)text of Fitzgerald's translation from Omar Khayam, water damage has affected an area covering parts of several lines
<l>The Moving Finger wri<damage agent="watergroup="1">es; and</damage> having writ,</l>
<l>Moves <damage agent="watergroup="1">on: nor all your</damage> Piety nor Wit</l>
<l>
 <damageSpan agent="watergroup="1spanTo="#washOut"/>Shall lure it back to cancel half a Line,
</l>
<l>Nor all your Tears wash <anchor xml:id="washOut"/> out a Word of it</l>

A more general solution to this problem is provided by thejoin element discussed in 16.7 Aggregation which may beused to link together arbitrary elements of any kind in thetranscription. Where, as here, several phenomena of illegibility and conjectureall result from the one cause, an area of damage to the text — rubbingat various points — which is not continuous in the text, affecting itat irregular points, the join element may beused to indicate which tagged features are part of the same physicalphenomenon.

If the damage has been so severe as to render parts of the textonly imperfectlylegible, the unclear element should be used to mark thefact. Returning to the Eddic example above, an encoder less confidentin the daga reading, may indicate this as follows:
um aldr d<unclear reason="damage">aga</unclear> yndisniota
If it is desired to supply more information about the kind ofdamage, it is also possible to nest an unclear element withinthe damage element:
um aldr d<damage agent="rubbing">
 <unclear>aga</unclear>
</damage> yndisniota
Alternatively, the transcriber may not feel able to read the lastthree letters of daga but may wish to supply them byconjecture. Note the use of the resp attribute to assignthe conjecture to Finnur Jónsson:
um aldr d<supplied reason="rubbingresp="#FJ">aga</supplied> yndisniota
The supplied element may if desired be enclosed within adamage element:
um aldr d<damage agent="rubbing">
 <supplied source="FJ">aga</supplied>
</damage> yndisniota
Contrast the use of gap in the next line, where thetranscriber believes that four letters cannot be read at all becauseof the damage:
þar komr inn dimmi dreki fliugandi naþr frann
neþan <gap
  reason="illegible"
  agent="rubbing"
  extent="4"
  unit="letters"/>
As with supplied, this gap might be enclosed by adamage element.
The above examples record imperfect legibility due to damage. Whenimperfect legibility is due to some other reason (typically because thehandwriting is ill-formed), the unclear element should be usedwithout any enclosing damage element. In Robert Southey'sautograph of The Life of Cowper the final six lettersof attention are difficult to read because of the hasteof the writing, though reasonably certain from the context.
and from time to time invited in like manner
his att<unclear>ention</unclear>
The cert attribute on the unclear element may beused to indicate the level of editorial confidence in the readingcontained within it.

11.5.2 Use of the gap, del, damage, unclear, and supplied Elements in Combination

The gap, damage, unclear,supplied, and del elements may be closely allied intheir use. For example, an area of damage in a primary source mightbe encoded with any one of the first four of these elements, depending onhow far the damage has affected the readability of the text.Further, certain of the elements may nest within one another. Theexamples given in the last sections illustrate something of how theseelements are to be distinguished in use. This may be formulated asfollows:
  • where the text has been rendered completely illegible bydeletion or damage and no text is supplied by the editor in place ofwhat is lost: place an empty gap element at the point ofdeletion or damage. Use the reason attribute to state thecause (damage, deletion, etc.) of the loss of text.
  • where the text has been rendered completely illegible bydeletion or damage and text is supplied by the editor in place ofwhat is lost: surround the text supplied at the point of deletion ordamage with the supplied element. Use the reasonattribute to state the cause (damage, deletion, etc.) of the loss oftext leading to the need to supply the text.
  • where the text has been rendered partly illegible by deletionor damage so that the text can be read but without perfectconfidence: transcribe the text and surround it with theunclear element. Use the reason attribute to statethe cause (damage, deletion, etc.) of the uncertainty in transcriptionand the cert attribute to indicate the confidence in thetranscription.
  • where there is deletion or damage but the text can be read withperfect confidence: transcribe the text and surround it with thedel element (for deletion) or the damage element (fordamage). Use appropriate attribute values to indicate the cause andtype of deletion or damage. Observe that the degreeattribute on the damage element permits the encoding to showthat a letter, word, or phrase is not perfectly preserved, though itmay be read with confidence.
  • where there is an area of deletion or damage and parts of thetext within that area can be read with perfect confidence, otherparts with less confidence, other parts not at all: in transcription,surround the whole area with the del element (for deletion; orthe delSpan element where it crosses a structural boundary); orthe damage element (for damage). Text within the damaged areawhich can be read with perfect confidence needs no further tagging.Text within the damaged area which cannot be read with perfectconfidence may be surrounded with the unclear element. Placeswithin the damaged area where the text has been rendered completelyillegible and no text is supplied by the editor may be marked withthe gap element. For each element, one may use appropriateattribute values to indicate the cause and type of deletion or damageand the certainty of the reading.
The rules for combinations of the add and delelements, and for the interpretation of such combinations, aresimilar:
  • if one add element (with identifier ADD1)contains another (with identifier ADD2), thenthe addition ADD1 was firstmade to the text, and later a second addition (ADD2) wasmade within that added text:
    This is the text
    <add xml:id="ADD1">with some added
    <add xml:id="ADD2">(interlinear!)</add>
    material</add>
    as written.
  • if one del element contains another, and theseq attribute does not indicate otherwise, it should beassumed that the inner deletion was made before the enclosing one. In the following example,the words redundant was deleted before a secondsecond deletion removed the entire passage:
    <del>This sentence contains
    some <del>redundant</del> unnecessary
    verbiage.</del>
  • if a del element contains an add element, the normalinterpretation will be that an addition was made within a passagewhich was laterdeleted in its entirety:
    <del>This sentence was deleted
    <add>originally</add> from the text.</del>
  • if an add element contains a del element, thenormal interpretation will be that adeletion was made from a passage which had earlier been added:
    <add>This sentence was added
    <del>eventually</del> to the text.</add>

11.6 Aspects of Layout

Finally in this chapter we present elements which may be used tocapture aspects of the layout of material on a page where this isconsidered important. Methods for recording page breaks, column breaks, and line breaks in thesource are described in section 3.10 参照システム.

11.6.1 Space

The author or scribe mayhave left space for a word, or for an initial capital, and for somereason the word or capital was never supplied and the space left empty.The presence of significant space in the text being transcribed maybe indicated by the space element.
  • space/ indicates the location of a significant space in the copy text.
    resp(responsible party) indicates the individual responsible for identifying and measuringthe space.
Note that this element should not be used to mark normal inter-word space or thelike.
In line 694 of Chaucer's Wife of Bath's Prologue inthe Holkham manuscript the scribe has left a space for a word whereother manuscripts read preestes:
By god if wommen had writen storyes
As <space quantity="7unit="char"/> han within her oratoryes
The supplied element discussed in the previous section may beused to supply the text presumed missing:
By god if wommen had writen storyes
As <supplied reason="spaceresp="#ESsource="Hg">preestes</supplied>
han within her oratoryes
Here, the fact of the space within the manuscript is indicated by thevalue of the reason attribute. The source of the suppliedtext is shown by the value of the source attribute as theHengwrt manuscript; the transcriber responsible for supplying the textis ES.

11.6.2 Lines

The most common form of marking of text in manuscripts is by lineswritten under, beside, or through the text. The lines themselves may beof various types: they may be solid, dashed or dotted, doubled ortripled, wavy or straight, or a combination of these and otherrenderings. The line may be used for emphasis, or to mark a foreign ortechnical term, or to signal a quotation or a title, etc.: the elementsemph, foreign, term, mentioned,title may be used for these. Frequently, a scholar may judgethat a line is used to delete text: the del element isavailable to indicate this. In all these cases, the rendattribute may be used on these or other elements to indicate that thetext is marked by a line and the style of the line. Thus, Lawrence'sdeletion by strike-through of my in the autograph ofEloi, Eloi, lama sabachthani is noted:
For I hate this
<del rend="strikethroughhand="#dhl">my</del> body,
which is so dear to me
There will be instances, however, where a scholar wishes only toregister the occurrence of lines in the text, without making anyjudgement as to what the lines signify. In these the hielement may be used, with the rend attribute to mark thestyle of line. In the manuscript of a letter by Robert Browning toGeorge Moulton-Barrett theunderlining of the phrase had obtained all the letters to Mr Boydmay be marked-up as follows:
I have once — by declaring I would prosecute
by law — hindered a man's proceedings who
<hi rend="underline">had obtained all the letters
to Mr Boyd</hi>
The above examples presume the common case where a single word orphrase is marked by a line, with no doubt as to where the marking beginsor ends and with no overlapping of the area of text with other markedareas of text. Where there is doubt, the certainty element maybe used to record the doubt. In the Browning example cited above theunderlining actually begins half-way under who, and thisuncertainty could be remarked as follows:
I have once — by declaring I would prosecute
by law — hindered a man's proceedings who
<hi xml:id="cstart1rend="underline">had obtained all
the letters to Mr Boyd</hi>
<!-- ... -->
<certainty target="#cstart1locus="startLocdegree="0.70">
 <desc>may begin with previous word</desc>
</certainty>

Where the area of text marked overlaps other areas of text, forexample crossing a structural division, one of the spanning mechanismsmentioned above must be used; for example where the line is thought tomark a deletion, the delSpan element may be used. Where it isdesired simply to record the marking of a span of text in circumstanceswhere it is not possible to surround the text with a hielement, the span element may be used with the rendattribute indicating the style of line-marking.

More work needs to be done on clarifying the treatment of othertextual features marked by lines which might so overlap or nest. Forexample, in many Middle English manuscripts (e.g. the Jesus and Digbyverse collections) marginal sidebars may indicate metrical structure:couplets may be linked in pairs, with the pairs themselves linkedinto stanzas. Or, marginal sidebars may indicate emphasis, or maypoint out a region of text on which there is some annotation: in manymanuscripts of Chaucer's Wife of Bath's Prologue lines655–8 are marked with nesting parentheses against which the scribehas written nota.

At the lowest level, all such features could be captured by use ofthe note element, containing a prose description of themanuscript at this point, enhanced by a link to a visualrepresentation (or facsimile) of the feature in question. It is not yet clear how best to mark up suchphenomena so as toobtain more usefully structured encodings. For example,in the Chaucer example just cited, one may wish to record that thenota is written in the Hengwrt manuscript in the rightmargin against a single large left parenthesis bracketing the fourlines, with two right parentheses in the right margin bracketing twooverlapping pairs of lines: the first and third, the second and fourth.The note element allows us to record that the scribe wrotenota, but is not well-adapted to show that thenota points both at all four lines and at two pairs oflines within the four lines.

11.7 Headers, Footers, and Similar Matter

As a rule, matter associated with the page break (signature,catchword, page number) should be drawn into the pb elementas attributes: see section 3.10 参照システム. In text-criticalsituations where these elements need tagging in their own right (forinstance, when the catch-word presents a variant reading, or spacingin the header or footer is significant for compositor identification),the element fw may be used:
  • fw (forme work) contains a running head (e.g. a header, footer), catchword, or similar material appearing on the current page.
The name fw is short for ‘forme work’. Itmay be used to encode any ofthe unchanging portions of a page forme, such as:
  • running heads (whether repeated on every page, or changing onevery page)
  • running footers
  • page numbers
  • catch-words
  • other material repeated from page to page, which falls outside thestream of the text
It should not be used for marginal glosses, annotations, or textualvariants, which should be tagged using gloss, note, orthe text-critical tags described in chapter 12 校本,respectively.
For example:
<fw type="headplace="top-centre">Poëms.</fw>
<fw type="pageNumplace="top-right">29</fw>
<fw type="sigplace="bot-centre">E3</fw>
<fw type="catchplace="bot-right">TEMPLE</fw>

11.8 Other Primary Source Features not Covered in these Guidelines

We repeat the advice given at the beginning of this chapter, thatthese recommendations are not intended to meet every transcriptionalcircumstance ever likely to be faced by any scholar. They are intendedrather as a base to enable encoding of the most common phenomena foundin the course of scholarly transcription of primary source materials.These guidelines particularly do not address the encoding of physicaldescription of textual witnesses: the materials of the carrier, themedium of the inscribing implement, the organisation of the carrier materials themselves (asquiring, collation, etc.), authorial instructions or scribal markup,etc. except insofaras these are involved in the broader question ofmanuscript description, as addressed by the msdescription module described in chapter 10 Manuscript Description.

11.9 転記モジュール

The module described in this chapter makes available the followingcomponents:
Module transcr: Transcription of primary sources
The selection and combination of modules to form a TEI schema is described in1.2 TEIスキーマの定義.

Contents « 10 Manuscript Description » 12 校本

注釈
34.
The coordinate spacemay be thought of as a grid superimposed on a rectangularspace. Rectangular areas of the grid are defined as four numbers a b c d: the first two identify the grid point whichis at the upper left corner of the rectangle; the second two give thegrid point located at the lower right corner of the rectangle. Thegrid point a b is understood to be the pointwhich is located a points from the origin alongthe x (horizontal) axis, and b points from the origin along the y (vertical) axis.
35.
The coordinate space used here is based on pixels, but the mapping between pixels and units in the coordinate space need not be one-to-one; it might be convenient todefine a more delicate grid, to enable us to address much smallerparts of the image. This can be done simply by supplying appropriatevalues for the attributes which define the coordinate space; forexample doubling them all would map each pixel to two gridpoints in the coordinate space.
36.
The image is takenfrom the collection at http://ancilla.unice.fr/Illustr.html, and was digitized from a copyin the Bibliothèque Municipale de Lyon, by whose kind permission it isincluded here
37.
The manuscript contains several othersubstitutions, ignored here for the sake of clarity.


Copyright TEIコンソーシアム 2007 Licensed under the GPL. Copying and redistribution is permitted and encouraged.
Version 1.0.