classicosm logo

TEI vs. Play.dtd

 

link_outTEI is the leading standard for marking up literature, plays and other non-technical text. (link_outDocBook is the standard for technical works.) However, even the simplified versions may be more complex than is required for most books, especially for volunteer efforts such as link_outProject Gutenberg.

The following comparison is from Martin Mueller's link_outA very gentle introduction to the TEI. I've reformatted it into a side-by-side layout and improved spacing, line breaks, and colors to simplify comparison. The markup on the left is in Martin Mueller's "baby" version of TEI. The markup on the right is from Jon Bosak, one of the inventors of XML. I've included specific comments on each below, plus overall recommendations. (See the original source for Martin's arguments in favor of TEI. I have only excerpted the sample XML.)

 
 

Martin Mueller's teixbaby.dtd

Jon Bosak's play.dtd

Declaration <?xml version="1.0"?>
<!DOCTYPE TEI.2 SYSTEM "teixbaby.dtd">
<?xml version="1.0"?>
<!DOCTYPE PLAY SYSTEM "play.dtd">
Root <TEI.2> <PLAY>
Header <teiHeader>
  <fileDesc>
    <titleStmt>
      <title>Hamlet, Prince of Denmark: an electronic edition</title>
      <author>Shakespeare,William</author>
    </titleStmt>
    <publicationStmt>
      <publisher>Houghton Mifflin</publisher>
      <pubPlace>Boston MA</pubPlace>
      <date>1997</date>
    </publicationStmt>
    <sourceDesc>
      <bibl>
        <title>The Riverside Shakespeare</title>
        <author>Shakespeare,William</author>
        <publisher>Boston: Houghton Mifflin,1974</publisher>
      </bibl>
    </sourceDesc>
  </fileDesc>
</teiHeader>
<TITLE>The Tragedy of Hamlet, Prince of Denmark</TITLE>
<FM>
  <P>Text placed in the public domain by Moby Lexical Tools, 1992.</P>
  <P>SGML markup by Jon Bosak, 1992-1994.</P>
  <P>XML version by Jon Bosak, 1996-1998.</P>
  <P>This work may be freely copied and distributed worldwide.</P>
</FM>
text <text>
front
(cast list)
<front>
  <div type="castlist">
    <list>
      <item id="Oph">OPHELIA, daughter to Polonius</item>
      <item id="King">CLAUDIUS, King of Denmark</item>
      <item id="Queen">GERTRUDE, Queen of Denmark</item>
    </list>
  </div>
</front>
<PERSONAE>
  <TITLE>Dramatis Personae</TITLE>
  <PERSONA>CLAUDIUS, king of Denmark.</PERSONA>
  <PERSONA>GERTRUDE, queen of Denmark, and mother to Hamlet.</PERSONA>
  <PERSONA>OPHELIA, daughter to Polonius.</PERSONA>
</PERSONAE>
body, divs (act, scene) <body>
  <div type="act" n="4">
    <div n="4.5" type="scene">
<SCNDESCR>SCENE Denmark.</SCNDESCR>
<PLAYSUBT>HAMLET</PLAYSUBT>
<ACT>
  <TITLE>ACT IV</TITLE>
  <SCENE>
    <TITLE>SCENE V. Elsinore. A room in the castle.</TITLE>
Queen <stage><hi rend="i">Enter</hi> KING.</stage>
<sp who="Queen"><speaker><hi rend="i">Queen.</hi></speaker>
  <l n="37" part="Y">Alas, look here, my lord.</l>
</sp>
<STAGEDIR>Enter KING CLAUDIUS</STAGEDIR>
<SPEECH><SPEAKER>QUEEN GERTRUDE</SPEAKER>
  <LINE>Alas, look here, my lord.</LINE>
</SPEECH>
Ophelia <sp who="Oph"><speaker><hi rend="i">Oph.</hi></speaker>
  <stage><hi rend="i">Song.</hi></stage>
  <lg part="M" type="song">
    <l n="38">"Larded all with sweet flowers,</l>
    <l n="39">Which bewept to the ground did not go</l>
    <l n="40">With true-love showers."</l>
  </lg>
</sp>
<SPEECH><SPEAKER>OPHELIA</SPEAKER>
  <LINE><STAGEDIR>Sings</STAGEDIR></LINE>
  <LINE>Larded with sweet flowers</LINE>
  <LINE>Which bewept to the grave did not go</LINE>
  <LINE>With true-love showers.</LINE>
</SPEECH>
King <sp who="King"><speaker> <hi rend="i">King.</hi></speaker>
  <l n="41" part="Y">How do you, pretty lady?</l>
</sp>
<SPEECH><SPEAKER>KING CLAUDIUS</SPEAKER>
  <LINE>How do you, pretty lady?</LINE>
</SPEECH>
Ophelia <sp who="Oph"><speaker> <hi rend="i">Oph.</hi></speaker>
  <P>
    <lb n="42"/>Well, <rs key="God">God</rs>dild you! They say the owl was a
    <lb n="43"/>baker's daughter. Lord, we know what we are, but
    <lb n="44"/>know not what we may be.<rs key="God">God</rs>be at your table!
  </p>
</sp>
<SPEECH><SPEAKER>OPHELIA</SPEAKER>
  <LINE>Well, God 'ild you! They say the owl was a baker's</LINE>
  <LINE>daughter. Lord, we know what we are, but know not</LINE>
  <LINE>what we may be. God be at your table!</LINE>
</SPEECH>
King <sp who="King"><speaker> <hi rend="i">King.</hi></speaker>
  <l n="45" part="Y">Conceit upon her father.</l>
</sp>
<SPEECH><SPEAKER>KING CLAUDIUS</SPEAKER>
  <LINE>Conceit upon her father.</LINE>
</SPEECH>
Ophelia <sp who="Oph"><speaker><hi rend="i">Oph.</hi></speaker>
  <P>
    <lb n="46"/>Pray let's have no words of this, but when
    <lb n="47"/>they ask you what it means, say you this:
  </p>
  <stage><hi rend="i">Song.</hi></stage>
  <lg part="M" type="song">
    <l n="48">"To-morrow is <rs key="StValentine">Saint Valentine's</rs>day,</l>
    <l n="49">All in the morning betime,</l>
    <l n="50">And I a maid at your window,</l>
    <l n="51">To be your <rs key="StValentine">Valentine</rs>.</l>
    <l n="52">"Then up he rose and donn'd his clo'es,</l>
    <l n="53">And dupp'd the chamber-door,</l>
    <l n="54">Let in the maid, that out a maid</l>
    <l n="55">Never departed more."</l>
  </lg>
</sp>
<SPEECH><SPEAKER>OPHELIA</SPEAKER>
  <LINE>Pray you, let's have no words of this; but when they</LINE>
  <LINE>ask you what it means, say you this:</LINE>
  <STAGEDIR>Sings</STAGEDIR>
  <LINE>To-morrow is Saint Valentine's day,</LINE>
  <LINE>All in the morning betime,</LINE>
  <LINE>And I a maid at your window,</LINE>
  <LINE>To be your Valentine.</LINE>
  <LINE>Then up he rose, and donn'd his clothes,</LINE>
  <LINE>And dupp'd the chamber-door;</LINE>
  <LINE>Let in the maid, that out a maid</LINE>
  <LINE>Never departed more.</LINE>
</SPEECH>
          </div>
      </div>
    </body>
  </text>
</TEI.2>
    </SCENE>
  </ACT>
</PLAY>
 

Martin Mueller's teixbaby.dtd

Pros

  • The header is more detailed (but, as noted below, still missing some useful information).
  • div type=act and type=scene: 'n' is an important attribute
  • sp (speech): 'who' is an important attribute (but, as noted below, appears redundant here)
  • rs key=God and key=StValentine are useful (but, as noted below, the tag name is obscure).
  • lg groups all lines of a song (but, as noted below, the information could be attached to sp/speech in the first instance).
  • l part=Y and lg part=M may capture useful information (see link_outTEI Lite: Prose, Verse and Drama for details).

Cons

  • The header is complicated.
  • The header is missing some useful information, e.g. canonical title as a distinct field, credit for the TEI markup, copyright.
  • Several fields in the header are "free format" and therefore allow inconsistencies. Additional tags may be useful, e.g. firstName and lastName within author; city and state within pubPlace. (To the extent that the header will usually be generated, the additional tags will not add any complexity for the typical creator.)
  • The "text" tag appears unnecessary. (However, DTB has a tag with a similar role: "book"; perhaps in both cases there's a good reason.)
  • For castlist: the "div" appears unnecessary; why not put type=castlist on the list?
  • For castlist: there's no explicit title. I doubt that "Dramatis Personae" is universal (though perhaps it's the default in TEI Baby?).
  • div type=act and type=scene do not indicate that the source uses roman numerals.
  • div type=scene: the n=4.5 is confusing. If the "4" is the act, that information is redundant with the previous div and thus a potential source of error. (If the information is useful or required for TEI, it can be generated. It should not appear in the source.)
  • div type=scene: no title is included
  • stage: hi rend=i for "Enter" and "Song" is presentational markup. That should be avoided unless it's an exception to a general rule. A structural alternative: span class=action.
  • sp (speech): 'who' appears redundant with 'speaker' here (if the presentation of the speaker's name is consistent throughout the play, which is probably true for most plays, though rarely true if sp/speech are used in prose)
  • speaker: hi rend=i is presentational markup that adds nothing.
  • line numbers are useful though could probably be generated automatically
  • lg type=song: the "lg" appears unnecessary in the first case; why not put the type on the earlier sp (speech) tag? (That doesn't work in the second case since the song is only part of the speech, but I still think it's a useful simplification.)
  • p lb: why are lines specified as line breaks within a paragraph here and one other place, but plain "l" (line) elsewhere? lb strikes me as a poor choice (especially for numbered lines) since it doesn't wrap the line, i.e. it's a tag with no balancing end tag.
  • rs key=God and key=StValentine are useful but the tag name is obscure.

Questions

  • For castlist: adding id to the list items looks strange. Is it implicitly associating the description with the character?

Suggestions

  • Header: get rid of hierarchy that's not useful to PG.
  • Header: if the following tags are kept at all, consider spelling out fileDesc, titleStmt, publicationStmt, sourceDesc, bibl. It's more important to have tags that are self-describing than to save a few bytes, especially since these fields are only used in the header.

Jon Bosak's play.dtd

Pros

  • The header structure is flatter (but, as noted below, is missing some useful information).
  • The document structure is flatter.
  • The tag names are simpler and more logical, e.g. "speech" is self-documenting whereas "sp" is not.

Cons

  • Uppercase tags have (I think) largely fallen out of favor.
  • The header is unstructured, e.g. credit for the markup and copyright.
  • The header is missing useful information, at least to the extent this edition can be traced to a specific published work.
  • 'FM' is less clear than TEI's 'front' (though I prefer frontMatter to both).
  • 'SCNDESCR' is awful. Why not add a description tag to SCENE? If that's not feasible, how about spelling out sceneDescription? It's self-describing and doesn't take up much more space, especially since this tag is not used often.
  • 'PLAYSUBT' is awful. If that information is useful (which is not clear here), why not just 'subtitle'?

Questions

  • LINE STAGEDIR: is LINE required here? STAGEDIR stood alone elsewhere.

Suggestions

  • Add a few tags and attributes to incorporate the advantages of the TEI version without losing the overall simplicity of this version.
  • Or, start with XHTML and add tags and attributes from here and (as needed) TEI.
  • link_outDublin Core may be a better model then TEI for enhancing the header.

 

Posted Oct. 28, 2004


Classicosm is a Product Architect site.
classicosm -at- product architect -dot- com (Feedback welcome!)
Copyright 2004 by Scott S. Lawton. All Rights Reserved. "Classicosm" and "A world of timeless value" are service marks owned by Scott S. Lawton.


Google
 
Web Classicosm