The XWF format is the output of a simplifying adapter that is applied to XTF files to provide a stream of words and discontinuities suitable for feeding to parsers which don't care about all the gory written details.
The data model is presently defined by cdl/tools/xwf.dtd:
The parser expects all parseable sequences to begin with a
<d type="text"/> discontinuity.
http://emegir.info/xwfw element in the
XTF filew element in the
XTF file (required).w element in the
XTF file (optional).type="text" xml:lang contains the lang of
the text as a whole; used by the parser to determine when to treat words as
foreign.;] are not
reported). When type="field", form is set to
the value of the XTF f element's type
field. When type="break" or type="blank", form is the unit
sign, word, or line.
Breakage or anepigraphic regions represented in the XTF file as a
number of columns is emitted with form="line" and with
size set to a conventional 50 lines.form
attribute.The output character set of the XWF stream is the same as the XTF input stream; this should normally be Unicode/UTF-8.
The XTF to XWF transformation is implemented by cdl/tools/xtf2xwf.xsl:
XWF questions can be directed to Steve
Tinney.