atfsplit.plx
Steve Tinney (stinney@sas.upenn.edu)
atfsplit.plx -- split up ATF files into their constituent PQ-files
atfsplit.plx [options] [file]
- -base dir
-
Use dir as the base into which to split the files. By default, this is D000, D001, etc.
- -cat
-
Spool the output straight onto STDOUT like unix cat does. Use with -list to
extract a sub-corpus from a bigger file.
- -dir
-
Create the files in 'dir', which is appended to 'base' if given. If you want
to split the files into the current directory with no subdirectories use
'-dir .'. If the dir name ends in a digit, it is incremented every thousand
files (similar to the default behaviour with the dir name D000, D001, D002
etc.).
- -dryrun
-
Just print the names of the files which would be generated; don't create any files.
- -except
-
Use with -list; output everything except the texts given in the list.
- -install
-
Install the individual PQ-files into the cdl/texts tree.
- -list filename
-
Read a list of P/Q IDs from filename and output only those texts.
- -shallow
-
When building pathnames do not include mid-level directories of the form P/P000xxx,
Q/Q100xxx etc.
- -show-updates
-
Produce a list of updated texts.
- -update
-
Only produce the ATF file for a text if the current version is different
from what is in the archive being split.
- -verbose
-
Print the names of files as they are generated.
atfsplit reads a file which may contain more than one transliteration
and splits it up into one file per transliteration. The output is
grouped in directories containing at most 1000 files each, the
subdirectories being named D000, D001, etc. With the -install option
the files are split directly into the cdl/texts tree.
Steve Tinney (stinney@sas.upenn.edu)
Copyright(c) Steve Tinney 2004.
Released under the Gnu General Public License
(http://www.gnu.org/copyleft/gpl.html).