archimedes_workflows
Archimedes Workflows
Documentation from Harvard (by M. Hyman/M. Schiefsky):
Documentation from MPIWG (text by St. Trzeciok; additional files by B. Fuchs):
Doku by B. Fuchs
- Workflow Archimedes
Metadata
-
Upload the text in: Filemaker IT server, archimedes_authors
Producing images
see documentation of the library
Transcription and producing the of text-xmls
-
-
Synchronisation text-xmls and images
Production of thumbnails
-
Production of cut-outs (cut-outs are drawings or similar illustrations on the images, which are tagged inside the text-xmls)
use cutout-tool
-
-
Correction of the text-xmls
Gap correction tool
-
NB: only working in Safari (Mac browser)
Frequency-sorted morphological “miss” lists
useful for “misses” which occur more than once
Tool: editor
Workflow:
download the chosen text-xml from the repository
choose the same text in Frequency-sorted morphological “miss” lists
find relevant words
copy the raw form, open the relevant text-xml in an editor
look up the copied raw form, replace the form
save the text-xml
parse the xml with XML Validator/SGML Parser
upload the text
using Bbedit one can get a list of all occuring raw forms in the text (Smultron or Jedit don´t have this option)
This may be helpful to compare the number of entities found in the Frequency-sorted morphological “miss” lists and the number found using the editor. This helps a lot to avoid new mistakes.
there is also the option to add new forms to the list of the Formmaker tool
NB: Unfortunately there is no link from the Formmaker tool back to the Frequency-sorted morphological “miss” lists, which slows down the process of correction/supplementation of the morphology. From working experience seen it is much faster to use the Formmaker tool for morphology supplementation
Correction of single mistakes + supplementation of the morphology by the survey of morphosyntactic rules for neologisms and spelling variations
From working experience it is useful and time-relevant to combine these different task steps with each other
Tools:
Editor/Arboreal + Browser/Overviewtool + Formmaker Tool
NB: recommended is to work with two monitors
Browser vs. Overviewtool
Using the Overviewtool instead of the ECHO or Archimedes environment in a browser gives the option to have the image and the text display next to each other. It is recommended for text correction with a lot of single-occuring morphological misses, because one does not spend time to load the images separately.
Unfortunately working with long xml-texts it takes a lot of time to load both the images and the text. According to this case it rather slows down the speed of correction.
NB: ECHO does not display <gap/>
Editor vs. Arboreal
The main difference of doing correction work with either Arboreal or an editor is the possibility to make changes in the xml-structure.
That means, people who are not supposed to or want to prevent themselves from changing the xml-structure should rather use Arboreal
Formmaker vs. Arboreal
The advantage of using Arboreal for the morphological supplementation is that the generated form can be directly send to the relevant server which is doing the morphological analysis
That means the forms of Formmaker have to be added separately.
On the other hand it is sometimes better to work without Arboreal; e.g. if a lot of xml-structure has to be added because of not decoded abrriviations, Formmaker is useful as a separate tool.
Workflow Editor:
download the relevant text from the text-repository
open the file in an editor
open the relevant text in the Overviewtool or in the ECHO/Archimedes environment using a browser
find black colored words (morphological analysed forms appear in ECHO/Archimedes brownish colored, not analysed forms in black)
decide if the form is a neologism/spelling variation or due to false transcription into xml
according to (5.) either add the form to Formmaker or correct the mistake in the editor
save the file after a working session
parse the file with XML Validator/SGML Parser
upload the text
Workflow Arboreal
download the relevant text from the text-repository
open the file in Arboreal
generate IDs.
-
NB: files which already have s-ids do not need this task step
get a morphological analysis by Donatus and highlight the unanalysed forms
One can also upload a morphology file from last session, if one saved it. This may speed up the work, when one handles large files or Donatus has been shut down.
correct the highlighted term or add unknown vocabulary
send new vocabulary to Donatus
save the file (as well as the morphological analysis from Donatus if needed)
parse the file with XML Validator/SGML Parser
upload the text
Producing parallel texts
-
NB: Parallel text are not displayable on Archimedes, but in the ECHO-environment
-
Lemmatized Corpus Search
-
Dictionary Lookup Tool, Dictionary Headword Access
-
XML Validator, SGML Parser
Corpus Language Statistics
used to display corpus word counts by language
xpath access ???
Working Group Home Page
GForge has been used for documentation and to coordinate tasks
-
Local Text administration (by St. Trzeciok)
By command line
Prerequisites
OS X or any other UNIX based system
User-account and Password for the Archimedes repository (provided by the IT-department)
Establishing a local text repository on the desktop of your computer
Create a folder on your desktop named e.g. sources
Open the program Terminal (a new shell window should appear on your screen)
Change the directory to the new local text repository folder. Type: cd Desktop/sources and press enter
Adjust the network protocol. Type: export CVS_RSH=ssh and press enter
Download the texts from the permanent text repository. Type: cvs -d :ext:username@archimedes.mpiwg-berlin.mpg.de:/archimedes/cvsroot co texts/archimedes/xml and press Enter
Type in your password and press enter
The texts will be in the subdirectory sources/texts/archimedes/xml
Adding a new text to the permanent text repository
Open the program Terminal (a new shell window should appear on your screen)
Change the directory to the local text repository folder. Type: cd Desktop/sources and press enter
Adjust the network protocol. Type: export CVS_RSH=ssh and press enter
Type: cvs -d :ext:username@archimedes.mpiwg-berlin.mpg.de:/archimedes/cvsroot add filename.xml
Type in your password and press enter
Press i (=insert) to add a comment
Press esc after the completion of your comment and type: :wq (=write and quit)
The new file will be uploaded into the permanent text
Adding changed texts to the permanent text repository
Open the program Terminal (a new shell window should appear on your screen)
Change the directory to the local text repository folder. Type: cd desktop/sources and press enter
Adjust the network protocol. Type: export CVS_RSH=ssh and press enter
Type: cvs -d :ext:username@archimedes.mpiwg-berlin.mpg.de:/archimedes/cvsroot commit texts/archimedes/xml
Type in your password and press enter
Press i (=insert) to add a comment, what you have done with the files (!very important!)
Press esc after the completion of your comment and type: :wq (=write and quit)
The changed files will be uploaded into the permanent text repository
Refreshing your local text repository
Open the program Terminal (a new shell window should appear on your screen)
Change the directory to the local text repository folder. Type: cd Desktop/sources and press enter
Adjust the network protocol. Type: export CVS_RSH=ssh and press enter
cvs -d :ext:username@archimedes.mpiwg-berlin.mpg.de:/archimedes/cvsroot up
Type in your password and press enter
Your local repository will be updated from the permanent text repository
archimedes_workflows.txt · Last modified: 2020/10/10 14:13 by 127.0.0.1