Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
linguisticsweb:tutorials:linguistics_tutorials:automaticannotation:stanford_pos_tagger [2019/01/26 22:51]
sabinebartsch [author: Sabine Bartsch, TU Darmstadt]
linguisticsweb:tutorials:linguistics_tutorials:automaticannotation:stanford_pos_tagger [2019/03/05 21:25] (current)
sabinebartsch [3.1 Input parameters]
Line 1: Line 1:
 ====== The Stanford POS Tagger ====== ====== The Stanford POS Tagger ======
  
-==== author: Sabine Bartsch, ​TU Darmstadt ====+==== author: Sabine Bartsch, ​Technische Universität ​Darmstadt ====
  
-Tutorial builds on input from the Stanford ​NLP website.+Tutorial builds on software and input from the [[https://​nlp.stanford.edu/​software/​tagger.html|Stanford ​PoS Tagger ​website]].
  
 Related tutorial: [[linguisticsweb:​tutorials:​linguistics_tutorials:​automaticannotation:​stanford_pos_tagger_python|Stanford PoS Tagger: tagging from Python]] Related tutorial: [[linguisticsweb:​tutorials:​linguistics_tutorials:​automaticannotation:​stanford_pos_tagger_python|Stanford PoS Tagger: tagging from Python]]
Line 45: Line 45:
 | **-model** ​ | different taggers are available, but at one has to be specified: e.g. edu.stanford.nlp.tagger.maxent.MaxentTagger | | **-model** ​ | different taggers are available, but at one has to be specified: e.g. edu.stanford.nlp.tagger.maxent.MaxentTagger |
 | **-textFile** ​ | for plain text input files  | | **-textFile** ​ | for plain text input files  |
-| -xmlInput ​ | Example value: <​body>;​ The value specified here determines the element of an xml file the contents of which is being tagged. ​ | +**-xmlInput**  | Example value: <​body>;​ The value specified here determines the element of an xml file the contents of which is being tagged. ​ | 
-| **-outputFormat** ​ | xml, tsv, slashTags, -tagSeparator \# |+| **-outputFormat** ​ | xml, tsv, slashTags, -tagSeparator \#|
  
  
Line 123: Line 123:
 Please note that for different languages the tagger uses different tag-sets as there is no universal tag-set that fits all linguistic phenomena in all languages. Make sure you find out what tag-set is being used in a model for a specific language and what the tags mean.  Please note that for different languages the tagger uses different tag-sets as there is no universal tag-set that fits all linguistic phenomena in all languages. Make sure you find out what tag-set is being used in a model for a specific language and what the tags mean. 
  
-  * English: ​the Penn Treebank site. There is a simple listings on the [[http://www.comp.leeds.ac.uk/amalgam/tagsets/upenn.html|AMALGAM project page]] +  * English: [[https://www.ling.upenn.edu/courses/Fall_2003/ling001/​penn_treebank_pos.html|Penn Tree Bank tag set]] 
-  * Chinese: [[http://www.cis.upenn.edu/~chinese/|the Penn Chinese Treebank]] +  * Chinese: [[https://verbs.colorado.edu/​chinese/​posguide.3rd.ch.pdf|Penn Chinese Treebank]] 
-  * German: [[http://​www.ims.uni-stuttgart.de/forschung/​ressourcen/​lexika/​TagSets/stts-table.html|Stuttgart-Tübingen Tag Set (STTS)]]+  * German: [[http://​www.sfs.uni-tuebingen.de/resources/stts-1999.pdf|Stuttgart-Tübingen Tag Set (STTS)]]
   * French: [[http://​www.llf.cnrs.fr/​Gens/​Abeille/​French-Treebank-fr.php|the French Treebank]] ​   * French: [[http://​www.llf.cnrs.fr/​Gens/​Abeille/​French-Treebank-fr.php|the French Treebank]] ​