Gate

Solo disponible en BuenasTareas
  • Páginas : 12 (2919 palabras )
  • Descarga(s) : 0
  • Publicado : 31 de agosto de 2010
Leer documento completo
Vista previa del texto
Using GATE as an Annotation Tool
Tom Kenter, Diana Maynard 28th January 2005

1

Contents
1 Introduction 1.1 What is GATE for? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Overview of GATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Getting started 2.1 Download and install the software . . . . . . . . . . . . . . . . . . . . . . . . 2.2Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 A GATE session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 2.3.6 2.3.7 2.3.8 2.3.9 Start the GATE application . . . . . . . . . . . . . . . . . . . . . . . Importing/loading/saving resources . . . . . . . . . . . . . . . . . . . Annotation schemas . . . . . . . . . .. . . . . . . . . . . . . . . . . Start annotating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manual annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Automatic annotation . . . . . . . . . . . . . . . . . . . . . . . . . . Viewing annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . Saving data in datastores . . . . . . . . . . . . . . . . . . . . . .. . . Save data as XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 4 5 5 5 5 5 5 6 6 6 7 7 7 8 8 8 8 8 9 9 9 9

2.3.10 Restore application from file . . . . . . . . . . . . . . . . . . . . . . . 3 Working with Ontologies 3.1 Ontology Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 OntoGazetteer . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 3.2.1 3.2.2 3.2.3 .lst file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mappings.def . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . lists.def . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3.3 Jape Transducer

3.4 Creating a pipeline . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . 3.5 Portability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Language resources: gate:/ path names . . . . . . . . . . . . . . . . .

9 10 10 10 10

3.6 Processing resources: Saving application state . . . . . . . . . . . . . . . . . 3.7 How to generate annotations automatically . . . . . . . . . . . . . . . . .. .

3

1

Introduction

This manual is designed as an introduction to GATE 3 for people who have no experience at all with the tool. The first part covers the basic aspects of how to use GATE as an annotation tool; the second part includes some more advanced aspects concerned with using the ontology functionalities. For more detailed information, we refer the reader to the GATE User Guide(see Section 3.2).

1.1

What is GATE for?

But first, what is GATE for? GATE can be used for infinitely many things, but one of the most typical uses is to annotate pages with it. This means that you have a collection of pages (a corpus) and a number of concepts (Annotation Schema) that supposedly occur in these pages. GATE provides you with an easy to use interface for indicating whichpieces of text denote which of your concepts. In GATE you can do the annotating by hand, or you can let GATE do this automatically by using Gazetteers, etc. For example, GATE automatically annotates all html tags it finds in your text (you will find them in the Annotation Set called ’Original markups annotations’).

1.2

Overview of GATE

GATE is an architecture that contains functionality forplugging in all kinds of NLP software, such as POS taggers, sentence splitters, Named Entity recognizers, etc. It works with resources. There are two main kinds of resources: Language Resources, and Processing Resources. • Language Resource (LR): refers to data-only resources such as lexicons, corpora, thesauri or ontologies. Some LRs come with software (e.g. Wordnet has both a user query interface...
tracking img