Ingenie

Solo disponible en BuenasTareas
  • Páginas : 8 (1751 palabras )
  • Descarga(s) : 0
  • Publicado : 25 de enero de 2011
Leer documento completo
Vista previa del texto
Evaluation of a Multi-user System of Voice Interaction Using Grammars
Elizabete Munzlinger, Fabricio da Silva Soares, and Carlos Henrique Quartucci Forster
Instituto Tecnológico de Aeronáutica, Divisão de Ciência da Computação, Praça Marechal Eduardo Gomes, 50 – 12.228-900 São José dos Campos, Brasil {bety, p2p, forster}@ita.br

Abstract. This paper shows an experimental study about thedesign of grammars for a voice interface system. The influence of the grammar design on the behavior of the voice recognition system regarding accuracy and computational cost is assessed through tests. With the redesign of a grammar we show that those characteristics can be expressively improved. Keywords: Grammar, multi-user interface, automatic speech recognition.

1 Introduction
Many speechrecognition systems need every new user to train the system to recognize one’s voice through the exhaustive reading of texts. This training is necessary because these systems often use extended vocabularies of words [1]. It is desirable to have a system independent of the training and able to recognize the same words when spoken by different voices, with different accents [5]. Applications that userecognized commands don’t need such extended vocabulary, which can be restricted to the needs of the particular application. By the use of grammars associated to the application a limit of possible words to every context is determined. The right design of a grammar can make the application become a multi-user system. The present document shows an experimental study about the design of grammars for avoice interface system for home application (Domotic). The design of grammars based on tests for accuracy and performance analysis made with an ASR (Automatic Speech Recognition) component used to recognize Brazilian Portuguese is described. The knowledge of improved design of grammars is a first step to the automatic generation of a grammar for multi-user interactive applications. The grammar wasused in a prototype of Domotic system that controls up to 32 devices through voice recognition. The system uses the parallel port of the computer and is connected to an electronic circuit that activates the devices. For the ASR system, IBM Via Voice was chosen because its acceptance of Brazilian Portuguese. The Domotic application was developed in Java and uses IBM Java Speech Technology API,which gives access and works together with the IBM VIA VOICE through the JSAPI API [4].
C. Baranauskas et al. (Eds.): INTERACT 2007, LNCS 4663, Part II, pp. 452 – 455, 2007. © IFIP International Federation for Information Processing 2007

Evaluation of a Multi-user System of Voice Interaction Using Grammars

453

2 Grammar Design
A grammar is built from a set of sentences separate byproduction rules and structured as a tree composed by nodes. The nodes of the grammar are contained in a static structure describing a hierarchy of nodes from the main node and a set of nodes dependent on it. Every node of the grammar has a name who specifies its category [3]. In two-dimensional disposition (Figure 1) it is possible to see the possibilities of connections between the levels of the treefollowing its hierarchy until reaching the terminal symbols. At first, we designed a grammar for general use (by systems with several contexts) based on the morphological analysis used in the sentences of the Portuguese language and made of many rules that determines, for example, verbs, subjects, treatments, pronouns and articles. Thus the rule that defines an article comprises other two subrulesfor definite articles and indefinite articles. In the end the grammar has a total of 64 sub-rules and 167 terminal symbols. It was noticed that this complex grammar lowers the performance of the recognition system making it impossible to execute the application. It took at least 980 MB of memory and 100% of CPU occupancy during 1 minute for allocation and processing of the structure of the...
tracking img