Word graphs are able to represent a large number of different utterance hypotheses in a very compact manner. However, usually they contain a huge amount of redundancy in terms of word hypotheses that cover almost identical intervals in time. We address this problem by introducing hypergraphs for speech processing. Hypergraphs can be classified as an extension to word graphs and charts, their edges possibly having several start and end vertices. By converting ordinary word graphs to hypergraphs one can reduce the number of edges considerably. We define hypergraphs formally, present an algorithm to convert word graphs into hypergraphs and state consistency properties for edges and their combination. Finally, we present some empirical results concerning graph size and parsing efficiency.
Jan W. Amtrup, Volker Weber