A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus

Abstract

We propose a hybrid generative/discriminative framework for semantic parsing which combines the hidden vector state (HVS) model and the hidden Markov support vector machines (HM-SVMs). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. The HM-SVMs combine the advantages of the hidden Markov models and the support vector machines. By employing a modified K-means clustering method, a small set of most representative sentences can be automatically selected from an un-annotated corpus. These sentences together with their abstract annotations are used to train an HVS model which could be subsequently applied on the whole corpus to generate semantic parsing results. The most confident semantic parsing results are selected to generate a fully-annotated corpus which is used to train the HM-SVMs. The proposed framework has been tested on the DARPA Communicator Data. Experimental results show that an improvement over the baseline HVS parser has been observed using the hybrid framework. When compared with the HM-SVMs trained from the fully-annotated corpus, the hybrid framework gave a comparable performance with only a small set of lightly annotated sentences.

Divisions: College of Engineering & Physical Sciences > Systems analytics research institute (SARI)
?? 50811700Jl ??
Additional Information: © 2008. Licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported license (http://creativecommons.org/licenses/by-nc-sa/3.0/). Some rights reserved.
ISBN: 978-1-905593-44-6
Last Modified: 04 Nov 2024 09:41
Date Deposited: 24 Jan 2013 14:33
Full Text Link: http://dl.acm.o ... .cfm?id=1599221
Related URLs: http://www.scop ... tnerID=8YFLogxK (Scopus URL)
PURE Output Type: Other chapter contribution
Published Date: 2008-01-01
Authors: Zhou, Deyu
He, Yulan (ORCID Profile 0000-0003-3948-5845)

Download

Export / Share Citation


Statistics

Additional statistics for this record