Exploring differential topic models for comparative summarization of scientific papers

Abstract

This paper investigates differential topic models (dTM) for summarizing the differences among document groups. Starting from a simple probabilistic generative model, we propose dTM-SAGE that explicitly models the deviations on group-specific word distributions to indicate how words are used differentially across different document groups from a background word distribution. It is more effective to capture unique characteristics for comparing document groups. To generate dTM-based comparative summaries, we propose two sentence scoring methods for measuring the sentence discriminative capacity. Experimental results on scientific papers dataset show that our dTM-based comparative summarization methods significantly outperform the generic baselines and the state-of-the-art comparative summarization methods under ROUGE metrics.

Divisions: College of Engineering & Physical Sciences > School of Informatics and Digital Engineering > Computer Science
College of Engineering & Physical Sciences > Systems analytics research institute (SARI)
Additional Information: -This work is licenced under a Creative Commons Attribution 4.0 International License. License details: http:// creativecommons.org/licenses/by/4.0/
Event Title: 26th International Conference on Computational Linguistics
Event Type: Other
Event Dates: 2016-12-11 - 2016-12-16
ISBN: 978-4-87974-702-0
PURE Output Type: Conference contribution
Published Date: 2016-12-11
Accepted Date: 2016-12-01
Authors: He, Lei
Li, Wei
Zhuge, Hai (ORCID Profile 0000-0001-8250-6408)

Download

[img]

Version: Published Version

License: Creative Commons Attribution

| Preview

Export / Share Citation


Statistics

Additional statistics for this record