Carnegie Mellon University

AIDA/OPERA: Hypothesis-Based Reasoning Based on Integrated Cross-Media Data

Natural Language Processing and Computational Linguistics

By Eduard Hovy

Advances in the automated analysis of text, images/video, and speech/audio now make possible two complementary lines of research: integrating their results and performing hypothesis-based reasoning over those results. Our proposed system OPERA (Operations-oriented Probabilistic Extraction with Reasoning and Analysis) is an integrated solution to the challenges of TA1, TA2, and TA3 as follows:

Existing high-performance media analysis: We build upon our existing entity and event understanding technology, developed most recently in DARPA’s DEFT program, image and video analysis technology, developed most recently in IARPA’s Aladdin program, and speech and audio recognition technology, developed most recently under DARPA’s GALE program.

Cross-media integration: We describe two complementary approaches to integrate their results. First, our tested high-performance entity linking and event coreference algorithms will be adapted to handle output from all TA1 engines to provide the core integration capabilities. Second, going top-down, the hypotheses constructed by the TA3 engine will be provided to the TA2/TA1 analysis engines to guide re-interpretation of their initial results in context. Such top-down guidance greatly simplifies integration inference.

Hypothesis creation, management, and utility-based hypothesis space exploration (both standalone and guided by the analyst): As foundation we employ PowerLoom, a mature knowledge representation and reasoning engine with a long history of DARPA successes, and on top of that implement a novel hypothesis reasoning engine with sophisticated belief calculation and propagation. We extend current belief propagation techniques, limited to fixed numbers of options, to handle open-ended possibilities. We provide a novel metric of probabilistic utility that guides the exploration of the hypothesis space.