Hi! I'm a researcher at Square Enix AI & Arts Alchemy, working towards ML-powered next generation interactive experiences in gaming. I am also a project assistant professor at University of Tokyo, Miyao-ken, leading the dialogue and LLM group. Some of my current research interests are:
- Generative/LLM Agents
- Character-driven Dialogue Generation and Storytelling
- Analyzing LLM behavior and hallucinations (circuit analysis)
- Emergent communication and situated language learning / multi-agent collaborative communication
Recently I've become increasingly interested in computational creativitiy and the potential for AI in music creation, and often work on projects in these areas in my spare time. If you're also interested in these topics, especially if you're located in Japan, feel free to get in touch.
Previously I worked at the Japanese AI/Robotics startup, Preferred Networks. Prior to moving to Japan, I was a faculty research scientist at
Johns Hopkins University, and did postdoctoral work at the University of Cambridge (with
Anna Korhonen) and at University College London (with
Sebastian Riedel). Prior to that I completed my PhD in Computer Science at UMass Amherst with
David Smith and
Mark Johnson.
Links:
CV Google Scholar Github
My Erdős–Bacon number is arguably no greater than 8.
Wolfe:
Wolfe is a probabilistic programming language that enables practitioners to develop machine learning models in a declarative manner. Wolfe models are written in Scala and compiled by Wolfe into highly-optimized inference and learning routines (using Scala's own abstract syntax trees!), enabling researchers to focus on modelling while Wolfe does the heavy lifting. It currently features matrix factorization, message passing, and alternating directions dual decomposition, can perform many structured prediction tasks, visualize inference in factor graphs, and more.
Natural Language Toolkit (NLTK):
The Natural Language Toolkit is a collection of open source Python modules that can be used freely for research or pedagogical purposes. There's also a book documenting how to use the NTLK which doubles as an introductory computational linguistics coursebook. For the summer of 2008 I worked on the NLTK while sponsored under the Google Summer of Code program, during which time I implemented a suite of dependency parsers under the supervision of Sebastian Riedel and Jason Baldridge.
Mind the gap between conversations for improved long-term dialogue generation
Qiang Zhang, Jason Naradowsky, and Yusuke Miyao
EMNLP Findings 2023
[abstract]
[paper]
[bib]
Knowing how to end and resume conversations over time is a natural part of communication, allowing for discussions to span weeks, months, or years. The duration of gaps between conversations dictates which topics are relevant and which questions to ask, and dialogue systems which do not explicitly model time may generate responses that are unnatural. In this work we explore the idea of making dialogue models aware of time, and present GapChat, a multi-session dialogue dataset in which the time between each session varies. While the dataset is constructed in real-time, progress on events in speakers{'} lives is simulated in order to create realistic dialogues occurring across a long timespan. We expose time information to the model and compare different representations of time and event progress. In human evaluation we show that time-aware models perform better in metrics that judge the relevance of the chosen topics and the information gained from the conversation.
Ask an Expert: Leveraging Language Models to Improve Strategic Reasoning in Goal-Oriented Dialogue Models
Qiang Zhang, Jason Naradowsky, and Yusuke Miyao
ACL Findings 2023
[abstract]
[paper]
[bib]
Existing dialogue models may encounter scenarios which are not well-represented in the training data, and as a result generate responses that are unnatural, inappropriate, or unhelpful. We propose the "Ask an Expert" framework in which the model is trained with access to an "expert" which it can consult at each turn. Advice is solicited via a structured dialogue with the expert, and the model is optimized to selectively utilize (or ignore) it given the context and dialogue history. In this work the expert takes the form of an LLM.We evaluate this framework in a mental health support domain, where the structure of the expert conversation is outlined by pre-specified prompts which reflect a reasoning strategy taught to practitioners in the field. Blenderbot models utilizing "Ask an Expert" show quality improvements across all expert sizes, including those with fewer parameters than the dialogue model itself. Our best model provides a 10% improvement over baselines, approaching human-level scores on "engingingness" and "helpfulness" metrics.
Fiction-Writing Mode: An Effective Control for Human-Machine Collaborative Writing
Wenjie Zhong, Jason Naradowsky, Hiroya Takamura, Ichiro Kobayashi, and Yusuke Miyao
EACL 2023
[abstract]
[paper]
[bib]
We explore the idea of incorporating concepts from writing skills curricula into human-machine collaborative writing scenarios, focusing on adding writing modes as a control for text generation models. Using crowd-sourced workers, we annotate a corpus of narrative text paragraphs with writing mode labels. Classifiers trained on this data achieve an average accuracy of 87% on held-out data. We fine-tune a set of large language models to condition on writing mode labels, and show that the generated text is recognized as belonging to the specified mode with high accuracy. To study the ability of writing modes to provide fine-grained control over generated text, we devise a novel turn-based text reconstruction game to evaluate the difference between the generated text and the author's intention. We show that authors prefer text suggestions made by writing mode-controlled models on average 61.1% of the time, with satisfaction scores 0.5 higher on a 5-point ordinal scale. When evaluated by humans, stories generated via collaboration with writing mode-controlled models achieve high similarity with the professionally written target story. We conclude by identifying the most common mistakes found in the generated stories.
Emergent Communication with Attention
Ryokan Ri, Ryo Ueda, and Jason Naradowsky
CogSci 2023
[abstract]
[paper]
[bib]
To develop computational agents that better communicate using their own emergent language, we endow the agents with an ability to focus their attention on particular concepts in the
environment. Humans often understand an object or scene as a
composite of concepts and those concepts are further mapped
onto words. We implement this intuition as cross-modal attention mechanisms in Speaker and Listener agents in a referential game and show attention leads to more compositional and
interpretable emergent language. We also demonstrate how attention aids in understanding the learned communication protocol by investigating the attention weights associated with
each message symbol and the alignment of attention weights
between Speaker and Listener agents. Overall, our results suggest that attention is a promising mechanism for developing
more human-like emergent language.
Rethinking Offensive Text Detection as a Multi-Hop Reasoning Problem
Qiang Zhang, Jason Naradowsky, and Yusuke Miyao
ACL Findings 2022
[abstract]
[paper]
[bib]
We introduce the task of implicit offensive text detection in dialogues, where a statement may have either an offensive or non-offensive interpretation, depending on the listener and context. We argue that reasoning is crucial for understanding this broader class of offensive utterances and release SLIGHT, a dataset to support research on this task. Experiments using the data show that state-of-the-art methods of offense detection perform poorly when asked to detect implicitly offensive statements, achieving only ∼11% accuracy.
In contrast to existing offensive text detection datasets, SLIGHT features human-annotated chains of reasoning which describe the mental process by which an offensive interpretation can be reached from each ambiguous statement. We explore the potential for a multi-hop reasoning approach by utilizing existing entailment models to score the probability of these chains and show that even naive reasoning models can yield improved performance in most situations. Furthermore, analysis of the chains provides insight into the human interpretation process and emphasizes the importance of incorporating additional commonsense knowledge.
Amp-Space: A Large-scale Dataset for Fine-grained Timbre Transformation
Jason Naradowsky
DAFx 2021
[abstract]
[paper]
[bib]
[code]
We release Amp-Space, a large-scale dataset of paired audio samples: a source audio signal, and an output signal, the result of a timbre transformation. The types of transformations we study are from blackbox musical tools (amplifiers, stompboxes, studio effects) traditionally used to shape the sound of guitar, bass, or synthesizer sounds. For each sample of transformed audio, the set of parameters used to create it are given. Samples are from both real and simulated devices, the latter allowing for orders of magnitude greater data than found in comparable datasets. We demonstrate potential use cases of this data by (a) pre-training a conditional WaveNet model on synthetic data and show that it reduces the number of samples necessary to digitally reproduce a real musical device, and (b) training a variational autoencoder to shape a continuous space of timbre transformations for creating new sounds through interpolation.
Machine Translation System Selection from Bandit Feedback
Jason Naradowsky, Xuan Zhang, and Kevin Duh
AMTA 2020
[abstract]
[paper]
[bib]
Adapting machine translation systems in the real world is a difficult problem. In contrast to offline training, users cannot provide the type of fine-grained feedback (such as correct translations) typically used for improving the system. Moreover, different users have different translation needs, and even a single user's needs may change over time.
In this work we take a different approach, treating the problem of adaptation as one of selection. Instead of adapting a single system, we train many translation systems using different architectures, datasets, and optimization methods. Using bandit learning techniques on simulated user feedback, we learn a policy to choose which system to use for a particular translation task. We show that our approach can (1) quickly adapt to address domain changes in translation tasks, (2) outperform the single best system in mixed-domain translation tasks, and (3) make effective instance-specific decisions when using contextual bandit strategies.
Pow-Wow: A dataset
and study on collaborative communication in Pommerman
Takuma Yoneda, Matthew Walter, and Jason Naradowsky
Language in Reinforcement Learning (LaReL), 2020
[abstract]
[paper]
[bib]
In multi-agent learning, agents must coordinate
with each other in order to succeed. For humans, this coordination is typically accomplished
through the use of language. In this work we
perform a controlled study of human language
use in a competitive team-based game, and search
for useful lessons for structuring communication
protocol between autonomous agents.
We construct Pow-Wow, a new dataset for studying situated goal-directed human communication.
Using the Pommerman game environment, we enlisted teams of humans to play against teams of
AI agents, recording their observations, actions,
and communications. We analyze the types of
communications which result in effective game
strategies, annotate them accordingly, and present
corpus-level statistical analysis of how trends in
communications affect game outcomes. Based on
this analysis, we design a communication policy
for learning agents, and show that agents which
utilize communication achieve higher win-rates
against baseline systems than those which do not.
Meta-learning Extractors for Music Source Separation
David Samuel, Aditya Ganeshan, and Jason Naradowsky
ICASSP 2020
[abstract]
[paper]
[bib]
[code]
[colab]
We propose a hierarchical meta-learning-inspired model for music source separation (Meta-TasNet) in which a generator model is used to predict the weights of individual extractor models. This enables efficient parameter-sharing, while still allowing for instrument-specific parameterization. Meta-TasNet is shown to be more effective than the models trained independently or in a multi-task setting, and achieve performance comparable with state-of-the-art methods. In comparison to the latter, our extractors contain fewer parameters and have faster run-time performance. We discuss important architectural considerations, and explore the costs and benefits of this approach.
Emergent Communication with World Models
Alex Cowen-Rivers and Jason Naradowsky
NeurIPS 2019 Workshop on Emergent Communication (EmeCom)
[abstract]
[paper]
[bib]
We introduce Language World Models, a class of language-conditional generative model which interpret natural language messages by predicting latent codes of future observations. This provides a visual grounding of the message, similar to an enhanced observation of the world, which may include objects outside of the listening agent's field-of-view. We incorporate this "observation" into a persistent memory state, and allow the listening agent's policy to condition on it, akin to the relationship between memory and controller in a World Model. We show this improves effective communication and task success in 2D gridworld speaker-listener navigation tasks. In addition, we develop two losses framed specifically for our model-based formulation to promote positive signalling and positive listening. Finally, because messages are interpreted in a generative model, we can visualize the model beliefs to gain insight into how the communication channel is utilized.
Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction
Daniela Gerz, Ivan Vulic´, Edoardo Ponti, Jason Naradowsky, Roi Reichart, and Anna Korhonen
TACL 2018
[abstract]
[paper]
[bib]
Neural architectures are prominent in the construction of language models (LMs). However, word-level prediction is typically agnostic of subword-level information (characters and character sequences) and operates over a closed vocabulary, consisting of a limited word set. Indeed, while subword-aware models boost performance across a variety of NLP tasks, previous work did not evaluate the ability of these models to assist next-word prediction in language modeling tasks. Such subword-level informed models should be particularly effective for morphologically-rich languages (MRLs) that exhibit high type-to-token ratios. In this work, we present a large-scale LM study on 50 typologically diverse languages covering a wide variety of morphological systems, and offer new LM benchmarks to the community, while considering subword-level information. The main technical contribution of our work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction. We conduct experiments in the LM setting where the number of infrequent words is large, and demonstrate strong perplexity gains across our 50 languages, especially for morphologically-rich languages. Our code and data sets are publicly available.
A Structured Variational Autoencoder for Morphological Inflection
Lawrence Wolf-Sonkin, Jason Naradowsky, Sebastian J. Mielke, and Ryan Cotterell
ACL 2018
[abstract]
[paper]
[bib]
Statistical morphological inflectors are typically trained on fully
supervised, type-level data. One remaining open research question is
the following: How can we effectively exploit raw, token-level data
to improve their performance? To this end, we introduce a novel
generative latent-variable model for the semi-supervised
learning of inflection generation. To enable posterior inference
over the latent variables, we derive an efficient variational
inference procedure based on the wake-sleep algorithm. We experiment
on 23 languages, using the Universal Dependency corpora in a simulated
low-resource setting, and find an
average improvement of 10%.
Hypothesis Only Baselines in Natural Language Inference
Adam Poliak, Jason Naradowsky, Aparajita Haldar, Rachel Rudinger, and Benjamin Van Durme
*Sem 2018
[abstract]
[paper]
[bib]
Best Paper Award
We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI). Especially when an NLI dataset assumes inference is occurring based purely on the relationship between a context and a hypothesis, it follows that assessing entailment relations while ignoring the provided context is a degenerate solution. Yet, through experiments on ten distinct NLI datasets, we find that this approach, which we refer to as a hypothesis-only model, is able to significantly outperform a majority-class baseline across a number of NLI datasets. Our analysis suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.
Gender Bias in Coreference Resolution
Rachel Rudinger, Jason Naradowsky, Brian Leonard, and Benjamin Van Durme
NAACL 2017
[abstract]
[paper]
[bib]
We present an empirical study of gender bias in coreference resolution systems. We first introduce a novel, Winograd schema-style set of minimal pair sentences that differ only by pronoun gender.
With these Winogender schemas, we evaluate and confirm systematic gender bias in three publicly-available coreference resolution systems, and correlate this bias with real-world and textual gender statistics.
Improvised Robotic Design with Found Objects
Azumi Maekawa, Ayaka Kume, Hironori Yoshida, Jun Hatori, Jason Naradowsky, Shunta Saito
NeurIPS Machine Learning for Creativity and Design 2018
[abstract]
[paper]
[bib]
We present a study in the creative design of robots using found objects. In particular,
this study focuses on learning locomotion techniques for robots with nontraditional
and arbitrary shaped limbs, in this case, tree branches. Through a use of 3D
scanning, simulation, and deep reinforcement learning, we show that we can
effectively learn movement strategies for such robots with unorthodox shapes.
Automatic Illumination Effects for 2D Characters
Zhengyan Gao, Taizan Yonetsuji, Tatsuya Takamura, Toru Matsuoka, Jason Naradowsky
NeurIPS Machine Learning for Creativity and Design 2018
[abstract]
[paper]
[bib]
In this work we present a system to apply realistic lighting effects to 2D character
illustrations. Our approach involves a cascaded set of predictions: first generating
a normal map of the character’s 3D structure, utilizing it (and a light source) to
predict a global shadow map, and combining both to render the final effect. This
allows our system to paint natural overlapping shadows, and recreate complex
lighting effects, such as rim-lighting, while maintaining the ease of use associated
with CGI. Results on original illustrations show the effectiveness of our method.
The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays
Naoyuki Kanda, Rintaro Ikeshita, Shota Horiguchi, Yusuke Fujita, Kenji Nagamatsu (Hitachi, Ltd), Xiaofei Wang, Vimal Manohar, Nelson Enrique Yalta Soplin, Matthew Maciejewski, Szu-Jui Chen, Aswin Shanmugam Subramanian, Ruizhi Li, Zhiqi Wang, Jason Naradowsky, L. Paola Garcia-Perera, and Gregory Sell
CHiME 2018
[abstract]
[paper]
[bib]
This paper presents Hitachi and JHU’s efforts on developing
CHiME-5 system to recognize dinner party speeches recorded
by multiple microphone arrays. We newly developed (1) the
way to apply multiple data augmentation methods, (2) residual
bidirectional long short-term memory, (3) 4-ch acoustic models,
(4) multiple-array combination methods, (5) hypothesis deduplication
method, and (6) speaker adaptation technique of neural
beamformer. As the results, our best system in category
B achieved 52.38% of word error rates (WERs) for development
set, which corresponded to 35% of relative WER reduction
from the state-of-the-art baseline. Our best system also
achieved 48.20% of WER for evaluation set, which was the 2nd
best result in the CHiME-5 competition.
Programming with a differentiable forth interpreter
Matko Bosnjak, Tim Rocktäschel, Jason Naradowsky, and Sebastian Riedel
ICML 2017
[abstract]
[paper]
[bib]
Given that in practice training data is scarce for all but a
small set of problems, a core question is how to incorporate
prior knowledge into a model. In this paper, we consider
the case of prior procedural knowledge for neural networks,
such as knowing how a program should traverse a sequence,
but not what local actions should be performed at each
step. To this end, we present an end-to-end differentiable
interpreter for the programming language Forth which
enables programmers to write program sketches with slots
that can be filled with behaviour trained from program
input-output data. We can optimise this behaviour directly
through gradient descent techniques on user-specified
objectives, and also integrate the program into any larger
neural computation graph. We show empirically that our
interpreter is able to effectively leverage different levels
of prior program structure and learn complex behaviours
such as sequence sorting and addition. When connected
to outputs of an LSTM and trained jointly, our interpreter
achieves state-of-the-art accuracy for end-to-end reasoning
about quantities expressed in natural language stories.
Modeling exclusion with a differentiable factor graph constraint
Jason Naradowsky and Sebastian Riedel
ICML 2017, DeepStruct
[abstract]
[paper]
[bib]
With the adoption of general neural network architectures,
many researchers have opted to trade
informative priors for powerful models and big
data. However, for many structured prediction
tasks the complex relationships between variables
in the output space are often difficult to
learn from the available data alone.
Such relationships often centre around the notion
of exclusion: that predicting one structure
prohibits the prediction of others. In this work
we formulate a differentiable factor graph exclusion
constraint to incorporate this prior belief
into neural end-to-end architectures. We demonstrate
the effectiveness of this method in the context
of extracting event information from clusters
of related news articles, and introduce metainference
learning to determine the ideal number
of inference iterations to simulate.
Break it down for me: A study in automated lyric annotation
Lucas Sterckx, Jason Naradowsky, Bill Byrne, Thomas Demeester, and Chris Develder
EMNLP 2017
[abstract]
[paper]
[bib]
Comprehending lyrics, as found in songs and poems, can pose a challenge to
human and machine readers alike. This motivates the need for systems that can
understand the ambiguity and jargon found in such creative texts, and provide
commentary to aid readers in reaching the correct interpretation.
We introduce the task of automated lyric annotation (ALA). Like text
simplification, a goal of ALA is to rephrase the original text in a more easily
understandable manner. However, in ALA the system must often include additional
information to clarify niche terminology and abstract concepts. To stimulate
research on this task, we release a large collection of crowdsourced
annotations for song lyrics. We analyze the performance of translation and
retrieval models on this task, measuring performance with both automated and
human evaluation. We find that each model captures a unique type of information
important to the task.
A neural forth abstract machine
Matko Bosnjak, Tim Rocktäschel, Jason Naradowsky, and Sebastian Riedel
NIPS 2017, NAMPI
[abstract]
[paper]
[bib]
Noise reduction and targeted exploration in imitation learning for Abstract Meaning Representation parsing
James Goodman, Andreas Vlachos and Jason Naradowsky
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics
[abstract]
[paper]
[bib]
Semantic parsers map natural language
statements into meaning representations,
and must abstract over syntactic phenomena,
resolve anaphora, and identify word
senses to eliminate ambiguous interpretations.
Abstract meaning representation
(AMR) is a recent example of one such
semantic formalism which, similar to a dependency
parse, utilizes a graph to represent
relationships between concepts (Banarescu
et al., 2013). As with dependency
parsing, transition-based approaches are a
common approach to this problem. However,
when trained in the traditional manner
these systems are susceptible to the accumulation
of errors when they find undesirable
states during greedy decoding.
Imitation learning algorithms have been
shown to help these systems recover from
such errors.
To effectively use these methods
for AMR parsing we find it highly
beneficial to introduce two novel extensions:
noise reduction and targeted exploration.
The former mitigates the noise in
the feature representation, a result of the
complexity of the task. The latter targets
the exploration steps of imitation learning
towards areas which are likely to provide
the most information in the context of a
large action-space. We achieve state-ofthe
art results, and improve upon standard
transition-based parsing by 4.7 F1 points
UCL+Sheffield at SemEval-2016 Task 8: Imitation learning for AMR parsing with an α-bound
James Goodman, Andreas Vlachos and Jason Naradowsky
Proceedings of the 10th International Workshop on Semantic Evaluation
[abstract]
[paper]
[bib]
We develop a novel transition-based parsing algorithm for the abstract meaning representation parsing task using exact imitation learning, in which the parser learns a statisticalmodel by imitating the actions of an experton the training data. We then use the imitation learning algorithm DAGGER to improvethe performance, and apply an α-bound as asimple noise reduction technique. Our performance on the test set was 60% in F-score, andthe performance gains on the development setdue to DAGGER was up to 1.1 points of F-score. The α-bound improved performance byup to 1.8 points.
Learning with Joint Inference and Latent Linguistic Structure in Graphical Models
Jason Naradowsky
Doctoral Dissertation, 2014
Supervisors: David Smith and
Mark Johnson
[abstract]
[paper]
[bib]
A human listener, charged with the difficult task of mapping language to meaning, must infer a rich hierarchy of linguistic structures, beginning with an utterance and culminating in an understanding of what was spoken. Much in the same manner, developing complete natural language processing systems requires the processing of many different layers of linguistic information in order to solve complex tasks, like answering a query or translating a document.
Historically the community has largely adopted a “divide and conquer” strategy, choosing to split up such complex tasks into smaller fragments which can be tackled independently, with the hope that these smaller contributions will also yield benefits to NLP systems as a whole. These individual components can be laid out in a pipeline and processed in turn, one system’s output becoming input for the next. This approach poses two problems. First, errors propagate, and, much like the childhood game of “telephone”, combining systems in this manner can lead to unintelligible outcomes. Second, each component task requires annotated training data to act as supervision for training the model. These annotations are often expensive and time-consuming to produce, may differ from each other in genre and style, and may not match the intended application.
In this dissertation we pursue novel extensions of joint inference techniques for natural language processing. We present a framework that offers a general method for constructing and performing inference using graphical model formulations of typical NLP problems. Models are composed using weighted Boolean logic constraints, inference is performed using belief propagation. The systems we develop are composed of two parts: one a representation of syntax, the other a desired end task (part-of-speech tagging, semantic role labeling, named entity recognition, or relation extraction). By modeling these problems jointly, both models are trained in a single, integrated process, with uncertainty propagated between them. This mitigates the accumulation of errors typical of pipelined approaches. We further advance previous methods for performing efficient inference on graphical model representations of combinatorial structure, like dependency syntax, extending it to various forms of phrase structure parsing.
Finding appropriate training data is a crucial problem for joint inference models. We observe that in many circumstances, the output of earlier components of the pipeline is often irrelevant – only the end task output is important. Yet we often have strong a priori assumptions regarding what this structure might look like: for phrase structure syntax the model should represent a valid tree, for dependency syntax it should represent a directed graph. We propose a novel marginalization-based training method in which the error signal from end task annotations is used to guide the induction of a constrained latent syntactic representation. This allows training in the absence of syntactic training data, where the latent syntactic structure is instead optimized to best support the end task predictions. We find that across many NLP tasks this training method offers performance comparable to fully supervised training of each individual component, and in some instances im- proves upon it by learning latent structures which are more appropriate for the task.
Improving NLP through Marginalization of Hidden Syntactic Structure
Jason Naradowsky,
Sebastian Riedel, and
David Smith
EMNLP 2012
[abstract]
[paper]
[bib]
Many NLP tasks make predictions that are inherently coupled to syntactic relations, but for many languages the resources required to provide such syntactic annotations are unavailable. For others it is unclear exactly how much of the syntactic annotations can be effectively leveraged with current models, and what structures in the syntactic trees are most relevant to the current task.
We propose a novel method which avoids the need for any syntactically annotated data when predicting a related NLP task. Our method couples latent syntactic representations, constrained to form valid dependency graphs or constituency parses, with the prediction task via specialized factors in a Markov random field. At both training and test time we marginalize over this hidden structure, learning the optimal latent representations for the problem. Results show that this approach provides significant gains over a syntactically uninformed baseline, outperforming models that observe syntax on an English relation extraction task, and performing comparably to them in semantic role labeling.
Grammarless Parsing for Joint Inference
Jason Naradowsky,
Tim Vieira, and
David Smith
COLING 2012
[abstract]
[paper]
[bib]
Many NLP tasks interact with syntax. The presence of a named entity span, for example, is often a clear indicator of a noun phrase in the parse tree, while a span in the syntax can help indicate the lack of a named entity in the spans that cross it. For these types of problems joint inference offers a better solution than a pipelined approach, and yet large joint models are rarely pursued. In this paper we argue this is due in part to the absence of a general framework for joint inference which can efficiently represent syntactic structure.
We propose an alternative and novel method in which constituency parse constraints are imposed on the model via combinatorial factors in a Markov random field, guaranteeing that a variable configuration forms a valid tree. We apply this approach to jointly predicting parse and named entity structure, for which we introduce a zero-order semi-CRF named entity recognizer which also relies on a combinatorial factor. At the junction between these two models, soft constraints coordinate between syntactic constituents and named entity spans, providing an additional layer of flexibility on how these models interact. With this architecture we achieve the best-reported results on both CRF-based parsing and named entity recognition on sections of the OntoNotes corpus, and outperform state-of-the-art parsers on an NP-identification task, while remaining asymptotically faster than traditional grammar-based parsers.
Combinatorial Constraints for Constituency Parsing in Graphical Models
Jason Naradowsky,
David Smith
Technical Report, University of Massachusetts Amherst, 2012.
Unsupervised Bilingual Morpheme Segmentation and Alignment with
Context-rich Hidden Semi-Markov Models
Jason Naradowsky and
Kristina Toutanova
ACL 2011
[abstract]
[paper]
[slides]
[bib]
This paper describes an unsupervised dynamic graphical model for morphological segmentation and bilingual morpheme alignment for statistical machine translation. The model extends Hidden Semi-Markov chain models by using factored output nodes and special structures for its conditional probability distributions. It relies on morpho-syntactic and lexical source-side information (part-of-speech, morphological segmentation) while learning a morpheme segmentation over the target language. Our model outperforms a competitive word alignment system in alignment quality. Used in a monolingual morphological segmentation setting it substantially improves accuracy over previous state-of-the-art models on three Arabic and Hebrew datasets.
A Discriminative Model for Joint Morphological Disambiguation and Dependency Parsing
John Lee,
Jason Naradowsky, and
David Smith
ACL 2011
[abstract]
[paper]
[bib]
Most previous studies of morphological disambiguation and dependency parsing have been pursued independently. Morphological taggers operate on n-grams and do not take into account syntactic relations; parsers use the ``pipeline'' approach, assuming that morphological information has been separately obtained.
However, in morphologically-rich languages, there is often considerable interaction between morphology and syntax, such that neither can be disambiguated without the other. In this paper, we propose a discriminative model that jointly infers morphological properties and syntactic structures. In evaluations on various highly-inflected languages, this joint model outperforms both a baseline tagger in morphological disambiguation, and a pipeline parser in head selection.
Feature Induction for Online Constraint-based Phonology Acquisition
Jason Naradowsky,
Joe Pater, and
David Smith
Synthesis Project, Presented at NECPHON 2011
[abstract]
[paper]
[bib]
Log-linear models provide a convenient method for coupling existing machine learning methods to constraint-based linguistic formalisms like optimality theory and harmonic grammar. While the learning methods themselves have been well studied in this domain, the question of how these constraints originate is often left unanswered. We present a novel, error-driven approach to constraint induction that performs lightweight decisions based on local information. When evaluated on the task of reproducing human gradient phonotactic judgements, a model trained with this procedure can sometimes nearly match the performance of state-of-the-art methods that rely on global information and individual assessment of all possible constraints. We conclude by discussing methods for incorporating context and linguistic bias into the induction scheme to produce more accurate grammars.
Learning Hidden Metrical Structure with a Log-linear Model of Grammar
Jason Naradowsky,
Joe Pater,
David Smith, and
Robert Staubs
Workshop on Computational Modelling of Sound Pattern Acquisition 2010
Polylingual Topic Models
David Mimno,
Hanna Wallach,
Jason Naradowsky,
David Smith and
Andrew McCallum
EMNLP 2009
[abstract]
[paper]
[bib]
Topic models are a useful tool for analyzing large text collections, but have previously been applied in only monolingual, or at most bilingual, contexts. Meanwhile, massive collections of interlinked documents in dozens of languages, such as Wikipedia, are now widely available, calling for tools that can characterize content in many languages. We introduce a polylingual topic model that discovers topics aligned across multiple languages. We explore the model's characteristics using two large corpora, each with over ten different languages, and demonstrate its usefulness in supporting machine translation and tracking topic trends across languages.
Improving Morphology Induction by Learning Spelling Rules
Jason Naradowsky and
Sharon Goldwater
IJCAI 2009
[abstract]
[paper]
[slides]
[bib]
Unsupervised learning of morphology is an important task for human learners and in natural language processing systems. Previous systems focus on segmenting words into substrings (taking => tak.ing), but sometimes a segmentation-only analysis is insufficient (e.g., taking may be more appropriately analyzed as take.ing, with a spelling rule accounting for the deletion of the stem-final e). In this paper, we develop a Bayesian model for simultaneously inducing both morphology and spelling rules. We show that the addition of spelling rules improves performance over the baseline morphology-only model.
Polylingual Topic Models
David Mimno,
Hanna Wallach,
Limin Yao,
Jason Naradowsky and
Andrew McCallum
The Learning Workshop (Snowbird) 2009