Distributional Learning of Syntax

A course at LSA Summer Institute, Chicago, July 2015


Alexander Clark, King's College London.

Course Overview

This course will look at the computational and mathematical theory of how grammars can be learned from strings. Any theory of language acquisition must at bottom rest on some solution to this problem.

In the last ten years there has been very rapid progress in this field using what is called distributional learning; we will cover a family of techniques for learning context free grammars and richer formalisms equivalent to Minimalist Grammars. We will review empirical work using naturally occurring corpora, but the primary goal is to understand the fundamental principles that underly this problem.

Slides and Materials

Course material; the lecture slides will be put up after each lecture.

Lecture 1

The first lecture will be a general introduction and we will look at a specific learning algorithm for substitutable context-free grammars, and relate this to the general problem of language acquisition.

Lecture 2

This lecture will look at the weak learning of a larger class of context free grammars. We will consider learning models where the learner can use queries, and related this to the use of probabilistic data.

Lecture 3

We will look at learning grammars beyond context-free; in particular at learning Multiple Context-Free Grammars which are equivalent to Stabler's Minimalist grammars. We will also look at learning copying.

Lecture 4

We will look at strong learning — learning syntactic structure — and how this relates more generally to the problem of learning semantics. This will be the moment to discuss the relation of these distributional learning ideas to syntactic and semantic bootstrapping and more generally to linguistic theory.