Book Review: Foundations of Statistical Natural Language Processing

“Foundations of Statistical Natural Language Processing” by Christopher D. Manning and Hinrich Schutze has a relatively old publication date of 1999, but do not let this deter you from reading this useful book. This book continues to be an important foundation text in a fast moving field.

Unlike many texts, Manning & Schutze do not make the mistake of trying to cover too many topics and domain problems. Instead, they concentrate on a relatively small group of important problems, and provide the coverage that they deserve. It is the choice of fundamental problems and this in-depth coverage which result in the book’s continuing value.

The book is divided into four sections of four chapters each:

  • Preliminaries: Introduction, Mathematical Foundations, Linguistic Essentials, Corpus-Based Work
  • Words: Collocations, Statistical Inference (n-gram Models over Sparse Data), Word Sense Disambiguation, Lexical Acquisition
  • Grammar: Markov Models, Part-of-Speech Tagging, Probabilistic Context Free Grammars, Probabilistic Parsing
  • Applications and Techniques: Statistical Alignment and Machine Translation, Clustering, Topics in Information Retrieval, Text Categorization

The Preliminaries section might be a little pedestrian for some readers, but it provides the background required for everything that follows. Not everyone has a good statistical and linguistic grounding. In fact, most readers in the target audience will lack sufficient knowledge in one of these, so it is good to explain the basics. Modern natural language processing relies on statistics, so some basic statistics is required. Similarly, some basic linguistics are required if you intend to write software that understands even the most basic language!

As the title suggests, the above set of subjects could be considered a “foundational” set of subjects that must be mastered if before more sophisticated problems are tackled. Although work has progressed on such problems in the intervening years, many continue to be research level and considered “unsolved” (e.g. conversational agents, question answering, and machine translation).

As well as giving a good, in-depth review of each of these topics, each chapter ends with an excellent further reading section. Not just a set of bibliographies, these sections include landmark papers and discuss the different further approaches that are possible to specific sub-problems.

This book is strongly recommended for readers who have whetted their appetite for natural language processing with books such as Natural Language Processing with Python, Bird et al and wish to know more. In particular, this book lays the ground work with good in-depth coverage of foundational topics in this difficult subject area.

Leave a Reply