https://wiki.inf.ed.ac.uk/MLforNLP/WebHome

# Machine Learning for Natural Language Processing (ML-for-NLP)

This reading group focuses on Machine Learning techniques that may be applied to the field of Natural Language Processing. Participants are encouraged to suggest topics, papers, or tutorials (which need not involve any current application in NLP) by adding them to the lists below. Suggesting a paper does not constitute any sort of commitment to presenting that paper.

Meetings are approximately every week on Thursdays. Meetings will be in 4.02 at 3pm unless otherwise stated.

Announcements for this group will be made by email, and it is possible to sign up to the mailing list here.

## News

- No news, is good news

## Tools for Research. Meeting notes

###### Unix Tools

The screen command. Cheatsheet, John's screen resources (John says: Note that the screen configuration files are named .screenrc and .screenrc.gen, and so you have to explicitly look for dotfiles to see them. The viewRepos.sh and launchRepositoryEditors.sh scripts are pretty specific to my directory arrangement and may need tweaking for others' setups.)

Unix for Poets by Ken Church is a nice guide to using the unix tool set: tr,grep,sort,uniq,wc,rev,sed,awk,shuf,cat,tac,tail,cut,paste,etc.

Basic stuff but super useful

-dave

###### LaTeX & PDFs & Bibliography Management

JabRef - bibliography management

Okular - pdf highlighting

###### Research organization

###### Large-Scale Computation

Eddie notes from Des' webpage

###### Machine learning & Coding libraries

Apache Commons and the FileUtils class

## Rota

Empty.

## Paper Recommendations

Please add papers you consider appropriate for ML-for-NLP. Please also add thematic categories that are not covered.

###### Statistical Significance Testing

- Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms

###### Loss Minimization

- MERT and variants:
- http://www.cs.jhu.edu/~ozaidan/zmert/zaidan09_ZMERT.pdf Zaidan 2009, "Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems"
- http://www.statmt.org/book/ Koehn 2010 (book), "Statistical Machine Translation"
- http://www.aclweb.org/anthology/P/P11/P11-2031.pdf Clark, Dyer et al. 2011, "Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability"

- MIRA:
- http://dl.acm.org/citation.cfm?id=944936 Crammer and Singer 2003, "Ultraconservative Online Algorithms for Multiclass Problems"
- http://dl.acm.org/citation.cfm?id=1613747 Chiang, Marton et al. 2008, "Online Large-Margin Training of Syntactic and Structural Translation Features"

- Parsing:
- http://acl.ldc.upenn.edu/P/P96/P96-1024.pdf Goodman 1993, "Parsing Algorithms and Metrics"
- http://acl.ldc.upenn.edu/N/N07/N07-1051.pdf Petrov and Klein 2007, "Improved Inference for Unlexicalized Parsing"

- Ranking
- http://research.microsoft.com/pubs/68133/lambdarank.pdf C.J.C. Burges, R. Ragno, and Q.V. Le 2007 "Learning to Rank with Non-Smooth Cost Functions"

###### Bayesian Methods

- http://www.isi.edu/natural-language/people/bayes-with-tears.pdf Knight 2009, "Bayesian Inference with Tears"
- http://www.cs.berkeley.edu/~klein/papers/tutorial-acl2007.pdf Liang and Klein, 2007, "Structured Bayesian Nonparametric Mode with Variational Inference"
- http://cocosci.berkeley.edu/tom/papers/exemplar.pdf Shi, Feldman et al. "Performing Bayesian Inference With Exemplar Models"
- http://cocosci.berkeley.edu/tom/papers/mechanism.pdf Shi, Griffiths et al. "Exemplar Models as a Mechanism for Performing Bayesian Inference"

###### Topic Models

- http://www.cs.umass.edu/~wallach/publications/wallach09rethinking.pdf Wallach, Mimno et al. 2009, "Rethinking LDA: Why Priors Matter"

###### Sampling Methods

- http://www.cs.ubc.ca/%7Earnaud/doucet_defreitas_gordon_smcbookintro.ps Doucet, de Freitas et al. "An Introduction to Sequential Monte Carlo Methods"

###### Dynamical Systems

- http://www.cs.ubc.ca/~arnaud/doucet_johansen_tutorialPF.pdf Doucet and Johansen 2008, "A Tutorial on Particle Filtering and Smoothing: Fifteen Years Later"

###### Information Theory

- http://www-lmmb.ncifcrf.gov/~toms/paper/primer/primer.pdf
- http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf

###### Variational Methods

###### Probabilistic Generative Models

###### Graphical Models

###### Deep learning and Energy Based Models

###### Language Modeling

- http://ftp.cs.toronto.edu/pub/gh/MacKay+Peto-1995.pdf D. J. C. MacKay and L. Peto, “A hierarchical Dirichlet language model,”
*Natural language engineering*, vol. 1, no. 3, pp. 1–19, 1995.

## Other Reading Groups

- Cambridge http://www.wiki.cl.cam.ac.uk/rowiki/NaturalLanguage/ReadingGroup
- Toronto http://learning.cs.toronto.edu/mlreading.html

## Useful Links

- Machine Learning Q&A: http://metaoptimize.com
- Machine Learning blog: http://hunch.net/

Previous meetings

## Past meetings

### 3pm, Thursday 6th September, room 1.16 (Planning Meeting)

- Discuss ideas for topics to be presented
- Discuss whether or not you are happy for our meetings to become more exercise-based

### 3pm, Thursday 13th September, Room 3.02 (Information Theory Primer)

### 3pm, Thursday 20th September, Room 4.02 (Information Theory MacKay )

- Chapter 2 of MacKay (pages 34-48 of the pdf, pages 22 -- 36 of the book), and focus on solutions:
- Generation and inference:
- Exercise 2.4
- Exercise 2.5
- Example 2.6

- Generation and inference:

### 3pm, Thursday 27th September, room 4.02 (Information Theory MacKay )

- Chapter 2 of MacKay (pages 34-48 of the pdf, pages 22 -- 36 of the book), and focus on solutions:
- Bayesian predictive distribution:
- Exercise 2.8

- Jensen's inequality (important for all versions of EM):
- Exercise 2.14
- Example 2.15

- Bayesian predictive distribution:

### 3pm, Thursday 4th October, room 4.02 (Hypothesis Testing)

### 3pm, Thursday 18th October, room 4.02 (Variational EM)

bkj-VBwalkthrough.pdf: Variational EM

### Meetings in 2011

- Visit ML-for-NLP meetings 2011 for a list of meetings held in 2011.