Below is a list of my publications, including book chapters, journal articles, and refereed conference papers and workshop presentations. Also shown are non-refereed technical reports, my PhD thesis, and patents. Please see my Google Scholar profile for a full list of citations and co-authors.
Generalization Bounds
Mark D. Reid
To appear in the Encyclopedia of Machine Learning, November, 2010.
{ PDF (Pre-print) | BibTeX }
@incollection{Reid:2010a,
Author = {Reid, Mark D.},
Booktitle = {Encyclopedia of Machine Learning},
Editor = {Sammut, C. and Webb, G.},
Publisher = {Springer},
Title = {Generalization Bounds},
Volume = {XXVI},
Year = {2010}}
Squinting at a Sea of Dots: Visualising Australian Readerships using Statistical Machine Learning
Julieanne Lamond and Mark D. Reid
Resourceful Reading: The New Empiricism, eResearch and Australian Literary Culture
{ BibTeX }
@incollection{Lamond:2010,
Address = {Sydney},
Author = {Lamond, Julieanne V. and Reid, Mark D.},
Booktitle = {Resourceful Reading: The New Empiricism, eResearch and Australian Literary Culture},
Editor = {Bode, Katherine and Dixon, Robert},
Pages = {223--239},
Publisher = {Sydney University Press},
Title = {Squinting at a Sea of Dots: Visualising Australian Readerships using Statistical Machine Learning},
Year = {2010}}
Mixability is Bayes Risk Curvature Relative to Log Loss
Tim van Erven, Mark D. Reid and Robert C. Williamson
Journal of Machine Learning Research (Vol. 13)
{ PDF | JMLR | Abstract | BibTeX }
@article{van-Erven:2012,
Author = { {van Erven}, Tim and Reid, Mark D. and Williamson, Robert C.},
Journal = {Journal of Machine Learning Research},
Month = {May},
Pages = {1639--1663},
Title = {Mixability is Bayes Risk Curvature Relative to Log Loss},
Volume = {13},
Year = {2012}}
Information, Divergence and Risk for Binary Experiments
Mark D. Reid and Robert C. Williamson
Journal of Machine Learning Research (Vol. 12)
{ PDF | JMLR | Abstract | BibTeX }
@article{Reid:2011,
Author = {Reid, Mark D. and Williamson, Robert C.},
Journal = {Journal of Machine Learning Research},
Month = {March},
Pages = {731--817},
Title = {Information, Divergence and Risk for Binary Experiments},
Volume = {12},
Year = {2011}}
Composite Binary Losses
Mark D. Reid and Robert C. Williamson
Journal of Machine Learning Research (Vol. 11)
{ PDF | JMLR | Abstract | BibTeX }
@article{Reid:2009b,
Author = {Reid, M.D. and Williamson, R.C.},
Journal = {Journal of Machine Learning Research},
Month = {September},
Title = {Composite Binary Losses},
Volume = {11},
Year = {2010}}
Cross-training and its Application to Skill-Mining.
Daniel Oblinger, Mark Reid, Mark Brodie, and Rodrigo de Salvo Braz.
IBM Systems Journal (Vol. 41, No. 3)
{ PDF | Abstract | BibTeX }
@article{Oblinger:2002,
Author = {Oblinger, Daniel and Reid, Mark D. and Brodie, Mark and {de Salvo Braz}, Rodrigo},
Journal = {IBM Systems Journal},
Number = {3},
Pages = {449--460},
Title = {Cross-training and its Application to Skill-Mining},
Volume = {41},
Year = {2002}}
Mixability in Statistical Learning
Tim van Erven, Peter Grünwald, Mark D. Reid, and Robert Williamson
Neural Information Processing Systems (NIPS 2012)
{ PDF | Abstract | BibTeX }
@conference{vanErven:2012,
Address = {Lake Tahoe, USA},
Author = {van Erven, Tim and Gr\"{u}nwald, Peter and Reid, Mark D. and Williamson, Robert C.},
Booktitle = {Proceedings of Neural Information Processing Systems},
Month = {December},
Title = {Mixability in Statistical Learning},
Year = {2012}}
Interpreting Prediction Markets: A Stochastic Approach
Rafael M. Frongillo, Nicolás Della Penna, and Mark D. Reid
Neural Information Processing Systems (NIPS 2012)
{ PDF | Abstract | BibTeX }
@conference{Frongillo:2012,
Address = {Lake Tahoe, USA},
Author = {Frongillo, Rafael and Della Penna, Nico\'{a}s and Reid, Mark D.},
Booktitle = {Proceedings of Neural Information Processing Systems},
Month = {December},
Title = {Interpreting Prediction Markets: A Stochastic Approach},
Year = {2012}}
Tighter Variational Representations of f-Divergences via Restriction to Probability Measures
Avraham Ruderman, Dario García-García, James Petterson, and Mark D. Reid
International Conference on Machine Learning (ICML 2012)
{ PDF | Abstract | BibTeX | Discuss }
We show that the variational representations for f-divergences currently used in the literature can be tightened. This has implications to a number of methods recently proposed based on this representation. As an example application we use our tighter representation to derive a general f-divergence estimator based on two i.i.d. samples and derive the dual program for this estimator that performs well empirically. We also point out a connection between our estimator and MMD.
@conference{Ruderman:2012,
Address = {Edinburgh, Scotland},
Author = {Ruderman, Avraham and Garc{\'\i}a-Garc{\'\i}a, Dar{\'\i}o and Petterson, James and Reid, Mark D.},
Booktitle = {Proceedings of the International Conference on Machine Learning},
Month = {June},
Title = {Tighter Variational Representations of f-Divergences via Restriction to Probability Measures},
Year = {2012}}
The Convexity and Design of Composite Multiclass Losses
Mark D. Reid, Peng Sun, and Robert C. Williamson
International Conference on Machine Learning (ICML 2012)
{ PDF | Abstract | BibTeX | Discuss}
We consider composite loss functions for multiclass prediction comprising a proper (i.e., Fisher-consistent) loss over probability distributions and an inverse link function. We establish conditions for their (strong) convexity and explore their implications. We also show how the separation of concerns afforded by using this composite representation allows for the design of families of losses with the same Bayes risk.
@conference{Reid:2012,
Address = {Edinburgh, Scotland},
Author = {Reid, Mark D. and Williamson, Robert C. and Sun, Peng},
Booktitle = {Proceedings of the International Conference on Machine Learning},
Month = {June},
Title = {The Convexity and Design of Composite Multiclass Losses},
Year = {2012}}
AOSO-LogitBoost: Adaptive One-Vs-One LogitBoost for Multi-Class Problems
Peng Sun, Mark D. Reid, and Jie Zhou
International Conference on Machine Learning (ICML 2012)
{ PDF | Abstract | BibTeX | Discuss }
This paper is dedicated to the improvement of model learning in multi-class LogitBoost for classification. Motivated by statistical view, LogitBoost can be seen as additive tree regression. Important facts in such a setting are 1) coupled classifier output as sum-to-zero constraint and 2) dense Hessian matrix arising in tree node split gain and node values fitting. On the one hand, the setting is too complicated for a tractable model learning algorithm; On the other hand, too aggressive simplification of the setting may lead to degraded performance. For example, the original LogitBoost is outperformed by ABC-LogitBoost due to the later’s more careful treatment for the above two key points in problem settings.
In this paper we propose improved methods to address the challenge: we adopt 1) vector tree (i.e. node value is vector) that enforces sum-to-zero constraint and 2) adaptive block coordinate descent exploiting dense Hessian when computing tree split gain and node values. Higher classification accuracy and faster convergence rate are observed for a range of public data sets when comparing to both original and ABC LogitBoost.
@conference{Sun:2012,
Address = {Edinburgh, Scotland},
Author = {Sun, Peng and Reid, Mark D. and Zhou, Jie},
Booktitle = {Proceedings of the International Conference on Machine Learning},
Month = {June},
Title = {AOSO-LogitBoost: Adaptive One-Vs-One LogitBoost for Multi-Class Problem},
Year = {2012}}
Crowd & Prejudice: An Impossibility Theorem for Crowd Labelling without a Gold Standard
Nicolás Della Penna and Mark D. Reid
Collective Intelligence (CI 2012)
{ PDF | Abstract | BibTeX }
@inproceedings{DellaPenna:2012,
Author = {Della Penna, Nicol\'{a}s and Reid, Mark D.},
Booktitle = {Proceedings of Collective Intelligence (CI)},
Title = {Crowd \& Prejudice: An Impossibility Theorem for Crowd Labelling without a Gold Standard},
Year = {2012}}
Composite Multiclass Losses
Elodie Vernet, Robert C. Williamson, and Mark D. Reid
Neural Information Processing Systems (NIPS 2011)
{ PDF | Abstract | BibTeX }
@inproceedings{Vernet:2011,
Author = {Vernet, Elodie and Williamson, Robert C. and Reid, Mark D.},
Booktitle = {Proceedings of Neural Information Processing Systems (NIPS 2011)},
Title = {Composite Multiclass Losses},
Year = {2011}}
Mixability is Bayes Risk Curvature Relative to Log Loss
Tim van Erven, Mark D. Reid, and Robert C. Williamson
Conference on Learning Theory (COLT 2011)
{ Video | PDF | Abstract | BibTeX }
@inproceedings{van-Erven:2011,
Author = { {van Erven}, Tim and Reid, Mark D. and Williamson, Robert C.},
Booktitle = {Proceedings of the 24th Annual Conference on Learning Theory},
Title = {Mixability is Bayes Risk Curvature Relative to Log Loss},
Year = {2011}}
Convexity of Proper Composite Binary Losses
Mark D. Reid and Robert C. Williamson
International Conference on Artificial Intelligence and Statistics (AISTATS 2010)
{ PDF | Abstract | BibTeX }
Kernel Conditional Quantile Estimation via Reduction Revisited
Novi Quadrianto, Kristian Kersting, Mark Reid, Tiberio Caetano, and Wray Buntine
IEEE International Conference on Data Mining (ICDM 2009)
{ PDF | Abstract | BibTeX }
@inproceedings{Quadrianto:2009,
Author = {Quadrianto, Novi and Kersting, Kristian and Reid, Mark D. and Caetano, Tiberio and Buntine, Wray},
Booktitle = {Proceedings of the IEEE International Conference on Data Mining (ICDM)},
Title = {Kernel Conditional Quantile Estimation via Reduction Revisited},
Year = {2009}}
Generalised Pinsker Inequalities.
Mark D. Reid and Robert C. Williamson
Conference on Learning Theory (COLT 2009)
{ PDF | Slides | Abstract | BibTeX }
@inproceedings{Reid:2009,
Author = {Reid, Mark D. and Williamson, Robert C.},
Booktitle = {Proceedings of the 22nd Annual Conference on Learning Theory},
Title = {Generalised Pinsker Inequalities},
Year = {2009}}
Surrogate Regret Bounds for Proper Losses
Mark D. Reid and Robert C. Williamson
International Conference on Machine Learning (ICML 2009)
{ PDF | Slides | Abstract | BibTeX }
@inproceedings{Reid:2009a,
Author = {Reid, Mark D. and Williamson, Robert C.},
Booktitle = {Proceedings of the International Conference on Machine Learning},
Pages = {897--904},
Title = {Surrogate Regret Bounds for Proper Losses},
Year = {2009}}
Improving Rule Evaluation Using Multitask Learning
Mark D. Reid
International Conference on Inductive Logic Programming (ILP 2004)
{ PDF | Slides | Abstract | BibTeX }
@inproceedings{Reid:2004,
Author = {Reid, Mark D.},
Booktitle = {Proceedings of the 14th International Conference on ILP},
Pages = {252--269},
Title = {Improving Rule Evaluation Using Multitask Learning},
Year = {2004}}
Using ILP to Improve Planning in Hierarchical Reinforcement Learning
Mark Reid and Malcolm Ryan
International Conference on Inductive Logic Programming (ILP 2000)
{ PDF | Abstract | BibTeX }
@inproceedings{Reid:2000,
Author = {Reid, Mark D. and Ryan, Malcolm},
Booktitle = {Proceedings of the 10th International Conference on ILP},
Pages = {174--190},
Title = {Using ILP to Improve Planning in Hierarchical Reinforcement Learning},
Year = {2000}}
Learning to Fly: An Application of Hierarchical Reinforcement Learning
Malcolm Ryan and Mark Reid
International Conference on Machine Learning (ICML 2000)
{ PDF | Abstract | BibTeX }
Hierarchical reinforcement learning promises to be the key to scaling reinforcement learning methods to large, complex, real-world problems. Many theoretical models have been proposed but so far there has been little in the way of empirical work published to demonstrate these claims.
In this paper we begin to fill this void by demonstrating the application of the RL-TOPs hierarchical reinforcement learning system to the problem of learning to control an aircraft in a flight simulator. We explain the steps needed to encode the background knowledge for this domain and present experimental data to show the success of this technique.@inproceedings{Ryan:2000,
Author = {Ryan, Malcolm and Reid, Mark D.},
Booktitle = {Proceedings of the 17th International Conference on Machine Learning (ICML)},
Pages = {807--814},
Title = {Learning to Fly: An Application of Hierarchical Reinforcement Learning},
Year = {2000}}
NRMIS: A Noise Resistant Model Inference System
Eric McCreath and Mark Reid
Discovery Science (DS 1999)
{ PDF | Abstract | BibTeX }
@inproceedings{McCreath:1999,
Author = {McCreath, Eric and Reid, Mark D.},
Booktitle = {Discovery Science},
Pages = {252--263},
Title = {NRMIS: A Noise Resistant Model Inference System},
Year = {1999}}
Interpreting Prediction Markets: A Stochastic Approach
Rafael Frongillo, Nicolás Della Penna, and Mark D. Reid
Workshop on Markets, Mechanisms, and Multi-Agent Models at ICML 2012
{ PDF | Abstract }
Bandit Market Makers
Nicolás Della Penna and Mark D. Reid
Poster at the Second Workshop on Computational Social Science and the Wisdom of Crowds at NIPS 2011
{ PDF | Abstract }
Anatomy of a Learning Problem
Mark D. Reid, James Montgomery, and Mindika Premachandra
Talk at the Relations Between Machine Learning Problems Workshop at NIPS 2011.
{ PDF | Abstract | Slides | Video }
DEFT Guessing: Using Inductive Transfer to Improve Rule Evaluation from Limited Data
Mark D. Reid
School of Computer Science and Engineering, The University of New South Wales, Sydney, Australia.
{ PDF | Abstract | BibTeX }
Algorithms that learn sets of rules describing a concept from its examples have been widely studied in machine learning and have been applied to problems in medicine, molecular biology, planning and linguistics. Many of these algorithms used a separate-and-conquer strategy, repeatedly searching for rules that explain different parts of the example set. When examples are scarce, however, it is difficult for these algorithms to evaluate the relative quality of two or more rules which fit the examples equally well.
This dissertation proposes, implements and examines a general technique for modifying rule evaluation in order to improve learning performance in these situations. This approach, called Description-based Evaluation Function Transfer (Deft), adjusts the way rules are evaluated on a target concept by taking into account the performance of similar rules on a related support task that is supplied by a domain expert. Central to this approach is a novel theory of task similarity that is defined in terms of syntactic properties of rules, called descriptions, which define what it means for rules to be similar. Each description is associated with a prior distribution over classification probabilities derived from the support examples and a rule’s evaluation on a target task is combined with the relevant prior using Bayes’ rule. Given some natural conditions regarding the similarity of the target and support task, it is shown that modifying rule evaluation in this way is guaranteed to improve estimates of the true classification probabilities.
Algorithms to efficiently implement Deft are described, analysed and used to measure the effect these improvements have on the quality of induced theories. Empirical studies of this implementation were carried out on two artificial and two real-world domains. The results show that the inductive transfer of evaluation bias based on rule similarity is an effective and practical way to improve learning when training examples are limited.
@phdthesis{Reid:2007,
Address = {Sydney, Australia},
Author = {Reid, Mark D.},
School = {University of New South Wales},
Title = {DEFT Guessing: Using Inductive Transfer to Improve Rule Evaluation from Limited Data},
Year = {2007}}
PSI Draft Specification
Mark D. Reid, James Montgomery, and Barry Drake
{ Project Site | Specification | Abstract }
The Protocols and Structures for Inference (PSI) project aims to develop an architecture for presenting machine learning algorithms, their inputs (data) and outputs (predictors) as resource-oriented RESTful web services1 in order to make machine learning technology accessible to a broader range of people than just machine learning researchers.
Currently, many machine learning implementations (e.g., in toolkits such as Weka, Orange, Elefant, Shogun, SciKit.Learn, etc.) are tied to specific choices of programming language, and data sets to particular formats (e.g., CSV, svmlight, ARFF). This limits their accessability, since new users may have to learn a new programming language to run a learner or write a parser for a new data format, and their interoperability, requiring data format converters and multiple language platforms. To address these limitations, the aim of the PSI service architecture is to present the main inferential entities – relations, attributes, learners, and predictors – as web resources that are accessible via a common interface. By enforcing a consistent interface for the entities involved in learning, interoperability is improved and irrelevant implementation details can be hidden to promote accessibility.Conditional Random Fields and Support Vector Machines: A Hybrid Approach
Qinfeng Shi, Mark D. Reid, and Tiberio Caetano
arXiv:1009.3346 [cs.LG]
{ arXiv | Abstract | BibTeX }
We propose a novel hybrid loss for multiclass and structured prediction problems that is a convex combination of log loss for Conditional Random Fields (CRFs) and a multiclass hinge loss for Support Vector Machines (SVMs). We provide a sufficient condition for when the hybrid loss is Fisher consistent for classification. This condition depends on a measure of dominance between labels - specifically, the gap in per observation probabilities between the most likely labels. We also prove Fisher consistency is necessary for parametric consistency when learning models such as CRFs.
We demonstrate empirically that the hybrid loss typically performs as least as well as - and often better than - both of its constituent losses on variety of tasks. In doing so we also provide an empirical comparison of the efficacy of probabilistic and margin based approaches to multiclass and structured prediction and the effects of label dominance on these results.@misc{Shi:2010,
Author = {Shi, Qinfeng and Reid, Mark D. and Caetano, Tib{\'e}rio S.},
Howpublished = {arXiv:1009.3346v1 [cs.LG]},
Month = {September},
Title = {Conditional Random Fields and Support Vector Machines: A Hybrid Approach},
Year = {2010}}
Information, Divergence and Risk for Binary Experiments
Mark D. Reid and Robert C. Williamson
arXiv:0901.0356v1 [stat.ML]
{ arXiv | Abstract | BibTeX }
@misc{Reid:2009c,
Author = {Reid, Mark D. and Williamson, Robert C.},
Howpublished = {arXiv:0901.0356v1 [stat.ML]},
Month = {January},
Title = {Information, Divergence and Risk for Binary Experiments},
Year = {2009}}
Determining Page Complexity
Barry James Drake and Mark Darren Reid
Australian Application Number: 2006252174
{ Entry }