Musings about machine learning and other things

Posts

Showing posts from April, 2007

LaTeX support

- April 17, 2007

I added the following two lines to the header section of the template of this page: <script type="text/javascript" src="http://www.maths.nottingham.ac.uk/personal/drw/LaTeXMathML.js"> </script> Suddenly, I can type (almost) anything in latex, e.g. \$a_n\$ becomes $a_n$. Fancy! (If you do not see anything fancy then either Javascript is disabled in your browser or you are using Internet Explorer without MathML support. In the latter case you may want to download MathPlayer by DesignScience.) Many thanks for Peter Jipsen the folks who developed ASCIIMathML , which serves as the basis of LaTeXMathML by Douglas R. Woodall . Examples showing what is possible with LatexMathML can be found here . This is an indispensable tool! The nice thing is that MathML is scaleable: $E=m c^2$

The Fastest Mixing Markov Chain on a Graph

- April 16, 2007

The paper can be found here . The authors are Stephen Boyd , Persi Diaconis and Lin Xiao. I have found the paper while looking at the papers by Perso Diaconis , a notable mathematician and magician. The paper talks about exactly what the title suggests: You are given a finite graph and you can set up a random walk on this graph by determining the transition probabilities between vertices that are connected by an edge. The walk must be symmetric so that the uniform distribution is a stationary distribution of this walk. Assuming that the associated Markov chain is irreducible and symmetric, the state distribution will converge to the uniform. The task is to maximize the rate of convergence of this. The solution is the Fastest Mixing Markov Chain on the graph (FMMC). The authors show that this is a convex optimization problem and give a polynomial algorithm (based on semidefinite programming) to find the solution. A subgradient method is given that can be more effective for larger graphs

The Loss Rank Principle by Marcus Hutter

- April 15, 2007

I found the paper posted by Marcus Hutter on arxiv quite interesting. The paper is about model (or rather predictor) selection. The idea is a familiar one, but the details appear to be novel: You want to find a model which yields small loss on the dataset available, while yielding a larger loss on most other datasets. Classification : The simplest case is when we consider supervised learning and the target set is finite. Then you can count the number of target label variations such that the predictor's loss is smaller than its loss when the true targets are used. This idea sounds very similar to the way Rademacher complexity works, see e.g. the paper of Lugosi and Wegkamp, where a localized version of Rademacher complexity is investigated. Regression : For continuous targets you can use a grid with an increasing resolution (assume that the range of targets is bounded) and count the number of gridpoints such that the predictor's loss is less than its loss on the true dataset.

Why this Blog??

- April 14, 2007

I am struggling with organizing the notes I make after lectures or after reading a paper. Hence I will experiment with this fancy way of keeping track of my thoughts. Of course, I will be happy to receive feedback from the occasional readers. We will see how well it goes!

Search This Blog