Posts

Showing posts from May, 2007

Constrained MDPs and the reward hypothesis

It's been a looong ago that I posted on this blog. But this should not mean the blog is dead. Slow and steady wins the race, right? Anyhow, I am back and today I want to write about constrained Markovian Decision Process (CMDPs). The post is prompted by a recent visit of Eugene Feinberg , a pioneer of CMDPs, of our department, and also by a growing interest in CMPDs in the RL community (see this , this , or this paper). For impatient readers, a CMDP is like an MDP except that there are multiple reward functions, one of which is used to set the optimization objective, while the others are used to restrict what policies can do. Now, it seems to me that more often than not the problems we want to solve are easiest to specify using multiple objectives (in fact, this is a borderline tautology!). An example, which given our current sad situation is hard to escape, is deciding what interventions a government should apply to limit the spread of a virus while maintaining economic

Minimize!

A matlab code for minimizing a multivariate function whose partial derivatives are available by Carl Rasmussen is downloadable from here . The routine looks pretty efficient, at least on the classical Rosenbrock function.

A notion of function compression

The following compressibility concept is introduced by Harnik and Naor in their recent paper : Given a function $f$ over some domain, a compression algorithm for $f$ should efficiently compress an input $x$ in a way that will preserve the information needed to compute $f(x)$. Note that if $f$ is available (and efficiently computable) then compression is trivial as then $y=f(x)$ will serve the purpose of the compact representation of $x$. Actually, the concept was originally studied in the framework of NP decision problems where unless $P=NP$ $f$ is not efficiently computable, hence the trivial solution is not available. I am wondering if this compressibility notion could be used in learning theory or function approximation? Consider e.g. classification problems so that the output of $f$ is $\{0,1\}$. In order to prevent the trivial solution we may require that $x$ be compressed to some $y$ such that for some fixed (efficiently computable) function $g$, $f(g(y))=f(x)$. We do not requir