Final published version
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - Conditional mean embeddings as regressors
AU - Grunewalder, S.
AU - Lever, G.
AU - Gretton, A.
AU - Baldassarre, L.
AU - Patterson, S.
AU - Pontil, M.
PY - 2012
Y1 - 2012
N2 - We demonstrate an equivalence between reproducing kernel Hilbert space (RKHS) embeddings of conditional distributions and vector-valued regressors.This connection introduces a natural regularized loss function which the RKHS embeddings minimise, providing an intuitive understanding of the embeddings and a justification for their use. Furthermore, the equivalence allows the application of vector-valued regression methods and results to the problem of learning conditional distributions. Using this link we derive a sparse version of the embedding by considering alternative formulations. Further, by applyingconvergence results for vector-valued regression to the embedding problem we derive minimax convergence rates which are O(log(n)=n) – compared to current state of the art rates of O(n1=4) – and are valid under milder and moreintuitive assumptions. These minimax upper rates coincide with lower rates up to a logarithmic factor, showing that the embedding method achieves nearly optimal rates. We study our sparse embedding algorithm in a reinforcementlearning task where the algorithm shows significant improvement in sparsity over an incomplete Cholesky decomposition.
AB - We demonstrate an equivalence between reproducing kernel Hilbert space (RKHS) embeddings of conditional distributions and vector-valued regressors.This connection introduces a natural regularized loss function which the RKHS embeddings minimise, providing an intuitive understanding of the embeddings and a justification for their use. Furthermore, the equivalence allows the application of vector-valued regression methods and results to the problem of learning conditional distributions. Using this link we derive a sparse version of the embedding by considering alternative formulations. Further, by applyingconvergence results for vector-valued regression to the embedding problem we derive minimax convergence rates which are O(log(n)=n) – compared to current state of the art rates of O(n1=4) – and are valid under milder and moreintuitive assumptions. These minimax upper rates coincide with lower rates up to a logarithmic factor, showing that the embedding method achieves nearly optimal rates. We study our sparse embedding algorithm in a reinforcementlearning task where the algorithm shows significant improvement in sparsity over an incomplete Cholesky decomposition.
M3 - Conference contribution/Paper
BT - Proceedings of the 29th International Conference on Machine Learning, Edinburgh, Scotland, 2012
ER -