--- title: "The `kantorovich` package" date: "2016-01-11" output: html_document --- ```{r setup0, include=FALSE} knitr::opts_chunk$set(collapse=TRUE) ``` I have just released a first version of the `kantorovich` package on [Github](https://github.com/stla/kantorovich). It is based on the code of my post [Using R to compute the Kantorovich distance](http://stla.github.io/stlapblog/posts/KantorovichWithR.html). This package has two main features: * It computes the extreme joinings of two probability measures $\mu$ and $\nu$ on a finite set; * It computes the Kantorovich distance between these two measures, for a given distance on their finite state space. With the help of the `rccd` and `gmp` packages, the `kantorovich` package can return the *exact* values of the extreme joinings and of the Kantorovich distance. ## Quick example As an example, take $\mu$ and $\nu$ the uniform probability measures on a finite set having three elements. ```{r} mu <- nu <- c(1/3, 1/3, 1/3) ``` The `ejoinings` function returns the extreme joinings of $\mu$ and $\nu$. In this case these are the $6!$ permutation matrices: ```{r} library(kantorovich) ejoinings(mu, nu) ``` Since `mu` and `nu` were unnamed, the vector names `c(1,2,3)` has been automatically assigned to them. The Kantorovich distance between $\mu$ and $\nu$ is relative to a given distance on the state space of $\mu$ and $\nu$, represented by their vector names. By default, the `kantorovich` package takes the discrete $0\mathrm{-}1$ distance. Obviously the Kantorovich distance is $0$ on this example, because $\mu=\nu$. ```{r} kantorovich(mu, nu) ``` Note the message returned by both the `ejoinings` and the `kantorovich` functions. In order to get exact results, use rational numbers with the `gmp` package: ```{r, message=FALSE, warning=FALSE} library(gmp) mu <- nu <- as.bigq(c(1,1,1), c(3,3,3)) # shorter: as.bigq(c(1,1,1), 3) ejoinings(mu, nu) ``` ## User-specified distance Let us try an example with a user-specified distance. Let's say that the state space of $\mu$ and $\nu$ is $\{a, b, c\}$, and then we use `c("a","b","c")` as the vector names. ```{r} mu <- as.bigq(c(1,2,4), 7) nu <- as.bigq(c(3,1,5), 9) names(mu) <- names(nu) <- c("a", "b", "c") ``` ### Define distance as a matrix The distance can be specified as a matrix. Assume the distance $\rho$ is given by $\rho(a,b)=1$, $\rho(a,c)=2$ and $\rho(b,c)=4$. The `bigq` matrices offered by the `gmp` package do not handle dimension names. But, in our example, the distance $\rho$ takes only integer values, therefore one can use a numerical matrix: ```{r} M <- matrix( c( c(0, 1, 2), c(1, 0, 4), c(2, 4, 0) ), byrow = TRUE, nrow = 3, dimnames = list(c("a","b","c"), c("a","b","c"))) kantorovich(mu, nu, dist=M) ``` If the distance takes rational values, one can proceed as before with a character matrix: ```{r} M <- matrix( c( c("0", "3/13", "2/13"), c("1/13", "0", "4/13"), c("2/13", "4/13", "0") ), byrow = TRUE, nrow = 3, dimnames = list(c("a","b","c"), c("a","b","c"))) kantorovich(mu, nu, dist=M) ``` ### Define distance as a function One can enter the distance as a function. In such an example, this does not sound convenient: ```{r} rho <- function(x,y){ if(x==y) { return(0) } else { if(x=="a" && y=="b") return(1) if(x=="a" && y=="c") return(2) if(x=="b" && y=="c") return(4) return(rho(y,x)) } } kantorovich(mu, nu, dist=rho) ``` Using a function could be more convenient in the case when the names are numbers: ```{r} names(mu) <- names(nu) <- 1:3 ``` But one has to be aware that there are in character mode: ```{r} names(mu) ``` Thus, one can define a distance function as follows, for example with $\rho(x,y)=\frac{x-y}{x+y}$: ```{r} rho <- function(x,y){ x <- as.numeric(x); y <- as.numeric(y) return(as.bigq(x-y, x+y)) } kantorovich(mu, nu, dist=rho) ``` ## A non-square example The `kantorovich` package also handles the case when `mu` and `nu` have different lengths, such as this example: ```{r} mu <- as.bigq(c(1,2,4), 7) nu <- as.bigq(c(3,1), 4) names(mu) <- c("a", "b", "c") names(nu) <- c("b", "c") ejoinings(mu, nu) kantorovich(mu, nu) ```