---
title: "The change of variables of formula for probability density functions"
date: "2015-12-25"
output:
html_document:
keep_md: yes
---
```{r setup0, echo=FALSE}
knitr::opts_chunk$set(fig.path="assets/fig/ChangeOfVariables-")
source('assets/Rfunctions/polyCurve_v1.R', encoding='UTF-8')
source('assets/Rfunctions/giftools.R', encoding='UTF-8')
library(scales) # to use alpha()
```
```{r parameters, include=FALSE}
sigma <- 0.5
mu <- 4*sigma
rgba <- col2rgb("gray")
tgray <- rgb(rgba[1,1], rgba[2,1], rgba[3,1], alpha=95, maxColorValue=255)
rgba <- col2rgb("pink")
tpink <- rgb(rgba[1,1], rgba[2,1], rgba[3,1], alpha=95, maxColorValue=255)
cex.axis <- 1.2
lwd.axis <- 2
ats <- sigma*seq(-4,4,by=2)+mu
labs <- c(expression(mu-4*sigma), expression(mu-2*sigma), expression(mu), expression(mu+2*sigma), expression(mu+4*sigma))
M <- exp(mu+1.8)
knorm <- 16; klnorm <- knorm # scales
cex.explabel <- 1.3
giffile0 <- "lognormal.gif"
giffile <- paste0("./assets/gif/", giffile0)
```
"How to derive the *pdf* of the random variable $Y=h(X)$ when one knows the *pdf* of the random variable $X$?". This question very frequently occurs on the internet.
For a general function $h$, there is no direct formula to get the *pdf* of the random variable $Y=h(X)$ knowing the *pdf* of $X$ (assuming $X$ has a *pdf*). There is a formula in case when $h$ is a *differentiable one-to-one mapping* from the range (the *support*, I should say) of $X$ to the range of $Y$.
Take for example a random variable $X \sim {\cal N}(\mu, \sigma^2)$ and set $Y=\exp(X)$.
The animation below shows some simulations of $X$ and the corresponding values of $Y$. The density of $X$ is shown in blue and the one of $Y$ is shown in orange in the vertical direction.
```{r animation, eval=!file.exists(giffile), message=FALSE, echo=FALSE, results='hide'}
library(animation)
nsims <- 50
saveGIF({
for (i in 1:nsims) {
par(mar=c(3, 1, 0.1, 1))
plot(0, 0, type = "n", lwd=4, col="blue", xlim=c(-2, mu+4*sigma), ylim=c(0, M),
ylab=NA, xlab=NA, axes=FALSE, yaxs="i")
axis(1, at=ats, labels=labs, col="grey", lwd=lwd.axis, cex.axis=cex.axis)
abline(v=0, col="grey", lwd=lwd.axis)
curve(knorm*dnorm(x,mu,sigma), add=TRUE, lwd=4, col="blue", from=-0.1)
curve(exp(x), col="green", add=TRUE, lwd=3)
text(x=mu+2.6*sigma, y=exp(mu+3.5*sigma),"exp(x)", cex=cex.explabel, col="green")
x <- seq(0, M, length=100)
y <- klnorm*dlnorm(x,mu,sigma)
lines(-y,x, lwd=4, col="darkorange")
# simulations
set.seed(31415)
simulation <- rnorm(i,mu,sigma)
#
cexpoints <- rep(1,i)
cexpoints[i] <- 1.8
cols <- alpha(rep("red", i), 0.5)
cols[i] <- alpha("red",1)
points(x=simulation, y=rep(0,i), col=cols, pch=19, cex=cexpoints, xpd=TRUE)
points(x=rep(0,i), y=exp(simulation), col=cols, pch=19, cex=cexpoints)
x0 <- simulation[i]
segments(x0,0,x0,exp(x0), col="green", lty=2)
segments(x0,exp(x0),0,exp(x0), col="green", lty=2)
}
}, movie.name = giffile0, interval = 1,
ani.width = 600, ani.height = 370, autobrowse=FALSE, loop=0
)
gif_compress(giffile0, giffile, extra.opts="--colors 256")
file.remove(giffile0)
```
```{r, results='asis', echo=FALSE}
cat(sprintf('
', giffile))
```
Now the question is: knowing the density $f_{\textrm{blue}}$ of $X$, what is the density $f_{\textrm{orange}}$ of $Y$ ?
Taking a point $y$ in the range of $Y$, the density $f_{\textrm{orange}}$ provides the probability that $Y$ belongs to a small area $\mathrm{d}y$ around $y$ by the formula
$$
\Pr(Y \in \mathrm{d}y) \approx f_{\textrm{orange}}(y)|\mathrm{d}y|
$$
where $|\mathrm{d}y|$ denotes the length of the small interval $\mathrm{d}y$. This formula is not a rigorous one, but it allows to do exact derivations when it is carefully used. The probability $\Pr(Y \in \mathrm{d}y)$ is the pink area on this figure:
```{r figure, fig.width=6, fig.height=3.7, fig.align='center', echo=FALSE}
x0 <- 14; x1 <- 18 # shaded area
par(mar=c(1.5, 1, 0.1, 1))
# normal density
xx <- seq(mu-4*sigma, mu+4*sigma, length.out = 20)
yy <- knorm*dnorm(xx,mu,sigma)
plot(xx, yy, type = "n", lwd=4, col="blue", xlim=c(-2, mu+4*sigma), ylim=c(0, M),
panel.first =
polyCurve(xx, yy, from = log(x0), to = log(x1),
col = tgray, border = tgray),
ylab=NA, xlab=NA, axes=FALSE, yaxs="i")
curve(exp(x), col="green", add=TRUE, lwd=3)
text(x=mu+2.6*sigma,y=exp(mu+3.5*sigma),"exp(x)", cex=cex.explabel, col="green")
curve(knorm*dnorm(x,mu,sigma), add=TRUE, lwd=4, col="blue", from=-0.1)
#axis(1, at=ats, labels=labs, col="grey", lwd=lwd.axis, cex.axis=cex.axis)
abline(v=0, col="grey", lwd=lwd.axis)
abline(h=0, col="grey", lwd=lwd.axis)
# lognormal density
x <- seq(0, M, length=100)
y <- klnorm*dlnorm(x,mu,sigma)
y0 <- klnorm*dlnorm(x0,mu,sigma); y1 <- klnorm*dlnorm(x1,mu,sigma)
xx <- seq(-y0, 0, length.out = 30)
F <- approxfun(-klnorm*dlnorm(seq(x0,x1, length.out = 20),mu,sigma), seq(x0,x1, length.out = 20))
yy <- F(pmin(xx, -y1))
lines(-y,x, lwd=4, col="darkorange",
panel.first =
polyCurve(xx, yy, from = min(xx), to = max(xx),
col = tpink, border = tpink))
segments(xx[1], yy[1], log(x0), x0, lty="dashed")
segments(-klnorm*dlnorm(x1, mu,sigma), x1, log(x1), x1, lty="dashed")
segments(log(x0), x0, log(x0), 0, lty="dashed")
segments(log(x1), x1, log(x1), 0, lty="dashed")
#
axis(1, at=mean(c(log(x0),log(x1))), labels=expression(italic(x)), col="grey", lwd=lwd.axis, cex.axis=cex.axis, padj=-1.3)
axis(2, pos=0, at=mean(c(x0,x1)), labels=NA, col="grey", lwd=lwd.axis, cex.axis=cex.axis, tcl=0.35)
text(x=0, y=.97*mean(c(x0,x1)), labels=expression(italic(y)), pos=4, offset=0.6)
```
This probability also equals the probability $\Pr(X \in \mathrm{d}x)$, shown by the grey area below the blue curve, where $x=\log(y)$ because of $y=\exp(x)$, and $\mathrm{d}x$ is the small interval around $x$. And one similarly has
$$
\Pr(X \in \mathrm{d}x) \approx f_{\textrm{blue}}(x)|\mathrm{d}x|.
$$
It is clear that $|\mathrm{d}x| \neq |\mathrm{d}y|$. Recall that these two lengths are very small, hence the green function - now let us call it $h$ instead of $\exp$ - is like a segment on the interval $\mathrm{d}x$, and the slope of this segment is the value $h'(x)$ of the derivative of $h$ at $x$. Therefore $|\mathrm{d}y| \approx h'(x)|\mathrm{d}x|$, and we finally get
$$
\Pr(Y \in \mathrm{d}y) = \Pr(X \in \mathrm{d}x) \approx f_{\textrm{blue}}(x)\frac{|\mathrm{d}y|}{h'(x)}.
$$
Expressing the right-hand side in terms of $y=h(x)$ instead of $x=h^{-1}(y)$, this gives
$$
\Pr(Y \in \mathrm{d}y) \approx f_{\textrm{blue}}\bigl(h^{-1}(y)\bigr)\frac{|\mathrm{d}y|}{h'\bigl(h^{-1}(y)\bigr)},
$$
or, because of $\frac{1}{h'\bigl(h^{-1}(y)\bigr)}={(h^{-1})}'(y)$, this can be written
$$
\Pr(Y \in \mathrm{d}y) \approx {(h^{-1})}'(y)\times f_{\textrm{blue}}\bigl(h^{-1}(y)\bigr)|\mathrm{d}y|.
$$
By identifying this formula by the one defining the density of $Y$:
$$
\Pr(Y \in \mathrm{d}y) \approx f_{\textrm{orange}}(y)|\mathrm{d}y|,
$$
we finally get
$$
f_{\textrm{orange}}(y) = {(h^{-1})}'(y)\times f_{\textrm{blue}}\bigl(h^{-1}(y)\bigr).
$$
This is the so-called *change of variables formula*.
Be careful about one point: this formula is not correct in general. In my example, the factor $k$ relating $|\mathrm{d}x|$ and $|\mathrm{d}y|$ by $|\mathrm{d}y| \approx k|\mathrm{d}x|$ is $k = h'(x)$ because $h'(x)>0$ in this example ($h$ is increasing), and one has to take $k=-h'(x)$ in the case $h'(x)<0$. The general formula includes the absolute value:
$$
\boxed{f_{\textrm{orange}}(y) = \bigl|{(h^{-1})}'(y)\bigr|\times f_{\textrm{blue}}\bigl(h^{-1}(y)\bigr)}.
$$