--- title: "Calling a Haskell function in R - a float expansion example" date: "2016-08-11" output: html_document: highlight: pygments --- <style type="text/css"> /* background code blocks */ ol.linenums li { background-color: #f5f5f5; } /* font size code blocks */ pre code { background-color: #f5f5f5; } /* inline code */ code { background-color: #f5f5f5; } </style> <style type="text/css"> a:link { text-decoration: underline; -webkit-text-decoration-color: yellow; -moz-text-decoration-color: yellow; text-decoration-color: yellow; border-bottom: yellow 0px solid } u { border-bottom: yellow; } </style> <style type="text/css">code{white-space: pre;}</style> <style type="text/css"> div.sourceCode, pre { overflow-x: auto; } table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode { margin: 0; padding: 0; vertical-align: baseline; border: none; } table.sourceCode { width: 100%; line-height: 100%; } td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; } td.sourceCode { padding-left: 5px; } code > span.kw { color: #007020; font-weight: bold; } /* Keyword */ code > span.dt { color: #902000; } /* DataType */ code > span.dv { color: #40a070; } /* DecVal */ code > span.bn { color: #40a070; } /* BaseN */ code > span.fl { color: #40a070; } /* Float */ code > span.ch { color: #4070a0; } /* Char */ code > span.st { color: #4070a0; } /* String */ code > span.co { color: #60a0b0; font-style: italic; } /* Comment */ code > span.ot { color: #007020; } /* Other */ code > span.al { color: #ff0000; font-weight: bold; } /* Alert */ code > span.fu { color: #06287e; } /* Function */ code > span.er { color: #ff0000; font-weight: bold; } /* Error */ code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ code > span.cn { color: #880000; } /* Constant */ code > span.sc { color: #4070a0; } /* SpecialChar */ code > span.vs { color: #4070a0; } /* VerbatimString */ code > span.ss { color: #bb6688; } /* SpecialString */ code > span.im { } /* Import */ code > span.va { color: #19177c; } /* Variable */ code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */ code > span.op { color: #666666; } /* Operator */ code > span.bu { } /* BuiltIn */ code > span.ex { } /* Extension */ code > span.pp { color: #bc7a00; } /* Preprocessor */ code > span.at { color: #7d9029; } /* Attribute */ code > span.do { color: #ba2121; font-style: italic; } /* Documentation */ code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ </style> ***(latest update : `r Sys.time()`)*** <br/> ```{r setup, include=FALSE} # use highlight: pygments knitr::opts_chunk$set(echo = TRUE, collapse=TRUE) ghc.path <- "/opt/ghc/7.10.3/bin" Sys.setenv(PATH = paste0(Sys.getenv()["PATH"], ":", ghc.path)) ``` In [the previous article](http://stla.github.io/stlapblog/posts/DyadicExpansion.html), I wrote a R function returning the binary expansion of a real number $u \in [0,1]$. In the present article, I will: - write a similar function in Haskell; - write this function in a way compatible with R, inside a module; - compile this module in a dynamic linker suitable for R (`dll` with Windows, `so` with Linux); - call the function from R through the dynamic linker. The creation of a Haskell function compatible with R is allowed by the [Foreign Function Interface (FFI)](https://wiki.haskell.org/Foreign_Function_Interface), in other words the `Foreign` module. I learnt how to do such things with the help of [this blog post by Neil Mitchell](http://neilmitchell.blogspot.be/2011/10/calling-haskell-from-r.html). ## Binary (and more) expansion in Haskell Let's go to Haskell. The `floatExpansion` function below is obtained by a small modification of the `floatToDigits` function of the `Numeric` package. It returns the expansion of a real number $u \in [0,1]$ in a given integer base. ```{r, engine='haskell', engine.path='ghc'} import Numeric let floatExpansion :: RealFloat a => Integer -> a -> [Int]; floatExpansion base u = replicate (- snd expansion) 0 ++ fst expansion where expansion = floatToDigits base u floatExpansion 2 0.125 ``` ## First dynamic linker: string output Firstly, I show how to make this function compatible with R when its output is a string instead of a list. It is easy to convert a list to a string in Haskell: ```{r, engine='haskell', engine.path='ghc'} show [0, 0, 1] ``` To get the output as a vector in R, more work is needed, and I will do it in the next section. ### Make the function compatible with R To make the function compatible with R, there are two rules: - Every argument must be a pointer (`Ptr`) to a C compatible type: `CInt`, `CDouble` or `CString`. - The result must be `IO ()`. A value of type `Ptr` represents a pointer to an object. This type is provided by the [`Foreign.Ptr` module](https://hackage.haskell.org/package/base-4.9.0.0/docs/Foreign-Ptr.html), which is imported via the `Foreign` module. The types `CInt`, `CDouble` and `CString` are provided by the [`Foreign.C` module](https://hackage.haskell.org/package/base-4.9.0.0/docs/Foreign-C.html). We end up with this module: ```haskell -- FloatExpansionString.hs {-# LANGUAGE ForeignFunctionInterface #-} module FloatExpansion where import Foreign import Foreign.C import Numeric foreign export ccall floatExpansionR :: Ptr CInt -> Ptr CDouble -> Ptr CString -> IO () floatExpansionR :: Ptr CInt -> Ptr CDouble -> Ptr CString -> IO () floatExpansionR base u result = do base <- peek base u <- peek u expansion <- newCString $ show $ floatExpansion (toInteger base) u poke result $ expansion floatExpansion :: RealFloat a => Integer -> a -> [Int] floatExpansion base u = replicate (- snd expansion) 0 ++ fst expansion where expansion = floatToDigits base u ``` ### Compilation We need the following C file to do the compilation, as explained in the [GHC users guide](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/win32-dlls.html#making-dlls-to-be-called-from-other-languages). ```c // StartEnd.c #include <Rts.h> void HsStart() { int argc = 1; char* argv[] = {"ghcDll", NULL}; // argv must end with NULL // Initialize Haskell runtime char** args = argv; hs_init(&argc, &args); } void HsEnd() { hs_exit(); } ``` Then we compile the library with the command: ```bash ghc -shared -fPIC -dynamic -lHSrts-ghc7.10.3 FloatExpansionString.hs StartEnd.c -o FloatExpansionString.so ``` This creates the dynamic linker `FloatExpansionString.so`. ### Call in R We firstly load the library with: ```{r} dyn.load("FloatExpansionString.so") .C("HsStart") ``` And we invoke the function with the help of the `.C` function, as follows: ```{r} .C("floatExpansionR", base=2L, x=0.125, result="")$result ``` It works. But it would be better to have a vector as output. ## Second dynamic linker: vector output To get the output as a vector, the additional modules we need are: `Foreign.R`, `Foreign.R.Types` and `Data.Vector.SEXP`. They are provided by the [`inline-r` package](https://hackage.haskell.org/package/inline-r). The `[Int]` type of the output list of the `floatExpansion` function must be converted to `[Int32]`. We write a simple function `intToInt32` to help us to do the conversion. It works with the help of the `Data.Int` module which is imported via the `Foreign` module. We end up with this module: ```haskell -- FloatExpansion.hs {-# LANGUAGE ForeignFunctionInterface #-} {-# LANGUAGE DataKinds #-} module FloatExpansion where import Foreign import Foreign.C import Foreign.R (SEXP) import qualified Foreign.R.Type as R import qualified Data.Vector.SEXP as DV import Numeric foreign export ccall floatExpansionR :: Ptr CInt -> Ptr CDouble -> Ptr (SEXP s R.Int) -> IO () floatExpansionR :: Ptr CInt -> Ptr CDouble -> Ptr (SEXP s R.Int) -> IO () floatExpansionR base u result = do base <- peek base u <- peek u let expansion = map intToInt32 $ floatExpansion (toInteger base) u poke result $ DV.toSEXP $ DV.fromList expansion intToInt32 :: Int -> Int32 intToInt32 i = fromIntegral (i :: Int) :: Int32 floatExpansion :: RealFloat a => Integer -> a -> [Int] floatExpansion base u = replicate (- snd expansion) 0 ++ fst expansion where expansion = floatToDigits base u ``` We compile the library as before. And we load it in R as before: ```{r} dyn.load("FloatExpansion.so") .C("HsStart") ``` And we invoke the function with the help of the `.C` function, as follows: ```{r} .C("floatExpansionR", base=2L, x=0.125, result=list(0L))$result ``` In fact, the output is a list with one element, the desired vector. Let's write a user-friendly function: ```{r} floatExpand <- function(x, base=2L){ .C("floatExpansionR", base=as.integer(base), x=as.double(x), result=list(integer(1)))$result[[1]] } ``` ```{r, include=FALSE} num2dyadic <- function(u, nmax=1024L){ out <- integer(nmax) i <- j <- 0L while(u>0 && i<nmax){ j <- 1L + max(0L,floor(-log2(u*(1+.Machine$double.eps^.5)))) if(i+j <= nmax){ i <- i + j out[i] <- 1L u <- 2L^j*u - 1L }else{ i <- nmax } } return(out[1:i]) } ``` Let's compare it with my R function `num2dyadic`: ```{r} u <- runif(5000) system.time(lapply(u, floatExpand)) system.time(lapply(u, num2dyadic)) ``` It is faster. And I have checked that the two functions always return the same results. Moreover the "RHaskell" function allows more than the binary expansion, for example the ternary expansion: ```{r} floatExpand(1/3+1/27, base=3) ``` Quite nice, isn't it ?