How to create functions with R
In this post, you will learn how to create functions
in R
. Functions can be useful for all kinds of things, like speeding up coding by wrapping different functions together.
Basics
The basic idea behind a function is to feed the function (i.e. action) with an input, and retrieve a transformed output
Consider the example below of a “car wrecker”. The “function” is the car wrecking machine, the input is a car and the output is a cube. The function is the action of squeezing the car. On a very general level, the function could be written as something like this
1
2
3
function(input=car)
squeeze
return(cube)
Functions in R
In R
, to create a function, you simply need to use the command function
. You need to save it with =
or <-
, giving it any name you like. Here I call the function my_function
.
Importantly, you need to give the function the name of your input, here I use x
.
my_function = function(x){
print(x)
}
my_function(x = 10)
This is the same as
random_function_name = function(abc){
print(abc)
}
random_function_name(abc = 10)
This function simply prints your input. Let’s now try a more complex action.
Let’s do a function to square numbers. Let’s call this function sq
(or square_number
if you prefer).
\begin{equation} \text{sq}(x) = x^2 \end{equation}
This looks like this
sq = function(x){
y = x^2
return(y)
}
If you test it sq(x=10)
, you get 100
.
We can create a more general function with making the power variable. We call the power simply power
, and we store the result into y
power_function = function(x, power){
y = x^power
return(y)
}
Again note that you can call power
anything you want, p
, w
, whatever. The same applies to y
, you can call it output
or whatever you want.
This is the same, just with different names.
power_function = function(x, p){
output = x^p
return(output)
}
Functions as wrapper
One useful way to use functions is to wrap other functions (e.g. saving time).
Imagine you are often doing the following line of code
round(prop.table(table(mtcars$gear, mtcars$carb), margin = 2), digits = 3)
You are doing a tabulation (table
), then you are doing row/columns percentages (prop.table
) on this table and finally you are rounding the numbers (round
). We have 3 functions here, table
, prop.table
and round
.
Instead of repeating all the time this convoluted code, you can wrap it into a function.
Let’s call this function pt
(prop-table), with an input that is a table (tab
). We make variable the margin
parameter prop.table
and the digits
parameter of the function round
.
We can put some initial values in the function itself. Here I set the initial value of margin
, which I call mar
as 1 (meaning I want the row percentages) and the value of 3 for the digits, which I call dig
.
This looks like this
pt = function(tab, mar = 1, dig = 3){
out = round(prop.table(tab, margin = mar), digits = dig)
return(out)
}
Let’s test our function pt
pt(table(mtcars$gear, mtcars$carb), mar = 1, dig = 3
Changing the dig
to 1 means rounding to 1 digit and mar to 2 means we are now taking the colum percentages
pt(table(mtcars$gear, mtcars$carb), mar = 2, dig = 1)
We also input a table object
tab = table(mtcars$gear, mtcars$carb)
pt(tab, mar = 2, dig = 1)
Functions with lapply
Functions can be useful for situations when you have to repeat codes, such as in a bootstrapping framework.
Let’s say that you want to simulate the following code many times. You create two random normal distributions (x1
and x2
) and then you take the squared differences ((x1-x2)^2
).
n=50
x1 = rnorm(n)
x2 = rnorm(n)
(x1-x2)^2
One thing you can do to generate (re-run) this code many times is to wrap it into a function and then you use lapply
to repeat it. This can also be computationally effective if you then use things like parallel versions of lapply
.
We create a function called f_repeat
(again call it what you wish). The only input we give it is i
, because we are going to use lapply
to loop it.
f_repeat = function(i){
n=50
x1 = rnorm(n)
x2 = rnorm(n)
output = (x1-x2)^2
return(output)
}
You can run the function empty and it will work
f_repeat
Now we can loop it or repeat with lapply
like this (50 times)
lapply(1:50, function(i) f_repeat(i))
Note that you can also use the function replicate
and do the same thing
replicate(50, expr = f_repeat(), simplify = F)
I personally prefer lapply because it is more flexible in complex setting but it will work fine here.