In this post, you will learn how to create functions in R. Functions can be useful for all kinds of things, like speeding up coding by wrapping different functions together.


Basics

The basic idea behind a function is to feed the function (i.e. action) with an input, and retrieve a transformed output

Consider the example below of a “car wrecker”. The “function” is the car wrecking machine, the input is a car and the output is a cube. The function is the action of squeezing the car. On a very general level, the function could be written as something like this

1
2
3
 function(input=car)
 squeeze
 return(cube)

Functions in R

In R, to create a function, you simply need to use the command function. You need to save it with = or <-, giving it any name you like. Here I call the function my_function.

Importantly, you need to give the function the name of your input, here I use x.

my_function = function(x){
  print(x)
}

my_function(x = 10)

This is the same as

random_function_name = function(abc){
  print(abc)
}

random_function_name(abc = 10)

This function simply prints your input. Let’s now try a more complex action.


Let’s do a function to square numbers. Let’s call this function sq (or square_number if you prefer).

\begin{equation} \text{sq}(x) = x^2 \end{equation}

This looks like this

sq = function(x){
  y = x^2
  return(y)
}

If you test it sq(x=10), you get 100.

We can create a more general function with making the power variable. We call the power simply power, and we store the result into y

power_function = function(x, power){
  y = x^power
  return(y)
}

Again note that you can call power anything you want, p, w, whatever. The same applies to y, you can call it output or whatever you want.

This is the same, just with different names.

power_function = function(x, p){
  output = x^p
  return(output)
}

Functions as wrapper

One useful way to use functions is to wrap other functions (e.g. saving time).

Imagine you are often doing the following line of code

round(prop.table(table(mtcars$gear, mtcars$carb), margin = 2), digits = 3)

You are doing a tabulation (table), then you are doing row/columns percentages (prop.table) on this table and finally you are rounding the numbers (round). We have 3 functions here, table, prop.table and round.

Instead of repeating all the time this convoluted code, you can wrap it into a function.

Let’s call this function pt (prop-table), with an input that is a table (tab). We make variable the margin parameter prop.table and the digits parameter of the function round.

We can put some initial values in the function itself. Here I set the initial value of margin, which I call mar as 1 (meaning I want the row percentages) and the value of 3 for the digits, which I call dig.

This looks like this

pt = function(tab, mar = 1, dig = 3){
  out = round(prop.table(tab, margin = mar), digits = dig)
  return(out)
}

Let’s test our function pt

pt(table(mtcars$gear, mtcars$carb), mar = 1, dig = 3

Changing the dig to 1 means rounding to 1 digit and mar to 2 means we are now taking the colum percentages

pt(table(mtcars$gear, mtcars$carb), mar = 2, dig = 1)

We also input a table object

tab = table(mtcars$gear, mtcars$carb)
pt(tab, mar = 2, dig = 1)

Functions with lapply

Functions can be useful for situations when you have to repeat codes, such as in a bootstrapping framework.

Let’s say that you want to simulate the following code many times. You create two random normal distributions (x1 and x2) and then you take the squared differences ((x1-x2)^2).

n=50
x1 = rnorm(n)
x2 = rnorm(n)
(x1-x2)^2

One thing you can do to generate (re-run) this code many times is to wrap it into a function and then you use lapply to repeat it. This can also be computationally effective if you then use things like parallel versions of lapply.

We create a function called f_repeat (again call it what you wish). The only input we give it is i, because we are going to use lapply to loop it.

f_repeat = function(i){
  n=50
  x1 = rnorm(n)
  x2 = rnorm(n)
  output = (x1-x2)^2
  return(output)
}

You can run the function empty and it will work

f_repeat

Now we can loop it or repeat with lapply like this (50 times)

lapply(1:50, function(i) f_repeat(i))

Note that you can also use the function replicate and do the same thing

replicate(50, expr = f_repeat(), simplify = F)

I personally prefer lapply because it is more flexible in complex setting but it will work fine here.


All the functions

square_function = function(x){
  y = x^2
  return(y)
}

power_function = function(x, power){
  y = x^power
  return(y)
}

pt = function(tab, mar = 1, dig = 3){
  out = round(prop.table(tab, margin = mar), digits = dig)
  return(out)
}

f_repeat = function(i){
  n=50
  x1 = rnorm(n)
  x2 = rnorm(n)
  output = (x1-x2)^2
  return(output)
}

# testing the functions
square_function(x = 10)
power_function(x = 10, power = 3)
pt(table(mtcars$gear, mtcars$carb), mar = 2, dig = 1)
lapply(1:50, function(i) f_repeat(i))
replicate(50, expr = f_repeat(), simplify = F)