Chapter 6 Going further

Here are a few links and suggestions about what else you might like to do with R.

6.1 Exporting figures

Exporting figures can be done using the following structure:

# Open up a blank plot file, pdf,jpeg etc.
<plot_function("file",...)>
# Write the plot to the file
<plot_object>
# Close the file
dev.off()

For example to export the volcano plot from Figure 5.7 to a pdf, we do:

# Open up a blank plot file, pdf,jpeg etc.
pdf("volcano_plot.pdf")

# Write the plot to the file
dat_tf %>%
  # Add a threhold for significant observations
  mutate(threshold = if_else(log_fc >= 2 & log_pval >= 1.3 |
                               log_fc <= -2 & log_pval >= 1.3,"A", "B")) %>%
  # Plot with points coloured according to the threshold
  ggplot(aes(log_fc,log_pval, colour = threshold)) +
  geom_point(alpha = 0.5) + # Alpha sets the transparency of the points
  # Add dotted lines to indicate the threshold, semi-transparent
  geom_hline(yintercept = 1.3, linetype = 2, alpha = 0.5) + 
  geom_vline(xintercept = 2, linetype = 2, alpha = 0.5) +
  geom_vline(xintercept = -2, linetype = 2, alpha = 0.5) +
  # Set the colour of the points
  scale_colour_manual(values = c("A"= "red", "B"= "black")) +
  xlab("log2 fold change") + ylab("-log10 p-value") + # Relabel the axes
  theme_minimal() + # Set the theme
  theme(legend.position="none") # Hide the legend

# Close the file
dev.off()

If I had saved the plot to an object called vplot I would call that object instead of making the plot using dat_tf as shown here.

Here is a general guide to the various formats you can export to.

Alternatively, if you are working in ggplot you can use the ggsave function as described in R4DS 28.7.

# Plot
dat_tf %>%
  # Add a threhold for significant observations
  mutate(threshold = if_else(log_fc >= 2 & log_pval >= 1.3 |
                               log_fc <= -2 & log_pval >= 1.3,"A", "B")) %>%
  # Plot with points coloured according to the threshold
  ggplot(aes(log_fc,log_pval, colour = threshold)) +
  geom_point(alpha = 0.5) + # Alpha sets the transparency of the points
  # Add dotted lines to indicate the threshold, semi-transparent
  geom_hline(yintercept = 1.3, linetype = 2, alpha = 0.5) + 
  geom_vline(xintercept = 2, linetype = 2, alpha = 0.5) +
  geom_vline(xintercept = -2, linetype = 2, alpha = 0.5) +
  # Set the colour of the points
  scale_colour_manual(values = c("A"= "red", "B"= "black")) +
  xlab("log2 fold change") + ylab("-log10 p-value") + # Relabel the axes
  theme_minimal() + # Set the theme
  theme(legend.position="none") # Hide the legend

# Use ggsave to save the plot as pdf
ggsave("volcano_plot.pdf", width = 20, height = 20, units = "cm")

6.2 Exporting data

There is a full manual for the import and export of data in R. However here are few pointers:

6.2.1 Writing to a file

One of the most portables way to share data is by writing to a csv file. These files can be opened in many programs. The tidyverse package contains two functions for csv files, write_csv and for Excel write_excel_csv. The latter form adds a bit of metadata that tells Excel about the file encoding. See R4DS writing to a file.

For example to write a csv file of dat_tf to a file called 04072018_transformed_data.csv to our working directory for sharing with a colleague using excel, the code is of the form <function>(<r-object>,"filename") like so:

write_excel_csv(dat_tf,"04072018_transformed_data.csv")

Note that the file name is a string and is in quotes.

6.2.2 For R

If you are exporting data to use yourself in R, the custom .rds format is a good choice and preserves the R structure.

In the tidyverse write_rds follows the same structure as write_csv.

You can read back in using read_rds.

6.3 Joining the R community

It’s worth joining the RStudio Community and following community members on Twitter such as Jenny Bryan, Hadley Wickham, Yihui Xie, Mara Averick, David Robinson and Julia Silge.

I would not reccomend DataCamp due to their misconduct.

And swirl is free.

6.4 Communication: creating reports, presentations and websites

R Markdown (Allaire et al. 2019) enables us to do literate programming, saving time as we can create analysis, reports, dashboards or web apps at the same time as writing code. R Markdown can use multiple programming languages. See also R4DS R Markdown and R4DS R Markdown formats and the R Markdown: The Definitive Guide.

As linked in Section 2.2.1 Claus Wilke has written a very nice guide to visualising data using R cvisualising data using R called Fundamentals of Data Visualization which is very helpful when it comes to thinking about how to best create figures for a report, poster or presentation.

You can use blogdown to build websites. I created this guide to buidling an academic website with blogdown.

6.4.1 Using bookdown to write a thesis dissertaion

I used the bookdown package to create these materials (Xie 2018) and you can use it to write a thesis dissertaion, as detailed very nicely in this blog by Edd Berry.

6.5 Machine Learning

If you are interested in machine learning, then TensorFlow is a good place to start, for example Leon Eyrich Jessen’s Deep Learning for Cancer Immunotherapy tutorial.

6.6 Version control

Another thing you may wish to consider is version control, “a system that records changes to a file or set of files over time so that you can recall specific versions later”.

To get started, have a look at these slides by Alice Bartlett and check out the Rstudio version control guide.

References

Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2019. Rmarkdown: Dynamic Documents for R. https://CRAN.R-project.org/package=rmarkdown.

Xie, Yihui. 2018. Bookdown: Authoring Books and Technical Documents with R Markdown. https://CRAN.R-project.org/package=bookdown.