Chapter 6 Going further
Here are a few links and suggestions about what else you might like to do with R.
6.1 Exporting figures
Exporting figures can be done using the following structure:
# Open up a blank plot file, pdf,jpeg etc.
# Write the plot to the file
# Close the file
For example to export the volcano plot from Figure 5.7 to a pdf, we do:
# Open up a blank plot file, pdf,jpeg etc.
# Write the plot to the file
dat_tf %>%
# Add a threhold for significant observations
mutate(threshold = if_else(log_fc >= 2 & log_pval >= 1.3 |
log_fc <= -2 & log_pval >= 1.3,"A", "B")) %>%
# Plot with points coloured according to the threshold
ggplot(aes(log_fc,log_pval, colour = threshold)) +
geom_point(alpha = 0.5) + # Alpha sets the transparency of the points
# Add dotted lines to indicate the threshold, semi-transparent
geom_hline(yintercept = 1.3, linetype = 2, alpha = 0.5) +
geom_vline(xintercept = 2, linetype = 2, alpha = 0.5) +
geom_vline(xintercept = -2, linetype = 2, alpha = 0.5) +
# Set the colour of the points
scale_colour_manual(values = c("A"= "red", "B"= "black")) +
xlab("log2 fold change") + ylab("-log10 p-value") + # Relabel the axes
theme_minimal() + # Set the theme
theme(legend.position="none") # Hide the legend
# Close the file
If I had saved the plot to an object called vplot
I would call that object
instead of making the plot using dat_tf
as shown here.
Here is a general guide to the various formats you can export to.
Alternatively, if you are working in ggplot
you can use the ggsave
as described in R4DS 28.7.
# Plot
dat_tf %>%
# Add a threhold for significant observations
mutate(threshold = if_else(log_fc >= 2 & log_pval >= 1.3 |
log_fc <= -2 & log_pval >= 1.3,"A", "B")) %>%
# Plot with points coloured according to the threshold
ggplot(aes(log_fc,log_pval, colour = threshold)) +
geom_point(alpha = 0.5) + # Alpha sets the transparency of the points
# Add dotted lines to indicate the threshold, semi-transparent
geom_hline(yintercept = 1.3, linetype = 2, alpha = 0.5) +
geom_vline(xintercept = 2, linetype = 2, alpha = 0.5) +
geom_vline(xintercept = -2, linetype = 2, alpha = 0.5) +
# Set the colour of the points
scale_colour_manual(values = c("A"= "red", "B"= "black")) +
xlab("log2 fold change") + ylab("-log10 p-value") + # Relabel the axes
theme_minimal() + # Set the theme
theme(legend.position="none") # Hide the legend
# Use ggsave to save the plot as pdf
ggsave("volcano_plot.pdf", width = 20, height = 20, units = "cm")
6.2 Exporting data
There is a full manual for the import and export of data in R. However here are few pointers:
6.2.1 Writing to a file
One of the most portables way to share data is by writing to a csv file. These
files can be opened in many programs. The tidyverse package contains two
functions for csv files, write_csv
and for Excel write_excel_csv
. The
latter form adds a bit of metadata that tells Excel about the file encoding.
See R4DS writing to a file.
For example to write a csv file of dat_tf
to a file called
to our working directory for sharing
with a colleague using excel, the code is of the form
like so:
Note that the file name is a string and is in quotes.
6.2.2 For R
If you are exporting data to use yourself in R, the custom .rds
format is a
good choice and preserves the R structure.
In the tidyverse write_rds
follows the same structure as write_csv
You can read back in using read_rds
6.3 Joining the R community
It’s worth joining the RStudio Community and following community members on Twitter such as Jenny Bryan, Hadley Wickham, Yihui Xie, Mara Averick, David Robinson and Julia Silge.
I would not reccomend DataCamp due to their misconduct.
And swirl is free.
6.4 Communication: creating reports, presentations and websites
R Markdown (Allaire et al. 2019) enables us to do literate programming, saving time as we can create analysis, reports, dashboards or web apps at the same time as writing code. R Markdown can use multiple programming languages. See also R4DS R Markdown and R4DS R Markdown formats and the R Markdown: The Definitive Guide.
As linked in Section 2.2.1 Claus Wilke has written a very nice guide to visualising data using R cvisualising data using R called Fundamentals of Data Visualization which is very helpful when it comes to thinking about how to best create figures for a report, poster or presentation.
You can use blogdown to build websites. I created this guide to buidling an academic website with blogdown.
6.5 Machine Learning
If you are interested in machine learning, then TensorFlow is a good place to start, for example Leon Eyrich Jessen’s Deep Learning for Cancer Immunotherapy tutorial.
6.6 Version control
Another thing you may wish to consider is version control, “a system that records changes to a file or set of files over time so that you can recall specific versions later”.
To get started, have a look at these slides by Alice Bartlett and check out the Rstudio version control guide.
Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2019. Rmarkdown: Dynamic Documents for R.
Xie, Yihui. 2018. Bookdown: Authoring Books and Technical Documents with R Markdown.