GRS Workshop Introduction to ggplot
Getting starting - why ggplot?
Eugene Hickey
March 9th 2022 About me
Eugene Hickey
lecturer in physics
Technological University Dublin Acknowledgments
Raffaella Salvante, co-pilot for this workshop and administrator of the Graduate Research School.
Graduate Research School for the opportunity to provide this workshop
- xaringan 📦 developed by Yihui Xie
- flipbookr 📦 developed by Gina Reynolds
- learnr 📦 developed by Garrick Aden-Buie Target Audience - graduate students looking for better ways to present their data - people currently using tools like MS Excel for visualisations --- # Why R? - working with a mouse isn't reproducible - difficult to log exactly what you've done - hard to repeat for a series of diagrams - difficult to be inspired by other people's work - good to separate sources of data and the visualisations that disply them - R uses series of commands that input, manipulate, and display data - lots of contributors around the world, diverse fields --- # Why ggplot? - while some plots can be easier to produce using base graphics .pull-left[ ```r hist(LOS_model$Age) ``` <img src="data:image/png;base64,#01-why-ggplot_files/figure-html/base_hist-1.png" width="80%" /> ] .pull-right[ ```r ggplot(data = LOS_model, aes(Age)) + geom_histogram(bins = 10) ``` <img src="data:image/png;base64,#01-why-ggplot_files/figure-html/ggplot_hist-1.png" width="80%" /> ] --- # Why ggplot? - anything moderately complicated is better in ggplot .pull-left[ ```r # David Robinson # par(mar = c(1.5, 1.5, 1.5, 1.5)) colors <- 1:6 names(colors) <- unique(top_data$nutrient) # legend approach from m <- matrix(c(1:20, 21, 21, 21, 21), nrow = 6, ncol = 4, byrow = TRUE) layout(mat = m, heights = c(.18, .18, .18, .18, .18, .1)) top_data$combined <- paste(top_data$name, top_data$systematic_name) for (gene in unique(top_data$combined)) { sub_data <- filter(top_data, combined == gene) plot(expression ~ rate, sub_data, col = colors[sub_data$nutrient], main = gene) for (n in unique(sub_data$nutrient)) { m <- lm(expression ~ rate, filter(sub_data, nutrient == n)) if (!$coefficients[2])) { abline(m, col = colors[n]) } } } # create a new plot for legend plot(1, type = "n", axes = FALSE, xlab = "", ylab = "") legend("top", names(colors), col = colors, horiz = TRUE, lwd = 4) ``` ] .pull-right[ ![](data:image/png;base64,#01-why-ggplot_files/figure-html/baseplot-label-out-1.png)<!-- --> ] --- # Why ggplot? - anything moderately complicated is better in ggplot .pull-left[ ```r ggplot(top_data, aes(rate, expression, color = nutrient)) + geom_point(show.legend = FALSE) + geom_smooth(method = "lm", se = FALSE, show.legend = FALSE) + facet_wrap(~systematic_name, scales = "free_y") ``` ] .pull-right[ ![](data:image/png;base64,#01-why-ggplot_files/figure-html/ggplot-label-out-1.png)<!-- --> ] --- # Lots of addin packages for ggplot, ggalignment, ggallin, ggalluvial, ggalt, ggamma, gganimate, ggarchery, ggasym, ggbeeswarm, ggborderline, ggbreak, ggBubbles, ggbuildr, ggbump, ggchangepoint, ggcharts, ggChernoff, ggcleveland, ggconf, ggcorrplot, ggdag, ggdark, ggDCA, ggdemetra, ggdendro, ggdensity, ggdist, ggdmc, gge, ggeasy, ggedit, ggeffects, ggenealogy, ggESDA, ggetho, ggExtra, ggfacto, ggfan, ggfittext, ggfocus, ggforce, ggformula, ggfortify, ggfun, ggfx, gggap, gggenes, ggghost, gggibbous, gggrid, ggh4x, gghalfnorm, gghalves, gghdr, ggheatmap, gghighlight, gghilbertstrings, ggHoriPlot, ggimage, ggimg, gginference, gginnards, ggip, ggiraph, ggiraphExtra, ggjoy, gglasso, gglm, gglogo, ggloop, gglorenz, ggm, ggmap, ggmatplot, ggmcmc, ggmix, ggmosaic, ggmotif, ggmr, ggmuller, ggmulti, ggnetwork, ggnewscale, ggnormalviolin, ggnuplot, ggOceanMaps, ggokabeito, ggpacman, ggpage, ggparallel, ggparliament, ggparty, ggpattern, ggperiodic, ggplot.multistats, ggplot2, ggplot2movies, ggplotAssist, ggplotgui, ggplotify, ggplotlyExtra, ggpmisc, ggPMX, ggpointdensity, ggpol, ggpolar, ggpolypath, ggpp, ggprism, ggpubr, ggpval, ggQC, ggQQunif, ggquickeda, ggquiver, ggraph, ggraptR, ggrasp, ggrastr, ggrepel, ggResidpanel, ggridges, ggrisk, ggROC, ggroups, ggsci, ggseas, ggseg, ggseg3d, ggseqlogo, ggshadow, ggside, ggsignif, ggsn, ggsoccer, ggsolvencyii, ggsom, ggspatial, ggspectra, ggstance, ggstar, ggstatsplot, ggstream, ggstudent, ggswissmaps, ggtea, ggtern, ggtext, ggThemeAssist, ggthemes, ggtikz, ggTimeSeries, ggupset, ggvenn, ggVennDiagram, ggversa, ggvis, ggvoronoi, ggwordcloud, ggx --- # And others, that make ggplots that can then be modified and treated as such .pull-left[ ```r fviz_cluster_example ``` ![](data:image/png;base64,#01-why-ggplot_files/figure-html/unnamed-chunk-4-1.png)<!-- --> ] .pull-right[ ```r fviz_cluster_example + theme_classic() ``` ![](data:image/png;base64,#01-why-ggplot_files/figure-html/unnamed-chunk-5-1.png)<!-- --> ] --- # Other reasons - ggplot is easy to make publication-ready - easier to make sequence of visualisations - fits in nicely with the rest of the tidyverse