ORIGINAL RESEARCH article
Use ggbreak to effectively utilize plotting space to deal with large datasets and outliers
- 1Southern Medical University, China
With the rapid increase of large-scale datasets, biomedical data visualization is facing challenges. The data may be large, have different orders of magnitude, contain extreme values, and the data distribution is not clear. Here we present an R package ggbreak that allows users to create broken axes using ggplot2 syntax. It can effectively use the plotting area to deal with large datasets (especially for long sequential data), data with different magnitudes, and contain outliers. The ggbreak package increases the available visual space for a better presentation of the data and detailed annotation, thus improves our ability to interpret the data The ggbreak package is fully compatible with ggplot2 and it is easy to superpose additional layers and applies scale and theme to adjust the plot using the ggplot2 syntax. The ggbreak package is open-source software released under the Artistic-2.0 license, and it is freely available on CRAN (https://CRAN.R-project.org/package=ggbreak) and Github (https://github.com/YuLab-SMU/ggbreak).
Keywords: Axis break, gap plot, long sequential data, outlier, Ggplot2
Received: 13 Sep 2021;
Accepted: 12 Oct 2021.
Copyright: © 2021 Yu, Xu, Chen, Feng, Zhan and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Prof. Guangchuang Yu, Southern Medical University, Guangzhou, China, firstname.lastname@example.org