Open-Source Biomedical Image Analysis Models: A Meta-Analysis and Continuous Survey

Li, Rui; Sharma, Vaibhav; Thangamani, Subasini; Yakimovich, Artur

doi:10.3389/fbinf.2022.912809

REVIEW article

Front. Bioinform., 05 July 2022

Sec. Computational BioImaging

Volume 2 - 2022 | https://doi.org/10.3389/fbinf.2022.912809

Open-Source Biomedical Image Analysis Models: A Meta-Analysis and Continuous Survey

1. Center for Advanced Systems Understanding (CASUS), Helmholtz-Zentrum Dresden-Rossendorf e. V. (HZDR), Görlitz, Germany
2. Bladder Infection and Immunity Group (BIIG), Department of Renal Medicine, Division of Medicine, University College London, Royal Free Hospital Campus, London, United Kingdom
3. Artificial Intelligence for Life Sciences CIC, Dorset, United Kingdom
4. Roche Pharma International Informatics, Roche Diagnostics GmbH, Mannheim, Germany

Article metrics

View details

Citations

8,1k

Views

1,5k

Downloads

Abstract

Open-source research software has proven indispensable in modern biomedical image analysis. A multitude of open-source platforms drive image analysis pipelines and help disseminate novel analytical approaches and algorithms. Recent advances in machine learning allow for unprecedented improvement in these approaches. However, these novel algorithms come with new requirements in order to remain open source. To understand how these requirements are met, we have collected 50 biomedical image analysis models and performed a meta-analysis of their respective papers, source code, dataset, and trained model parameters. We concluded that while there are many positive trends in openness, only a fraction of all publications makes all necessary elements available to the research community.

Introduction

The source code of data analysis algorithms made freely available for possible redistribution and modification (i.e. open source) has been beyond any doubt driving the ongoing revolution in Data Science (DS), Machine Learning (ML), and Artificial Intelligence (AI) (Sonnenburg et al., 2007; Landset et al., 2015; Abadi et al., 2016; Paszke et al., 2019). Encouraging open collaboration, the open-source model of code redistribution allows researchers to build upon their peers’ work on a global scale fueling the rapid iterative improvement in the respective fields (Sonnenburg et al., 2007). Conversely, “closed-source” publications not only hamper the development of the field but also make it hard for the researchers to reproduce the results disseminated in the research articles. While de jure all published work resides in the public domain, reverse engineering of an advanced algorithm implementation may often take weeks or months, making such works hard to reproduce.

Needless to say, open source comes in a great variety of shapes and kinds. Remarkably, just making the source code of your research software available publicly or upon request does not per se make it open source. Usage and redistribution of any original creation, be it a research article or source code, lies within the legal boundaries of copyright laws, which differ significantly from country to country. Therefore, for example, publicly available code without an explicit attribution of a respective open-source license cannot be counted or treated as open source. Due to the sheer diversity, it may be difficult to judge which specific license is right for one’s project. Yet the choice of the license must always be dictated by the project and the intent of its authors. Consulting the licenses list approved by the Open Source Initiative is generally considered to be a good starting point.

The importance of open source software for computational biomedical image analysis has become self-evident in the past 3 decades. Packages like ImageJ/Fiji (Schindelin et al., 2012; Schneider et al., 2012), CellProfiler (Carpenter et al., 2006), KNIME (Tiwari and Sekhar, 2007), and Icy (de Chaumont et al., 2011) not only perform the bulk of quantification tasks in the wetlabs but also serve as platforms for distribution of modules containing cutting-edge algorithms. The ability to install and use these modules and algorithms by researchers from various fields via a point-and-click interface made it possible for the research groups without image analysis specialists to obtain a qualitatively new level of biomedical insights from their data. Yet, as we transition into the data-driven and representation learning paradigm of biomedical image analysis, the availability of datasets and trained model parameters becomes as important as the open-source code.

The ability to download training parameters may allow researchers to skip the initial model training and focus on gradual model improvement through a technique known as transfer learning (West et al., 2007; Pan and Yang, 2010). Transfer learning has proven effective in Computer Vision (Deng et al., 2009) and Natural Language Processing (Wolf et al., 2020) domains (further reviewed in (Yakimovich et al., 2021)). However, the complexity of sharing the trained parameters of a model differs significantly between ML algorithms. For example, while model parameters of a conventional ML algorithm like linear regression may be conveniently shared in the text of the article, this is impossible for DL models with millions of parameters. This, in turn, requires rethinking conventional approaches to ML/DL models sharing under an open-source license.

In this review, we collate ML models for biomedical image analysis recently published in the peer-reviewed literature and available as open-source. We describe open-source licenses used, code availability, data availability, biomedical and ML tasks, as well as the availability of model parameters. We make the collated collection of the open-source model available via a GitHub repository and call on the research community to contribute their models to it via pull requests. Furthermore, we provide descriptive statistics of our observations and discuss the pros and cons of the status quo in the field of biomedical image analysis as well as perspectives in the general DS context. Several efforts to create biomedical ML model repositories or so-called “zoos” (e.g. bioimage. io) and web-based task consolidators (Hollandi et al., 2020; Stringer et al., 2021) have been undertaken. Here, rather than proposing a competing effort, we propose a continuous survey of the field “as is”. We achieve this through collating metadata of published papers and their respective source code, data, and model parameters (also known as weights and checkpoints).

Continuous Biomedical Image Analysis Model Survey

To understand the availability, reproducibility, and accessibility of published biomedical image analysis models we have collected a survey meta-dataset of 50 model articles and preprints published within the last 10 years. During our collection effort, we have prioritized publications with accompanying source code freely available online. In an attempt to minimize bias, we made sure that no individual medical imaging modality or biomedical task represents more than 25% of our dataset. Additionally, we have attempted to sample models published by both the biomedical community (e.g. Nature group journals), engineering community (IEEE group journals and conferences), as well as models published as preprints. For each publication we have noted the biomedical imaging modality, biomedical task (e.g. cancer), the open-source license used, reported model performance with respective metric, whether the model is dealing with the supervised task, whether the model parameters can be downloaded (as well as the respective link), links to code and dataset. Noteworthy, performance reporting is highly dependent on a dataset or benchmark. Therefore, to avoid confusion or bias we have recorded the best-reported performance for illustrative purposes only. Identical performance on a different dataset should not be expected. For the purpose of this review, we have split this meta-dataset into three tables according to the ML task of the models. The full dataset is available on GitHub (https://github.com/casus/bim). To ensure the completeness and correctness of this meta-dataset we invite the research community to contribute their additions and corrections to our survey meta-dataset.

First display table obtained from our meta-dataset contains 14 models aimed at biomedical image classification (Table 1). The most prevalent imaging modalities for this ML task are computed tomography (CT) and digital pathology—both highly clinically relevant modalities. We noted that most publications had an open-source license clearly defined in their repositories. The consensus between the choices of metric is rather low, making it difficult to compare one model to the other. Although most models had both source code and datasets available, only 4 out of 14 models had trained model parameters available for download.

TABLE 1

Imaging Modality	Biomed Task	License	Reported Performance	Parameters Download	References
CT	Lung tumor	Apache-2.0	0.93 Accuracy	No	LaLonde et al. (2020)
CT	Lung tumor	MIT	0.76 AUC	No	Guo et al. (2020)
CT	Pulmonary nodule	GPL-3.0	0.90 Accuracy	No	Zhu et al. (2018a)
CT	Pulmonary nodule	MIT	0.96 AUC	No	Al-Shabi et al. (2019)
CT	Pulmonary nodule	MIT	0.95 AUC	No	Dey et al. (2018)
Dermatoscopy	Skin tumor	N/a	0.93 Accuracy	No	Datta et al. (2021)
Dermatoscopy	Skin tumor	MIT	0.81 AUC	Yes	Zunair and Ben Hamza, (2020)
Mammography	Breast tumor	CC BY-NC-ND 4.0	0.93 AUC	Yes	Shen et al. (2021)
Digital Pathology	Breast tumor	CC BY-NC-ND 4.0	0.63 F1	No	Pati et al. (2022)
Mammography	Breast tumor	CC BY-NC-SA 4.0	0.84 Accuracy	Yes	Shen et al. (2019)
Digital Pathology	Breast tumor	MIT	0.93 Accuracy	Yes	Rakhlin et al. (2018)
Digital Pathology	Lung tumor	GPL-3.0	0.53 Kappa	No	Wei et al. (2019)
Digital Pathology	Lung tumor	MIT	0.97 AUC	No	Coudray et al. (2018)
Fluorescence microscopy	Host-pathogen interactions	N/a	0.92 Accuracy	No	Fisch et al. (2019)

Biomedical Image Classification Models. Here, AUC is Area under curve, CT is computed tomography.

The second display table contains 25 models (Table 2) aimed at biomedical image segmentation—a task relevant for obtaining quantitative insights from the biomedical images (e.g. size of the tumor). Similarly, to the models for biomedical image classification, the vast majority of the segmentation models have a well-defined open-source license with only a few exceptions. Again, similarly to the classification models, the consensus between performance metric choices is rather low, although Dice score reports clearly dominated. Conversely, the percentage of models with pre-trained parameters available for download is slightly higher than in the case of the classification models (36% vs 29%). However, over half of the models do not provide pre-trained parameters for the download for both segmentation and classification tasks.

TABLE 2

Imaging Modality	Biomed Task	License	Reported Performance	Parameters Download	References
3D microscopy	Nuclei detection	MIT	0.937 AP	No	Hirsch and Kainmueller, (2020)
CT	Kidney tumor	GPL-3.0	0.95 Dice	No	Müller and Kramer, (2021)
CT	Pulmonary nodule	BSD-3-Clause	N/a	No	Hancock and Magnan, (2019)
CT	Pulmonary nodule	CC BY-NC-SA 4.0	0.55 IoU	Yes	Aresta et al. (2019)
CT	Pulmonary nodule	MIT	0.83 DSC	No	Keetha et al. (2020)
CT	Pancreas & Brain tumor	MIT	0.84 Dice	No	Oktay et al. (2018)
CT, Dermatoscopy	Lung tumor and Skin tumor	N/a	0.9965 Jaccard	No	Kaul et al. (2019)
CT	Brain tumor	Apache 2.0	0.89 Dice	No	Isensee et al. (2018)
MRI	Brain tumor	Apache 2.0	0.79 Dice	No	Wang et al. (2021)
MRI	Brain tumor	CC BY-NC-ND 4.0	0.76 Dice	No	Baek et al. (2019)
Digital Pathology	Breast tumor	CC BY-NC-ND 4.0	0.893 F1	Yes	Le et al. (2020)
Digital Pathology	Lung tumor	CC-BY	0.83 Accuracy	No	Tomita et al. (2019)
Digital Pathology	Multiple pathologies	MIT	N/a	No	Khened et al. (2021)
Electron microscopy	Multiple pathologies	MIT	0.5 VI	Yes	Lee et al. (2017)
Fluorescence microscopy	Cellular structures reconstruction	N/a	20 x Enhancement in DOF	Yes	Wu et al. (2019)
Fluorescence microscopy	Nuclei detection	BSD-3-Clause	0.94 Accuracy	Yes	Weigert et al. (2020)
Microscopy	Cellular reconstruction	N/a	0.69 AP	No	Hirsch et al. (2020)
MRI	Brain tumor	BSD-3-Clause	0.87 Dice	Yes	Wang et al. (2018)
MRI	Brain tumor	MIT	0.85 Dice	Yes	Havaei et al. (2017)
MRI	Brain tumor	MIT	0.90 Dice	No	Isensee et al. (2018)
MRI	Brain tumor	MIT	0.91 Dice	No	Myronenko, (2019)
SHG	Bone disease	GPL-3.0	0.78 Accuracy	No	Schmarje et al. (2019)
Time-lapse microscopy	Nuclei detection	N/a	0.92 Accuracy	Yes	Shailja et al. (2021)
Ultrasound imaging	Intraventricular hemorrhage	MIT	0.89 Dice	No	Valanarasu et al. (2020)
MRI	Brain tumor	N/a	0.81 Dice	Yes	Larrazabal et al. (2021)

Biomedical Image Segmentation Models. Here, CT is computed tomography, DSC is Dice similarity coefficient, AP is Average Precision, IoU is Intersection over Union, DOF is Depth of field, AUC is Area under curve, SHG is Second harmonic generation microscopy.

Finally, we have also examined biomedical image analysis models aimed at less popular ML tasks including data generation, object detection or reconstruction (Table 3). Apart from digital pathology, CT scans this group of models also contains light and electron microscopy. Remarkably, only 19% of models in this group had downloadable model parameters. At the same time, almost all the models in this group had well attributed open-source licenses. This may suggest that parameter sharing is not very common in highly specialized fields like microscopy. Interestingly, for this and other groups of ML tasks, we have found that parameter sharing was more common in models submitted as a part of a data challenge. This may be simply a result of data challenge participation conditions.

TABLE 3

Imaging Modality	Biomed Task	ML Task	License	Parameters Download	References
Mammography	Breast tumor	Classification & Detection	N/a	Yes	Ribli et al. (2018)
Fluorescence microscopy	Cellular structures reconstruction	Data generation	Apache-2.0	No	Eschweiler et al. (2021)
CT	Pulmonary nodule	Detection	Apache-2.0	No	Zhu et al. (2018b)
CT	Pulmonary nodule	Detection	MIT	No	Li and Fan, (2020)
Digital Pathology	Multiple pathologies	Graph embedding	AGPL 3.0	No	Jaume et al. (2021)
Mammography	Breast tumor	Image Inpainting & Data generation	CC BY-NC-ND 4.0	Yes	Wu et al. (2018)
Confocal microscopy	Cellular structures reconstruction	Reconstruction	Apache-2.0	No	Vizcaíno et al. (2021)
Cryo-electron microscopy	Cellular structures reconstruction	Reconstruction	GPL-3.0	No	Zhong et al. (2019)
Cryo-electron microscopy	Protein structures reconstruction	Reconstruction	GPL-3.0	No	Ullrich et al. (2019)
Electron microscopy	Cellular structures reconstruction	Reconstruction	N/a	No	Guay et al. (2021)
3D microscopy	Image acquisition	Reconstruction	BSD-3-Clause	No	Saha et al. (2020)

Other Biomedical Image Models. Here, CT is computed tomography.

Trends Meta-Analysis in Biomedical Image Analysis Model

To understand general trends in the collection of our open-source models we have computed respective fractions of each descriptive category we have assigned to each work. The assignment was performed through careful analysis of the respective research article, code repository, dataset repository, and the availability of the trained model parameters (Figure 1). While admittedly 50 papers constitute a relatively small sample size, we have made the best reasonable effort to ensure the sampling was unbiased. Specifically, the set of models we have reviewed addresses the following biomedical tasks (from most to least frequent): pulmonary nodule, brain tumor, breast tumor, cellular structures reconstruction, lung tumor, cell nuclei detection, multiple pathologies, skin tumor, protein structures reconstruction, kidney tumor, pancreas and brain tumor, lung tumor and skin tumor, host-pathogen interactions, bone disease, image acquisition, intraventricular hemorrhage (Figure 1A).

FIGURE 1

From the perspective of imaging modalities, the models we reviewed span the following: computed tomography (CT), digital pathology, magnetic resonance imaging (MRI), mammography, fluorescence microscopy, 3D microscopy, cryo-electron microscopy, dermatoscopy, electron microscopy, confocal microscopy, CT and dermatoscopy, light and electron microscopy, second harmonic generation microscopy, time-lapse microscopy, ultrasound imaging (Figure 1B). From the perspective of ML tasks these models covered the following: segmentation, reconstruction classification, object detection, imagine inpainting and data generation, graph embedding, classification, and detection (Figure 1C). 86% of the models we have reviewed were addressing supervised tasks and 14% unsupervised tasks (Figure 1D).

Within our collection of open-source models, we have noted that 32% of the authors have selected the MIT license, 18% have selected Apache-2.0, 12%—GPL-3.0, 10%—BSD-3-Clause license, 8%—CC BY-NC-SA 4.0 license. Remarkably, another 8% have published their code without license attribution, arguably making it harder for the field to understand the freedom to operate with the code made available with the paper (Figure 1E). Within these papers, 84% of the authors made the dataset used to train the model available and clearly indicated within the paper or the code repository (Figure 1F). Overall, this amounted to the vast majority of the works which we have selected to have a clear open-source license designation, as well as a dataset available.

Remarkably, while providing the model’s source code, as well as, in most cases, the model’s dataset, an impressive 68% of the contributions we have reviewed did not provide trained model parameters (Figure 1G). Breaking down by the publishers or repositories, 43% and 31% of papers published by Nature group and Springer respectively provided model parameters. However, only 25% of IEEE papers and 14% of arXiv preprints provided parameters. Altogether, the low percentage of shared parameters are suggesting that the efforts to reproduce these papers came with the caveat of provisioning a hardware setup capable of wielding the computational load required by the respective model. In some cases that requiresaccess to the high-capacity computing. Furthermore, this way, instead of simply building upon the models trained, the efforts of the authors would have to be first reproduced. Needless to say, should any of the papers become seminal these high-performance computations would have to be repeated time and time again, possibly taking days of GPU computation.

Interestingly, of the authors who have chosen to make the trained parameters available to the readers around 25% have chosen to deposit the parameters on GitHub, while 19% and 6% have opted for Google drive and Dropbox services respectively. The rest deposited their parameters on the proprietary and other services (Figure 1H).

Discussion

The advent of ML and specifically representation learning is opening a new horizon for biomedical image analysis. Yet, the success of these new advanced ML approaches brings about new requirements and standards to ensure quality and reproducibility (Hernandez-Boussard et al., 2020; Mongan et al., 2020; Norgeot et al., 2020; Heil et al., 2021; Laine et al., 2021). Several minimalistic quality standards applicable to the clinical setting have been proposed (Hernandez-Boussard et al., 2020; Mongan et al., 2020; Norgeot et al., 2020), and while coming from slightly different perspectives they demonstrate an overlap on essential topics like the dataset description, comparison to baseline and hyperparameters sharing. For example, CLAIM (Mongan et al., 2020) and MINIMAR (Hernandez-Boussard et al., 2020) approaches aim to adhere to a clinical tradition. Authors define a checklist including a structure of an academic biomedical paper, requiring either a lengthy biomedical problem description (CLAIM) or descriptive statistics of the dataset’s internal structure (MINIMAR). At the same time, MI-CLAIM (Norgeot et al., 2020) aims to adhere to the Data Science tradition, focusing specifically on data preprocessing and baseline comparison. Remarkably, even though item 24 of the CLAIM checklist explicitly mentions the importance of specifying the source of the starting weights (parameters) if transfer learning is employed, all three approaches fail to explicitly encourage sharing of the trained model parameters. Instead of proposing yet another checklist, the current survey aims to understand to extend to which the model parameters are shared in the biomedical image analysis field and emphasize the importance of parameters sharing to foster reproducibility in the field.

The past 3 decades have successfully demonstrated the viability of the open-source model for the research software in this field, as well as the role of open-source software in fostering scientific progress. However, the change of modeling paradigm to DL requires new checks and balances to ensure the results are reproducible and the efforts are not doubled. Furthermore, major computational efforts inevitably come with an environmental footprint (Strubell et al., 2020). Making parameters of the trained models available to the research community not only could minimize this footprint, but also open new prospects for the researcher wishing to fine-tune the pre-trained models to their task of choice. Such an approach proved incredibly fruitful in the field of natural language processing (Zhang et al., 2020).

Remarkably, in the current survey, we have found that only 32% of the biomedical models we have reviewed made the train model parameters available for download. On one hand, such a low number of trained models available for download may be explained by the fact that many journals and conferences do not require trained models to warrant publication. On another hand, with parameters of some models requiring hundreds of megabytes of storage, there are not many opportunities to share these files. Interestingly, while some researchers shared their trained model parameters via platforms like GitHub, Google drive, and Dropbox, the vast majority opted for often proprietary sites to share these parameters (Figure 1H). In our opinion, this indicates the necessity of hubs and platforms for sharing trained biomedical image analysis models.

It is worth noting that most cloud storage services like Google drive or Dropbox are more suited for instant file sharing rather than archival deposition of model parameters. These storage solutions don’t offer data immutability or digital object identifiers attached to them, and hence can simply be overwritten or disappear leaving crucial content inaccessible. Authors opting for self-hosting of model parameters also likely underestimate the workload of the long-term serving of archival data. Instead of the aforementioned approaches to model sharing, one should take advantage of efforts like BioImage.io, Tensorflow Hub (Paper, 2021), PyTorch Hub, DLHub (Chard et al., 2019), or similar in order to foster consistency and reproducibility of their results. Arguably, one of the most intuitive experiences of model parameters sharing for the end-users is currently offered by the HuggingFace platform in the domain of natural language processing. This has largely been possible through the platform’s own ML library allowing for improved compatibility (Wolf et al., 2020).

Interestingly, the vast majority of authors have chosen MIT and Apache-2.0 as their open-source licenses. Both Apache-2.0 and MIT are known for being permissive, rather than copyleft licenses. Furthermore, both licenses are very clearly formulated and easy to use. It is tempting to speculate that their popularity is a result of the simplicity and openness that these licenses offer.

However, noteworthy, our survey is limited to the papers we reviewed. To improve the representativeness of our meta-analysis, as well as encourage the dissemination of the open-source models in biomedical image analysis we call on our peers to contribute to our collection via the GitHub repository. Specifically, we invite the researchers to fork our repository, make additions to the content of the list following the contribution guidelines and merge them in via pull request. This way we hope to not only obtain an up-to-date state of the field but also ensure the code, datasets and trained model parameters are easier to find.

Statements

Author contributions

AY conceived the idea. AY, ST, VS, and RL reviewed the published works and collated the data. AY, ST, VS, and RL wrote the manuscript.

Funding

This work was partially funded by the Center for Advanced Systems Understanding (CASUS) which is financed by Germany’s Federal Ministry of Education and Research (BMBF) and by the Saxon Ministry for Science, Culture and Tourism (SMWK) with tax funds on the basis of the budget approved by the Saxon State Parliament. This work has been partially funded by OPTIMA. OPTIMA is funded through the IMI2 Joint Undertaking and is listed under grant agreement No. 101034347. IMI2 receives support from the European Union’s Horizon 2020 research and innovation programme and the European Federation of Pharmaceutical Industries and Associations (EFPIA). IMI supports collaborative research projects and builds networks of industrial and academic experts in order to boost pharmaceutical innovation in Europe. The views communicated within are those of OPTIMA. Neither the IMI nor the European Union, EFPIA, or any Associated Partners are responsible for any use that may be made of the information contained herein.

Conflict of interest

AY was employed by Roche Pharma International Informatics, Roche Diagnostics GmbH, Mannheim, Germany

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1
AbadiM.BarhamP.ChenJ.ChenZ.DavisA.DeanJ.et al (2016). “TensorFlow: A System for Large-Scale Machine Learning,” in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (usenix.org), 265
- Google Scholar
2
Al-ShabiM.LanB. L.ChanW. Y.NgK. H.TanM. (2019). Lung Nodule Classification Using Deep Local-Global Networks. Int. J. Comput. Assist. Radiol. Surg.14, 1815–1819. 10.1007/s11548-019-01981-7
- CrossRef
- Google Scholar
3
ArestaG.JacobsC.AraújoT.CunhaA.RamosI.van GinnekenB.et al (2019). iW-Net: an Automatic and Minimalistic Interactive Lung Nodule Segmentation Deep Network. Sci. Rep.9, 11591. 10.1038/s41598-019-48004-8
- CrossRef
- Google Scholar
4
BaekS.HeY.AllenB. G.BuattiJ. M.SmithB. J.TongL.et al (2019). Deep Segmentation Networks Predict Survival of Non-small Cell Lung Cancer. Sci. Rep.9, 17286. 10.1038/s41598-019-53461-2
- CrossRef
- Google Scholar
5
CarpenterA. E.JonesT. R.LamprechtM. R.ClarkeC.KangI. H.FrimanO.et al (2006). CellProfiler: Image Analysis Software for Identifying and Quantifying Cell Phenotypes. Genome Biol.7, R100. 10.1186/gb-2006-7-10-r100
- CrossRef
- Google Scholar
6
ChardR.LiZ.ChardK.WardL.BabujiY.WoodardA.et al (2019). “DLHub: Model and Data Serving for Science,” in 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (ieeexplore.ieee.org), 283–292. 10.1109/ipdps.2019.00038
- CrossRef
- Google Scholar
7
CoudrayN.OcampoP. S.SakellaropoulosT.NarulaN. (2018). “Classification and Mutation Prediction from Non–small Cell Lung Cancer Histopathology Images Using Deep Learning,”Nat. Med.Available at: https://www.nature.com/articles/s41591-018-0177-5?sf197831152=1.
- Google Scholar
8
DattaS. K.ShaikhM. A.SrihariS. N.GaoM. (2021). “Soft Attention Improves Skin Cancer Classification Performance,” in Interpretability of Machine Intelligence in Medical Image Computing, and Topological Data Analysis and its Applications for Medical Data (Springer International Publishing), 13–23. 10.1007/978-3-030-87444-5_2
- CrossRef
- Google Scholar
9
de ChaumontF.DallongevilleS.Olivo-MarinJ.-C. (2011). “ICY: A New Open-Source Community Image Processing Software,” in 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro (ieeexplore.ieee.org), 234–237. 10.1109/isbi.2011.5872395
- CrossRef
- Google Scholar
10
DengJ.DongW.SocherR.LiL.-J.LiK.Fei-FeiL. (2009). “ImageNet: A Large-Scale Hierarchical Image Database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. 10.1109/cvpr.2009.5206848
- CrossRef
- Google Scholar
11
DeyR.LuZ.HongY. (2018). “Diagnostic Classification of Lung Nodules Using 3D Neural Networks,” in 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018) (ieeexplore.ieee.org), 774–778. 10.1109/isbi.2018.8363687
- CrossRef
- Google Scholar
12
EschweilerD.RethwischM.JarchowM.KoppersS.StegmaierJ. (2021). 3D Fluorescence Microscopy Data Synthesis for Segmentation and Benchmarking. PLoS One16, e0260509. 10.1371/journal.pone.0260509
- CrossRef
- Google Scholar
13
FischD.YakimovichA.CloughB.WrightJ.BunyanM.HowellM.et al (2019). Defining Host-Pathogen Interactions Employing an Artificial Intelligence Workflow. Elife8, e40560. 10.7554/eLife.40560
- CrossRef
- Google Scholar
14
GuayM. D.EmamZ. A. S.AndersonA. B.AronovaM. A.PokrovskayaI. D.StorrieB.et al (2021). Dense Cellular Segmentation for EM Using 2D-3D Neural Network Ensembles. Sci. Rep.11, 2561–2611. 10.1038/s41598-021-81590-0
- CrossRef
- Google Scholar
15
GuoH.KrugerU.WangG.KalraM. K.YanP. (2020). Knowledge-Based Analysis for Mortality Prediction from CT Images. IEEE J. Biomed. Health Inf.24, 457–464. 10.1109/JBHI.2019.2946066
- CrossRef
- Google Scholar
16
HancockM. C.MagnanJ. F. (2019). Level Set Image Segmentation with Velocity Term Learned from Data with Applications to Lung Nodule Segmentation. arXiv [eess.IV]. Available at: http://arxiv.org/abs/1910.03191 (Accessed March 31, 2022).
- Google Scholar
17
HavaeiM.DavyA.Warde-FarleyD.BiardA.CourvilleA.BengioY.et al (2017). Brain Tumor Segmentation with Deep Neural Networks. Med. Image Anal.35, 18–31. 10.1016/j.media.2016.05.004
- CrossRef
- Google Scholar
18
HeilB. J.HoffmanM. M.MarkowetzF.LeeS. I.GreeneC. S.HicksS. C. (2021). Reproducibility Standards for Machine Learning in the Life Sciences. Nat. Methods18, 1132–1135. 10.1038/s41592-021-01256-7
- CrossRef
- Google Scholar
19
Hernandez-BoussardT.BozkurtS.IoannidisJ. P. A.ShahN. H. (2020). MINIMAR (MINimum Information for Medical AI Reporting): Developing Reporting Standards for Artificial Intelligence in Health Care. J. Am. Med. Inf. Assoc.27, 2011–2015. 10.1093/jamia/ocaa088
- CrossRef
- Google Scholar
20
HirschP.KainmuellerD. (2020). “An Auxiliary Task for Learning Nuclei Segmentation in 3D Microscopy Images,” in Proceedings Of the Third Conference On Medical Imaging With Deep Learning Proceedings of Machine Learning Research. Editors ArbelT.Ben AyedI.de BruijneM.DescoteauxM.LombaertH.PalC. (Montreal, QC, Canada: PML), 304
- Google Scholar
21
HirschP.MaisL.KainmuellerD. (2020). PatchPerPix for Instance Segmentation. arXiv [cs.CV]. Available at: http://arxiv.org/abs/2001.07626 (Accessed March 30, 2022).
- Google Scholar
22
HollandiR.SzkalisityA.TothT.TasnadiE.MolnarC.MatheB.et al (2020). nucleAIzer: A Parameter-free Deep Learning Framework for Nucleus Segmentation Using Image Style Transfer. Cell. Syst.10, 453–e6. 10.1016/j.cels.2020.04.003
- CrossRef
- Google Scholar
23
IsenseeF.KickingerederP.WickW.BendszusM.Maier-HeinK. H. (2018). “Brain Tumor Segmentation and Radiomics Survival Prediction: Contribution to the BRATS 2017 Challenge,” in Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries (Quebec City, Canada: Springer International Publishing), 287–297. 10.1007/978-3-319-75238-9_25
- CrossRef
- Google Scholar
24
JaumeG.PatiP.AnklinV.FoncubiertaA.GabraniM. (2021). “HistoCartography: A Toolkit for Graph Analytics in Digital Pathology,” in Proceedings Of the MICCAI Workshop On Computational Pathology Proceedings of Machine Learning Research. Editors AtzoriM.BurlutskiyN.CiompiF.LiZ.MinhasF.MüllerH. (PMLR), 117
- Google Scholar
25
KaulC.ManandharS.PearsN. (2019). “Focusnet: An Attention-Based Fully Convolutional Network for Medical Image Segmentation,” in 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) (ieeexplore.ieee.org), 455–458. 10.1109/isbi.2019.8759477
- CrossRef
- Google Scholar
26
KeethaN. V.SamsonA. B. P.AnnavarapuC. S. R. (2020). U-det: A Modified U-Net Architecture with Bidirectional Feature Network for Lung Nodule Segmentation. arXiv [eess.IV]. Available at: http://arxiv.org/abs/2003.09293 (Accessed March 31, 2022).
- Google Scholar
27
KhenedM.KoriA.RajkumarH.KrishnamurthiG.SrinivasanB. (2021). A Generalized Deep Learning Framework for Whole-Slide Image Segmentation and Analysis. Sci. Rep.11, 11579. 10.1038/s41598-021-90444-8
- CrossRef
- Google Scholar
28
LaineR. F.Arganda-CarrerasI.HenriquesR.JacquemetG. (2021). Avoiding a Replication Crisis in Deep-Learning-Based Bioimage Analysis. Nat. Methods18, 1136–1144. 10.1038/s41592-021-01284-3
- CrossRef
- Google Scholar
29
LaLondeR.TorigianD.BagciU. (2020). “Encoding Visual Attributes in Capsules for Explainable Medical Diagnoses,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2020 (Springer International Publishing), 294–304. 10.1007/978-3-030-59710-8_29
- CrossRef
- Google Scholar
30
LandsetS.KhoshgoftaarT. M.RichterA. N.HasaninT. (2015). A Survey of Open Source Tools for Machine Learning with Big Data in the Hadoop Ecosystem. J. Big Data2, 1–36. 10.1186/s40537-015-0032-1
- CrossRef
- Google Scholar
31
LarrazabalA. J.MartínezC.DolzJ.FerranteE. (2021). “Orthogonal Ensemble Networks for Biomedical Image Segmentation,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 (Strasbourg, France: Springer International Publishing), 594–603. 10.1007/978-3-030-87199-4_56
- CrossRef
- Google Scholar
32
LeH.GuptaR.HouL.AbousamraS.FasslerD.Torre-HealyL.et al (2020). Utilizing Automated Breast Cancer Detection to Identify Spatial Distributions of Tumor-Infiltrating Lymphocytes in Invasive Breast Cancer. Am. J. Pathol.190, 1491–1504. 10.1016/j.ajpath.2020.03.012
- CrossRef
- Google Scholar
33
LeeK.ZungJ.LiP.JainV.Sebastian SeungH. (2017). Superhuman Accuracy on the SNEMI3D Connectomics Challenge. arXiv [cs.CV]. Available at: http://arxiv.org/abs/1706.00120.
- Google Scholar
34
LiY.FanY. (2020). DeepSEED: 3D Squeeze-And-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection. Proc. IEEE Int. Symp. Biomed. Imaging2020, 1866–1869. 10.1109/ISBI45749.2020.9098317
- CrossRef
- Google Scholar
35
MonganJ.MoyL.KahnC. E.Jr (2020). Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiol. Artif. Intell.2, e200029. 10.1148/ryai.2020200029
- CrossRef
- Google Scholar
36
MüllerD.KramerF. (2021). MIScnn: a Framework for Medical Image Segmentation with Convolutional Neural Networks and Deep Learning. BMC Med. Imaging21, 12. 10.1186/s12880-020-00543-7
- CrossRef
- Google Scholar
37
MyronenkoA. (2019). “3D MRI Brain Tumor Segmentation Using Autoencoder Regularization,” in Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries (Granada, Spain: Springer International Publishing), 311–320. 10.1007/978-3-030-11726-9_28
- CrossRef
- Google Scholar
38
NorgeotB.QuerG.Beaulieu-JonesB. K.TorkamaniA.DiasR.GianfrancescoM.et al (2020). Minimum Information about Clinical Artificial Intelligence Modeling: the MI-CLAIM Checklist. Nat. Med.26, 1320–1324. 10.1038/s41591-020-1041-y
- CrossRef
- Google Scholar
39
OktayO.SchlemperJ.Le FolgocL.LeeM.HeinrichM.MisawaK.et al (2018). Attention U-Net: Learning where to Look for the Pancreas. arXiv [cs.CV]. Available at: http://arxiv.org/abs/1804.03999 (Accessed March 31, 2022).
- Google Scholar
40
PanS.YangQ. (2010). A Survey on Transfer Learning. IEEE Transaction Knowl. Discov. Data Eng.22 (10), 191. 10.1109/tkde.2009.191
- CrossRef
- Google Scholar
41
PaperD. (2021). “Simple Transfer Learning with TensorFlow Hub,” in State-of-the-Art Deep Learning Models in TensorFlow: Modern Machine Learning in the Google Colab Ecosystem. Editor PaperD. (Berkeley, CA: Apress), 153–169. 10.1007/978-1-4842-7341-8_6
- CrossRef
- Google Scholar
42
PaszkeA.GrossS.MassaF.LererA.BradburyJ.ChananG.et al (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. Adv. Neural Inf. Process. Syst.32. Available at: https://proceedings.neurips.cc/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html (Accessed March 24, 2022).
- Google Scholar
43
PatiP.JaumeG.Foncubierta-RodríguezA.FeroceF.AnnicielloA. M.ScognamiglioG.et al (2022). Hierarchical Graph Representations in Digital Pathology. Med. Image Anal.75, 102264. 10.1016/j.media.2021.102264
- CrossRef
- Google Scholar
44
RakhlinA.ShvetsA.IglovikovV.KalininA. A. (2018). “Deep Convolutional Neural Networks for Breast Cancer Histology Image Analysis,” in Image Analysis and Recognition (Póvoa de Varzim, Portugal: Springer International Publishing), 737–744. 10.1007/978-3-319-93000-8_83
- CrossRef
- Google Scholar
45
RibliD.HorváthA.UngerZ.PollnerP.CsabaiI. (2018). Detecting and Classifying Lesions in Mammograms with Deep Learning. Sci. Rep.8, 4165. 10.1038/s41598-018-22437-z
- CrossRef
- Google Scholar
46
SahaD.SchmidtU.ZhangQ.BarbotinA.HuQ.JiN.et al (2020). Practical Sensorless Aberration Estimation for 3D Microscopy with Deep Learning. Opt. Express28, 29044–29053. 10.1364/OE.401933
- CrossRef
- Google Scholar
47
SchindelinJ.Arganda-CarrerasI.FriseE.KaynigV.LongairM.PietzschT.et al (2012). Fiji: an Open-Source Platform for Biological-Image Analysis. Nat. Methods9, 676–682. 10.1038/nmeth.2019
- CrossRef
- Google Scholar
48
SchmarjeL.ZelenkaC.GeisenU.GlüerC.-C.KochR. (2019). “2D and 3D Segmentation of Uncertain Local Collagen Fiber Orientations in SHG Microscopy,” in Pattern Recognition (Dortmund, Germany: Springer International Publishing), 374–386. 10.1007/978-3-030-33676-9_26
- CrossRef
- Google Scholar
49
SchneiderC. A.RasbandW. S.EliceiriK. W. (2012). NIH Image to ImageJ: 25 Years of Image Analysis. Nat. Methods9, 671–675. 10.1038/nmeth.2089
- CrossRef
- Google Scholar
50
ShailjaS.JiangJ.ManjunathB. S. (2021). “Semi Supervised Segmentation and Graph-Based Tracking of 3D Nuclei in Time-Lapse Microscopy,” in 2021 IEEE 18th International Symposium on Biomedical Imaging (Nice, France: ISBI IEEE), 385–389. 10.1109/isbi48211.2021.9433831
- CrossRef
- Google Scholar
51
ShenL.MargoliesL. R.RothsteinJ. H.FluderE.McBrideR.SiehW. (2019). Deep Learning to Improve Breast Cancer Detection on Screening Mammography. Sci. Rep.9, 12495. 10.1038/s41598-019-48995-4
- CrossRef
- Google Scholar
52
ShenY.WuN.PhangJ.ParkJ.LiuK.TyagiS.et al (2021). An Interpretable Classifier for High-Resolution Breast Cancer Screening Images Utilizing Weakly Supervised Localization. Med. Image Anal.68, 101908. 10.1016/j.media.2020.101908
- CrossRef
- Google Scholar
53
SonnenburgS.BraunM. L.OngC. S.BengioS.BottouL.HolmesG.et al (2007). The Need for Open Source Software in Machine Learning. Available at: https://www.jmlr.org/papers/volume8/sonnenburg07a/sonnenburg07a.pdf (Accessed March 24, 2022).
- Google Scholar
54
StringerC.WangT.MichaelosM.PachitariuM. (2021). Cellpose: a Generalist Algorithm for Cellular Segmentation. Nat. Methods18, 100–106. 10.1038/s41592-020-01018-x
- CrossRef
- Google Scholar
55
StrubellE.GaneshA.McCallumA. (2020). Energy and Policy Considerations for Modern Deep Learning Research. AAAI34, 13693–13696. 10.1609/aaai.v34i09.7123
- CrossRef
- Google Scholar
56
TiwariA.SekharA. K. (2007). Workflow Based Framework for Life Science Informatics. Comput. Biol. Chem.31, 305–319. 10.1016/j.compbiolchem.2007.08.009
- CrossRef
- Google Scholar
57
TomitaN.AbdollahiB.WeiJ.RenB.SuriawinataA.HassanpourS. (2019). Attention-Based Deep Neural Networks for Detection of Cancerous and Precancerous Esophagus Tissue on Histopathological Slides. JAMA Netw. Open2, e1914645. 10.1001/jamanetworkopen.2019.14645
- CrossRef
- Google Scholar
58
UllrichK.van den BergR.BrubakerM.FleetD.WellingM. (2019). Differentiable Probabilistic Models of Scientific Imaging with the Fourier Slice Theorem. arXiv [cs.LG]. Available at: http://arxiv.org/abs/1906.07582 (Accessed March 31, 2022).
- Google Scholar
59
ValanarasuJ. M. J.SindagiV. A.HacihalilogluI.PatelV. M. (2020). “KiU-Net: Towards Accurate Segmentation of Biomedical Images Using Over-complete Representations,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2020 (Lima, Peru: Springer International Publishing), 363–373. 10.1007/978-3-030-59719-1_36
- CrossRef
- Google Scholar
60
VizcaínoJ. P.SaltarinF.BelyaevY.LyckR.LasserT.FavaroP.et al (2021). Learning to Reconstruct Confocal Microscopy Stacks from Single Light Field Images. IEEE Trans. Comput. Imaging7, 775. 10.1109/TCI.2021.3097611
- CrossRef
- Google Scholar
61
WangG.LiW.OurselinS.VercauterenT. (2018). “Automatic Brain Tumor Segmentation Using Cascaded Anisotropic Convolutional Neural Networks,” in Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries (Quebec City, Canada: Springer International Publishing), 178–190. 10.1007/978-3-319-75238-9_16
- CrossRef
- Google Scholar
62
WangW.ChenC.DingM.YuH.ZhaS.LiJ. (2021). “TransBTS: Multimodal Brain Tumor Segmentation Using Transformer,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2021, Strasbourg, France, September 27–October 1, 2021 (Springer International Publishing), 109–119. 10.1007/978-3-030-87193-2_11
- CrossRef
- Google Scholar
63
WeiJ. W.TafeL. J.LinnikY. A.VaickusL. J.TomitaN.HassanpourS. (2019). Pathologist-level Classification of Histologic Patterns on Resected Lung Adenocarcinoma Slides with Deep Neural Networks. Sci. Rep.9, 3358. 10.1038/s41598-019-40041-7
- CrossRef
- Google Scholar
64
WeigertM.SchmidtU.HaaseR.SugawaraK.MyersG. (2020). “Star-convex Polyhedra for 3d Object Detection and Segmentation in Microscopy,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass village, CO, United States, March 2–5, 2020 (openaccess.thecvf.com), 3666–3673. 10.1109/wacv45572.2020.9093435
- CrossRef
- Google Scholar
65
WestJ.VenturaD.WarnickS. (2007). Spring Research Presentation: A Theoretical Foundation for Inductive Transfer, 1. Provo, UT, United States: Brigham Young University, College of Physical and Mathematical Sciences.
- Google Scholar
66
WolfT.DebutL.SanhV.ChaumondJ.DelangueC.MoiA.et al (2020). “Transformers: State-Of-The-Art Natural Language Processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, November 16–20, 2020 (Online: Association for Computational Linguistics). 10.18653/v1/2020.emnlp-demos.6
- CrossRef
- Google Scholar
67
WuE.WuK.CoxD.LotterW. (2018). “Conditional Infilling GANs for Data Augmentation in Mammogram Classification,” in Image Analysis for Moving Organ, Breast, and Thoracic Images (Granada, Spain: Springer International Publishing), 98–106. 10.1007/978-3-030-00946-5_11
- CrossRef
- Google Scholar
68
WuY.RivensonY.WangH.LuoY.Ben-DavidE.BentolilaL. A.et al (2019). Three-dimensional Virtual Refocusing of Fluorescence Microscopy Images Using Deep Learning. Nat. Methods16, 1323–1331. 10.1038/s41592-019-0622-5
- CrossRef
- Google Scholar
69
YakimovichA.BeaugnonA.HuangY.OzkirimliE. (2021). Labels in a Haystack: Approaches beyond Supervised Learning in Biomedical Applications. Patterns2, 100383. 10.1016/j.patter.2021.100383
- CrossRef
- Google Scholar
70
ZhangN.LiL.DengS.YuH.ChengX.ZhangW.et al (2020). Can Fine-Tuning Pre-trained Models Lead to Perfect Nlp? a Study of the Generalizability of Relation Extraction. Available at: https://openreview.net/forum?id=3yzIj2-eBbZ (Accessed April 4, 2022).
- Google Scholar
71
ZhongE. D.BeplerT.DavisJ. H.BergerB. (2019). Reconstructing Continuous Distributions of 3D Protein Structure from Cryo-EM Images. arXiv [q-bio.QM]. Available at: http://arxiv.org/abs/1909.05215 (Accessed March 30, 2022).
- Google Scholar
72
ZhuW.LiuC.FanW.XieX. (2018a). “DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification,” in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) (ieeexplore.ieee.org), Granada, Spain, March 12–15, 2018, 673–681. 10.1109/wacv.2018.00079
- CrossRef
- Google Scholar
73
ZhuW.VangY. S.HuangY.XieX. (2018b). “DeepEM: Deep 3D ConvNets with EM for Weakly Supervised Pulmonary Nodule Detection,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2018 (Springer International Publishing), Lake Tahoe, NV, United States, March 12–15, 2018, 812–820. 10.1007/978-3-030-00934-2_90
- CrossRef
- Google Scholar
74
ZunairH.Ben HamzaA. (2020). Melanoma Detection Using Adversarial Training and Deep Transfer Learning. Phys. Med. Biol.65, 135005. 10.1088/1361-6560/ab86d3
- CrossRef
- Google Scholar

Summary

Keywords

machine learing, deep learning, open source, bioimaging, image analysis, medical imaging

Citation

Li R, Sharma V, Thangamani S and Yakimovich A (2022) Open-Source Biomedical Image Analysis Models: A Meta-Analysis and Continuous Survey. Front. Bioinform. 2:912809. doi: 10.3389/fbinf.2022.912809

Received

04 April 2022

Accepted

13 June 2022

Published

05 July 2022

Volume

2 - 2022

Edited by

Jan Eglinger, Friedrich Miescher Institute for Biomedical Research (FMI), Switzerland

Reviewed by

Jean-Karim Hériché, European Molecular Biology Laboratory Heidelberg, Germany

Christopher Schmied, Fondazione Human Technopole, Italy

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Artur Yakimovich, a.yakimovich@hzdr.de

†These authors have contributed equally to this work

This article was submitted to Computational BioImaging, a section of the journal Frontiers in Bioinformatics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Computational BioImaging

REVIEW article

Open-Source Biomedical Image Analysis Models: A Meta-Analysis and Continuous Survey

Abstract

Introduction

Continuous Biomedical Image Analysis Model Survey

Trends Meta-Analysis in Biomedical Image Analysis Model

Discussion

Statements

Author contributions

Funding

Conflict of interest

Publisher’s note

References

Summary

Outline

Figures

Cite article

Article metrics

REVIEW article

Open-Source Biomedical Image Analysis Models: A Meta-Analysis and Continuous Survey

Abstract

Introduction

Continuous Biomedical Image Analysis Model Survey

Trends Meta-Analysis in Biomedical Image Analysis Model

Discussion

Statements

Author contributions

Funding

Conflict of interest

Publisher’s note

References

Summary

Outline

Figures

Cite article

Share article

Article metrics