SSAM-lite: A Light-Weight Web App for Rapid Analysis of Spatially Resolved Transcriptomics Data

The combination of a cell’s transcriptional profile and location defines its function in a spatial context. Spatially resolved transcriptomics (SRT) has emerged as the assay of choice for characterizing cells in situ. SRT methods can resolve gene expression up to single-molecule resolution. A particular computational problem with single-molecule SRT methods is the correct aggregation of mRNA molecules into cells. Traditionally, aggregating mRNA molecules into cell-based features begins with the identification of cells via segmentation of the nucleus or the cell membrane. However, recently a number of cell-segmentation-free approaches have emerged. While these methods have been demonstrated to be more performant than segmentation-based approaches, they are still not easily accessible since they require specialized knowledge of programming languages and access to large computational resources. Here we present SSAM-lite, a tool that provides an easy-to-use graphical interface to perform rapid and segmentation-free cell-typing of SRT data in a web browser. SSAM-lite runs locally and does not require computational experts or specialized hardware. Analysis of a tissue slice of the mouse somatosensory cortex took less than a minute on a laptop with modest hardware. Parameters can interactively be optimized on small portions of the data before the entire tissue image is analyzed. A server version of SSAM-lite can be run completely offline using local infrastructure. Overall, SSAM-lite is portable, lightweight, and easy to use, thus enabling a broad audience to investigate and analyze single-molecule SRT data.


Differences between SSAM and SSAM-lite
In order to improve the simplicity, usability and performance of SSAM-lite, we implemented certain heuristics and changed the feature set of SSAM-lite compared to SSAM's guided mode. This allowed for the implementation of a cleaner graphical user interface and simplified the parameter definitions. The differences are described in the following text.

KDE heuristics
As default, the original SSAM package utilizes the sklearn implementation of KDE in its run_kde function. In addition, it also has a fast_kde function which employs a heuristic for producing the KDE. Instead of computing an activation value for every single pixel in the pixel matrix, it only visits pixels that are close to the input coordinate points and updates surrounding pixels up to a distance where activation is considered negligible (as the Gaussian curve decays when moving away from the mean).
The differences between the SSAM-lite KDE implementation to that of SSAMs fast_kde stems from the way local pixels are updated in the pixel matrix. SSAM applies the Gaussian KDE to a small cutout of three bandwidths centered at the molecule coordinate only, whereas SSAM-lite first assigns each molecule coordinate in the pixel space and then projects a pre-calculated Gaussian kernel template onto the pixel space. This results in performance optimizations as the Gaussian kernel has to only be generated once, and we can directly obtain the results of the KDE in the pixel space ( Figure  S2). However, the output of SSAM-lite's KDE can deviate slightly from SSAM's original output when both bandwidth and resolution are low, although we believe these differences are negligible ( Figure S2).

Per gene expression threshold
SSAM implements a per gene expression threshold in addition to the total expression threshold. We found that the total expression threshold always dominated over the per gene expression threshold, so removing it has negligible effect and allows us to simplify SSAM-lite with fewer parameters.

Automated scaling of coordinates to pixel matrix size
Other than the original SSAM, SSAM-lite includes a subroutine that automatically scales the input coordinates to fill out the complete pixel matrix and determines the width-height ratio from the coordinate input. This leads to optimal memory usage and direct control of the vector field size in pixels, whereas for SSAM, the user must provide the (metric) sample height and width externally and then define the shape of the pixel matrix through a scaling factor that determines the sampling distance. Beyond optimized memory consumption, this eliminates an additional parameter and allows for a simpler interface.

Area-based structure removal
The original SSAM implementation includes a function to remove small structures from the output cell-type map by: (i) removing structures ('blobs') with a pixel count below a user-defined threshold; and (ii) Filling empty blobs below a certain pixel count in the cell-type map. These features were omitted for SSAM-lite's initial release as we did not consider them elementary, and including them would have complicated the interface. However, we're considering including them in a future release.

Correlation-based cutoff
The original SSAM algorithm provides a threshold for a minimal correlation before a pixel is classified, whereas SSAM-lite classifies all pixels above the expression threshold. We did not implement this feature because out of the two thresholding concepts, the pixel matrix norm threshold (which is used in SSAM-lite) seemed more intuitive and in our experience gives a good estimate for local cell existence.

Input and output masks
The original SSAM can make use of a user-provided input and output masks, which restricts the data processing to certain parts of the image. This feature does not affect the computation of the guided mode analysis. The mask must be computed by an external tool and requires careful fitting to the output cell-type map, which is why we decided against including it in our light-weight tool.

Low end hardware
To demonstrate the algorithm's ability to scale to modest hardware, we analyzed the mouse SSp osmFISH dataset by Codeluppi et al on a Lenovo b570e laptop and a 2017 Samsung S8+ Android 9 smartphone. The SSAM-lite algorithm was run with a parameter setting of bandwidth=5, pixel width=500 and threshold=2. The runtime and memory footprint of the function runKDE was monitored using the DevTools performance monitoring tool of Chrome v96. The procedure was repeated three times per test.

SSAM-lite KDE heuristic performance increase
To demonstrate that our new implementation of KDE is substantially faster than the implementation in SSAM, we implemented SSAM-lite's heuristic in python and compared it to the naive, C-based KDE function run_kde implemented as default in the SSAM package, and to a python version SSAM's fast_kde algorithm.
Using a simulated dataset of random coordinates on a square patch, we determined the resource consumption characteristics of the different KDE implementations by varying the 'cell-type map width', 'n_coordinates' and 'kernel bandwidth' parameters. The runtime of the naive SSAM run_kde proved the slowest, typically showing a 100-fold slower runtime increase compared to the SSAM fast_kde heuristic. SSAM-lite's added heuristic could outperform SSAM's fast_kde by 10-fold in terms of runtime. The runtime complexity of SSAM's default run_kde algorithm scaled quadratically with respect to cell-type map width (or linear with cell-type map pixel count), whereas the SSAM fast_kde and the SSAM-lite heuristic implementation show a constant runtime for all cell-type map sizes ( Figure S3).