A Gene Transfer-Positive Cell Sorting System Utilizing Membrane-Anchoring Affinity Tag

Gene delivery efficiency is an essential limit factor in gene study and gene therapy, especially for cells that are hard for gene transfer. Here we develop an affinity cell sorting system that allows efficient enrichment of gene transfer-positive cells. The system expresses an enhanced green fluorescent protein (EGFP) fused with an N-terminal high-affinity Twin-Strep-Tag (TST) that will be anchored to the cell membrane at the out-surface through a glycosylphosphatidylinositol (GPI) membrane-anchoring structure. The EGFP permits microscopy and flow cytometry analysis of the gene transfer-positive cells, and the TST tag at the N terminal of EGFP allows efficient affinity sorting of the positive cells using Strep-Tactin magnetic beads. The cell sorting system enables efficient isolation of gene transfer-positive cells in a simple, convenient, and fast manner. Cell sorting on transfected K-562 cells resulted in a final positive cell percentage of up to 95.0% with a positive cell enrichment fold of 5.8 times. The applications in gene overexpression experiments could dramatically increase the gene overexpression fold from 10 times to 58 times, and in shRNA gene knockdown experiments, cell sorting increased the gene knockdown efficiency from 12% to 53%. In addition, cell sorting in CRISPR/Cas9 genome editing experiments allowed more significant gene modification, with an editing percentage increasing from 20% to 79%. The gene transfer-positive cell sorting system holds great potential for all gene transfer studies, especially on those hard-to-transfect cells.


INTRODUCTION
In vitro gene delivery has achieved great progress (Zhao et al., 2012). Both virus-mediated and nonvirus-mediated gene delivery can attain high transfection efficiency for most cells (Lundstrom, 2003;Woods et al., 2003;Heller et al., 2005;Abbasalipour et al., 2019). However, the transfection efficiency is still inadequate for the hard-to-transfect cells such as lymphoma/leukemia cells and primary cells (Huang et al., 1998;Migliaccio et al., 2000;Guven et al., 2005). Improving the positive cell percentage for these cells remains a key issue in gene function study. Hotta et al., 2009;Moriarity et al., 2014). However, it is not applicable to suspension cells because of the incompetence in removing the dead cells killed by the drug. In practice, preexperiment is usually obligatory to determine the drug concentration for each cell line because of their diverse sensitivity to drug treatment. In addition, it is worth noticing that the drug administration may cause toxicity and lead to unpredictable side effects on gene expression and cell signaling.
The FACS method employs vectors expressing fluorescent proteins, such as EGFP, mCherry, RFP, YFP, and BFP, and sorts the fluorescence-positive cells on a flow cytometer after gene delivery (Sutermaster and Darling, 2019;Pan and Wan, 2020). This method requires a flow cytometer equipped with a sorting module, which is expensive and not readily accessible to major labs. In addition, even though having a simple procedure, the limit in sorting speed makes it less applicable in experiments desiring a large number of cells (Sutermaster and Darling, 2019;Pan and Wan, 2020).
The common MACS method utilizes an antibody-conjugated magnetic microsphere to bind and sort the target cells expressing the corresponding antigen on the cell surface (Pan and Wan, 2020). For this purpose, H-2K k (Wei et al., 2001) and truncated LNGFR (Matheson et al., 2014) are usually encoded on the vectors and will locate to the cell surface when expressed in the transfected cells. The transfection-positive cells can then be isolated using magnetic beads coupled with the corresponding antibody or binding ligand. Because these molecules themselves have important biological functions, the overexpression and membrane anchoring on the cell surface might alter the gene expression profile and the cell phenotypes. For example, LNGFR is a type I transmembrane cell surface glycoprotein of the tumor necrosis factor receptor superfamily (Dechant and Barde, 2002). Overexpression of LNGFR can promote the osteogenic differentiation of rat extraembryonic mesenchymal stem cells. Using LNGFR as the sorting marker will probably alter the normal cell signaling and produce potential influences on the experimental results (Li et al., 2017). Therefore, we still lack a fast, simple, and widely applicable system to enrich gene transferpositive cells.
Here, we describe a versatile gene transfer-positive cell sorting system based on an affinity fluorescent tag protein encoded on a vector and will be located to the cell surface once expressed. Specifically, the tag comprises the Enhanced Green Fluorescent Protein (EGFP) with an N-terminal Twin-Strep-Tag (TST) (Schmidt et al., 2013;Maertens et al., 2015;Yeliseev et al., 2017) and a C-terminal membrane positioning signal module. The gene transfer-positive cells will express and display the affinity fluorescent tags on the cell surface and, therefore, can be sorted using Magrose Strep-Tactin magnetic beads that can bind TST tags with high affinity. This positive cell sorting system is efficient, simple, low-cost, and convenient to operate, and thus has great potential in diverse gene function research and related applications, including gene overexpression, gene knockdown, reporter gene assay, genome editing, et al.

Design the Affinity Fluorescent Sorting Tag Protein
We design the sorting tag protein on the basis of enhanced green fluorescent protein (EGFP) to allow the identification of the gene transfer-positive cells with microscopy and flow cytometry. To realize affinity sorting of the positive cells, we fuse EGFP with an N-terminal TST tag, the dimer version of Strep-TagII that binds Strep-Tactin with high affinity (Rai et al., 2014), resembling streptavidin-biotin (Schmidt and Skerra, 2015). Further, to locate the sorting tag protein on the cell surface, we start from six membrane-anchoring modules. Among them, three membrane-anchoring motifs from BY55, DAF, and CEAM7 can anchor the protein to the outer layer of the lipid bilayer through glycosylphosphatidylinositol (GPI) molecule (Ferguson and Williams, 1988;Kinoshita et al., 2008;Paulick and Bertozzi, 2008) ( Figure 1A). Three modules from ITB3, ITA5, and ITAV belong to the transmembrane domain (TMD), which can insert the eukaryotic cell membrane and anchor the molecules to the cell membrane (Ling et al., 1999;Winnard et al., 2007;Fu et al., 2019) ( Figure 1A). We used SignalP-5.0 Server (Almagro Armenteros et al., 2019) to predict the corresponding module sequences and splicing sites of the given signal peptide (Supplementary Table  S1). Meanwhile, to avoid evoking potential integrin signaling, we replaced all the intracellular amino acid residues involved in functional interactions (Arnaout et al., 2007) (Supplementary  Table S2).
The GPI modification motif coding sequences, the transmembrane domain coding sequences, and the TST coding sequence were synthesized and joined with the reading frame of EGFP by multiplex PCR and then cloned into the pEGFP-C2 vector. Transfected cells harboring these plasmids will display corresponding sorting tags on the cell surface. Strep-Tactin magnetic beads can thus bind and enrich the transfectionpositive cells ( Figure 1B).

Sorting Tags Locate to the Cell Surface
First, to evaluate the cell membrane targeting ability of the six sorting tags, we transfected the corresponding expression plasmids into the Lenti-X 293T cells growing on the glass slide and observed with a confocal laser scanning fluorescence microscopy. We found that all the six sorting tags were expressed at a high level and effectively located to the cell membrane ( Figure 2). The EGFP protein without membrane positioning signal was distributed throughout the whole cell (Figure 2A), and the three GPI sorting fluorescent tags exhibited an obvious membrane targeting effect ( Figure 2B), at a higher degree than the three TMD sorting tags ( Figure 2C). Notably, the cells expressing GPI sorting tags displayed unaffected cell morphology, but the cells expressing TMD sorting tags present a rounded shape ( Figure 2C). It indicates that the integrin TMD protein overexpression might disturb the cell adhering.  Affinity Cell Sorting Enriches Transfection-Positive Cells We transfected the sorting tag vectors into three cell lines and performed affinity cell sorting with the Strep-Tactin magnetic beads to enrich the transfection-positive cells. We found that, in K-562, a suspension leukemia cell, all six sorting tags significantly enriched fluorescence positive cells. For the three GPI type sorting tags, TST-EGFP-GPI BY55 , TST-EGFP-GPI DAF , and TST-EGFP-GPI CEAM7 , the affinity sorting increased positive cell percentage from 15%, 16%, and 16%-86%, 87%, and 88%, respectively, as determined with flow cytometry ( Figures 3A,B). Meanwhile, for the three TMD type sorting tags, TST-EGFP-TM ITB3 , TST-EGFP-TM ITA5 , TST-EGFP-TM ITAV , the positive cells ratio increased from 28%, 24%, and 35%-78%, 68%, and 82% respectively ( Figures 3A,B). Further affinity cell sorting with the three GPI type sorting tags in the Jurkat T-cellderived leukemia cells showed that TST-EGFP-GPI BY55 , TST-EGFP-GPI DAF , and TST-EGFP-GPI CEAM7 increased the positive cell percentage from 13% to 67%, 77%, and 63%, respectively (Supplementary Figures S1A,B).
In addition, we calculated the positive cell enrichment fold to represent the ability of the six tags in cell sorting. The results showed that the three GPI type sorting tags had higher enrichment fold than the TMD type sorting tags in all the three cell lines, including K-562, Lenti-X 293T, and 22Rv1 cells ( Figure 3C).
During the cell sorting, we noticed that when we put the tube on a magnetic stand to separate the bead/cell complexes, they ran suddenly and roughly toward the magnet, which might cause the dropping of the bound positive cells. So we tried to separate the cell/magnetic beads by free settling instead of applying an external magnetic field, and obtained an higher positive cell percentage. In the case of cell sorting with the TST-EGFP-GPI BY55 tag in K-562 cells, the positive cell ratio reached up to 95% ( Figure 3D).
In addition, we determined the cell sorting enrichment fold with EGFP expression at the RNA level. The RNA was extracted from the transfected cells before or after cell sorting, and the EGFP expression was measured using RT-qPCR with the β-actin gene as a control. We found that all the six sorting tags could efficiently enrich the transfectionpositive cells and lead to a dramatically higher EGFP expression level in the resulted cells. Consistent with the flow cytometry analysis results, the three GPI type sorting tags, TST-EGFP-GPI BY55 , TST-EGFP-GPI DAF , and TST-EGFP-GPI CEAM7 , had higher enrichment fold than the three TMD type ones in K-562 ( Figure 4A), Lenti-X 293T ( Figure 4B), and 22Rv1 cells ( Figure 4C). The three GPI type sorting tags worked more effectively and reached an enrichment fold of over ten times in K-562 cells ( Figure 4A). In the other suspension cells, Jurkat, the enrichment folds of the three GPI type sorting tags were also over nine times (Supplementary Figure S1C). Noticeably, the enrichment fold of the six sorting tags had a big fluctuation in the Lenti-X 293T and 22Rv1 cells (Figures 4B,C). It indicated that detaching and resuspending operation of the adherent cells might disturb cell sorting. Given the higher and more stable cell sorting enrichment fold, we choose the TST-EGFP-GPI BY55 sorting tags for further evaluation and application.

Cell Sorting Helps Gene Overexpression Analysis
We first tested the cell sorting system in a gene overexpression experiment. The coding sequence of the TST-EGFP-GPI BY55 was inserted into the pcDNA3.1 vector in place of the neomycin resistance gene to obtain the pcDNA3.1-GPI BY55 sorting vector for gene overexpression (Supplementary Figure S3B). The vector could effectively drive the expression and membrane targeting of the sorting tag (Supplementary Figure S2A). Then we cloned two transcription factor genes, CEBPB and CTCF, from K-562 cDNA into this vector and transfected K-562 cells. The expression level of the target genes in transfected cells before or after affinity sorting was determined by RT-qPCR. We observed that cell transfection increased the CEBPB gene expression level by ten times the control, and affinity cell sorting operation dramatically increased the expression level by up to 58 times in the sorted cells ( Figure 5A). Similarly, cell transfection with CTCF expression plasmid increased the mRNA level of the CTCF gene by 14 times that of the endogenous gene, and further affinity cell sorting increased the expression level effectively by 27 times ( Figure 5B). It indicates that FIGURE 3 | positive cells in the sorted K-562, Lenti-X 293T, and 22Rv1 cells determined by flow cytometry analysis. Values are from three independent biological replicates. (C) The enrichment fold of positive cells after affinity cell sorting with six sorting tags in K-562, Lenti-X 293T, and 22Rv1. Data from three independent biological replicates. (D) Flow cytometry analysis of K-562 cells transfected with TST-EGFP-GPI BY55 sorting tag plasmids and sorted through free settling strategy. The grey, blue, and orange layers represent negative control cells, sorting tag-transfected cells, and enriched cells. The bar chart represents data from three biological replicates.
Frontiers in Bioengineering and Biotechnology | www.frontiersin.org June 2022 | Volume 10 | Article 930966 the affinity cell sorting system can greatly help the gene overexpression study.

Cell Sorting Assists shRNA Gene Knockdown Assay
Furthermore, we investigated the affinity cell sorting system in the shRNA gene knockdown assay. The encoding sequence of TST-EGFP-GPI BY55 was cloned into the pLKO.1 vector in the place of the puromycin resistance gene to construct a plasmid for gene knockdown assay (Supplementary Figure S3C). Fluorescence microscopy showed that the affinity sorting vector could effectively express the EGFP sorting tags and locate them to the cell surface (Supplementary Figure S2B). Then we constructed two shRNA expression plasmids targeting the BCR-ABL fusion gene, a vital marker molecule of  Values represent CEBPB or CTCF expression level normalized to the β-actin reference gene, determined using RT-qPCR experiments. Error bars, means ± SEM of three independent experiments. **p < 0.01, and ****p < 0.0001 as determined by an unpaired, two-tailed Student's t-test.
Studies have shown that reducing the BCR-ABL expression can inhibit the proliferation of leukemia cells (Szczylik et al., 1991;Skorski et al., 1993;Liu et al., 2021). Hence, we evaluated the proliferation ability of the shRNA transfected cells before or after affinity cell sorting using the CCK-8 kit. The results showed that both ABL-shRNA1 and ABL-shRNA2 transfection significantly inhibited cell proliferation ( Figure 6B). Noticeably, a more significant inhibition effect was observed in the cells enriched in the affinity cell sorting ( Figure 6B). It indicates that the transfection positive-cell affinity sorting system can deeply assist gene knockdown experiments.

Cell Sorting for Genome Editing
The CRISPR/Cas9 system has been widely used in gene function research and the treatment of genetic diseases (Jinek et al., 2012;Cong et al., 2013;Mali et al., 2013). Increasing the editing positive cell ratio is also pivotal in genome editing research. Here, we also evaluated the affinity cell sorting system in the genome editing experiment. The expression unit of the TST-EGFP-GPI BY55 sorting tag was cloned into the high-fidelity eCas9 (Kleinstiver The cell proliferation analysis of the K-562 cells expressing ABL-shRNA1 or ABL-shRNA2, with or without enrichment by affinity cell sorting, measured using the CCK-8 kit. Data from three biological replicate wells. (C) Bar chart showing genome editing frequency at rs1388941 locus in K-562 cells transfected with CRISPR/eCas9-GPI BY55 vector encoding sgRNA targeting rs1388941 region, with or without enrichment by cell sorting. The values represent the allele-specific unaltered gene percentage from three getPCR experiments. (D) Genome editing in K-562 cells using CRISPR/eCas9-GPI BY55 vector encoding the HOXB13 gene sgRNA. The values represent the unaltered gene percentage from three getPCR experiments. Error bars, means ± SEM. **p < 0.01, ***p < 0.001, and ****p < 0.0001 as determined by an unpaired, two-tailed Student's t-test.
Frontiers in Bioengineering and Biotechnology | www.frontiersin.org June 2022 | Volume 10 | Article 930966  Figure  S3D). Fluorescence microscopy analysis showed that cells transfected with the vector could effectively express the EGFP sorting tags and localize them to the cell membrane (Supplementary Figure S2C). The function and mechanism researches of the disease riskassociated SNPs play pivotal roles in genetic pathology (Huang et al., 2014;Gao et al., 2018;Ma et al., 2021;Ren et al., 2021). Modifying risk SNP sites through genome editing to obtain risk SNP cell models is essential in the study. Here, we chose rs1388941, a risk SNP site associated with osteonecrosis in children with acute lymphoblastic leukemia (Karol et al., 2015), as an example for the genome editing study and designed a sgRNA targeting the neighbor region. We transfected K-562 cells, A/G heterozygous at the SNP site, and performed affinity cell sorting to enrich the transfection-positive cells. After cell sorting, genomic DNA was extracted, and the unedited ratio at both alleles was determined through the getPCR method (Li et al., 2019a). We found that the editing ratio at both alleles was only about 7% before sorting. Remarkably, the editing ratio significantly increased to about 30% and 25% ( Figure 6C) in enriched cells.
To better manifest the potential of the affinity cell sorting system in genome editing experiment, we further performed genome editing using a sgRNA that targeting the HOXB13 gene with a high editing activity in Lenti-X 293T cells (Li et al., 2019b). Analysis in K-562 cells showed that cell sorting significantly improved the editing efficiency at this sgRNA target from 20% to 79% ( Figure 6D). It indicates that the transfectionpositive cell sorting system can promote genome editing experiments with hard-to-transfect cells.

Universal Sorting Plasmid Co-Transfection Permits Efficient Positive-Cell Sorting
Presently, cloning the target genes in the given vectors encoding the sorting marker is usually obligatory for positive cell sorting. To break this limit, we investigated the strategy of co-transfecting the aim plasmid with a universal plasmid expressing the sorting tag, which will allow the sorting system to be easily transplanted to any other experiments in need. Remarkably, this strategy will exempt the need to construct a vector for a given type of application, and experiments can start directly with the existing plasmid.
We co-transfected Lenti-X 293T cells with the TST-EGFP-GPI BY55 expression plasmid and the pmCherry-N1 in equal proportion. Laser confocal microscopy analysis showed that the EGFP and the mCherry displayed an obvious co-positive pattern ( Figure 7A). The positive cells for EGFP and mCherry accounted for 52% and 48%, respectively. Remarkably, EGFP and mCherry double-positive cells accounted for up to 44% (Supplementary Figure S4).
Furthermore, we performed similar co-transfection on K-562 cells and applied affinity cell sorting to enrich the transfectionpositive cells. Flow cytometry analysis showed that EGFP and mCherry displayed a prominent co-positive feature again ( Figures 7B-D). In the co-transfected K-562 cells, the proportions of EGFP positive cells and mCherry positive cells were 36% and 32%, respectively. Remarkably, the double-positive cells accounted for up to 29% ( Figure 7B). In cells enriched by affinity cell sorting, the proportion of EGFP positive, mCherry positive, and double-positive cells reached 93%, 90%, and 86%, respectively ( Figures 7B-D). It indicates that co-transfection with a universal sorting plasmid can allow effective affinity cell sorting to enrich the positive cells containing the target plasmid.
Interestingly, in the co-transfected K-562 cells, the ratio of EGFP single positive, mCherry single positive, and EGFP/ mCherry double-positive cells was approximately 1:1:6. Assumed that each liposome microdroplet carries multiple plasmid molecules and the plasmids of similar size have equal opportunity to enter cells, if three plasmids enter one cell, the corresponding ratio is supposed to be 1:1:6. Thus, we proposed that an average of three plasmids entered each cell under the transfection conditions.

Sorting Plasmid Co-Transfection Assists Gene Knockdown and Overexpression Experiments
Next, we co-transfected the pEGFP-GPI BY55 plasmid with the pLKO.1 plasmid expressing ABL shRNA into K-562 cells and performed affinity cell sorting. RT-qPCR analysis showed that ABL-shRNA1 and ABL-shRNA2 downregulated ABL gene expression by 29% and 31%, respectively. Noticeably, in the cells enriched by affinity sorting, the knockdown efficiencies of ABL-shRNA1 and ABL-shRNA2 increased dramatically to 66% and 72%, respectively ( Figure 7E), comparable to the levels of the single-plasmid transfection experiment.
Further, we applied the sorting plasmid co-transfection strategy in gene overexpression experiments. The pEGFP-GPI BY55 plasmids were co-transfected with pcDNA3.1 vector encoding CEBPB or CTCF gene into K-562 cells. RT-qPCR analysis showed that the transfection increased the expression levels of CEBPB and CTCF by nine times and five times, respectively. While in the sorted cells, the expression levels of CEBPB and CTCF dramatically increased to 46 times and 24 times, respectively ( Figure 7F), comparable to that of the singleplasmid transfection experiments. It indicates that the sorting plasmid co-transfection strategy can effectively enrich positive cells through affinity cell sorting and benefit the gene knockdown and gene overexpression assays to an extent comparable to the single plasmid transfection strategy.

DISCUSSION
We developed a gene transfer-positive cell sorting system to help gene studies, especially those in hard-to-transfect cells. The system is based on a fluorescent affinity sorting tag designed by fusing EGFP with an N-terminal TST peptide and a GPI signal module from the BY55 gene. The positive cells expressing the sorting tags on the cell surface can bind Strep-Tactin magnetic beads and hence are enriched effectively. Besides, the EGFP module of the sorting tag enables the positive cells to be evaluated through fluorescence microscopy and flow cytometry. Furthermore, we demonstrated the great potential of the sorting system in a series of gene function studies, including gene overexpression, shRNA knockdown, and genome editing. Our cell sorting system has several advantages compared to the existing magnetic cell sorting methods. Firstly, we use a GPI anchor-linked EGFP molecule as the basis to construct the sorting tag, which less probably brings about a disturbance on the cell signaling and gene function compared to the truncated LNGFR (Matheson et al., 2014) and mouse H-2K k (Wei et al., 2001) molecules used in previous methods. Secondly, the TST is used as the affinity ligand to display on the cell surface and enable affinity cell sorting using the Strep-Tactin ® or Strep-Tactin ® XT magnetic beads. The nM or even pM level affinity of the TST for the receptor makes the system more efficient to pull out the positive cells than the SBP tags (Matheson et al., 2014), and the H-2K k (Wei et al., 2001) molecules do. Thirdly, the EGFP module in our sorting tag allows convenient evaluation of the positive cells through fluorescent microscopy and flow cytometry, which is difficult for the existing MACS methods. Fourthly, the GPI anchoring structure displaying the sorting tag on the cell surface in our system is obviously more efficient than the transmembrane domain used in the existing MACS methods.
Compared with the drug screening methods, our sorting system is more time-saving and efficient because of its better versatility for different cell lines and the ability to realize transfection-positive cells enrichment with a one-step affinity sorting operation. Correspondingly, the drugs usually take several days to kill the transfection-negative cells, and on the other hand, pre-experiments are obligatory to explore the working drug concentration for different cell lines. Meanwhile, our sorting system is supposed to cause fewer side effects on the cell function because it does not bring about cell toxicity as the screening drugs do. Unlike the restricted application in adherent cells for the drug screening methods, our sorting system is applicable to both adherent and suspension cells.
Compared to the FACS method, our cell sorting system also has several advantages. Firstly, we do not require equipment like the cell sorter, which is expensive and not readily available in most laboratories. Secondly, the throughput of our cell sorting system is easy to expand by using more affinity beads and can be performed in parallel. Whereas, even the cutting-edge cell sorter, the sorting speed is still limited, resulting in prolonged sorting time if many cells are demanded. Thirdly, our cell sorting operation applies a more mild operation which will allow less mechanical damage to the cells than the FACS method.
Finally, our gene transfer-positive cell sorting system displayed great potential to enrich gene transfection positive cells in gene study applications, including gene overexpression, gene shRNA knockdown, and genome editing. In the future, the application can be easily expanded to other gene study fields by simply inserting the expression cassette of TST-EGFP-GPI BY55 into the target vectors. Furthermore, we can further expand the versatility of the cell sorting system by replacing the EGFP with other fluorescent proteins such as mCherry, dsRed, RFP, YFP, and BFP, and replacing the TST tags can with other affinity tags such as CBP (calmodulin-binding peptide), MBP (maltose-binding protein), and His-Tag protein tags. In addition, the sorting system should work well in a wide range of biological systems with GPI anchoring systems, including most eukaryotes and some Archaeobacteria (Yadav and Khan, 2018;Nakano et al., 2021).
Notably, the co-transfection experiment of the TST-EGFP-GPI BY55 expression plasmid and pmCherry-N1 plasmid manifested a strongly co-positive character for the two fluorescence. This property permits the gene transfer-positive cell sorting in a more simple but efficient way by co-transfecting existing vectors with the universal TST-EGFP-GPI BY55 expression plasmid. More interestingly, the co-transfection strategy can be further expanded to other transfection-positive cell sorting systems such as FACS and MACS. The cotransfection strategy can avoid the trouble of reconstructing the target gene plasmid on the sorting vector, allow the target plasmid to accommodate larger insertion, and hence lead to reduced time and manpower cost. In summary, the gene transfer-positive cell sorting system possesses great potential to burst gene function study in hard-to-transfect cells.

Plasmid Construction and Gene Cloning
The coding sequences of the six membrane location signals were synthesized in the pUC57a vector (GENEWIZ Co., Suzhou, China) and amplified separately through PCR. The EGFP coding sequence was amplified from the pEGFP-C2 plasmid, and the Twin-Strep-Tag (TST) coding sequence containing 30 amino acid residues (WSHPQFEK-GGGSGGGSGGS-SAWSHPQFEK) was obtained by primer self-PCR. Then the above PCR products were mixed at the mole ratio of 1:1:1:1 and subjected to overlapping PCR to obtain the whole length sorting tag coding sequence. The resulted PCR products were digested by FastDigest AgeI and FastDigest BglII (Thermo Fisher, Waltham, MA, United States) and then joined with the FastDigest AgeI/ FastDigest BglII linearized pEGFP-C2 vector to obtain the expression vectors for the six affinity sorting tags (Supplementary Figure S3A). Primer sequences are shown in Supplementary Table S3.
For constructing the CEBPB and CTCF overexpression plasmids, the corresponding CDS sequences were amplified from K-562 cDNA using primers listed in Supplementary  (Li et al., 2019b). The annealed oligos (Supplementary Table S6) bearing the gRNA sequence were inserted into BbsI linearized CRISPR/eCas9-GPI BY55 vector using T4 DNA ligase. All plasmids for cell transfection were extracted with an Endo-Free Plasmid Mini Kit I (D6948-02, OMEGA, Guangzhou, China) and purified by ethanol precipitation.

Confocal Laser Scanning Microscopy
The Lenti-X 293T cells were inoculated on a glass slide in a 24well plate at a density for cells to reach 70% at transfection. On the next day, 0.8 µg pEGFP-C2, pEGFP-GPI BY55 , pEGFP-GPI DAF , pEGFP-GPI CEAM7 , pEGFP-TM ITB3 , pEGFP-TM ITA5 and pEGFP-TM ITAV plasmids were transfected using 1×PEI reagent (#408727, Sigma, St. Louis, MO, United States) as described previously (Ma et al., 2021) with a DNA: PEI ratio of 1:1.5. Forty-eight hours post-transfection, cells were washed twice with 1×PBS and fixed with 4% paraformaldehyde at room temperature for 10-15 min in the dark and then gently washed twice with 1×PBS. Then, the nuclei were counterstained with 10 μg/ml DAPI (4′, 6-diamidine-2-phenylindole, C0060, Solarbio, Beijing, China) reagent according to the instructions by 15 min incubation at 37°C followed by twice washing with 1×PBS. The slides were sealed with Antifade Mounting Medium (S2100, Solarbio, Beijing, China) and stored in a wet box in the dark. The cells were observed under an LSM900 Super Resolution Laser Scanning Confocal Microscope (ZEISS, Oberkohen, baden-Wurberg, Germany).

Cell Transfection and Affinity Cell Sorting
For Lenti-X 293T and 22Rv1, cells were inoculated in 6-well plates and transfected on the next day as described previously (Ma et al., 2021)  The cells were applied to affinity cell sorting 36-48 h posttransfection. For adherent Lenti-X 293T and 22Rv1 grown in 6well plates, cells were rinsed with 500 µl 1×PBS solution and dispersed into a single-cell suspension through incubation with 500 µl 1× Non-enzymatic Cell Dissociation Solution (C5914, Sigma, St. Louis, MO, United States) for 5-10 min. The cells were collected and washed twice with 1×PBS solution and then resuspended in 250 µl binding buffer (IMDM with 2% FBS). For suspension Jurkat and K-562, cells were collected directly, washed twice with 1×PBS solution, and then resuspended in 250 µl binding buffer.
For each cell sorting reaction, 100 µl BeaverBeads ™ Magrose Strep-Tactin (#70808, Beaver, Suzhou, China) were washed twice, resuspended in 250 µl binding buffer, and then mixed with the cell suspension gently. The mixture was placed on a rotating mixer and incubated at 10 rpm for 15 min at room temperature. Then the magnetic beads were separated by staying on a magnetic rack for 2 min. After removing the supernatant, the beads/cells complexes were gently rinsed twice with IMDM medium without FBS. Finally, the gene transfection-positive cells captured on the magnetic beads were released in 300 μl complete medium by rotating at 15 rpm for 5 min. The magnetic beads were separated on the magnetic rack to collect the supernatant containing transfection-positive cells.
For the cell sorting strategy separating the beads/cells complex by free settling, 1.5 ml D-PBS (E607009, Sangon Biotech, Shanghai, China) solution containing 0.1% BSA (A600332, Sangon Biotech, Shanghai, China) was used as the binding buffer and washing buffer to prepare cell suspension and beads. After incubation on a rotating mixer, the beads/cells complexes were separated from the unbound cells by free settling for 1 min, utilizing their density difference. The bead/ Frontiers in Bioengineering and Biotechnology | www.frontiersin.org cell complexes were resuspended in 1.5 ml D-PBS solution containing 0.1% BSA, and the supernatant was removed after 1 min free settling. The transfection-positive cells captured on the beads were collected in a D-PBS solution containing 0.1% BSA or a specific cell medium dependent on the downstream applications.

Flow Cytometry
Cells from the affinity cell sorting experiment were directly subjected to flow cytometry analysis on a NovoCyte (ACEA Biosciences, San Diego, California, United States) or Guava easyCyte (Luminex, Austin, Texas, United States). The default detector gain was used for FSC and SSC, while the detector gain of FITC was adjusted to locate the negative control cell peak around 1× 10 2 and 1×10 3 . For each analysis, 10,000 events were acquired at a low-speed flow rate. In multicolor fluorescence analysis of EGFP and DsRed transfected cells, fluorescence compensation was performed using compensation control cells that were transfected with EGFP and DsRed plasmids separately.

RNA Extraction and Quantitative RT-PCR
RNA samples were prepared using the GeneJET RNA Purification Kit (K0732, Thermo Scientific, Waltham, MA, United States) according to the product user guide. Residual genomic DNA was removed by the RapidOut DNA Removal Kit (K2981, Thermo Scientific, Waltham, MA, United States) according to the user guide. RNA was reverse transcribed into cDNA with the High-capacity cDNA Reverse Transcription Kit (4368813, Applied Biosystems, Waltham, MA, United States) using the accompanied random primers, following the product manual. The cDNA products were used directly for subsequent experiments or stored in a −80°C refrigerator.
The mRNA expression levels of EGFP, ABL, CEBPB, or CTCF in the transfected cells before or after cell sorting were determined by qPCR using Taq388 mix (Du et al., 2022) on a QIAGEN Q-Rex machine as previously described (Ma et al., 2021) with primers listed in Supplementary Table S6. Each pair of PCR primers were tested, and primers with good specificity and amplification efficiency were selected for quantitative PCR analysis. The endogenous ACTB (β-actin) gene expression was used for normalization. The enrichment fold of the EGFP mRNA for the cell sorting was calculated to characterize the cell sorting efficiency of each sorting tag. In the gene overexpression and knockdown experiment, the empty vector was used as the control in the transfection.

getPCR
The genomic DNA was prepared from transfected cells 48 h post-transfection with or without cell sorting, using TIANamp genomic DNA kit (#DP304-03, Tiangen Biological Technology, Beijing, China). The genome-editing efficiency was evaluated using the getPCR method (Li et al., 2019a) with primers listed in Supplementary Table S6 that had been evaluated for the amplification efficiency and specificity. The qPCR was performed using the Taq 388 mix (Du et al., 2022) on a QIAGEN Q-Rex machine with the program: 5min initial denaturation at 95°C, then 40 cycles of 95°C for 30 s, 67°C for 30 s and 72°C for 15 s with fluorescence acquirement, followed by a final melting curve step increasing from 65°C to 95°C.

Cell Proliferation Assays
K-562 cells transfected with ABL shRNA with or without enrichment through affinity cell sorting were dispersed into single-cell suspension and inoculated in a 96-well plate at a density of 1000 cells per well. K-562 cells transfected with the pLKO.1-GPI BY55 blank plasmid were also inoculated as the control. At 0, 24, 48, 72, and 96 h post-seeding, 1/10 volume CCK-8 reagent (Cell Counting Kit-8, MA0218, Meilun, Dalian, China) was added to the cells, and OD450 nm was acquired on a microplate reader after 3-h incubation, with 600 nm as the reference wavelength. Values were obtained from three independent replicate wells, and the statistical significance was calculated using a two-tailed Student's t-test.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
LY: Data curation, formal analysis, investigation, methodology, visualization, writing-original draft. LC, SM and QZ: Investigation, validation, visualization, writing-review and editing. QH: Conceptualization, methodology, project administration, supervision, funding acquisition, visualization, writing-review and editing. All authors have read and agreed to the published version of the manuscript.