ORIGINAL RESEARCH article

Front. Plant Sci., 20 November 2018

Sec. Computational Genomics

Volume 9 - 2018 | https://doi.org/10.3389/fpls.2018.01685

Identification and Analysis of Rice Yield-Related Candidate Genes by Walking on the Functional Network

  • 1. School of Aerospace Engineering, Xiamen University, Xiamen, China

  • 2. School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China

  • 3. School of Information Science and Engineering, Xiamen University, Xiamen, China

Abstract

Rice (Oryza sativa L.) is one of the most important staple foods in the world. It is possible to identify candidate genes associated with rice yield using the model of random walk with restart on a functional similarity network. We demonstrated the high performance of this approach by a five-fold cross-validation experiment, as well as the robustness of the parameter r. We also assessed the strength of associations between known seeds and candidate genes in the light of the results scores. The candidates ranking at the top of the results list were considered to be the most relevant rice yield-related genes. This study provides a valuable alternative for rice breeding and biology research. The relevant dataset and script can be downloaded at the website: http://lab.malab.cn/jj/rice.htm.

Introduction

Rice (Oryza sativa L.) is one of the most important food crops worldwide, being used as the main food source by more than half of the global population (Mahender et al., 2016; Li et al., 2017). In the developing world, rice provides 27% of dietary energy and 20% of dietary protein (Huang et al., 2013). However, despite genetic improvements in grain yield delivered by the exploitation of semi-dwarfism and heterosis over the past 50 years, a substantial increase in grain productivity of the major crops is still required to feed a growing world population (Abe et al., 2018). The prime breeding target is to increase both grain size and grain number, because they impact both on yield potential and its end-use quality (Okada et al., 2018). However, the simultaneous improvement of grain quality and grain yield is a major challenge because of the well-established negative correlation between these two traits which is controlled by quantitative trait loci and influenced by environmental changes. Additionally, determining which genes in quantitative trait loci regulate grain size and number has not been clarified (Borzee et al., 2018; Li et al., 2018). Therefore, the identification genetic variants associated with improvements in grain yield would facilitate the breeding of new high-yielding rice varieties and may also be applicable to other crops (You et al., 2017).

Vast numbers of genetic variants have been detected by traditional genome-wide association studies and recent sequencing studies, and connecting the functional implications of these results to known genes has become a standard task (Li et al., 2015; Dehury et al., 2017; Torres and Henry, 2018; Wu et al., 2018). We previously developed a database, RicyerDB, to collect all known rice yield-related genes by integrating multiple omics data, information from the literature, and associated databases (Jiang et al., 2018). This work also established a search tool to query a particular gene, and to provide insights into gene functions and locations. Any rice yield-related gene can therefore be easily queried and the findings downloaded through the webpage, while candidate genes can be screened and prioritized to identify those most likely to be associated with known genes.

To achieve this goal, several approaches have been proposed from the perspective of computational systems biology (Behroozi-Khazaei and Nasirahmadi, 2017; He et al., 2017; Liu E. et al., 2017; Liu Y. et al., 2017; Xiong et al., 2017; Maione and Barbosa, 2018; Zhang M. et al., 2018; Zhou et al., 2018). For example, the Endeavor tool uses the guilt-by-association principle to rank candidate genes according to their functional similarities to a set of predefined seed genes (Aerts et al., 2006; Tranchevent et al., 2008, 2016). In recent years, a protein–protein interaction (PPI) network has been developed to achieve a global inference of entire genes (Liu et al., 2010; Lee, 2011; Rezadoost et al., 2016; Wang et al., 2016; Zeng et al., 2016; Luo and Liu, 2017; Holland and Johnson, 2018; Vlaic et al., 2018). PPI networks have also been used to provide a simplified yet systematic measure of functional similarities between genes (Chen et al., 2017a, 2018a).

Some methods for identifying yield-related genes have linked profile and sequence technology to facilitate the prediction of related genes. For example, Odilbekov et al. (2018) used machine learning and integrated this analysis with data obtained from spectroradiometer, infrared thermometer, and chlorophyll fluorescence measurements to identify the most predictive proxy measurements for studying Septoria tritici blotch disease of wheat.

Hybrid breeding is an effective tool to improve yield in rice, although parental selection remains a difficult issue. Xu et al. (2018) compared six genomic selection methods, such as least absolute shrinkage and selection operation and support vector machine, to evaluate predictabilities for different methods, and demonstrated their implementation to predict the hybrid performance of rice. Although good results have been achieved by these studies, the techniques of microarray and sequencing are nevertheless expensive.

The main target of this research was to use current knowledge to identify rice yield-related genes with network prediction methods. We proposed a computational systems biology approach for the identification of candidate genes via a random walk model on a PPI network with functional similarities (Kohler et al., 2008). Starting from known nodes, our method simulates the process in which a random walker travels to its neighbors or jumps to itself in the network, scores a gene using the probability that the walker stays in the gene at a steady state, and then ranks candidate genes according to their scores. Using a series of cross-validation experiments, we systematically demonstrated the robustness of our method, and applied our approach to predict a landscape of associations between known genes and candidates.

Materials and Methods

Flowchart Overview

We modeled the problem of identifying candidate genes associated with a set of known genes as a prioritization problem, and proposed to solve this problem using a three-step approach. As shown in Figure 1, taking the set of known genes as input, we first standardized the genes between STRING (Szklarczyk et al., 2015) and RicyerDB (Jiang et al., 2018). Then, we constructed a protein–protein network that scores the edges through functional similarities. This procedure applied a RWR algorithm to the network to calculate a score for each candidate gene, and then ranked the candidates to obtain a ranking list as the output (Chen et al., 2012a,b; Chen, 2016; Chen X. et al., 2016; Li et al., 2016; Peng et al., 2016; Zhu et al., 2018). Finally, the top candidate gene was verified according to its function and by the published literature.

FIGURE 1

Construction of the Functional Similarity Network

The functional similarity network is described as a graph G = (V, E), where V represents the nodes of the network and E stands for the edges of the network. The background network comes from the STRING database because of existing potential associated interactions among the proteins. The known rice yield-related genes were identified from our previous work with RicyerDB (Jiang et al., 2018). To standardize gene names between STRING and RicyerDB, genes were retrieved by reference to National Center for Biotechnology Information gene names. Functional similarities among genes in the background network were considered by scoring E for GO annotations. Using the latest release of the GO database (Ashburner et al., 2000; Chen L. et al., 2016; Raza, 2016; The Gene Ontology, 2017), edges were scored for a shared functional significance score of genes in the network that were annotated with GO terms.

The shared functional significance score F(i,j) between gene i and j was measured by the Weighted Shared Functions approach, which considered a gene’s functions as a set of functional categories in GO. The functions shared by a small number of genes are taken to be far more significant than ones shared by a large number of genes. Each function had its own significance, which was defined as the inverse number of genes sharing the function. When two genes, i and j, have m functions in common, i.e., F(i)∩F(j) = (f1, f2, …, fm), F(i,j) was given as the total sum of the significance of the functions shared between them as follows:

Here sig(fn) denotes the significance of a function fn(n = 1,2,..., m) shared between genes i and j, | Genes (fn)| is the number of genes sharing a function fn. We calculated the ranking score, p, for each gene in the disease-related network and ranked these genes in the descending order of p.

Random Walking on the Functional Similarity Network

We achieved the goal of identifying candidates related to known seeds by calculating a score for each candidate and then ranking the candidates to obtain a ranking list. The higher the rank, the more likely the gene was to be related to the given source nodes. For this purpose, we adapted the RWR method in the functional similarity network.

At the beginning, the walker chooses the seeds as the starting point. In each step of the walking process, the walker may start on a new journey with probability r or move on with probability 1−r. When moving on, the walker may move at random to one of its direct neighbors.

In our application, the initial probability vector P0 was constructed such that equal probabilities were assigned to the nodes representing members of the disease, with the sum of the probabilities equal to 1. This is equivalent to letting the random walker begin from each of the known disease genes with equal probability. The transition matrix W is the column-normalized adjacency matrix of the graph, and Pt is a vector in which the ith element holds the probability of being at node i at time step t. Formally, the RWR is defined as:

Candidate genes were ranked according to the values in the steady-state probability vector P. P vector changes with time t, while it is possible to obtain it by explicitly calculating Equation (1) until convergence. The iteration is finished when the change between Pt and Pt+1 falls below 10−10. In this paper, we set default values for parameters r = 0.3 (see Results section for details).

Validation Method

We adopted a five-fold cross-validation experiment to assess the capability of RWR to identify the left seeds. All seed genes were divided equally into five parts, then one part was removed as a test set, and added to the candidate genes. All candidate genes were ranked by RWR to determine the ranking of the test gene. This procedure was repeated until all seed genes were used up as test genes.

In the context of the functional similarity network, the above validation procedure was equivalent to removing one part of the seed genes to candidate genes and determining whether candidates containing these seeds could receive a high rank. The r parameter of RWR ranged from [0,1] and was used to identify the ranking of the five parts. ROC curves were plotted, and areas under the ROC curve (AUC) values were used to evaluate the performance of r.

Results

Data Sources

We obtained the rice background protein–protein network from the STRING database. In the network, protein associations were either directly derived from physical interactions or functional links from experimental evidence and computational methods (Jensen et al., 2009). The network composes of 6561 nodes and 567034 edges, which represent proteins and interactions between them, respectively. In our study, 136 known genes were selected as seed genes and other genes as candidate genes. We downloaded O. sativa Japonica protein network data through STRING version 10.5 (Szklarczyk et al., 2015).

Proteins with accurate functional annotations are vital to biological research. We obtained functional annotation information from the GO Consortium (Ashburner et al., 2000), and downloaded GO annotations of O. sativa from the most recent GO version. GO enrichment analysis is used to interpret high-throughput molecular data. GO annotation is the list of all annotated genes linked to ontological terms describing those genes.

The RicyerDB database integrates publicly available resources to construct a public platform for browsing and the interactive visualization of yield-related genes. The first release of RicyerDB contained more than 400 manually curated gene information entries which were all associated with rice yield.

Performance of the Proposed Method

The score vector P (the probability of being at the current node) for all genes in the network was calculated based on the ranking of corresponding r coefficients. Candidate genes were then ranked in the descending order of P score.

For optimal parameters, genes were also ranked according to the calculated p scores with nine different r-values (r = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9). The matching numbers of the five-part seed genes were applied to assess the effectiveness of RWR. In Figure 2 listed the five cases of all, the number of matched seeds among the top 500 (every 100 is a measurement cutoff) in the ranking list of r = 0.3 was higher than other r-values in most cases.

FIGURE 2

The sum of the numbers of matched seed nodes in all ranking results was determined, and r = 0.3 was shown to have the maximum match in general. Finally, the parameter r = 0.3 was selected to calculate vector P to obtain the ranking results. Further to detect the robustness of parameter r, we repeated the five-fold cross validation 100 times. Then we applying statistical analysis to compare the ranking of all seeds at different r-values in our model, the results were shown as Figure 3.

FIGURE 3

Prioritization of Candidate Genes and Validation by Literature Review

In the functional similarity network, all candidate genes were prioritized by RWR according to vector P at the final status. We manually searched the 100 top candidate genes (Table 1) in PubMed1 for their association with yield. This verified eight candidate genes associated with rice production. The LOC_Os11g40150 (rank 39) alias is OsRad51A1, which is a key component of homologous recombination in DNA repair. Direct interaction with OsNAC14 recruits factors involved in DNA damage repair and defense response, resulting in an improved tolerance to drought (Shim et al., 2018). LOC_Os04g37619 (rank 11) named ZEP, which is one of the key genes that involved hormone abscisic acid biosynthesis in rice by ion beam. Irritation can enhance the expression of genes involved in ABA biosynthesis, resulting in increasing content of endogenous plant hormone abscisic acid in rice (Chen et al., 2014).

Table 1

RankingGene nameP scorePubMedID
1LOC_Os06g093900.001487617PMID: 20713616, PMID: 27555860
2LOC_Os06g504800.001475286
3LOC_Os02g024800.00146746
4LOC_Os08g424700.001461294
5LOC_Os01g033400.000941415
6LOC_Os01g033900.00080268PMID: 12972663
7LOC_Os01g040400.00080268
8LOC_Os01g040500.00080268
9LOC_Os07g023500.000775571PMID: 16240106, PMID: 11416158
10LOC_Os08g026400.000669873
11LOC_Os04g376190.000640376PMID: 24634194
12LOC_Os11g355000.00062345PMID:29813124, PMID:29402905
13LOC_Os05g419700.000594578PMID: 1731968
14LOC_Os12g168900.000594578
15LOC_Os01g036800.000584668
16LOC_Os07g105800.000564849PMID: 28158863, PMID: 22108719
17LOC_Os06g503400.000561268PMID: 19704753, PMID: 16511358
18LOC_Os10g141500.000555163PMID: 19201764
19LOC_Os01g555400.000551598PMID: 15753104
20LOC_Os10g228600.00054974PMID: 23384860, PMID: 28101092
21LOC_Os10g329900.000547737PMID: 23384860, PMID: 28101092
22LOC_Osm1g004500.000540982
23LOC_Os01g606700.000536737
24LOC_Os07g114100.00053512
25LOC_Os01g138000.000533159
26LOC_Os02g137800.000533159
27LOC_Os10g067600.000533159PMID: 23384860, PMID: 28101092
28LOC_Os10g139700.000533159PMID: 23384860, PMID: 28101092
29LOC_Os10g191600.000533159PMID: 23384860, PMID: 28101092
30LOC_Os02g575300.000532385PMID: 14754915
31LOC_Os10g218100.000529529
32LOC_Os01g477300.000507068
33LOC_Os07g119200.000505391PMID: 28158863, PMID: 22108719
34LOC_Os01g078700.00049357
35LOC_Os03g547900.000492652
36LOC_Os01g186700.000492651
37LOC_Os07g423000.000483507PMID: 24466124
38LOC_Os11g101000.000478643
39LOC_Os11g401500.000478361PMID:28071676
40LOC_Os12g313700.000478361PMID:28071676
41LOC_Os03g057400.000472443
42LOC_Os08g387200.000468006
43LOC_Os03g503300.000462237
44LOC_Os04g087400.000461766PMID: 19417056
45LOC_Os01g426500.000461755PMID: 16263700
46LOC_Os03g272900.000460621PMID: 19217306, PMID: 15672456
47LOC_Os10g396700.000460227
48LOC_Os01g652300.000459159
49LOC_Os03g547800.000456546
50LOC_Os08g036400.000456163
51LOC_Os01g148300.000454589
52LOC_Os01g108200.000453601
53LOC_Os10g421100.000449388
54LOC_Os03g268600.000448345
55LOC_Os07g417500.000448221
56LOC_Os03g175800.000448145
57LOC_Os10g429400.000447386PMID: 24715026, PMID: 10873582
58LOC_Os03g035700.000446501PMID: 10364408
59LOC_Os12g435500.000445728
60LOC_Os03g495000.000444206PMID: 29767552
61LOC_Os10g046740.000442469PMID: 24145853, PMID: 17986178
62LOC_Os10g067400.000442469PMID: 28154240
63LOC_Os01g059800.000442411
64LOC_Os10g336500.000440094
65LOC_Os01g181500.000438562
66LOC_Os01g224900.000436139
67LOC_Os02g185500.000436139
91LOC_Os05g509300.000408491
92LOC_Os10g394400.000408336PMID: 24372780, PMID: 18335199
93LOC_Os08g066300.000407594
94LOC_Osp1g008200.000407028PMID:25658309
95LOC_Osp1g010500.000407028PMID:25658309
96LOC_Osp1g004200.00040642PMID:25658309
97LOC_Os05g493200.000404017
98LOC_Os12g077200.000400566PMID: 14756303
99LOC_Os10g069300.000399998PMID: 29356995
100LOC_Os03g064100.000399411PMID: 1731968

The top 100 candidate genes in the ranking list.

Taken together, of the top 100 candidate genes in the ranking list, 46 candidate genes predicted by our method had been confirmed to be correlated with rice yield in PubMed literature (Table 1). Top-ranked candidates were found to have a high confirmation rate in terms of their association with rice yield, especially top 20 candidates (Table 2).

Table 2

Top nConfirmation NumberConfirmation Rate
201155%
301653.33%
402050%
602643.33%
703042.86%
803341.25%
1004646%

The confirmation rate of top 100 candidate genes in the ranking list.

The confirmation rate was calculated by dividing the confirmation number by the corresponding number of top n. It represented the effectiveness of the confirmation.

We conducted GO analysis to assess the functional enrichment of the top 100 candidate genes (Figure 4). The GO term having the most candidates annotated to was GO: 0005524 ∼ ATP binding, which is a binding motif within the primary structure of an ATP binding protein. A recently identified rice ATP binding cassette plays multiple roles in plant growth, development and environmental stress responses (Zhang X.D. et al., 2018). ATP binding has also been shown to play an important role in rice development (Coneva et al., 2014; Zhao et al., 2015; Chang et al., 2016; Lei et al., 2018).

FIGURE 4

Discussion

In the present study, we identified genes associated with rice yield using the RWR method on a functional similarity network. We demonstrated the high performance of the RWR approach via a five-fold cross-validation experiment and showed the robustness of the parameter r. As an application of the RWR approach, we predicted a landscape of associations between known seeds and candidate genes.

Our work has the following advantages. First, the RWR method can predict associations among known seed genes and candidate genes with the ability to spread the information that known seeds carried via their neighbors. Second, the interaction network provides a systematic view of functional similarities between genes by calculating GO terms. Finally, the robustness of the parameter r leads to a high level of accuracy in making predictions, and the method that achieving parameter can be adapted to other dataset.

Rice is the most important food crop worldwide. Use of the RWR method in the function similarity network can identify candidate genes associated with known rice yield-related genes, while gene ranking saves experimental time in the exploitation of rice as a major crop. Future development of our research will include the collection of more rice yield-related genes via online databases and the analysis of literature. Subsequent accurate analysis involving an effective prediction algorithm will enable the prediction of novel genes that can boost rice yield. In the future, we would further develop computational models for the identification and analysis of rice yield-related microRNAs/Long non-coding RNAs based on Chen et al.’s researches (Chen and Yan, 2013; Chen and Huang, 2017; Chen et al., 2017b, 2018b).

Statements

Author contributions

CW designed the research. XZ performed the research. FX analyzed the data. JJ wrote the manuscript. All authors read and approved the manuscript.

Funding

The work was supported by the Natural Science Foundation of China (Nos. 91735306, 61872114, and 61872309).

Acknowledgments

We thank Sarah Williams, Ph.D., from Liwen Bianji, Edanz Group China (www.liwenbianji.cn), for editing the English text of a draft of this manuscript.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

  • AUC

    area under the ROC curve

  • GO

    gene ontology

  • ROC

    receiver operating characteristic

  • RWR

    random walk with restart.

References

  • 1

    AbeK.OshimaM.AkasakaM.KonagayaK. I.NanasatoY.OkuzakiA.et al (2018). Development and characterization of transgenic dominant male sterile rice toward an outcross-based breeding system.Breed. Sci.68248–257. 10.1270/jsbbs.17090

  • 2

    AertsS.LambrechtsD.MaityS.Van LooP.CoessensB.De SmetF.et al (2006). Gene prioritization through genomic data fusion.Nat. Biotechnol.24537–544. 10.1038/nbt1203

  • 3

    AshburnerM.BallC. A.BlakeJ. A.BotsteinD.ButlerH.CherryJ. M.et al (2000). Gene ontology: tool for the unification of biology. Gene Ontology Consortium.Nat. Genet.2525–29. 10.1038/75556

  • 4

    Behroozi-KhazaeiN.NasirahmadiA. (2017). A neural network based model to analyze rice parboiling process with small dataset.J. Food Sci. Technol.542562–2569. 10.1007/s13197-017-2701-x

  • 5

    BorzeeA.HeoK.JangY. (2018). Relationship between agro-environmental variables and breeding Hylids in rice paddies.Sci. Rep.8:8049. 10.1038/s41598-018-26222-w

  • 6

    ChangZ.ChenZ.YanW.XieG.LuJ.WangN.et al (2016). An ABC transporter, OsABCG26, is required for anther cuticle and pollen exine formation and pollen-pistil interactions in rice.Plant Sci.25321–30. 10.1016/j.plantsci.2016.09.006

  • 7

    ChenL.ZhangY.-H.ZouQ.placeChuC.JiZ. (2016). Analysis of the chemical toxicity effects using the enrichment of Gene Ontology terms and KEGG pathways.Biochim. Biophys. Acta Gen. Sub.18602619–2626. 10.1016/j.bbagen.2016.05.015

  • 8

    ChenX.YouZ. H.YanG. Y.GongD. W. (2016). IRWRLDA: improved random walk with restart for lncRNA-disease association prediction.Oncotarget757919–57931. 10.18632/oncotarget.11141

  • 9

    ChenQ. F.YaH. Y.FengY. R.JiaoZ. (2014). Expression of the key genes involved in ABA biosynthesis in rice implanted by ion beam.Appl. Biochem. Biotechnol.173239–247. 10.1007/s12010-014-0837-y

  • 10

    ChenX. (2016). miREFRWR: a novel disease-related microRNA-environmental factor interactions prediction method.Mol. Biosyst.12624–633. 10.1039/c5mb00697j

  • 11

    ChenX.HuangL. (2017). LRSSLMDA: laplacian regularized sparse subspace learning for MiRNA-disease association prediction.PLoS Comput. Biol.13:e1005912. 10.1371/journal.pcbi.1005912

  • 12

    ChenX.HuangL.XieD.ZhaoQ. (2018a). EGBMMDA: extreme gradient boosting machine for MiRNA-disease association prediction.Cell Death Dis.9:3. 10.1038/s41419-017-0003-x

  • 13

    ChenX.XieD.WangL.ZhaoQ.YouZ. H.LiuH. (2018b). BNPMDA: bipartite network projection for MiRNA-disease association prediction.Bioinformatics343178–3186. 10.1093/bioinformatics/bty333

  • 14

    ChenX.LiuM. X.YanG. Y. (2012a). Drug-target interaction prediction by random walk on the heterogeneous network.Mol. Biosyst.81970–1978. 10.1039/c2mb00002d

  • 15

    ChenX.LiuM. X.YanG. Y. (2012b). RWRMDA: predicting novel human microRNA-disease associations.Mol. Biosyst.82792–2798. 10.1039/c2mb25180a

  • 16

    ChenX.XieD.ZhaoQ.YouZ. H. (2017a). MicroRNAs and complex diseases: from experimental results to computational models.Brief Bioinform.10.1093/bib/bbx130 [Epub ahead of print].

  • 17

    ChenX.YanC. C.ZhangX.YouZ. H. (2017b). Long non-coding RNAs and complex diseases: from experimental results to computational models.Brief Bioinform.18558–576. 10.1093/bib/bbw060

  • 18

    ChenX.YanG. Y. (2013). Novel human lncRNA-disease association inference based on lncRNA expression profiles.Bioinformatics292617–2624. 10.1093/bioinformatics/btt426

  • 19

    ConevaV.SimopoulosC.CasarettoJ. A.El-KereamyA.GuevaraD. R.CohnJ.et al (2014). Metabolic and co-expression network-based analyses associated with nitrate response in rice.BMC Genomics15:1056. 10.1186/1471-2164-15-1056

  • 20

    DehuryB.BeheraS. K.NegiS. (2017). Overcoming the limitation of GWAS platforms using systems biology approach.Curr. Bioinform.12156–170. 10.2174/15748936116661604261708

  • 21

    HeY. H.LiangX. F.HeS.YuanX. C.WangQ. C.CaiW. J.et al (2017). Circadian clock gene of grass carp (Ctenopharyngodon idellus): genomic structure and tissue expression pattern of period1 gene.Curr. Bioinform.12312–319. 10.2174/1574893611666160527101628

  • 22

    HollandD. O.JohnsonM. E. (2018). Stoichiometric balance of protein copy numbers is measurable and functionally significant in a protein-protein interaction network for yeast endocytosis.PLoS Comput. Biol.14:e1006022. 10.1371/journal.pcbi.1006022

  • 23

    HuangR.JiangL.ZhengJ.WangT.WangH.HuangY.et al (2013). Genetic bases of rice grain shape: so many genes, so little known.Trends Plant Sci.18218–226. 10.1016/j.tplants.2012.11.001

  • 24

    JensenL. J.KuhnM.StarkM.ChaffronS.CreeveyC.MullerJ.et al (2009). STRING 8–a global view on proteins and their functional interactions in 630 organisms.Nucleic Acids Res.37D412–D416. 10.1093/nar/gkn760

  • 25

    JiangJ.XingF.ZengX. X.ZouQ. (2018). RicyerDB: a database for collecting rice yield-related genes with biological analysis int.J. Biol. Sci.14965–970. 10.7150/ijbs.23328

  • 26

    KohlerS.BauerS.HornD.RobinsonP. N. (2008). Walking the interactome for prioritization of candidate disease genes.Am. J. Hum. Genet.82949–958. 10.1016/j.ajhg.2008.02.013

  • 27

    LeeI. (2011). Probabilistic functional gene societies.Prog. Biophys. Mol. Biol.106435–442. 10.1016/j.pbiomolbio.2011.01.003

  • 28

    LeiL.ChenJ.LiuY.WangL.ZhaoG.ChenZ. Y. (2018). Dietary wheat bran oil is equally as effective as rice bran oil in reducing plasma cholesterol.J. Agric. Food Chem.662765–2774. 10.1021/acs.jafc.7b06093

  • 29

    LiF.XieJ.ZhuX.WangX.ZhaoY.MaX.et al (2018). Genetic basis underlying correlations among growth duration and yield traits revealed by GWAS in rice (Oryza sativa L.).Front. Plant Sci.9:650. 10.3389/fpls.2018.00650

  • 30

    LiJ.LiH. Y.ZhiJ. K.ShenC. Z.YangX. S.XuJ. C. (2017). Codon usage of expansin genes in Populus trichocarpa.Curr. Bioinform.12452–461. 10.2174/1574893611666161008195145

  • 31

    LiM.ZhengR. Q.LiQ.WangJ. X.WuF. X.ZhangZ. H. (2016). Prioritizing disease genes by using search engine algorithm.Curr. Bioinform.11195–202. 10.2174/1574893611666160125220905

  • 32

    LiP.GuoM.WangC.LiuX.ZouQ. (2015). An overview of SNP interactions in genome-wide association studies.Brief. Funct. Genomics14143–155. 10.1093/bfgp/elu036

  • 33

    LiuE.ZengS.ChenX.DangX.LiangL.WangH.et al (2017). Identification of putative markers linked to grain plumpness in rice (Oryza sativa L.) via association mapping.BMC Genet.18:89. 10.1186/s12863-017-0559-6

  • 34

    LiuY.ZengX.HeZ.ZouQ. (2017). Inferring MicroRNA-disease associations by random walk on a heterogeneous network with multiple data sources.Ieee Acm Trans. Comput. Biol. Bioinform.14905–915. 10.1109/tcbb.2016.2550432

  • 35

    LiuX.TangW. H.ZhaoX. M.ChenL. (2010). A network approach to predict pathogenic genes for Fusarium graminearum.PLoS One5:e13021. 10.1371/journal.pone.0013021

  • 36

    LuoJ. W.LiuC. C. (2017). An effective method for identifying functional modules in dynamic PPI networks.Curr. Bioinform.1266–79. 10.2174/1574893611666160831113726

  • 37

    MahenderA.AnandanA.PradhanS. K.PanditE. (2016). Rice grain nutritional traits and their enhancement using relevant genes and QTLs through advanced approaches.Springerplus5:2086. 10.1186/s40064-016-3744-6

  • 38

    MaioneC.BarbosaR. M. (2018). Recent applications of multivariate data analysis methods in the authentication of rice and the most analyzed parameters: a review.Crit. Rev. Food Sci. Nutr.10.1080/10408398.2018.1431763 [Epub ahead of print]. 10.1080/10408398.2018.1431763

  • 39

    OdilbekovF.ArmonieneR.HenrikssonT.ChawadeA. (2018). Proximal phenotyping and machine learning methods to identify Septoria Tritici blotch disease symptoms in wheat.Front. Plant Sci.9:685. 10.3389/fpls.2018.00685

  • 40

    OkadaS.OnogiA.IijimaK.HoriK.IwataH.YokoyamaW.et al (2018). Identification of QTLs for rice grain size using a novel set of chromosomal segment substitution lines derived from Yamadanishiki in the genetic background of Koshihikari.Breed. Sci.68210–218. 10.1270/jsbbs.17112

  • 41

    PengW.WangJ. X.ZhangZ.WuF. X. (2016). Applications of random walk model on biological networks.Curr. Bioinform.11211–220. 10.2174/1574893611666160223200823

  • 42

    RazaK. (2016). Reconstruction, topological and gene ontology enrichment analysis of cancerous gene regulatory network modules.Curr. Bioinform.11243–258. 10.2174/1574893611666160115212806

  • 43

    RezadoostH.KarimiM.JafariM. (2016). Proteomics of hot-wet and cold-dry temperaments proposed in Iranian traditional medicine: a network-based Study.Sci. Rep.6:30133. 10.1038/srep30133

  • 44

    ShimJ. S.OhN.ChungP. J.KimY. S.ChoiY. D.KimJ. K. (2018). Overexpression of OsNAC14 improves drought tolerance in rice.Front. Plant Sci.9:310. 10.3389/fpls.2018.00310

  • 45

    SzklarczykD.FranceschiniA.WyderS.ForslundK.HellerD.Huerta-CepasJ.et al (2015). STRING v10: protein-protein interaction networks, integrated over the tree of life.Nucleic Acids Res.43D447–D452. 10.1093/nar/gku1003

  • 46

    The Gene OntologyC. (2017). Expansion of the gene ontology knowledgebase and resources.Nucleic Acids Res.45D331–D338. 10.1093/nar/gkw1108

  • 47

    TorresR. O.HenryA. (2018). Yield stability of selected rice breeding lines and donors across conditions of mild to moderately severe drought stress.Field Crops Res.22037–45. 10.1016/j.fcr.2016.09.011

  • 48

    TrancheventL. C.ArdeshirdavaniA.ElShalS.AlcaideD.AertsJ.AuboeufD.et al (2016). Candidate gene prioritization with endeavour.Nucleic Acids Res.44W117–W121. 10.1093/nar/gkw365

  • 49

    TrancheventL. C.BarriotR.YuS.Van VoorenS.Van LooP.CoessensB.et al (2008). ENDEAVOUR update: a web resource for gene prioritization in multiple species.Nucleic Acids Res.36W377–W384. 10.1093/nar/gkn325

  • 50

    VlaicS.ConradT.Tokarski-SchnelleC.GustafssonM.DahmenU.GuthkeR.et al (2018). ModuleDiscoverer: identification of regulatory modules in protein-protein interaction networks.Sci. Rep.8:433. 10.1038/s41598-017-18370-2

  • 51

    WangF.SongB. X.ZhaoX.MiaoY. T.LiD. Y.ZhouN.et al (2016). Prediction and analysis of the protein-protein interaction networks for chickens, cattle, dogs, horses and rabbits.Curr. Bioinform.11131–142. 10.2174/1574893611666151203221255

  • 52

    WuT. Y.GruissemW.BhullarN. K. (2018). Targeting intra-cellular transport combined with efficient uptake and storage significantly increases grain iron and zinc levels in rice.Plant Biotechnol. J.10.1111/pbi.12943 [Epub ahead of print].

  • 53

    XiongX.DuanL.LiuL.TuH.YangP.WuD.et al (2017). Panicle-SEG: a robust image segmentation method for rice panicles in the field based on deep learning and superpixel optimization.Plant Methods13:104. 10.1186/s13007-017-0254-7

  • 54

    XuY.WangX.DingX.ZhengX.YangZ.XuC.et al (2018). Genomic selection of agronomic traits in hybrid rice using an NCII population.Rice11:32. 10.1186/s12284-018-0223-4

  • 55

    YouZ. H.HuangZ. A.ZhuZ.YanG. Y.LiZ. W.WenZ.et al (2017). PBMDA: a novel and effective path-based computational model for miRNA-disease association prediction.PLoS Comput. Biol.13:e1005455. 10.1371/journal.pcbi.1005455

  • 56

    ZengJ.LiD.WuY.ZouQ.LiuX. (2016). An empirical study of features fusion techniques for protein-protein interaction prediction.Curr. Bioinform.114–12. 10.2174/1574893611666151119221435

  • 57

    ZhangM.LiJ.ChenF.KongQ. (2018). Unary non-structural fertilizer response model for rice crops and its field experimental verification.Sci. Rep.8:2792. 10.1038/s41598-018-21163-w

  • 58

    ZhangX. D.ZhaoK. X.YangZ. M. (2018). Identification of genomic ATP binding cassette (ABC) transporter genes and Cd-responsive ABCs in Brassica napus.Gene664139–151. 10.1016/j.gene.2018.04.060

  • 59

    ZhaoG.ShiJ.LiangW.XueF.LuoQ.ZhuL.et al (2015). Two ATP binding cassette G transporters, rice ATP binding cassette G26 and ATP Binding cassette G15, collaboratively regulate rice male reproduction.Plant Physiol.1692064–2079. 10.1104/pp.15.00262

  • 60

    ZhouX.BaiX.XingY. (2018). A rice genetic improvement boom by next generation sequencing.Curr. Issues Mol. Biol.27109–126. 10.21775/cimb.027.109

  • 61

    ZhuL.SuF.XuY.ZouQ. (2018). Network-based method for mining novel HPV infection related genes using random walk with restart algorithm.Biochim. Biophys. Acta Mol. Basis Dis.18642376–2383. 10.1016/j.bbadis.2017.11.021

Summary

Keywords

rice, yield, random walking, function, network

Citation

Jiang J, Xing F, Wang C and Zeng X (2018) Identification and Analysis of Rice Yield-Related Candidate Genes by Walking on the Functional Network. Front. Plant Sci. 9:1685. doi: 10.3389/fpls.2018.01685

Received

28 June 2018

Accepted

30 October 2018

Published

20 November 2018

Volume

9 - 2018

Edited by

Arun Kumar Sangaiah, VIT University, India

Reviewed by

Jing Lu, Walmart Labs, United States; Feng Zhu, Zhejiang University, China

Updates

Copyright

*Correspondence: Chunyu Wang, Xiangxiang Zeng,

This article was submitted to Bioinformatics and Computational Biology, a section of the journal Frontiers in Plant Science

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics