1. 研究目的与意义、国内外研究现状(文献综述)
meaning:
cotton (gossypium spp.) is widely cultivated and utilized for its single-celled fibre in the textile industry. but the comprehensive genetic analysis of fiber development are difficult just using some key genes which identified by map-based cloning or other methods due to the complex traits, noise of data. lncrna are transcripts at least 200 bp in length with no apparent coding capability and are involved in various biological regulatory processes while its detailed function analysis in plant is still not elucidate. with the rapid advances of the deep sequencing technology, many of long noncoding rnas (lncrnas) have been discovered in cotton genome[1].
nowadays the surge of omics data especially gene expression profiling of all transcripts (mrna and lncrna) makes it possible to give a systematic understanding rather than just get degs. so we could use these data for further deep analysis.
2. 研究的基本内容和问题
target:
we hope to get a systematic view to see the complex trait(fiber growth) in the molecular(rna) level and some basic knowledge of long sequences regulator relationships.
content :
3. 研究的方法与方案
research method:
as we described in the research meaning, the data is enough and available for us to analysis, and the super power linux server is available for computational program running.
we employ computational methods to integrate transcript data, gwas results to construct gene co-expression network and find the meaningful module for the fiber development.
4. 研究创新点
unique features or innovation:
1) this work could provide a systematic view about the complex cotton traitfiber development by integrating useful previous information .
2) the exploration of lncrnas function is a new epigenetic emphasize in plant genome especially in allopolyploids. our work could offer some preliminary basic knowledge for such study.
5. 研究计划与进展
2016年7月-2016年8月:学习基本的linux 系统的操作,掌握r与rstudio的运用;同时利用perl编写流水化的计算47个rna-seq全部转录本表达量的脚本并执行脚本。具体学习聚类的理论并熟练运用pheatmap r包。完善纤维相关的报道基因。
2016年9月-2017年1月:
1. 认真学习多元统计相关的知识(包括聚类、分类和降维的一些知识),完全掌握r及r studio的使用,学习课题相关的生信算法(hmm,mcmc等)。旁听理学院的课程《数理统计》与《生物数学模型》,跨专业辅修《数值分析》课程。努力做到知其然,知其所以然地使用计算工具去发掘大量数据中婴隐含的生物学规律。
