本文已被:浏览 1432次 下载 2167次
Received:June 23, 2016 Revised:July 25, 2016
Received:June 23, 2016 Revised:July 25, 2016
中文摘要: 针对长非编码RNA(long non-coding RNA,lncRNA)数据类型多样带来的有用信息提取困难的问题,提出基于基因组浏览器GBrowse(Generic Genome Browser)的多源lncRNA数据可视化系统.该系统主要包括网页服务器和lncRNA数据存储.其中,网页服务器主要由HTTP服务和GBrowse网页组件构成,支持纯文本、MySQL、SQLite等多种数据存储方式.系统实现流程包括GBrowse安装与配置、多源lncRNA数据的收集、数据预处理、数据存储、数据访问及可视化配置.原型系统收集了六种人类lncRNA数据,包括人类基因注释、基因组序列、组蛋白修饰H3K4me3信号及其位点、转录因子CTCF绑定位点信号及其位点的数据,并对数据进行了预处理.通过MySQL、SQLite等建立了lncRNA数据库,对数据的访问方式和可视化参数进行配置.实验结果表明,多源lncRNA数据在GBrowse框架下能够得到整合与可视化,并在基因组空间同时呈现,这使得研究者能够以更加直观的方式观测数据,进而建立新的科学假说.
Abstract:In consideration of the problem that useful information cannot be easily extracted from various types of long noncoding RNA (lncRNA) data, this paper proposes a visualization system of multi-source lncRNA data based on generic genome browser (GBrowse). The system mainly includes a web server including HTTP service and GBrowse components, and lncRNA data storage which supports flat files, MySQL, SQLite and other types of databases. The main steps of constructing the system include GBrowse installation and configuration, multi-source lncRNA data collection, preprocessing, storage, and access and visualization configuration. A demo system is constructed by firstly collecting six sets of human lncRNA data, including human gene annotation, genome sequence, histone modification H3K4me3 signals and their loci predicted, signals of transcription factor CTCF binding sites and their loci predicted. After preprocessing, these data are stored by databases such as MySQL, SQLite and so on, and data access and visualization methods are also configured. The experiment results demonstrate that multi-source lncRNA data can be integrated and visualized within the GBrowse framework, and be showed in the genome spatial space simultaneously, which can make researchers observe the lncRNA data more intuitively, thereby helps to produce novel scientific hypothesis.
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金(61301220);扬州大学大学生学术科技创新基金(x2015423,x2015444)
引用文本:
孙磊,陈璇,唐红,魏李婷,姬岚洋,施胜飞,杨晓华.基于GBrowse的多源长非编码RNA数据可视化系统.计算机系统应用,2017,26(3):81-85
SUN Lei,CHEN Xuan,TANG Hong,WEI Li-Ting,JI Lan-Yang,SHI Sheng-Fei,YANG Xiao-Hua.Visualization System of Multi-Source Long Non-Coding RNA Data Based on Gbrowse.COMPUTER SYSTEMS APPLICATIONS,2017,26(3):81-85
孙磊,陈璇,唐红,魏李婷,姬岚洋,施胜飞,杨晓华.基于GBrowse的多源长非编码RNA数据可视化系统.计算机系统应用,2017,26(3):81-85
SUN Lei,CHEN Xuan,TANG Hong,WEI Li-Ting,JI Lan-Yang,SHI Sheng-Fei,YANG Xiao-Hua.Visualization System of Multi-Source Long Non-Coding RNA Data Based on Gbrowse.COMPUTER SYSTEMS APPLICATIONS,2017,26(3):81-85