Tag Archives: LY2784544

We developed Tilescope, a fully integrated data processing pipeline for analyzing

We developed Tilescope, a fully integrated data processing pipeline for analyzing high-density tiling-array data http://tilescope. packages, such as ExpressYourself [3] or MIDAS [4], are available to process and analyze the data sets generated in such studies. However, limited by its manufacturing methodology, traditional microarrays are not amenable for systematic coverage of large genomes or even some large genomic regions. To fully realize the parallel-measurement potential of microarray technology, the current trend is to present large genomic regions (for example, ENCODE regions or a complete human chromosome) or even an entire genome on one or several microarrays in an unbiased fashion by using oligonucleotides (that is, tiles) uniformly sampled from presented genomic sequences. Recent technology breakthroughs [5,6] made it possible for such oligonucleotides, typically of 25-60 base-pairs (bp) in length, to be chemically synthesized directly on the microarray slides in a very high density (up to 6.6 million elements in less than 2 cm2). Such oligonucleotide tiling microarrays, which give unprecedented genomic coverage and resolution, can be used for genomic studies of gene expression [7-10], chromatin immuno-precipitation (ChIP-chip) [11], copy number variation [12], histone modification [13], and chromatin DNaseI sensitivity [14]. Like for any other nascent technologies, ready-to-use data analysis software packages for tiling array experiments are hard to find. Existing data processing software for traditional microarrays cannot be used since the considerably larger size and LY2784544 different nature of tiling array data require a new analysis approach [15]. Recently, a model-based method for tiling array ChIP-chip data analysis has been proposed [16]. Two other methods, based on curve fitting [17] and multi-channel combination [18], respectively, have also been developed for tiling array transcription data analysis. The excellent open-source Bioconductor software project [19] provides many sophisticated statistical methods written in R for microarray data analysis. However, as a software toolbox and a programming environment, it is rather difficult for non-programmers to use. Here we present Tilescope, an automated data processing pipeline LY2784544 for analyzing data sets generated in experiments using high-density tiling microarrays. Suitable microarray data processing methods, either previously published elsewhere or newly developed, were implemented and made available conveniently in a single online software pipeline. It has a user-friendly interface and is freely Rabbit Polyclonal to AARSD1 accessible over the worldwide web. The software performs data normalization, combination of replicate experiments, tile scoring, and feature identification. We demonstrate the modular nature of the pipeline design by showing how different methods can be plugged in – at major data processing steps, such as normalization and feature identification, several methods are available to be chosen from depending on the nature of the data and the user’s data-analysis goal. The program LY2784544 can process gene expression and ChIP-chip tiling microarray data. The results, presented in a clear, well organized manner, can be downloaded for further analysis. System implementation and user interface Tilescope was entirely developed in Java. Java was chosen as the programming language because of its built-in threading capability and its excellent library support for graphic user interface and networking development. More importantly, it was chosen because of its object-oriented nature: the program code is organized into different coherent classes and, thus, it naturally modularizes the system, which greatly facilitates parallel system development and subsequent system updating, a desideratum for any software engineering project of nontrivial complexity. As a web-accessible program system, Tilescope is composed of three connected components: an applet, a servlet, and a pipeline program. The applet is the graphical interface through which the user interacts with Tilescope. It is automatically downloaded and launched inside a Java-enabled web browser whenever the pipeline web page is browsed. Through the Tilescope applet, a user can upload array data files to the pipeline server, select appropriate pipeline parameters and methods, run the data processing program, and view or download analysis results. The applet, however, cannot run the pipeline program directly. Instead, it makes data processing requests to the servlet, a server program that acts as the proxy of the pipeline program on the web and communicates with the applet upon requests. The servlet, the central layer of Tilescope, runs two ‘daemon’ threads in the background to handle – that is, accept and schedule or reject based on the current system load – file upload or data processing requests, prepare the pipeline running environment, and initiate with user-specified parameters the back-end pipeline program, which carries out the heavy lifting – the actual data processing procedure. This modular design – the separation between.