Oligo/BAC CGH Array 解析パッケージをまとめてみる
書きかけです‥下書きで保存すると結局公開しないことが多発するので。
また来週に改訂するつもり。
目的・動機
今回は普段自分が使っていないプラットフォームの CGH マイクロアレイを解析 & 自分の実験データと比較するために、他社のプラットフォームを解析するためのツール (パッケージ) をここに集めておく *1。
対象とするのは BAC とアジレント社製の CGH アレイ。# だったけど Affy と共用のものもある
背景
今やっている研究では Affymetrix 社製の GenomeWideSNP 6.0 array というアレイを用いて染色体の構造異常を解析している。
また最近になって各種の精神疾患で多くの染色体構造異常が報告されまくっている*2。そしてこれらデータは論文で発表されるとともに NCBI GEO でも利用可能になっている (ものもある)。
ここからパッケージたち
cghMCR (R/BioC package)
DNAcopy, marray, arrayQuality に依存
Find chromosome regions showing common gains/losses.
Based on the algothrim proposed by Dr. Lynda Chin's lab, this package provides functions that identify chromosome regions that show gains/losses commonly observed across different samples profiled using arrayCGH platform.
CGHcall (R/BioC package)
CGHbase, impute, DNAcopy に依存
Tumor profile 用みたいなので染色体構造異常には向いてないかも (パラメーター的な意味で)
Calling aberrations for array CGH tumor profiles.
Calls aberrations for array CGH data using a six state mixture model as well as several biological concepts that are ignored by existing algorithms. Visualization of profiles is also provided.
DNAcopy (R/BioC package)
DNA copy number data analysis
Segments DNA copy number data using circular binary segmentation to detect regions with abnormal copy number
サンプルデータを見る
> library(DNAcopy) > data(coriell) > is(coriell) [1] "data.frame" "list" "oldClass" "vector" > head(coriell) Clone Chromosome Position Coriell.05296 Coriell.13330 1 GS1-232B23 1 0 NA 0.207470 2 RP11-82d16 1 468 0.008824 0.063076 3 RP11-62m23 1 2241 -0.000890 0.123881 4 RP11-60j11 1 4504 0.075875 0.154343 5 RP11-111O05 1 5440 0.017303 -0.043890 6 RP11-51b04 1 7000 -0.006770 0.094144 >
Coriell.05296、Coriell.13330はサンプル名で、data.frame の形にしておく必要があるようだ。
http://www.bioconductor.org/packages/bioc/html/DNAcopy.html
aCGH (R/BioC package)
cluster, survival, multtest, sma に依存
Classes and functions for Array Comparative Genomic Hybridization data.
Functions for reading aCGH data from image analysis output files and clone information files, creation of aCGH S3 objects for storing these data. Basic methods for accessing/replacing, subsetting, printing and plotting aCGH objects.
snapCGH (R/BioC package)
limma, tilingArray, DNAcopy, GLAD, cluster, methods, aCGH に依存
Segmentation, normalisation and processing of aCGH data.
Methods for segmenting, normalising and processing aCGH data; including plotting functions for visualising raw and segmented data for individual and multiple arrays.
BioC の ML ではこれが勧められていた (n=1)。その他の R/BioC パッケージを統合したもの。
read.maimages() でスポットの蛍光量を記したファイルから読み込めるのも良い感じだ。
> datadir <- system.file("testdata", package = "snapCGH") > targets <- readTargets("targets.txt", path = datadir) > RG1 <- read.maimages(targets$FileName, path = datadir, source = "genepix")
SMAP (R/BioC package)
A Segmental Maximum A Posteriori Approach to Array-CGH Copy Number Profiling
Functions and classes for DNA copy number profiling of array-CGH data
MANOR (R/BioC package)
CGH Micro-Array NORmalization
We propose importation, normalization, visualization, and quality control functions to correct identified sources of variability in array-CGH experiments.
GLAD (R/BioC package)
Gain and Loss Analysis of DNA
Analysis of array CGH data : detection of breakpoints in genomic profiles and assignment of a status (gain, normal or lost) to each chromosomal regions identified.
ADaCGH (R/BioC package)
Analysis of data from aCGH experiments
Analysis and plotting of array CGH data. Allows usage of Circular Binary Segementation, wavelet-based smoothing, ACE method (CGH Explorer), HMM, BioHMM, GLAD, CGHseg, and Price's modification of Smith & Waterman's algorith. Most computations are parallelized. Figures are imagemaps with links to IDClight (http://idclight.bioinfo.cnio.es).
http://cran.r-project.org/web/packages/ADaCGH/index.html
WebApp version
http://adacgh2.bioinfo.cnio.es/
cgh (R)
Microarray CGH analysis using the Smith-Waterman algorithm
Functions to analyze microarray comparative genome hybridization data using the Smith-Waterman algorithm
CNVFinder (Perl)
The CNVFinder algorithm has been designed to detect copy number variants (CNVs) in human population from large-insert clone DNA microarray covering the entire human genome in tiling path resolution (WGTP platform).
CAPWeb
local install できなさげ。
CAPweb is a web tool devoted to the analysis of copy number microarray data. CAPweb starts from image analysis results (a gpr file for example) and goes up to biological results. Different formats are supported by CAPweb:
MAIA - GENEPIX - SPOT - IMAGENE - AGILENT - Affymetrix SNP 100k/500k.
CGHScan (Java Swing)
CGHScan analyzes comparative genomic hybridization data to delineate the boundaries of deleted or divergent regions in a particular genome as compared to a reference genome. For more information on how the program works, consult the documentation.
CGH-Plotter (MATLAB)
MATLAB Toolbox for CGH-data Analysis
The CGH-Plotter is a MATLAB toolbox with a graphical user interface for comparative genomic hybridization (CGH) data analysis. The CGH-Plotter identifies putative groups of genes whose copy-number is deleted or amplified using k-means clustering and dynamic programming. The CGH-Plotter allows also representative illustrations of CGH-data. The CGH-Plotter is platform independent and requires MATLAB 6.1 or higher in order to operate.
http://bioinformatics.oxfordjournals.org/cgi/screenpdf/19/13/1714
CGHweb (R/BioC package)
waveslim, quantreg, snapCGH, cghFLasso, FASeg, GLAD, GDD, gplots あたりが関連パッケージ
http://compbio.med.harvard.edu/CGHweb/
CNIT
Affy GeneChip
What is CNIT program?
Copy number inferring tool (CNIT) is designed for Affymetrix GeneChip to analyze copy number of each SNP allele. CNIT can be applicable in chromosome-abnormal disease, cancer and copy number variation studies, and can provide accurate CN estimations with low false-positive rate.
CNVtools (R/BioC package)
Case-Control での Association test を行うパッケージ。
CNVtools is an R package for performing robust case control and quantitative trait association analyses of Copy Number Variants.
The package implements a robust association framework by unifying genotyping and association testing into a single model. This is done by incorporating a disease model, which is either a logistic regression disease model for a dichotomous disease variable or a standard regression for a quantitative trait, into the mixture model for the signal. Association is assessed via a likelihood ratio test. The procedure is assay/platform independent and can be applied whenever there is a univariate diploid copy number eg SNP genotyping assays (R coordinate), Array-CGH or quantitative PCR.
dChip
Affy GeneChip
http://www.hsph.harvard.edu/~cli/complab/dchip/
PennCNV
Illumina and Affymetrix arrays (ただし file format を工夫することで、その他のアレイも処理できる)
PennCNV is a free software tool for Copy Number Variation (CNV) detection from SNP genotyping arrays. Currently it can handle signal intensity data from Illumina and Affymetrix arrays. With appropriate preparation of file format, it can also handle other types of SNP arrays and oligonucleotide arrays.
http://www.openbioinformatics.org/penncnv/
# Preparing signal intensity files from other types of arrays (Agilent, Nimblegen, Affy, etc)
http://www.openbioinformatics.org/penncnv/penncnv_input.html#_Toc214852007
QuantiSNP
SNP だけに Illumina と Affy
QuantiSNP is an analytical tool for the analysis of copy number variation using whole genome SNP genotyping data. In its first implementation it was developed for data arising from Illumina platforms and this is fully described in Colella and Yau et al., 2007. At present we are in the process of further developing QuantiSNP with a particular interest in adapting the algorithm for cancer sample analysis.
ISACGH
WebApp
Merging DNA copy number and gene expression to the analysis of Array CGH
http://gepas3.bioinfo.cipf.es/cgi-bin/tutoXX?c=/isacgh/isacgh.config
InSilicoArray CGH
メーカー、サードパーティ製 (有償)
Copy Number Analysis Module (CNAM)
ImaGene CGH
NimbleScan
DNACopy and segMNT algorithms are available in NimbleScan software to analyze CGH data.
Partek GS for Copy Number Data
Illumina, GeneChip compatible.
http://www.partek.com/partekgs_copynumber
CGH Analytics Software
The Agilent CGH Analytics 3.4 software provides an intuitive user interface for visually exploring, detecting and analyzing aberration patterns from multiple Comparative Genomic Hybridization (CGH) microarray profiles. It accepts data output from Agilent Feature Extraction software and displays chromosomal deletions and amplifications at multiple zoom levels simultaneously. Take advantage of the new joint analysis module to detect changes in both copy number and gene expression from experiments that have aCGH and gene expression data available.
世の中、親切な人はいるもので‥
同じようなリストを発見‥
http://www.nslij-genetics.org/cnv/programs.html