6 Coverage Analysis
Read coverage is an important factor for interpreting DNA methylomes.
It requires different coverage levels for different purpose. For example, SNV calling requires higher coverage than it is required for DMR study. The SNV calling process is depending on all nucleotides (A, T, C and G), whereas DNA methylation levels only depend on T and C read counts aligned to cytosines.
In CGmapTools, we propsed two ways for evaluating the coverages of DNA methylations: OverAll Coverage (OAC) and Methylation-Effective Coverage (MEC).
- OAC is calculated as the average read coverage on all nucleotides on both strands, which are calculated from the ATCGmap file.
- MEC is calculated as the average read coverage only for cytosines, which is calculated from the CGmap file. Generally, the MEC is slightly higher than half of the OAC.
In CGmapTools, we provides function for basic statistics of coverages (cgmaptools oac stat
and cgmaptools mec stat
) and visualization of coverages in bins across genome (cgmaptools oac bin
and cgmaptools mec stat
).
6.1 oac
- Command
cgmaptools oac -h
# Usage: cgmaptools oac <command> [options]
# Version: 0.1.2
# Updated on: Dec. 14th, 2018
# Commands:
# bin * overall coverage in bins
# stat * overall coverage statistics globally
6.1.1 oac bin
- Command
cgmaptools oac bin -h
# Usage: cgmaptools oac bin [-i <ATCGmap>] [-B 5000000]
# (aka ATCGmapCovInBins)
# Description: Generate the overall coverage in Bins.
# Contact: Guo, Weilong; guoweilong@126.com;
# Last Update: 2016-12-07
# Output Ex:
# chr1 1 5000 29.0000
# chr1 5001 10000 30.0396
# chr2 1 5000 35.0755
# chr2 5001 10000 40.0027
# chr3 1 5000 na
#
# Options:
# -h, --help show this help message and exit
# -i FILE File name end with .ATCGmap or .ATCGmap.gz. If not
# specified, STDIN will be used.
# -B BIN_SIZE Define the size of bins [Default: 5000000]
# -f FILE, --figure-type=FILE
# png, pdf, eps. Will not generate figure if not
# specified
# -H FLOAT Height of figure in inch [Default: 4]
# -W FLOAT Width of figure in inch [Default: 8]
# -p STRING Prefix for output figures
# -t STRING, --title=STRING
# title in the output figures
Example
cgmaptools oac bin -i WG.ATCGmap.gz -B 1000 -f png -p WG -t WG > WG.oac_bin.data
Output figure
6.1.2 oac stat
- Command
cgmaptools oac stat -h
# Usage: cgmaptools oac stat [-i <ATCGmap>]
# (aka ATCGmapStatCov)
# Description: Get the distribution of overall coverages.
# Contact: Guo, Weilong; guoweilong@126.com;
# Last Update: 2018-05-02
# Output Ex:
# OverAllCov global 47.0395
# OverAllCov chr1 45.3157
# OverAllCov chr10 47.7380
# CovAndCount 1 1567
# CovAndCount 2 655
# CovAndCount 3 380
#
# Options:
# -h, --help show this help message and exit
# -i FILE File name end with .ATCGmap or .ATCGmap.gz. If not
# specified, STDIN will be used.
# -f FILE, --figure-type=FILE
# png, pdf, eps. Will not generate figure if not
# specified
# -H FLOAT Scale ratio for the Height of figure [Default: 4]
# -W FLOAT Width of figure in inch [Default: 8]
# -p STRING Prefix for output figures
Example
cgmaptools oac stat -i WG.ATCGmap.gz -p WG -f png > WG.oac_stat.data
output format:
The output format of
bin
:chr1 1 5000 29.0000 chr1 5001 10000 30.0396 chr2 1 5000 35.0755 chr2 5001 10000 40.0027 chr3 1 5000 na
The output format of
stat
:OverAllCov global 47.0395 OverAllCov chr1 45.3157 OverAllCov chr10 47.7380 CovAndCount 1 1567 CovAndCount 2 655 CovAndCount 3 380
6.2 mec
- Command
cgmaptools mec -h
# Usage: cgmaptools mec <command> [options]
# Version: 0.1.2
# Updated on: Dec. 14th, 2018
# Commands:
# bin * methylation effective coverage in bins
# stat * methylation effective coverage statistics globally
6.2.1 mec bin
- Command
cgmaptools mec bin -h
# Usage: cgmaptools mec bin [-i <CGmap>] [-B 5000000]
# (aka CGmapCovInBins)
# Description: Generate the methylation-effective coverage in Bins.
# Contact: Guo, Weilong; guoweilong@126.com;
# Last Update: 2018-01-02
# Output Ex:
# chr1 1 5000 29.0000
# chr1 5001 10000 30.0396
# chr2 1 5000 35.0755
# chr2 5001 10000 40.0027
# chr3 1 5000 na
#
# Options:
# -h, --help show this help message and exit
# -i FILE File name end with .CGmap or .CGmap.gz. If not
# specified, STDIN will be used.
# -B BIN_SIZE Define the size of bins [Default: 5000000]
# -f FILE, --figure-type=FILE
# png, pdf, eps. Will not generate figure if not
# specified
# -H FLOAT Height of figure in inch [Default: 4]
# -W FLOAT Width of figure in inch [Default: 8]
# -p STRING Prefix for output figures
# -t STRING, --title=STRING
# title in the output figures
# -C CONTEXT, --context=CONTEXT
# specific context: CG, CH, CHG, CHH, CA, CC, CT, CW
# use all sites if not specified
Example
cgmaptools mec bin -i WG.CGmap.gz -B 1000 -f png -p WG -t WG > WG.mec_bin.data
6.2.2 mec stat
- Command
cgmaptools mec stat -h
# Usage: cgmaptools mec stat [-i <CGmap>]
# (aka CGmapStatCov)
# Description: Get the distribution of methylation-effective coverages.
# Contact: Guo, Weilong; guoweilong@126.com
# Last Update: 2018-05-02
# Output Ex:
# MethEffectCove global 47.0395
# MethEffectCove chr1 45.3157
# MethEffectCove chr10 47.7380
# CovAndCount 1 1567
# CovAndCount 2 655
# CovAndCount 3 380
#
# Options:
# -h, --help show this help message and exit
# -i FILE File name end with .CGmap or .CGmap.gz. If not
# specified, STDIN will be used.
# -f FILE, --figure-type=FILE
# png, pdf, eps. Will not generate figure if not
# specified
# -H FLOAT Scale factor for the Height of figure [Default: 4]
# -W FLOAT Width of figure in inch [Default: 11]
# -p STRING Prefix for output figures
# -C CONTEXT, --context=CONTEXT
# specific context: CG, CH, CHG, CHH, CA, CC, CT, CW
# use all sites if not specified
Example
cgmaptools mec stat -i WG.CGmap.gz -p WG -f png > WG.mec_stat.data
Output figure