1. Peak Finding
1.1 Example Data
a. Login to MolBioCloud using X2GO.
b. Create a ‘macs2’ folder in your Desktop.
c. Download the MACS example dataset for FoxA1 and save it to the macs2 folder.
d. The example dataset consists of Input_tags.bed and Treatment_tags.bed for FoxA1 treatment.
1.2. Start the terminal
Start the terminal and change to the macs2 directory
1.3. Run macs2
Run macs2 using the following commands;
macs2 callpeak -t Treatment_tags.bed -c Input_tags.bed -f BED -g hs -n FoxA1 -B –p 0.005
The –t option specifies the treatment bed file, -c specifies the input (control) bed file, -g specifies the human as the genome, -n specifies the name for the output file, -B specifies that a bedgraph pileup is stored in the current directory and –p specifies the pvalue cutoff threshold.
Running TIme
1.4. Output files
FoxA1_peaks.narrowPeak
FoxA1_peaks.xls
FoxA1_summits.bed
FoxA1test_model.r
FoxA1_control_lambda.bdg
FoxA1_treat_pileup.bdg
Please see the MACS site for more info on passing different other parameters.
https://github.com/taoliu/MACS
1.5. Model PDF
Run ‘Rscript FoxA1_model.r’ in the shell to generate the PDF of the model
2. Extract Peak Sequences
Run the below commands to get some idea about your results
wc -l Input_tags.bed wc -l FoxA1_summits.bed grep -w 'chr1' FoxA1_summits.bed | wc -l
2.1. Extract top 1000 peaks
...from the FoxA1_summits.bed using the below command;
head –n 1000 FoxA1_summits.bed > 1000summits.bed
2.2. Reformat BED file
Convert the 5 column 1000summits.bed to 4 column 1000summits4.bed file by cutting and saving only the first 4 columns;
cut –f 1-4 1000summits.bed > 1000summits4.bed
2.3. Extract FASTA sequences
Extract FASTA format sequences of the 1000 peaks, using UCSC genome browser to table browser option, as a custom track, save it as sequence.fasta.
3. MEME-CHIP
Run meme-chip for two motifs, choosing DNA as molecule with ZOOPS option.
meme-chip -meme-mod zoops -meme-nmotifs 2 1000summitsequence.fasta
4. Peak Annotation
Run chipseek with 4 column data