Welcome to PolyAsite

Repository for 3' end sequencing data


Check out our "About" section to easily figure out
how to get the data you want.

  • A-seq

    • In the A-seq protocol, reverse transcription is accomplished by an anchored oligo(dT) primer. The products of reverse transcription and PCR amplification are expected to have six As preceding the 3' adapter. Libraries are sequenced in sense direction requiring removal of the 3' adapter sequence and preceding As to pinpoint the 3' end.

    Publications & data sets

    • Martin, G., Gruber, A. R., Keller, W. & Zavolan, M. Genome-wide analysis of pre-mRNA 3’ end processing reveals a decisive role of human cleavage factor I in the regulation of 3' UTR length. Cell Rep 1, 753–763 (2012). Link out
    • Gruber, AR. et al. Global 3’ UTR Shortening Has a Limited Effect on Protein Abundance in Proliferating T Cells. Nat Commun 5, 5465 (2014). Link out
    • Gruber, A. R., Martin, G., Keller, W. & Zavolan, M. Cleavage factor Im is a key regulator of 3’ UTR length. RNA Biol 9, 1405–1412 (2012). Link out
  • DRS

    • In the direct RNA sequencing (DRS) protocol, 3' ends of transcripts are hybridized to poly(dT)-coated flow cell surfaces where antisense strand synthesis is initiated. This has the advantage that no prior reverse transcription or cDNA amplification is needed.

    Publications & data sets

    • Yao, C. et al. Transcriptome-wide analyses of CstF64-RNA interactions in global regulation of mRNA alternative polyadenylation. Proc Natl Acad Sci U S A 109, 18773–18778 (2012). Link out
    • Lackford, B. et al. Fip1 regulates mRNA alternative polyadenylation to promote stem cell self-renewal. EMBO J (2014). Link out
    • Ji, X. et al. αCP Poly(C) Binding Proteins Act as Global Regulators of Alternative Polyadenylation. Mol Cell Biol 33, 2560–2573 (2013). Link out
    • Rehfeld, A. et al. Alternative polyadenylation of tumor suppressor genes in small intestinal neuroendocrine tumors. Front Endocrinol (Lausanne) 5, 46 (2014). Link out
  • PAS-Seq

    • In the PAS-Seq protocol, reverse transcription is accomplished by an anchored oligo(dT) primer. The products of reverse transcription and PCR amplification are expected to have 20 As preceding the 3' adapter. Libraries are sequenced in anti-sense direction with a costum primer requiring to reverse complement reads to pinpoint the 3' end.

    Publications & data sets

    • Shepard, P. J. et al. Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. RNA 17, 761–772 (2011). Link out




Organisms

Protocols

Please wait... loading

PolyAsite takes action


Once established, we used the PolyAsite resource in our own research and developed two tools that exploit the comprehensive annotation of 3' ends.

PAQR

PAQR is a method for the polyadenylation site usage quantification from RNA sequencing data and infers drops in the read coverage profile of RNA-seq libraries at positions of annotated poly(A) sites. In this way, the relative poly(A) site usage can be calculated for genes whose 3' UTR length can vary depending on the chosen cleavage site.

KAPAC

KAPAC stands for k-mer activity on polyadenylation site cchoice and is the method that infers position-dependent activities of sequence motifs on 3' end processing from changes in poly(A) site usage between conditions.

Availability

KAPAC can be applied instantly to 3' end quantifications that were obtain from data sets outlined in the experiments tab. To also allow the application of KAPAC on RNA-seq data, PAQR was developed such that its output feeds perfectly into KAPAC. Both tools are available on https://github.com/zavolanlab/PAQR_KAPAC.


A full use case application of both tools through a snakemake pipeline is available on zenodo. The available archive contains all necessary input data, scripts and software packages to be started out of the box on a linux systems. It can be downloaded from https://doi.org/10.5281/zenodo.1147433

3' end quantification made easy

Using our poly(A) site cluster annotation, quantification of 3' end of transcripts is done in three easy steps.

1.You have sequencing data generated with a dedicated 3'-end sequencing protocol? Use any mapping software you like, for example segemehl or STAR, to create a mapping file in SAM (or its binary equivalent BAM) format.

2. Next, from your SAM or BAM file generate a BED file with samtools and bedtools. Have a look at this sample code:


  # If you have a SAM file:
  samtools view -Sb  myfile.sam | bamToBed > myfile.bed

  # If you have already have a BAM file:
  bamToBed -i myfile.bam > myfile.bed



3. Use our cluster annotation and a PERL script called PAS-quant.pl to get your data analyzed within seconds. The output is a TAB-separated file (TSV) that can easily be read with R or any spreadsheet software.


  # Download the PERL script:
  curl http://polyasite.unibas.ch/scripts/PAS-quant.pl > PAS-quant.pl

  # Download the cluster annotation for your species of interest
  curl http://polyasite.unibas.ch/clusters/Mus_musculus/r1.0/clusters.bed > mouse.bed
  curl http://polyasite.unibas.ch/clusters/Homo_sapiens/r1.0/clusters.bed > human.bed

  # run the PERL script with your data
  perl PAS-quant.pl --clusters=human.bed --sample=myfile.bed > output.tsv



If you don't have any time to lose: a one-liner to go from your SAM file to 3' end quantification data:


  samtools view -Sb myfile.sam | bamToBed | perl PAS-quant.pl --clusters=human.bed --pipe > output.tsv


Paper

If you use PolyASite in your research, please cite the following publication:

Gruber, A. J. et al. A comprehensive analysis of 3’ end sequencing data sets reveals novel polyadenylation signals and the repressive role of heterogeneous ribonucleoprotein C on cleavage and polyadenylation. Genome Res. 26, 1145–1159 (2016). Link out

About PolyASite

PolyASite was built by Manuel Belmadani, Andreas R. Gruber, Andreas J. Gruber, and Ralf Schmidt. If you have any suggestions on how PolyASite might improve - please let us know: ralf.schmidt@unibas.ch