renameSeqlevels

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

renameSeqlevels

Teresa Tavella
Dear all,

I would like to ask if it is possible to change the seqnames of a bam file
giving a vector of character to the function renameSeqlevels. This is
because in order to use the fuction summarizeOverlap or count/find, the
seqnames have to match.
>From the bamfile below I have extracted the locus annotations form the
seqnames (i.e ERCC00002, NC_001133.9...etc) and I have created a list (same
length as the seqlevels of the bam file).


*bamfile*
GAlignments object with 6 alignments and 0 metadata columns:

seqnames

<Rle>
  [1]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
  [2]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
  [3]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
  [4]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
  [5]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
  [6]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
      strand       cigar    qwidth     start       end     width     njunc
       <Rle> <character> <integer> <integer> <integer> <integer> <integer>
  [1]      +     8M2D27M        35      1025      1061        37         0
  [2]      +     8M2D27M        35      1025      1061        37         0
  [3]      -         36M        36      1025      1060        36         0
  [4]      -         36M        36      1026      1061        36         0
  [5]      +         35M        35      1027      1061        35         0
  [6]      +         35M        35      1027      1061        35         0
  -------
*gffile*
GRanges object with 6 ranges and 12 metadata columns:
         seqnames           ranges strand |       source     type     score
            <Rle>        <IRanges>  <Rle> |     <factor> <factor> <numeric>
  [1] NC_001133.9 [ 24837,  25070]      + | s_cerevisiae     exon      <NA>
  [2] NC_001133.9 [ 25048,  25394]      + | s_cerevisiae     exon      <NA>
  [3] NC_001133.9 [ 27155,  27786]      + | s_cerevisiae     exon      <NA>
  [4] NC_001133.9 [ 73431,  73792]      + | s_cerevisiae     exon      <NA>
  [5] NC_001133.9 [165314, 165561]      + | s_cerevisiae     exon      <NA>
  [6] NC_001133.9 [165388, 165781]      + | s_cerevisiae     exon      <NA>
          phase     gene_id  transcript_id exon_number   gene_name
      <integer> <character>    <character> <character> <character>
  [1]      <NA> XLOC_000040 TCONS_00000191           1        FLO9
  [2]      <NA> XLOC_000040 TCONS_00000192           1        FLO9
  [3]      <NA> XLOC_000041 TCONS_00000193           1        FLO9
  [4]      <NA> XLOC_000055 TCONS_00000200           1   YAL037C-A
  [5]      <NA> XLOC_000075 TCONS_00000100           1     YAR010C
  [6]      <NA> XLOC_000075 TCONS_00000219           1     YAR010C
                                         oId nearest_ref  class_code
                                 <character> <character> <character>
  [1]   {TRINITY_GG_normal}16_c1_g1_i1.mrna1        rna8           x
  [2]   {TRINITY_GG_normal}16_c0_g1_i1.mrna1        rna8           x
  [3]   {TRINITY_GG_normal}12_c0_g1_i1.mrna1        rna8           x
  [4]    {TRINITY_GG_normal}3_c3_g1_i1.mrna1       rna31           x
  [5] {TRINITY_GG_normal}3479_c0_g1_i1.mrna1       rna77           x
  [6]   {TRINITY_GG_normal}24_c0_g1_i1.mrna1       rna77           x
           tss_id
      <character>
  [1]       TSS42
  [2]       TSS43
  [3]       TSS44
  [4]       TSS71
  [5]      TSS118
  [6]      TSS118
  -------

It is possible to replace the seqlevels names with the list?
I have tried:

bamfile1 <- renameSeqlevels(seqlevels(bamfile), listx)

Thank you for any advice,

Kind regards,

Teresa

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

bamfile.png (25K) Download Attachment
gfffile.png (41K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: renameSeqlevels

Martin Morgan-3
Rsamtools and GenomicAlignments are Bioconductor packages so ask on the
Bioconductor support site

   https://support.bioconductor.org

You cannot rename the seqlevels in the bam file; you could rename the
seqlevels in the object(s) you have created from the bam file.

Martin

On 02/14/2017 09:17 AM, Teresa Tavella wrote:

> Dear all,
>
> I would like to ask if it is possible to change the seqnames of a bam file
> giving a vector of character to the function renameSeqlevels. This is
> because in order to use the fuction summarizeOverlap or count/find, the
> seqnames have to match.
>>From the bamfile below I have extracted the locus annotations form the
> seqnames (i.e ERCC00002, NC_001133.9...etc) and I have created a list (same
> length as the seqlevels of the bam file).
>
>
> *bamfile*
> GAlignments object with 6 alignments and 0 metadata columns:
>
> seqnames
>
> <Rle>
>   [1]
> DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
>   [2]
> DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
>   [3]
> DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
>   [4]
> DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
>   [5]
> DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
>   [6]
> DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
>       strand       cigar    qwidth     start       end     width     njunc
>        <Rle> <character> <integer> <integer> <integer> <integer> <integer>
>   [1]      +     8M2D27M        35      1025      1061        37         0
>   [2]      +     8M2D27M        35      1025      1061        37         0
>   [3]      -         36M        36      1025      1060        36         0
>   [4]      -         36M        36      1026      1061        36         0
>   [5]      +         35M        35      1027      1061        35         0
>   [6]      +         35M        35      1027      1061        35         0
>   -------
> *gffile*
> GRanges object with 6 ranges and 12 metadata columns:
>          seqnames           ranges strand |       source     type     score
>             <Rle>        <IRanges>  <Rle> |     <factor> <factor> <numeric>
>   [1] NC_001133.9 [ 24837,  25070]      + | s_cerevisiae     exon      <NA>
>   [2] NC_001133.9 [ 25048,  25394]      + | s_cerevisiae     exon      <NA>
>   [3] NC_001133.9 [ 27155,  27786]      + | s_cerevisiae     exon      <NA>
>   [4] NC_001133.9 [ 73431,  73792]      + | s_cerevisiae     exon      <NA>
>   [5] NC_001133.9 [165314, 165561]      + | s_cerevisiae     exon      <NA>
>   [6] NC_001133.9 [165388, 165781]      + | s_cerevisiae     exon      <NA>
>           phase     gene_id  transcript_id exon_number   gene_name
>       <integer> <character>    <character> <character> <character>
>   [1]      <NA> XLOC_000040 TCONS_00000191           1        FLO9
>   [2]      <NA> XLOC_000040 TCONS_00000192           1        FLO9
>   [3]      <NA> XLOC_000041 TCONS_00000193           1        FLO9
>   [4]      <NA> XLOC_000055 TCONS_00000200           1   YAL037C-A
>   [5]      <NA> XLOC_000075 TCONS_00000100           1     YAR010C
>   [6]      <NA> XLOC_000075 TCONS_00000219           1     YAR010C
>                                          oId nearest_ref  class_code
>                                  <character> <character> <character>
>   [1]   {TRINITY_GG_normal}16_c1_g1_i1.mrna1        rna8           x
>   [2]   {TRINITY_GG_normal}16_c0_g1_i1.mrna1        rna8           x
>   [3]   {TRINITY_GG_normal}12_c0_g1_i1.mrna1        rna8           x
>   [4]    {TRINITY_GG_normal}3_c3_g1_i1.mrna1       rna31           x
>   [5] {TRINITY_GG_normal}3479_c0_g1_i1.mrna1       rna77           x
>   [6]   {TRINITY_GG_normal}24_c0_g1_i1.mrna1       rna77           x
>            tss_id
>       <character>
>   [1]       TSS42
>   [2]       TSS43
>   [3]       TSS44
>   [4]       TSS71
>   [5]      TSS118
>   [6]      TSS118
>   -------
>
> It is possible to replace the seqlevels names with the list?
> I have tried:
>
> bamfile1 <- renameSeqlevels(seqlevels(bamfile), listx)
>
> Thank you for any advice,
>
> Kind regards,
>
> Teresa
>
>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


This email message may contain legally privileged and/or...{{dropped:2}}

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.