Package ‘genomes’ documentation

Introduction

The genomes software is a package written in Python. It is designed to provide easy access to some meta data about genome assemblies. You can retrieve the list of chromosomes for any assembly. In addition, you can get the lengths in base pairs of every chromosome.

Installation

To install you can simply type:

$ sudo easy_install genomes

That’s it. However, if that doesn’t work because you don’t have sufficient permissions, you can simply install it somewhere else (for instance in your home):

$ cd ~
$ pip install -e git+https://github.com/xapple/genomes

Examples

Here are all the things you can do with it:

from genomes import Assembly
a = Assembly('sacCer2')
print a.chrmeta
print a.guess_chromsome_name('chr1')

Code

class genomes.Assembly(assembly)[source]

The only object provided by the library.

Parameters:assembly (string) – A valid assembly name.
chrmeta[source]

A dictionary of chromosome metadata:

>>> from genomes import Assembly
>>> a = Assembly('TAIR10')
>>> print a.chrmeta
{u'c': {'length': 154478, 'refseq': u'NC_000932.1'}, u'm': {'length': 366924, 'refseq': u'NC_001284.2'}, u'1': {'length': 30427671, 'refseq': u'NC_003070.9'}, u'3': {'length': 23459830, 'refseq': u'NC_003074.8'}, u'2': {'length': 19698289, 'refseq': u'NC_003071.7'}, u'5': {'length': 26975502, 'refseq': u'NC_003076.8'}, u'4': {'length': 18585056, 'refseq': u'NC_003075.7'}}
guess_chromosome_name(chromosome_name)[source]

Searches the assembly for chromosome synonym names, and returns the canonical name of the chromosome.

Parameters:chromosome_name (string) – Any given name for a chromosome in this assembly.
Returns:The same or an other name for the chromosome. Returns None if the chromosome is not known about.
>>> from genomes import Assembly
>>> a = Assembly('sacCer2')
>>> print a.guess_chromosome_name('chrR')
2micron
>>> a = Assembly('hg19')
>>> print a.guess_chromosome_name('NC_000023.9')
chrX

Table Of Contents

This Page