No description
Find a file
Nicolas Cellier d0ac9ec43b Initial commit
2023-08-22 23:33:47 +02:00
pyassembly_dl Initial commit 2023-08-22 23:33:47 +02:00
poetry.lock Initial commit 2023-08-22 23:33:47 +02:00
pyproject.toml Initial commit 2023-08-22 23:33:47 +02:00
README.md Initial commit 2023-08-22 23:33:47 +02:00

A simple tool to download NCBI reference genome from assembly accession

Installation

pip install git+...

As python lib

>>> refseq_adl = AssemblyDownloader("/tmp/assembly_genomes", db="refseq")
>>> refseq_adl.download("GCF_001735525.1")
'/tmp/assembly_genomes/GCF_001735525.1.gb'
>>> uids = [
...     "GCF_001735525.1",
...     "GCF_025402875.1",
...     "GCF_007197645.1",
...     "GCF_900111765.1",
...     "GCF_900109545.1",
...     "GCF_001027285.1",
...     "GCF_001189295.1",
...     "GCF_002343915.1",
...     "GCF_022870945.1",
...     "GCF_002222655.1",
... ]
>>> gb_adl.download_many(uids)
['/tmp/assembly_genomes/GCF_001735525.1.gb',
 '/tmp/assembly_genomes/GCF_025402875.1.gb',
 '/tmp/assembly_genomes/GCF_007197645.1.gb',
 '/tmp/assembly_genomes/GCF_900111765.1.gb',
 '/tmp/assembly_genomes/GCF_900109545.1.gb',
 '/tmp/assembly_genomes/GCF_001027285.1.gb',
 '/tmp/assembly_genomes/GCF_001189295.1.gb',
 '/tmp/assembly_genomes/GCF_002343915.1.gb',
 '/tmp/assembly_genomes/GCF_022870945.1.gb',
 '/tmp/assembly_genomes/GCF_002222655.1.gb']

As command line tool

pyassembly_dl --folder /tmp/assembly_genomes/ GCF_001735525.1 GCF_025402875.1 GCF_001189295.1

should output

/tmp/assembly_genomes/GCF_001735525.1.gb
/tmp/assembly_genomes/GCF_025402875.1.gb
/tmp/assembly_genomes/GCF_001189295.1.gb