Browsed by
Author: admin

UPDATED: Sequence Assembly

UPDATED: Sequence Assembly

I’ve asked Juan to upload data to the class directory on the Centaurus cluster (hpc-student.uncc.edu). So instead of transferring from Dropbox, you can go directly to the cluster and copy the data you need into your working directory. You will find: sequence reads from one of the better chloroplast genome samples generated with our Ion Torrent instrument ERR008613 (a set of paired end Illumina sequence reads from ends of 200bp E. coli fragments) ERR022075 (a set of paired end Illumina…

Read More Read More

UPDATED: Read Mapping

UPDATED: Read Mapping

Mapping short sequence reads to a reference sequence is a common task in genomics. Many different results can be extracted from a mapped sequence, depending on the original experimental design that produced the sequence reads and on the analysis that follows the mapping. For example: a genomic consensus for an individual (against the reference genome for that species) location of SNPs and other variations in one genome relative to the other location of expressed transcripts (coding mRNAs, noncoding RNAs such…

Read More Read More

Updated: NGS QC

Updated: NGS QC

In this exercise we will focus primarily on quality analysis and quality control of Illumina sequencing data, since that is the type of NGS data you are currently most likely to encounter in new datasets. You can view older versions of this exercise for tips on how to handle Ion Torrent or 454 data if you encounter that in your work. What sequence data looks like — the FASTQ file NGS read files tend to be distributed in *.fastq format. To…

Read More Read More

Calling variants from mixed viral samples

Calling variants from mixed viral samples

This is just an example of some bioinformatics work I’m currently doing, not a homework assignment. I learned to do something new yesterday and so I thought I’d put up a tutorial about it since this illustrates a few software installation, simple pipelining and scripting tasks that we commonly do in the class. Where’s the data from? We collect samples of wastewater and sequence the viral fragments out of them. Often the viral genome sequence we get is not complete,…

Read More Read More

New computer setup: Apple

New computer setup: Apple

Here’s what I do, in order, to set my computer up to be able to install software that we use in the class. Disclaimer: I just got a new work laptop, so it doesn’t have much of an environment on it yet. I can’t know what you’ve done to your computer in the past, and so some of these things may not work for you if your environment’s already like this: Administrator privileges 2. xcode Install the xcode libraries that…

Read More Read More

Read mapping and simple variants

Read mapping and simple variants

Mapping short sequence reads to a reference sequence is a common task in genomics. Many different results can be extracted from a mapped sequence, depending on the original experimental design that produced the sequence reads and on the analysis that follows the mapping. For example: a genomic consensus for an individual (against the reference genome for that species) location of SNPs and other variations in one genome relative to the other location of expressed transcripts (coding mRNAs, noncoding RNAs such…

Read More Read More

Genome annotation with prokka

Genome annotation with prokka

Why use Prokka? First, because in a benchmark test it has been shown to be as or more accurate at reproducing known annotation than RAST or xBASE2 in most annotation categories. Second, because it’s fast and you can run it on a standard laptop within a short time, while sending your genome out for annotation to the RAST server can take a day or so to return. If you’re interested in learning to use the RAST server, search the site…

Read More Read More

Got a Fresh Apple?

Got a Fresh Apple?

How do I get my computer ready for Bioinformatics 2111? I just got a new Apple laptop, so I get to do a fresh start. By the time I get through 4 years with a machine, I’ve generally installed and tweaked so much bioinformatics software and supporting libraries (and probably installed some of it via obsolete tools) that I literally can not tell students how to exactly replicate the working environment I have. As always, xckd knows how it is:…

Read More Read More

Microbial community analysis with QIIME2

Microbial community analysis with QIIME2

This tutorial makes use of the data from the NC Urban Microbiome Project, a collaboration seeded by the Department of Bioinformatics and Genomics and involving participants from our department as well as Civil Engineering, Biology, and Geography and Earth Science. Our goal in lab this week is to analyze 16S ribosomal RNA sequences from mixed microbial samples using QIIME2. The QIIME2 analysis will tell us what identifiable microbes are present in the samples (usually at the genus level rather than…

Read More Read More

Whole genome shotgun metagenomics with MetaPhLan

Whole genome shotgun metagenomics with MetaPhLan

Like last week’s tutorial, this tutorial uses Urban Environmental Genomics Project data. The original version of the tutorial was developed by Anju Lulla for our student interns. Preparation and software installation You can use metaphlan on the cluster and that’s probably the best idea. If you have a reasonably powerful laptop you can run it there on these small data sets. You can download Metaphlan by cloning it (hg/Mercurial): hg clone https://bitbucket.org/biobakery/metaphlan2. Note: this will put metaphlan2’s directory as a…

Read More Read More