Featured
Lab 1: Sequence QC And Assembly

Lab 1: Sequence QC And Assembly

The sequence data files for the exercise are preloaded on the hpc-student cluster in directories /projects/class/yoursection. Students registered for 8203 have a different directory. It contains several files of sequencing reads that you can attempt to use together or separately to create an assembly. In the first part of the exercise, you will trim and assemble the 200bp fragment Illumina reads alone to see what kind of assembly you get for the E. coli genome, and to learn the basics…

Read More Read More

Featured
UPDATED: Sequence Assembly

UPDATED: Sequence Assembly

I’ve asked Juan to upload data to the class directory on the Centaurus cluster (hpc-student.uncc.edu). So instead of transferring from Dropbox, you can go directly to the cluster and copy the data you need into your working directory. You will find: sequence reads from one of the better chloroplast genome samples generated with our Ion Torrent instrument ERR008613 (a set of paired end Illumina sequence reads from ends of 200bp E. coli fragments) ERR022075 (a set of paired end Illumina…

Read More Read More

Lab 2: Genome Annotation and Comparison

Lab 2: Genome Annotation and Comparison

Why use Prokka? Because in a benchmark test it has been shown to be as or more accurate at reproducing known annotation than RAST or xBASE2 in most annotation categories. It can run on a normal laptop for a small genome and is easy and convenient to use. Setup Prokka is not installed on the hpc-student cluster yet, so we’re going to learn how to install a program for your own use on the cluster. Jon Halter in University Research…

Read More Read More

UPDATED: Read Mapping

UPDATED: Read Mapping

Mapping short sequence reads to a reference sequence is a common task in genomics. Many different results can be extracted from a mapped sequence, depending on the original experimental design that produced the sequence reads and on the analysis that follows the mapping. For example: a genomic consensus for an individual (against the reference genome for that species) location of SNPs and other variations in one genome relative to the other location of expressed transcripts (coding mRNAs, noncoding RNAs such…

Read More Read More

Updated: NGS QC

Updated: NGS QC

In this exercise we will focus primarily on quality analysis and quality control of Illumina sequencing data, since that is the type of NGS data you are currently most likely to encounter in new datasets. You can view older versions of this exercise for tips on how to handle Ion Torrent or 454 data if you encounter that in your work. What sequence data looks like — the FASTQ file NGS read files tend to be distributed in *.fastq format. To…

Read More Read More

Calling variants from mixed viral samples

Calling variants from mixed viral samples

This is just an example of some bioinformatics work I’m currently doing, not a homework assignment. I learned to do something new yesterday and so I thought I’d put up a tutorial about it since this illustrates a few software installation, simple pipelining and scripting tasks that we commonly do in the class. Where’s the data from? We collect samples of wastewater and sequence the viral fragments out of them. Often the viral genome sequence we get is not complete,…

Read More Read More

New computer setup: Apple

New computer setup: Apple

Here’s what I do, in order, to set my computer up to be able to install software that we use in the class. Disclaimer: I just got a new work laptop, so it doesn’t have much of an environment on it yet. I can’t know what you’ve done to your computer in the past, and so some of these things may not work for you if your environment’s already like this: Administrator privileges 2. xcode Install the xcode libraries that…

Read More Read More

Read mapping and simple variants

Read mapping and simple variants

Mapping short sequence reads to a reference sequence is a common task in genomics. Many different results can be extracted from a mapped sequence, depending on the original experimental design that produced the sequence reads and on the analysis that follows the mapping. For example: a genomic consensus for an individual (against the reference genome for that species) location of SNPs and other variations in one genome relative to the other location of expressed transcripts (coding mRNAs, noncoding RNAs such…

Read More Read More

Got a Fresh Apple?

Got a Fresh Apple?

How do I get my computer ready for Bioinformatics 2111? I just got a new Apple laptop, so I get to do a fresh start. By the time I get through 4 years with a machine, I’ve generally installed and tweaked so much bioinformatics software and supporting libraries (and probably installed some of it via obsolete tools) that I literally can not tell students how to exactly replicate the working environment I have. As always, xckd knows how it is:…

Read More Read More

Microbial community analysis with QIIME2

Microbial community analysis with QIIME2

This tutorial makes use of the data from the NC Urban Microbiome Project, a collaboration seeded by the Department of Bioinformatics and Genomics and involving participants from our department as well as Civil Engineering, Biology, and Geography and Earth Science. Our goal in lab this week is to analyze 16S ribosomal RNA sequences from mixed microbial samples using QIIME2. The QIIME2 analysis will tell us what identifiable microbes are present in the samples (usually at the genus level rather than…

Read More Read More