Browsed by
Category: Uncategorized

Lab 1: Sequence QC And Assembly

Lab 1: Sequence QC And Assembly

The sequence data files for the exercise are preloaded on the hpc-student cluster in directories /projects/class/yoursection. Students registered for 8203 have a different directory. It contains several files of sequencing reads that you can attempt to use together or separately to create an assembly. In the first part of the exercise, you will trim and assemble the 200bp fragment Illumina reads alone to see what kind of assembly you get for the E. coli genome, and to learn the basics…

Read More Read More

Troubleshooting the read mapping lab

Troubleshooting the read mapping lab

Hey everyone! A few people are reporting some trouble in making this lab work so I updated all my software and went in. I am just going to walk through what worked with one sample. I started with the BC26 chloroplast file from SRA. This is fine: fastq-dump SRR1763773 Moved it to a shorter filename: mv SRR1763773.fastq BC26.fastq Built the bowtie2 index from the reference genome file. This is fine: bowtie2-build NC_007898.fasta NC_007898 This command should produce this output, including…

Read More Read More

BINF 6203: GAGE tutorial

BINF 6203: GAGE tutorial

This lab is offered as an opportunity for extra credit or for your project requirement (if applicable). This was presented by Dr. Luo during a guest lecture visit last year, so there is not “tech support” available for it this year. So the challenge is to figure out the tools from the tutorials, and (if possible) apply them to your count data from the vibrio data set. Due by the last day of class if you choose to do it….

Read More Read More

BINF 6203: Expression Statistics

BINF 6203: Expression Statistics

Last week you used transcriptome and read mapping tools to get read counts for your Vibrio vulnificus expression data, with two different pipelines. Now let’s analyze the data and get the top differentially expressed genes.

BINF 6203: Gene expression read quantitation

BINF 6203: Gene expression read quantitation

In this lab and the next, we are going to use two different methods to calculate differential expression for the same RNASeq dataset. In a nutshell, we have measured gene expression under two conditions (two replicates each) and we want to find out which genes are the most significantly differentially expressed between the two conditions. We are concerned with two things — magnitude of differential expression, and significance of differential expression. We want to see only genes with both high…

Read More Read More

BINF 6203: Annotation with RAST

BINF 6203: Annotation with RAST

In this lab exercise, you’ll use the myRAST software (or the RAST website) to annotate an assembled genome. Try this first with the E. coli genome assembly that you generated a couple of weeks ago (your contigs.fasta file) and then try to annotate a sequenced but poorly characterized genome from the NCBI microbial genomes collection. E. coli will have tons of genomic neighbors with well-documented annotation, while annotating a genome from among the uncategorized (or poorly categorized) environmental strains in…

Read More Read More

BINF 6203: NGS Genome Assembly

BINF 6203: NGS Genome Assembly

Part 1: assemble a chloroplast genome Today’s first challenge is to assemble the chloroplast genome data that was generated by the BINF 6350 class. You do not have to assemble all of the chloroplasts — pick one file to work with for this exercise. If I were choosing, I’d probably look for one of the larger files (as it will contain more reads) and also verify the quality with FastQC to make sure that the data quality isn’t particularly bad….

Read More Read More

BINF 6203: NGS Data QC

BINF 6203: NGS Data QC

Today, you’ll use trimming software to clean up several NGS data sets. We’ll take a look at the following: Chloroplast genome sequences generated by Spring 2014 BINF 6350 students. (Ion Torrent, single end) Some very old and crappy 454 sequence data for Vibrio vulnificus E64MW (single end) Some high-quality RNASeq data for Vibrio vulnificus C-strains (paired end Illumina) If you do not have space to download the full SRA files you can get pre-sampled data here. There are four software…

Read More Read More

Link for script input examples: https://www.dropbox.com/sh/uhp9j1qcns2t58w/AAD8sew-8TjcCF2YWzq8NO8Ca?dl=0

BINF 2111: Getting variable values from an input file (Lab)

BINF 2111: Getting variable values from an input file (Lab)

So far in your UNIX journey, you’ve learned how to: Run built-in and custom UNIX commands Put a series of commands together into a simple script that follows sequential logic, and construct a simple conditional that uses a test operator Introduce loops, user input, and more complex conditionals to automate your workflow It may not seem like it yet, but you have already learned a lot about common programming tasks. The last thing we’re going to do with bash, before…

Read More Read More