BINF 6215: Deposit your data in the SRA (Part 1: BioSamples)
Since your class has created an interesting data set, we are going to work on creating an SRA submission for your chloroplast data. I recently built a submission for a set of sixteen Vibrio vulnificus RNA-Seq samples. This is a walk-through of what I had to do to get that dataset submitted.
Things you should have ready before you start
- have your written laboratory protocol close at hand
- have a description of your experiment ready, in terms of biological samples, conditions and replicates
- have all your samples sorted and named so you know exactly what you are uploading
If you look at the quick start guide NCBI offers, you’ll probably get the idea that you should start building an SRA submission first, then the BioProject submission, then the BioSample submission. This is exactly backwards of the way that you want to go. You need BioSample IDs to enter into your BioProject, and you need a BioProject ID BEFORE you can finish an SRA submission.
Step 1: Lay out your samples and upload the descriptions
If you have more than one or two samples, get a batch template. I got the one for Microbes. It’s pretty self-explanatory. Describe your samples and enter the required fields in the spreadsheet. In this example, date of isolation and geographic location were irrelevant because all strains were lab-maintained mono-isolate cultures, but I still had to put something in those fields so I put not applicable/not collected. I also had to add a custom column at the end of the table (not visible) and enter my sample IDs in that column so each sample would have one completely unique attribute. The number I put in that field reflects the sample number in the V. vulnificus study. You’ll note I’ve got a BioProject ID in there. That’s because I did things in the wrong order and I’m actually going to have to e-mail NCBI to get it fixed, most likely. So, don’t be like me. Start with the BioSamples.
Next, proceed to create a BioSample entry. It’s easy — once you fill in the form below correctly, you’ll be prompted to do everything in an unambiguous way. In all my submissions, I’m setting the data to be released on July 1, 2015. If our papers are published before then, I can always go back and ask NCBI to move the release date forward.
Once you get to the end of this process you can submit, and NCBI will convert your attribute files into a list of samples with identifiers. This is what it will look like once your file is processed.
References: SRA submission quick-start guide