Prepare Your Computer
Many of the assignments in BINF 6203 are scaled so that you can complete them on a reasonably powerful laptop without using the University Research Computing cluster. That said, in order to do the assignments you are going to need to prepare your computer by installing certain tools. I am going to assume that you’re working on an Apple computer running OSX, because that’s the type of computer I have available to set up and test, but the general requirements are the same if you are setting up a Windows laptop with the latest version of Windows. If you already have your personal computer running Linux you probably know what to do.
I don’t have an entirely fresh laptop to set up, but I just updated my laptop to Mac OSX Mojave, so I’m going to run through the series of steps that I follow to get all my software in place after a major update. Disclaimer: this may not catch every last step of a setup process for a new machine. As you work with, update and customize a new computer over a period of years, your environment will get more and more idiosyncratic and it will be hard to recall all the customizations you’ve made.
You need a Terminal (shell) window
If this is the first time you’ve ever had a command line relationship with a computer it is going to feel difficult at first. If you are totally unfamiliar with a UNIX command line environment, you can start by working through the Codecademy Learn The Command Line tutorial.
Pep talk: rules for using UNIX and not driving yourself crazy
- It’s never the computer, it’s always you. You typed it wrong, or you forgot a step, or your input was formatted wrong. Look again, look closer, and figure it out.
- Don’t get mad. That’s just the way UNIX works, it’s not personal, and you have to figure it out, so just power through.
- Someone else did it wrong already and figured it out already. Whatever you are trying to solve has been encountered by a beginning UNIX user or bioinformatician before you. Stack Overflow is your friend — join it, search for the answer to your question that is probably already there, and if not post a question and someone will help you figure it out.
Apple computers come with a built in terminal shell by default because the OS is based on FreeBSD Unix (this is why I prefer them). The Terminal application is in the Utilities folder under your Applications folder. Put it in your Dock so it’s easy to find.
Newer versions of Windows offer bash shell (terminal) functionality. On Windows you can also install Cygwin and/or PuTTY. The former is more fully featured, the latter is great if you just need to connect to the URC cluster from your Windows desktop. If you’re feeling frisky, you could partition your Windows machine so you can dual boot either Windows or Linux. (If you did not understand what I just said, either don’t try it, or mentally prepare for a tough process with a steepish learning curve.)
You need a suite of developer tools for Unix
Developer tools include things like code libraries and compilers that will turn readable code into executable programs. You often need these available to install research software even if you aren’t writing your own code.
On an Apple computer, this means you need XCode installed. You can install it from the App store like any other Apple application. If XCode or parts of XCode are already on your computer, you can try to open it. When you do that, it may ask you to install additional required components. Then install the xcode command line tools by typing xcode-select –install at the command line. You may be prompted to agree to licenses and install other needed software. Once you do that you should be good to go.
You need some package managers
Package managers are software applications that make installing other software applications easier. When I want to install, say, the samtools sequence analysis program, I need to have a bunch of other software installed before I even do that or it won’t work right. These software “prerequisites” are called dependencies, and a package manager takes care of them automatically, IF the software is available via that package manager.
The first one I usually install is homebrew, because a lot of common genomics software is available through homebrew, as well as other favorite general-purpose UNIX packages. To get it just cut and paste this command at the UNIX command line:
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Once you have homebrew, you can use brew install packagename to install any available package, brew update to update brew itself and its lists of available packages and versions, and brew upgrade to upgrade an individual package (if you name one) or everything you’ve previously installed and all the relevant dependencies.
Some packages you want won’t be homebrew friendly and you’ll have to figure out how to install them on your own from source, or use a different method. Other package-manager-adjacent tools that you might need include pip2 (python package installer), mercurial (a revision control tool), git (same), and miniconda. If the software install instructions tell you you need one of those, you can figure out how to install it at the time.
On a Windows (Cygwin) or Ubuntu Linux machine, one package manager that is consistently pretty useful to have is apt-get. The ecosystem for setting up and installing scientific software on these machines is quite different from OSX though, and you will need to look for install instructions in packages as we need them.
You probably need python3 and R
Even if I don’t make you use them a lot in this class, you are going to want to learn them eventually, so you might as well learn how to get them set up.
You need to make some choices for installing python, and stick to them, so that you don’t have two competing python installations on your computer. First, the current version of python is 3.x. Python 2.x is deprecated. Some operating systems might still make python 2 available to you, but don’t be tempted. Install, learn and update python 3. Second, you are going to have to keep updating python. You can install and maintain it through homebrew with brew install python. This is a great choice if you don’t have python on your computer at all yet, as it can then live in your homebrew Cellar with all your other packages and get updated every time you run brew upgrade. However, if python already lives on your computer (type python at the command line and hit enter to find out) then you’ll probably just want to take upgrades from Apple as they occur unless you have a really important reason to have the bleeding-edge version.
To install r, just type brew install r and you’ll be in business. To use r, some people like RStudio, and some people like the command line. We won’t use r until we get to gene expression analysis, so you have some time before you need to do this.