New computer setup: Apple
Here’s what I do, in order, to set my computer up to be able to install software that we use in the class. Disclaimer: I just got a new work laptop, so it doesn’t have much of an environment on it yet. I can’t know what you’ve done to your computer in the past, and so some of these things may not work for you if your environment’s already like this:
- Administrator privileges
2. xcode
Install the xcode libraries that support software development on the Apple OS
xcode-select –install
3. Conda
You will need conda to download and install various software packages. Conda is the package manager itself. Two different software distributions exist to use conda. Anaconda is a big, heavy duty tool that’s going to provide you a graphical user interface and preload a whole bunch of stuff. I prefer to use miniconda, because mostly I’m not browsing for conda packages, I’m homing in on a few very specific packages that I’m going to need and I don’t want to bring all that other junk along with it.
To install miniconda, you can find links to all the latest installer package versions at https://docs.conda.io/en/latest/miniconda.html. Download your relevant installer .pkg and follow the install instructions. If you’re installing this on your own computer you’ll only be able to install it for yourself, but that’s fine.
By starting with conda as our primary package manager, we’re making a choice. We’re always going to try to install things through conda first, and then only if there is no other choice, we’ll switch and try to install them using either python’s pip3 package manager, or homebrew, another package manager that supports scientific software.
4. Python
Conda installs its own python version, which will be tucked away in a conda-specific location in your home directory. When you install miniconda for yourself, the conda version of python will become your default; when you call python3 to run a script at the command line, that’s the version that will be used. (Unless you mess around with your environment variables — which is NOT RECOMMENDED until you know what you’re doing and honestly then only if you want to spend some serious TIME on it).
Before:
After:
This can be a little confusing, because you will also have other python versions hanging around on your computer (see figure at top), plus the system python version in /usr/bin/python, which you should never change to point to a different executable, or replace, because Apple is using that for system functions.
If you start your installing with conda, then just assume conda python’s going to be your default python, and keep it and its packages up to date.
You can update it by typing:
conda update python
This will trigger a whole bunch of packages getting updated, at least the first time. Conda will present you with a “package plan” and tell you what it’s going to update.
5. Important python packages
I’d just go ahead and install the following python packages right from the start. There might be others that you’ll want later, but it’s a safe bet that you’ll want to have these.
conda install scipy (this will install both scipy and numpy which contain scientific and numerical functions for python)
Note: you might get a warning at this point that your pip installer is not up to date. If it is out of date, use the command that is provided in the warning text and update pip.
e.g. /Library/Frameworks/Python.framework/Versions/3.10/bin/python3.10 -m pip install –upgrade pip
conda install pandas (pandas is a highly convenient data science library that you will most definitely need)
conda install biopython (contains functions that manipulate sequence)
There are probably more, but I always put these three on right away because I 100% know that I’m going to want them.
6. Add conda channels
This will help you find relevant packages that are available for conda installation. We can add the ‘bioconda’ channel which contains useful packages for bioinformatics. Type the following to add this channel to your configuration.
conda config –add channels bioconda
To locally install a software package that we use in the course, try this example:
conda install -c conda-forge -c bioconda -c defaults prokka
This will install the prokka prokaryotic genefinding package (and a lot of other stuff). When the install is done, try typing the command ‘prokka’ and see what happens. You can also type ‘blastp -help’ and you’ll find that a recent version of NCBI’s BLAST software has been installed and is ready for you to use.
Warning: conda may not always install the most up-to-date version of software. For some packages that have been pretty stable in their functions and models for a long time, this won’t be much of an issue. However, for packages that update frequently or when the conda version is very different than the current version, you may want to follow developer instructions to install a more current version instead.
7. Homebrew
There are a few software packages that are better installed with the Homebrew package manager than with conda. As long as you don’t try to install the same package with both, Homebrew should not interfere with your conda environment or vice versa.
To get Homebrew, paste the following command into your shell window:
/bin/bash -c “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)”
Your computer will want to know your password to verify that you have admin access during this process. It may take about 10 minutes to install Homebrew, so don’t give up on it if it seems to hang there for a bit.
Once the install finishes, you can test it out by installing samtools and bcftools, two packages that are commonly used to analyze high throughput sequence data.
brew install samtools
brew install bcftools
Like conda, Homebrew will tell you what other packages it plans to install to make the install you wanted work right. Everything will end up in a Homebrew-specific location on your computer, but executables will end up in a generic system location, which is one reason why you should not try to install the same software twice with conda and with Homebrew. To see where samtools and bcftools executables got installed, use the unix ‘which‘ command, e.g. ‘which samtools‘. To see which version got installed, type ‘samtools –version‘.