BINF 6215: UNIX on your Mac
The commands covered here are also (mostly) covered in Chapter 4 of Haddock and Dunn, should you need to review.
There are two different ways to navigate around the same set of files on your Mac — via the Finder, pointing and clicking at file and directory icons like you’re probably used to, and at the command line, like a badass. The entire point of this class is to put you on the road to being a badass who doesn’t need their hand held by a commercial software package in order to get things done.
So how do you access the command line? Look in your Dock — is there an icon of a black window with an arrow prompt (>)? Click on that. If it’s not in your doc you’ll find it in Applications/Utilities. It’s your Terminal window. Once you double click you can type commands directly into that window.
Two things to know. One, you’re talking to a program called a command line interpreter, commonly known as a “shell”. Different machines have different default shells and they interpret commands differently. So command lines will not always work the same in every place, although the most common commands are pretty consistent. You can tell what kind of shell you are in by typing
echo $0
What kind of shell does your Mac use?
The second thing to know is that when you are at the command prompt, you are in a virtual location in the filesystem. That location determines which files and programs you can interact with directly. To find out where you are, you type
pwd
which means “print working directory”. Where are you?
Every location in the UNIX filesystem is described by a path of directories relative to the root directory. The location of the root directory is /. So if you are in /Users/yourname, you are in your home directory, which is contained inside the Users directory, which is contained inside the root directory. You can also find out what files are “around” you in the directory you’re in. Type
ls
to see a list of files and directories contained in the directory you are in. If you type
ls -a
you’ll see some special files that don’t appear in the regular list. Those files also won’t appear if you look at your home directory in the finder, but they’re there. The most important file that could be hidden there is a file called .bash_profile, which you can edit to modify how your shell environment behaves.
Setting shell behavior
If you type
env
you will get a description of some very important things, including the path where your command line interpreter looks for programs that you have installed, like so
PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin
It’s really important that your environment is set up so that you’re looking for programs in the right path when you try to run them from scripts or at the command line. You can add directories to your path by creating a .bash_profile file containing the command:
export PATH="$HOME/newdirectory/directory:$PATH"
which would add the directory /Users/yourname/newdirectory/directory to your path in any new shell window you open. If you don’t have a .bash_profile file already, the $PATH variable is set in a system-level file, and you can make sure you get both the system path, and the new directories you are adding to your path, by including the variable $PATH in your .bash_profile variable.
If you don’t have the right directories in your path and you type the name of a program into your shell, nothing will happen because the system does not know where to look for that program.
Finding programs on your computer
If you want to know whether your system can find a particular program, use the which command. First let’s see if your system has the “vi” editor. Type:
which vi
which vim
Your system should return a directory path location in each case.
Using vi
vi is a text-based editing software that has been around on UNIX systems forever; vim is a more modern version with some added features. Along with other editors like pico and emacs, these programs can be used to edit text files and to write code. Some editors will highlight and help you with writing good quality code, others are very basic. Which you use is really up to you. I find vi/vim easiest to use for simple editing tasks, like editing my .bash_profile or doing a global search and replace in a file. To create and edit a .bash_profile, type:
vi .bash_profile
You will be shifted into the vi environment. There is a lot to learn about vi and we aren’t going to cover it all here. But the basics are, vi has two modes. “command mode” (where it is by default when you open it) and “insert mode” (which allows you to add text). To switch into insert mode, type:
i
Yes, the letter i just i nothing but i. Add the text:
export PATH="$HOME/software/bin:$PATH"
By adding this text, you are telling the shell interpreter to look for executable programs in the system path, but to also look for executable programs in a directory called software/bin, which is under your home directory.
To get out of “insert mode”, hit the escape key. Then to save your new .bash_profile and exit vi, type:
:wq
Commands in vi are preceded by a “:”. “wq” means write-quit. To learn more about vi, there are hundreds of tutorials on the internet. This is one of them.
Once you are quit out of vi, type:
source .bash_profile
“source” is the command that tells the shell interpreter to read instructions from a shell script. Note: if you want to learn everything there is to know about bash shell scripting, or close to it, work through Mendel Cooper’s shell scripting tutorials. However, this is a skill that you can learn as you need it and it will make more sense to learn it when you have a problem in mind that you need to solve, as we will in a few days.
Moving around in the filesystem
To move from directory to directory in the file system, you use the command “cd”. “pwd” told you that you were located in the directory /Users/yourname. But what if you want to move to another location? You might have noticed that in your system path, /usr/local/bin is one of the places that your shell interpreter looks for executable programs. To move to that directory, type:
cd /usr/local/bin
Then type:
ls
to see what programs are present. Another location you might be interested in is your Dropbox. If you don’t have a Dropbox account by now, you should get one. If you do, your Dropbox is probably located in /Users/yourname/Dropbox. If you join the 6215-DATA folder that I invited you to using your UNCC email address, you can practice copying files using some of the data in that folder. Use the cd command to move to the 6215-DATA folder in your Dropbox by typing:
cd ~/Dropbox/6215-DATA
- When you preface a path with ~/, that means “start at my home directory”.
- When you preface a path or command with ./, that means “look in this directory”.
- When you preface a path with ../, that means “start in the directory one above this one”.
- When you preface a path with ../../, that means “start in the directory two above this one”.
Renaming, moving and copying files
To copy some files from your current directory to your home directory, you can type:
mkdir ~/data
cp -R cplastSum2014 ~/data
This sequence of commands creates a directory called “data” under your home directory, and makes a copy of the “cplastSum2014” folder in that data directory. You will be using the chloroplast data for some of the exercises in this course so you can go ahead and make that copy. It might take some time.
If you want to rename a file, you can use the “mv” command. For instance, if you want to call your directory “chloroplasts” instead of “cplastSum2014”, you would type:
cd ~/data
mv cplastSum2014 chloroplasts
The first command moves you to your data directory and to the files you actually want the second command to affect; the second command renames the file.
Commands and flags
You may have noticed above that when we used the cp command above we followed it with “-R”. That’s what’s called a flag. Flags change settings in the program you are running, and can also be followed by a parameter value. Flags are different for every UNIX command and program — their usage is specific to the command — although you’ll often find that -i followed by a filename is input, -o followed by a filename is output, and some other things may be common between command flags. To find out what command line options cp command can be used with, just type
cp
at the command line with no other flags or filenames. And to get a full online manual page describing the command usage in detail, or command line help type:
man cp cp -h
Not all programs will have UNIX manpages, but most programs will have a command line help option.
Now let’s look at command line options for a little program called nucmer, which should be on your workstation. nucmer is part of the MUMmer package of sequence analysis software. MUMmer is not a native UNIX program, but most developers will provide similar command line feedback options with programs that are meant to run at the command line. So you can type just “nucmer” to get the usage, and “nucmer -h” to get a help page describing all the options.
- Is nucmer on your machine?
- Where is the executable program located?
- What does nucmer do?
- What are mandatory elements that have to be present for nucmer to run?
- How do you change the prefix of the output files from nucmer?
- How do you set the program to use matches that are unique in both the query and the reference?
When you start working with command line programs to build pipelines, examining these options and making sure of what your parameters should be, where the input will come from and where the output will flow to is a critical step.
We’ve barely scratched the surface with these UNIX operations, but now you should at least feel comfortable about doing basic operations with your files from the command line. If you want some more UNIX tips in book format in addition to what you find in Haddock and Dunn, there are three chapters of Developing Bioinformatics Computer Skills (by Gibas and Jambeck) that are relevant. You can probably get them on a free trial basis if you have not used Safari before.