Using the Ontologizer (Part 2)
Despite the willingness of the Ontologizer developers to correspond with me, we did not yet arrive at a solution for the problem of graphs not displaying in the Ontologizer, but we did arrive at a workaround. In the Ontologizer menu (shows in the Apple top menu bar when the main Ontologizer window is active) you can go to the Window pulldown, select “Log” and see the log file. In the log file, Ontologizer will tell you where it put the input to graphviz, e.g.
You can move this file to your results directory and use /usr/local/bin/dot to write out a PNG file either for all of your enriched genes, or for a single category, as shown below.
All of the internal navigation in the Ontologizer works otherwise, so you can navigate to individual locus IDs and see what functional information is attached. You can also write out annotation files and results tables — the graph display is the only part that doesn’t work quite right.
Choices in the Ontologizer
There are parameter and method choices for you to make within the Ontologizer and they will change your results.
The first major thing to consider is the choice of enrichment analysis method.The basic method for calculating enrichment (Term-for-term) does not take into account the inherent structure of the Gene Ontology, and may result in over-calling of enriched terms due to calling both sub-categories and the larger categories that contain them. The Parent-Child method (fully described here, and shown to be more effective than other contemporaneous methods at identifying enriched GO terms without over-calling) takes that into account, but does result in a list of rather high level categories that annotate lots of genes in the population, making the results somewhat difficult to interpret except by browsing to individual genes one at a time.
The second major thing to consider is whether to use a statistical correction for multiple tests — such as the Bonferroni correction. There are several options available, and you can investigate on your own the differences between Bonferroni, Bonferroni-Holm, etc, but the main point to be made here is that you should use a multiple testing correction. It is likely to be the difference between hundreds of genes showing up as statistically significant vs. tens (note: you want tens, not hundreds of false positives that really aren’t significant).