CRAP - a Character-state Reconstruction Analysis Program -------------------------------------------------------- This file contains the documentation for using CRAP. CRAP was written to quantify the robustness of most-parsimonious reconstructions of ancestral states on phylogenetic trees for binary character data. It does this by reconstructing ancestral states and counting character-state changes over a range of relative costs of gains and losses (step matrices). The parsimony algorithms used are those described by Swofford and Maddison (1992). DISCLAIMER ---------- CRAP is really a utility more than a stand-alone application. The output it generates is intended to be viewed in other applications. This is my first C program, and the code is rather inelegant; I welcome suggestions for improvement. It does work, but error-checking is bad and it is not very forgiving with respect to input file format and command-line options - see IMPORTANT POINTS, below. Compiling CRAP -------------- This program is distributed as C source code in the file crap-VERSION.c, where VERSION is the number of the current version. It was developed and tested on different versions of unix (Linux and Solaris) - I haven't tried compiling and running it on other platforms, so if you do this I would be interested in knowing if it works. To build an executable file, it needs to be compiled, e.g.: gcc crap-VERSION.c -o crap (substitute VERSION for the actual number) Using CRAP ---------- CRAP reads NEXUS files that contain a DATA block (with character states coded as 0 or 1) and a TREES block. For a description of the NEXUS format, I recommend the MacClade manual (Maddison and Maddison, 1992). The program is invoked from the command line as follows: crap -cX -tY [options] FILENAME where X and Y are integer numbers (beginning with 1) that refer to the character and tree you are interested in, respectively. If 'a' is used in place of an integer for X or Y (but not both), all characters or trees in the input file will be analyzed - this is useful for looking at permuted data. Note that the -c and -t options are required even if your file only contains one character and one tree. Options: -mgX, -mlY These options set the maximum cost of gains and losses, respectively - in other words, the range over which CRAP will reconstruct ancestral states. X and Y are single-decimal floating point numbers. The default value for both is 10.0. -iX This specifies how much CRAP should increment the cost of gains or losses at each step. The default value is 0.1 - i.e., CRAP will hold one cost fixed at 1.0 and increment the other cost by 0.1 until the maximum value (see above) is reached. Output: CRAP writes three output files, a data file containing the number of gains and losses reconstructed at each step over the range of costs, a Gnuplot command file that will allow you to plot the data using the program Gnuplot, and a NEXUS tree file with numbers at internal nodes that represent the cost of gains or losses at which the ancestral state of each node changes under parsimony. To distinguish between the two (costs of gains and losses), numbers preceded by '999' are costs of losses. The tree file should be viewed with the program TreeView by Rod Page - I use the '999' trick because TreeView does not accept negative numbers for node labeling. The files are named FILENAME.cX.tY.SUFFIX, where FILENAME is the name of the original input file, X and Y are numbers referring to the character and tree analyzed, respectively, and SUFFIX is either 'data', 'plot', or 'tree'. If multiple characters or trees are specified initally, then the output tree file is not written. Example ------- To analyze character 1 on tree 1 in the file 'testfile': crap -c1 -t1 testfile CRAP will create the files testfile.c1.t1.data, testfile.c1.t1.plot, and testfile.c1.t1.tree. To look at all the characters on tree 1: crap -ca -t1 testfile If you have gnuplot installed you can then graph the number of gains and losses by the command: gnuplot testfile.c1.t1.plot To view the robustness values at internal nodes on the tree, transfer testfile.c1.t1.tree to a Mac or PC and open it up with TreeView. IMPORTANT POINTS ---------------- DO NOT use a translation table in the trees block! Trees should contain the taxon names themselves, NOT numbers referring to a list of names elsewhere in the DATA block. MacClade gives you the option of not using a translation table. I hope to fix this in later versions. All trees have to be strictly bifurcating - polytomies are not tolerated. I hope to fix this in later versions. The '.data' file written contains 3 columns of numbers: 1) the difference between the cost of gains and the cost of losses, 2) the number of gains inferred at that weighting, and 3) the number of losses inferred at that weighting. If multiple characters or trees are examined, the data for each are separated by blank lines. This is useful for making graphs with Gnuplot, but not with other graphing programs like DeltaGraph or Excel that prefer data to be in tabular format. I have written a small utility in Java that will convert CRAP '.data' files into a tabular format for use with these other programs. For More Information, reporting bugs, etc. ------------------------------------------ For bug reporting, send email to Rick Ree . If the program absolutely refuses to work for you, feel free to send me your data file and I will try to get it to work. In the future I hope to rewrite it from scratch in Java.