NEXUS CLASS LIBRARY home | classes | functions

Class NxsDiscreteMatrix

Friends

class NxsAllelesBlock, class NxsCharactersBlock

Data Members

data, ncols, nrows

Member Functions

AddRows, AddState, AddState, CopyStatesFromFirstTaxon, DebugSaveMatrix, DuplicateRow, Flush, GetDiscreteDatum, GetNumStates, GetNumStates, GetObsNumStates, GetState, GetState, IsGap, IsGap, IsMissing, IsMissing, IsPolymorphic, IsPolymorphic, NxsDiscreteMatrix, ~NxsDiscreteMatrix, Reset, SetGap, SetGap, SetMissing, SetMissing, SetPolymorphic, SetPolymorphic, SetState, SetState

Class Description

Class providing storage for the discrete data types (dna, rna, nucleotide, standard, and protein) inside a DATA or CHARACTERS block. This class is also used to store the data for an ALLELES block. Maintains a matrix in which each cell is an object of the class NxsDiscreteDatum . NxsDiscreteDatum stores the state for a particular combination of taxon and character as an integer. Ordinarily, there will be a single state recorded for each taxon-character combination, but exceptions exist if there is polymorphism for a taxon-character combination, or if there is uncertainty about the state (e.g., in dna data, the data file might have contained an R or Y entry). Please consult the documentation for the NxsDiscreteDatum class for the details about how states are stored. For data stored in an ALLELES block, rows of the matrix correspond to individuals and columns to loci. Each NxsDiscreteDatum must therefore store information about both genes at a single locus for a single individual in the case of diploid data. To do this, two macros HIWORD and LOWORD are used to divide up the unsigned value into two words. A maximum of 255 distinct allelic forms can be accommodated by this scheme, assuming at minimum a 32-bit architecture. Because it is not known in advance how many rows are going to be necessary, The NxsDiscreteMatrix class provides the AddRows method, which expands the number of rows allocated for the matrix while preserving data already stored.

Key to symbols and colors

public, protected, private, A = abstract, C = constructor, D = destructor, I = inline, S = static, V = virtual, F = friend

 

Data Members
     NxsDiscreteDatum   **data
       
storage for the data
     unsigned   ncols
       
number of columns (characters) in the data matrix
     unsigned   nrows
       
number of rows (taxa) in the data matrix

 

Member Functions
    void   AddRows(unsigned nAddRows)
       
Allocates memory for nAddRows additional rows and updates the variable nrows. Data already stored in data is not destroyed; the newly-allocated rows are added at the bottom of the existing matrix.
    void   AddState(unsigned i, unsigned j, unsigned value)
       
Adds state directly to the NxsDiscreteDatum object at data[i][j]. Assumes data is non-NULL, i is in the range [0..nrows), and j is in the range [0..ncols). The value argument is assumed to be either zero or a positive integer. Calls private member function AddState to do the real work; look at the documentation for that function for additional details.
    void   AddState(NxsDiscreteDatum &d, unsigned value)
       
Adds an additional state to the array states of d. If states is NULL, allocates memory for two integers and assigns 1 to the first and value to the second. If states is non-NULL, allocates a new int array long enough to hold states already present plus the new one being added here, then deletes the old states array. Assumes that we are not trying to set either the missing state or the gap state here; the functions SetMissing or SetGap, respectively, should be used for those purposes. Also assumes that we do not want to overwrite the state. This function always adds states to those already present; use SetState to overwrite the state.
    void   CopyStatesFromFirstTaxon(unsigned i, unsigned j)
       
Sets state of taxon i and character j to state of first taxon for character j. Assumes i is in the range [0..nrows) and j is in the range [0..ncols). Also assumes data is non-NULL. Calls private function CopyFrom to do the actual work.
    void   DebugSaveMatrix(ostream &out, unsigned colwidth)
       
Performs a dump of the current contents of the data matrix stored in the variable data. Translates missing data elements to the '?' character and gap states to '-', otherwise, calls GetState to provide the representation.
    unsigned   DuplicateRow(unsigned row, unsigned count, unsigned startCol, unsigned endCol)
       
Duplicates columns startCol to endCol in row row of the matrix. If additional storage is needed to accommodate the duplication, this is done automatically through the use of the AddRows method. Note that count includes the row already present, so if count is 10, then 9 more rows will actually be added to the matrix to make a total of 10 identical rows. The parameters startCol and endCol default to 0 and ncols, so if duplication of the entire row is needed, these need not be explicitly specified in the call to DuplicateRow. Return value is number of additional rows allocated to matrix (0 if no rows needed to be allocated). Assumes data is non-NULL, row is in the range [[0..nrows), startCol is in the range [0..ncols), and endCol is either UINT_MAX, in which case it is reset to ncols - 1, or is in the range (startCol..ncols).
    void   Flush()
       
Deletes all cells of data, setting data to NULL, and resets nrows and ncols to 0.
    NxsDiscreteDatum   &GetDiscreteDatum(unsigned i, unsigned j)
       
Assumes that data is non-NULL, i is in the range [0..nrows) and j is in the range [0..ncols). Returns reference to the NxsDiscreteDatum object at row i, column j of matrix.
    unsigned   GetNumStates(unsigned i, unsigned j)
       
Returns number of states for taxon i and character j. Assumes data is non-NULL, i is in the range [0..nrows), and j is in the range [0..ncols). Calls private member function GetNumStates to do the actual work.
    unsigned   GetNumStates(NxsDiscreteDatum &d)
       
Returns total number of states assigned to d. Returns 0 for both gap and missing states.
    unsigned   GetObsNumStates(unsigned j)
       
Returns number of states for character j over all taxa. Note: this function is rather slow, as it must walk through each taxon for the specified character, adding the states encountered to a set, then finally returning the size of the set. Thus, if this function is called often, it would be advisable to initialize an array using this function, then refer to the array subsequently. Assumes j is in the range [0..ncols) and data is non-NULL. Includes all taxa (i.e. there is no mechanism here for treating some taxa as deleted for a particular analysis). Missing and gap states are ignored.
    unsigned   GetState(unsigned i, unsigned j, unsigned k)
       
Returns the kth state possessed by taxon i and character j. This taxon-character combination will have more than one state if there is ambiguity or polymorphism. Assumes that i is in the range [0..nrows) and j is in the range [0..ncols). Also assumes that at least one state is present (i.e., not the gap or missing state). Use the function GetNumStates to determine the number of states present. Assumes k is in the range [0..ns), where ns is the value returned by GetNumStates.
    unsigned   GetState(NxsDiscreteDatum &d, unsigned k)
       
Returns the internal unsigned representation of the state stored in d at position k of the array d.states. Assumes that the state is not the missing or gap state. Use IsMissing and IsGap prior to calling this function to ensure this function will succeed. Assumes that k is in the range [ 0 .. d.states[0]).
    bool   IsGap(unsigned i, unsigned j)
       
Returns 1 if the state for taxon i, character j, is set to the gap symbol, 0 otherwise. Assumes data is non-NULL, i is in the range [0..nrows) and j is in the range [0..ncols).
    bool   IsGap(NxsDiscreteDatum &d)
       
Returns true if the gap state is stored, otherwise returns false. Note: returns false if this datum represents missing data (often the gap state is equated with missing data, but the distinction is made here).
    bool   IsMissing(unsigned i, unsigned j)
       
Returns 1 if the state for taxon i, character j, is set to the missing data symbol, 0 otherwise. Assumes i is in the range [0..nrows) and j is in the range [0..ncols).
    bool   IsMissing(NxsDiscreteDatum &d)
       
Returns true if the missing state is stored, false otherwise. Note that this function returns false if the gap state is stored (often the gap state is equated with missing data, but the distinction is maintained here).
    bool   IsPolymorphic(unsigned i, unsigned j)
       
Returns 1 if character j is polymorphic in taxon i, 0 otherwise. Assumes data is non-NULL, i is in the range [0..nrows) and j is in the range [0..ncols).
    bool   IsPolymorphic(NxsDiscreteDatum &d)
       
Returns true if the number of states is greater than 1 and polymorphism has been specified. Returns false if the state stored is the missing state, the gap state, or if the number of states is 1.
C     NxsDiscreteMatrix(unsigned rows, unsigned cols)
       
Initializes nrows to rows and ncols to cols. In addition, memory is allocated for data (each element of the matrix data is a NxsDiscreteDatum object, which can do its own initialization).
D     ~NxsDiscreteMatrix()
       
Deletes memory allocated in the constructor for data member data.
    void   Reset(unsigned rows, unsigned cols)
       
Deletes all cells of data and reallocates memory to create a new matrix object with nrows = rows and ncols = cols. Assumes rows and cols are both greater than 0.
    void   SetGap(unsigned i, unsigned j)
       
Sets state stored at `data[i][j]' to the gap state. Assumes i is in the range [0..nrows) and j is in the range [0..ncols). Calls the private SetGap member function to do the actual work.
    void   SetGap(NxsDiscreteDatum &d)
       
Assigns the gap state to d, erasing any previously stored information. The gap state is designated internally as a states array one element long, with the single element set to the value 0.
    void   SetMissing(unsigned i, unsigned j)
       
Sets state stored at `data[i][j]' to the missing state. Assumes data is non-NULL, i is in the range [0..nrows) and j is in the range [0..ncols). Calls the private member function SetMissing to do the actual work.
    void   SetMissing(NxsDiscreteDatum &d)
       
Assigns the missing state to d, erasing any previously stored information. The missing state is stored internally as a NULL value for the states array.
    void   SetPolymorphic(unsigned i, unsigned j, unsigned value)
       
Specify 1 for value if taxon at row i is polymorphic at character in column j, 0 for value if uncertain which state applies. Sets polymorphism state of taxon i and character j to value. Assumes data is non-NULL, i is in the range [0..nrows) and j is in the range [0..ncols). Also assumes that the number of states stored is greater than 1. Calls private member function SetPolymorphic to do the actual work.
    void   SetPolymorphic(NxsDiscreteDatum &d, unsigned value)
       
Sets the polymorphism cell (last cell in d.states) to value. Warning: has no effect if there are fewer than 2 states stored!
    void   SetState(unsigned i, unsigned j, unsigned value)
       
Sets state of taxon i and character j to value. Assumes data is non-NULL, i is in the range [0..nrows) and j is in the range [0..ncols). Assumes that this function will not be called if there is missing data or the state is the gap state, in which case the functions SetMissing or SetGap, respectively, should be called instead. Calls the private member function SetState to do the actual work.
    void   SetState(NxsDiscreteDatum &d, unsigned value)
       
Assigns value to the 2nd cell in d.states (1st cell in d.states array is set to 1 to indicate that there is only one state). Warning: if already one or more states (including the gap state) are assigned to d, they will be forgotten. Use the function AddState if you want to preserve states already stored in d. Assumes state being set is not the missing state nor the gap state; use SetMissing or SetGap, respectively, for that.