NEXUS CLASS LIBRARY home | classes | functions

Class NxsCharactersBlock

Friends

class NxsAssumptionsBlock

Enums

DataTypesEnum

Data Members

activeChar, activeTaxon, assumptionsBlock, charLabels, charPos, charStates, datatype, eliminated, equates, formerly_datablock, gap, interleaving, labels, matchchar, matrix, missing, nchar, ncharTotal, newchar, newtaxa, ntax, ntaxTotal, respectingCase, symbols, taxa, taxonPos, tokens, transposing

Member Functions

ApplyDelset, ApplyExset, ApplyIncludeset, ApplyRestoreset, BuildCharPosArray, CharLabelToNumber, Consume, DebugShowMatrix, DeleteTaxon, ExcludeCharacter, GetActiveCharArray, GetActiveTaxonArray, GetCharLabel, GetCharPos, GetDataType, GetGapSymbol, GetInternalRepresentation, GetMatchcharSymbol, GetMaxObsNumStates, GetMissingSymbol, GetNChar, GetNCharTotal, GetNTax, GetNTaxTotal, GetNumActiveChar, GetNumActiveTaxa, GetNumEliminated, GetNumEquates, GetNumMatrixCols, GetNumMatrixRows, GetNumStates, GetObsNumStates, GetOrigCharIndex, GetOrigCharNumber, GetOrigTaxonIndex, GetOrigTaxonNumber, GetState, GetStateLabel, GetSymbols, GetTaxonLabel, GetTaxPos, HandleCharlabels, HandleCharstatelabels, HandleDimensions, HandleEliminate, HandleEndblock, HandleFormat, HandleMatrix, HandleNextState, HandleStatelabels, HandleStdMatrix, HandleTaxlabels, HandleTokenState, HandleTransposedMatrix, IncludeCharacter, IsActiveChar, IsActiveTaxon, IsDeleted, IsEliminated, IsExcluded, IsGapState, IsInSymbols, IsInterleave, IsLabels, IsMissingState, IsPolymorphic, IsRespectCase, IsTokens, IsTranspose, NxsCharactersBlock, ~NxsCharactersBlock, PositionInSymbols, Read, Report, Reset, ResetSymbols, RestoreTaxon, ShowStateLabels, ShowStates, TaxonLabelToNumber, WriteStates

Class Description

This class handles reading and storage for the NEXUS block CHARACTERS. It overrides the member functions Read and Reset, which are abstract virtual functions in the base class NxsBlock . The issue of bookkeeping demands a careful explanation. Users are allowed to control the number of characters analyzed either by "eliminating" or "excluding" characters. Characters can be eliminated (by using the ELIMINATE command) at the time of execution of the data file, but not thereafter. Characters can, however, be excluded at any time after the data are read. No storage is provided for eliminated characters, whereas excluded characters must be stored because at any time they could be restored to active status. Because one can depend on eliminated characters continuing to be eliminated, it would be inefficient to constantly have to check whether a character has been eliminated. Hence, the characters are renumbered so that one can efficiently traverse the entire range of non-eliminated characters. The original range of characters will be hereafter denoted [0..ncharTotal), whereas the new, reduced range will be denoted [0..nchar). The two ranges exactly coincide if ncharTotal = nchar (i.e., no ELIMINATE command was specified in the CHARACTERS block. The possibility for eliminating and excluding characters creates a very confusing situation that is exacerbated by the fact that character indices used in the code begin at 0 whereas character numbers in the data file begin at 1. The convention used hereafter will be to specify "character number k" when discussing 1-offset character numbers in the data file and either "character index k" or simply "character k" when discussing 0-offset character indices. There are several functions (and data structures) that provide services related to keeping track of the correspondence between character indices in the stored data matrix compared to character numbers in the original data file. The array charPos can be used to find the index of one of the original characters in the matrix. The function GetCharPos provides public access to the protected charPos array. For example, if character 9 (= character number 10) was the only one eliminated, GetCharPos(9) would return UINT_MAX indicating that that character 9 does not now exist. GetCharPos(10) returns 9 indicating that character 10 in the data file corresponds to character 9 in the stored data matrix. All public functions in which a character number must be supplied (such as GetInternalRepresentation) assume that the character number is the current position of the character in the data matrix. This allows one to quickly traverse the data matrix without having to constantly check whether or not a character was eliminated. Note that GetNChar returns nchar, not ncharTotal, and this function should be used to obtain the end point for a traversal of characters in the matrix. Other functions requiring a (current) character index are:

 GetInternalRepresentation
 GetNumStates
 GetNumStates
 GetObsNumStates
 GetOrigCharIndex
 GetOrigCharNumber
 GetState
 HandleNextState
 HandleTokenState
 IsGapState
 IsMissingState
 IsPolymorphic
 ShowStateLabels
The function IsEliminated is exceptional in requiring (by necessity) the original character index. The function GetOrigCharIndex returns the original character index for any current character index. This is useful only when outputting information that will be seen by the user, and in this case, it is really the character number that should be output. To get the original character number, either add 1 to GetOrigCharIndex or call GetOrigCharNumber function (which simply returns GetOrigCharIndex + 1). A character may be excluded by calling the function ExcludeCharacter and providing the current character index or by calling the function ApplyExset and supplying an exclusion set comprising original character indices. These functions manipulate a bool array, activeChar, which can be queried using one of two functions: IsActiveChar or IsExcluded. The array activeChar is nchar elements long, so IsActiveChar and IsExcluded both accept only current character indices. Thus, a normal loop through all characters in the data matrix should look something like this:
 for(unsigned j = 0; j < nchar; j++)
 	{
 	if (IsExcluded(j))
 		continue;
 	.
 	.
 	.
 	}
A corresponding set of data structures and functions exists to provide the same services for taxa. Thus, ntax holds the current number of taxa, whereas ntaxTotal holds the number of taxa specified in the TAXA block. If data is provided in the MATRIX command for all taxa listed in the TAXA block, ntax will be equal to ntaxTotal. If data is not provided for some of the taxa, the ones left out are treated just like eliminated characters. The function GetTaxonPos can be used to query the taxonPos array, which behaves like the charPos array does for characters: UINT_MAX for element i means that the taxon whose original index was i has been eliminated and no data is stored for it in the matrix. Otherwise, GetTaxonPos(i) returns the current index corresponding to the taxon with an original index of i. The function GetNTax returns ntax, whereas GetNTaxTotal must be used to gain access to ntaxTotal (but this is seldom necessary). The functions GetOrigTaxonIndex and GetOrigTaxonNumber behave like their character counterparts, GetOrigCharIndex and GetOrigCharNumber. Like characters, taxa can be temporarily inactivated so that they do not participate in any analyses.until they are reactivated by the user. Inactivation of a taxon is refered to as deleting the taxon, whereas restoring a taxon means reactivating it. Thus, the ApplyDelset, DeleteTaxon, and RestoreTaxon functions correspond to the ApplyExset, ExcludeCharacter, and IncludeCharacter functions for characters. To query whether a taxon is currently deleted, use either IsActiveTaxon or IsDeleted. A normal loop across all active taxa can be constructed as follows:
 for (unsigned i = 0; i < ntax; i++)
 	{
 	if (IsDeleted(i))
 		continue;
 	.
 	.
 	.
 	}
Below is a table showing the correspondence between the elements of a CHARACTERS block in a NEXUS file and the variables and member functions of the NxsCharactersBlock class that can be used to access each piece of information stored. Items in parenthesis should be viewed as "see also" items.
 NEXUS         Command        Data           Member
   Command       Atribute       Member         Functions
 ---------------------------------------------------------------------
 DIMENSIONS    NEWTAXA        newtaxa
 
               NTAX           ntax           GetNTax
                                (ntaxTotal)    (GetNumMatrixRows)
 
               NCHAR          nchar          GetNChar
                              (ncharTotal)   (GetNumMatrixCols)
 
 FORMAT        DATATYPE       datatype       GetDataType
 
               RESPECTCASE    respectingCase IsRespectCase
 
               MISSING        missing        GetMissingSymbol
 
               GAP            gap            GetGapSymbol
 
               SYMBOLS        symbols        GetSymbols
 
               EQUATE         equates        GetEquateKey
                                             GetEquateValue
                                             GetNumEquates
 
               MATCHCHAR      matchchar      GetMatchcharSymbol
 
               (NO)LABELS     labels         IsLabels
 
               TRANSPOSE      transposing    IsTranspose
 
               INTERLEAVE     interleaving   IsInterleave
 
               ITEMS          (Note: only STATES implemented)
 
               STATESFORMAT   (Note: only STATESPRESENT implemented)
 
               (NO)TOKENS     tokens         IsTokens
 
 ELIMINATE                    eliminated     IsEliminated
                                             GetNumEliminated
 
 MATRIX                       matrix         GetState
                                             GetInternalRepresentation
                                             GetNumStates
                                             GetNumMatrixRows
                                             GetNumMatrixCols
                                             IsPolymorphic

Key to symbols and colors

public, protected, private, A = abstract, C = constructor, D = destructor, I = inline, S = static, V = virtual, F = friend

 

Enums
enum DataTypesEnum
  standard = 1
    indicates matrix holds characters with arbitrarily-assigned, discrete states, such as discrete morphological data
  dna = 2
    indicates matrix holds DNA sequences (states A, C, G, T)
  rna = 3
    indicates matrix holds RNA sequences (states A, C, G, U)
  nucleotide = 4
    indicates matrix holds nucleotide sequences
  protein = 5
    indicates matrix holds amino acid sequences
  continuous = 6
    indicates matrix holds continuous data

 

Data Members
     bool   *activeChar
       
`activeChar[i]' true if character i not excluded; i is in range [0..nchar)
     bool   *activeTaxon
       
`activeTaxon[i]' true if taxon i not deleted; i is in range [0..ntax)
     NxsAssumptionsBlock   *assumptionsBlock
       
pointer to the ASSUMPTIONS block in which exsets, taxsets and charsets are stored
     NxsStringVector   charLabels
       
storage for character labels (if provided)
     unsigned   *charPos
       
maps character numbers in the data file to column numbers in matrix (necessary if some characters have been eliminated)
     NxsStringVectorMap   charStates
       
storage for character state labels (if provided)
     DataTypesEnum   datatype
       
flag variable (see datatypes enum)
     NxsUnsignedSet   eliminated
       
array of (0-offset) character numbers that have been eliminated (will remain empty if no ELIMINATE command encountered)
     NxsStringMap   equates
       
list of associations defined by EQUATE attribute of FORMAT command
     bool   formerly_datablock
       
true if this object was originally read in as a DATA block rather than as a CHARACTERS block, false otherwise
     char   gap
       
gap symbol for use with molecular data
     bool   interleaving
       
indicates matrix will be in interleaved format
     bool   labels
       
indicates whether or not labels will appear on left side of matrix
     char   matchchar
       
match symbol to use in matrix
     NxsDiscreteMatrix   *matrix
       
storage for discrete data
     char   missing
       
missing data symbol
     unsigned   nchar
       
number of columns in matrix (same as ncharTotal unless some characters were eliminated, in which case ncharTotal > nchar)
     unsigned   ncharTotal
       
total number of characters (same as nchar unless some characters were eliminated, in which case ncharTotal > nchar)
     bool   newchar
       
true unless CHARLABELS or CHARSTATELABELS command read
     bool   newtaxa
       
true if NEWTAXA keyword encountered in DIMENSIONS command
     unsigned   ntax
       
number of rows in matrix (same as ntaxTotal unless fewer taxa appeared in CHARACTERS MATRIX command than were specified in the TAXA block, in which case ntaxTotal > ntax)
     unsigned   ntaxTotal
       
number of taxa (same as ntax unless fewer taxa appeared in CHARACTERS MATRIX command than were specified in the TAXA block, in which case ntaxTotal > ntax)
     bool   respectingCase
       
if true, RESPECTCASE keyword specified in FORMAT command
     char   *symbols
       
list of valid character state symbols
     NxsTaxaBlock   *taxa
       
pointer to the TAXA block in which taxon labels are stored
     unsigned   *taxonPos
       
maps taxon numbers in the data file to row numbers in matrix (necessary if fewer taxa appear in CHARACTERS block MATRIX command than are specified in the TAXA block)
     bool   tokens
       
if false, data matrix entries must be single symbols; if true, multicharacter entries are allows
     bool   transposing
       
indicates matrix will be in transposed format

 

Member Functions
    unsigned   ApplyDelset(NxsUnsignedSet &delset)
       
Deletes (i.e., excludes from further analyses) taxa whose indices are contained in the set delset. The taxon indices refer to original taxon indices, not current indices (originals will equal current ones if number of taxa in TAXA block equals number of taxa in MATRIX command). Returns the number of taxa actually deleted (some may have already been deleted)
    unsigned   ApplyExset(NxsUnsignedSet &exset)
       
Excludes characters whose indices are contained in the set exset. The indices supplied should refer to the original character indices, not current character indices. Returns number of characters actually excluded (some may have already been excluded).
    unsigned   ApplyIncludeset(NxsUnsignedSet &inset)
       
Includes characters whose indices are contained in the set inset. The indices supplied should refer to the original character indices, not current character indices.
    unsigned   ApplyRestoreset(NxsUnsignedSet &restoreset)
       
Restores (i.e., includes in further analyses) taxa whose indices are contained in the set restoreset. The taxon indices refer to original taxon indices, not current indices (originals will equal current ones if number of taxa in TAXA block equals number of taxa in MATRIX command).
    void   BuildCharPosArray(bool check_eliminated)
       
Use to allocate memory for (and initialize) charPos array, which keeps track of the original character index in cases where characters have been eliminated. This function is called by HandleEliminate in response to encountering an ELIMINATE command in the data file, and this is probably the only place where BuildCharPosArray should be called with check_eliminated true. BuildCharPosArray is also called in HandleMatrix, HandleCharstatelabels, HandleStatelabels, and HandleCharlabels.
V   unsigned   CharLabelToNumber(NxsString s)
       
Converts a character label to a 1-offset number corresponding to the character's position within charLabels. This method overrides the virtual function of the same name in the NxsBlock base class. If s is not a valid character label, returns the value 0.
    void   Consume(NxsCharactersBlock &other)
       
Transfers all data from other to this object, leaving other completely empty. Used to convert a NxsDataBlock object to a NxsCharactersBlock object in programs where it is desirable to just have a NxsCharactersBlock for storage but also allow users to enter the information in the form of the deprecated NxsDataBlock. This function does not make a copy of such things as the data matrix, instead just transferring the pointer to that object from other to this. This is whay it was named Consume rather than CopyFrom.
V   void   DebugShowMatrix(ostream &out, bool use_matchchar, char *marginText)
       
Provides a dump of the contents of the matrix variable. Useful for testing whether data is being read as expected. If marginText is NULL, matrix output is placed flush left. If each line of output should be prefaced with a tab character, specify " " for marginText.
I   void   DeleteTaxon(unsigned i)
       
Deletes taxon whose 0-offset current index is i. If taxon has already been deleted, this function has no effect.
I   void   ExcludeCharacter(unsigned i)
       
Excludes character whose 0-offset current index is i. If character has already been excluded, this function has no effect.
I   bool   *GetActiveCharArray()
       
Returns activeChar data member (pointer to first element of the activeChar array). Access to this protected data member is necessary in certain circumstances, such as when a NxsCharactersBlock object is stored in another class, and that other class needs direct access to the activeChar array even though it is not derived from NxsCharactersBlock.
I   bool   *GetActiveTaxonArray()
       
Returns activeTaxon data member (pointer to first element of the activeTaxon array). Access to this protected data member is necessary in certain circumstances, such as when a NxsCharactersBlock object is stored in another class, and that other class needs direct access to the activeTaxon array even though it is not derived from NxsCharactersBlock.
I   NxsString   GetCharLabel(unsigned i)
       
Returns label for character i, if a label has been specified. If no label was specified, returns string containing a single blank (i.e., " ").
I   unsigned   GetCharPos(unsigned origCharIndex)
       
Returns current index of character in matrix. This may differ from the original index if some characters were removed using an ELIMINATE command. For example, character number 9 in the original data matrix may now be at position 8 if the original character 8 was eliminated. The parameter origCharIndex is assumed to range from 0 to ncharTotal - 1.
I   unsigned   GetDataType()
       
Returns value of datatype.
I   char   GetGapSymbol()
       
Returns the gap symbol currently in effect. If no gap symbol specified, returns ''.
I   int   GetInternalRepresentation(unsigned i, unsigned j, unsigned k)
       
Returns internal representation of the state for taxon i, character j. In the normal situation, k is 0 meaning there is only one state with no uncertainty or polymorphism. If there are multiple states, specify a number in the range [0..n) where n is the number of states returned by the GetNumStates function. Use the IsPolymorphic function to determine whether the multiple states correspond to uncertainty in state assignment or polymorphism in the taxon. The value returned from this function is one of the following:
  • -3 means gap state (see note below)
  • -2 means missing state (see note below)
  • an integer 0 or greater is internal representation of a state
Note: gap and missing states are actually represented internally in a different way; for a description of the actual internal representation of states, see the documentation for NxsDiscreteDatum.
I   char   GetMatchcharSymbol()
       
Returns the matchchar symbol currently in effect. If no matchchar symbol specified, returns ''.
V   unsigned   GetMaxObsNumStates()
       
Returns the maximum observed number of states for any character. Note: this function is rather slow, as it must walk through each row of each column, adding the states encountered to a set, then finally returning the size of the set. Thus, if this function is called often, it would be advisable to initialize an array using this function, then refer to the array subsequently.
I   char   GetMissingSymbol()
       
Returns the missing data symbol currently in effect. If no missing data symbol specified, returns ''.
I   unsigned   GetNChar()
       
Returns the value of nchar.
I   unsigned   GetNCharTotal()
       
Returns the value of ncharTotal.
I   unsigned   GetNTax()
       
Returns the value of ntax.
I   unsigned   GetNTaxTotal()
       
Returns the value of ntaxTotal.
    unsigned   GetNumActiveChar()
       
Performs a count of the number of characters for which activeChar array reports true.
    unsigned   GetNumActiveTaxa()
       
Performs a count of the number of taxa for which activeTaxon array reports true.
I   unsigned   GetNumEliminated()
       
Returns the number of characters eliminated with the ELIMINATE command.
I   unsigned   GetNumEquates()
       
Returns the number of stored equate associations.
I   unsigned   GetNumMatrixCols()
       
Returns the number of actual columns in matrix. This number is equal to nchar, but can be smaller than ncharTotal since the user could have eliminated some of the characters.
I   unsigned   GetNumMatrixRows()
       
Returns the number of actual rows in matrix. This number is equal to ntax, but can be smaller than ntaxTotal since the user did not have to provide data for all taxa specified in the TAXA block.
I   unsigned   GetNumStates(unsigned i, unsigned j)
       
Returns the number of states for taxon i, character j.
IV   unsigned   GetObsNumStates(unsigned j)
       
Returns the number of states for character j over all taxa. Note: this function is rather slow, as it must walk through each row, adding the states encountered to a set, then finally returning the size of the set. Thus, if this function is called often, it would be advisable to initialize an array using this function, then refer to the array subsequently.
    unsigned   GetOrigCharIndex(unsigned j)
       
Returns the original character index in the range [0..ncharTotal). Will be equal to j unless some characters were eliminated.
I   unsigned   GetOrigCharNumber(unsigned j)
       
Returns the original character number (used in the NEXUS data file) in the range [1..ncharTotal]. Will be equal to j + 1 unless some characters were eliminated.
    unsigned   GetOrigTaxonIndex(unsigned i)
       
Returns the original taxon index in the range [0..ntaxTotal). Will be equal to i unless data was not provided for some taxa listed in a preceding TAXA block.
I   unsigned   GetOrigTaxonNumber(unsigned i)
       
Returns the original taxon number (used in the NEXUS data file) in the range [1..ntaxTotal]. Will be equal to i + 1 unless data was not provided for some taxa listed in a preceding TAXA block.
I   char   GetState(unsigned i, unsigned j, unsigned k)
       
Returns symbol from symbols list representing the state for taxon i and character j. The normal situation in which there is only one state with no uncertainty or polymorphism is represented by k = 0. If there are multiple states, specify a number in the range [0..n) where n is the number of states returned by the GetNumStates function. Use the IsPolymorphic function to determine whether the multiple states correspond to uncertainty in state assignment or polymorphism in the taxon. Assumes symbols is non-NULL.
    NxsString   GetStateLabel(unsigned i, unsigned j)
       
Returns label for character state j at character i, if a label has been specified. If no label was specified, returns string containing a single blank (i.e., " ").
I   char   *GetSymbols()
       
Returns data member symbols. Warning: returned value may be NULL.
I   NxsString   GetTaxonLabel(unsigned i)
       
Returns label for taxon number i (i ranges from 0 to ntax - 1).
I   unsigned   GetTaxPos(unsigned origTaxonIndex)
       
Returns current index of taxon in matrix. This may differ from the original index if some taxa were listed in the TAXA block but not in the DATA or CHARACTERS block. The parameter origTaxonIndex is assumed to range from 0 to ntaxTotal - 1.
    void   HandleCharlabels(NxsToken &token)
       
Called when CHARLABELS command needs to be parsed from within the DIMENSIONS block. Deals with everything after the token CHARLABELS up to and including the semicolon that terminates the CHARLABELS command. If an ELIMINATE command has been processed, labels for eliminated characters will not be stored.
    void   HandleCharstatelabels(NxsToken &token)
       
Called when CHARSTATELABELS command needs to be parsed from within the CHARACTERS block. Deals with everything after the token CHARSTATELABELS up to and including the semicolon that terminates the CHARSTATELABELS command. Resulting charLabels vector will store labels only for characters that have not been eliminated, and likewise for charStates. Specifically, `charStates[0]' refers to the vector of character state labels for the first non-eliminated character.
    void   HandleDimensions(NxsToken &token, NxsString newtaxaLabel, NxsString ntaxLabel, NxsString ncharLabel)
       
Called when DIMENSIONS command needs to be parsed from within the CHARACTERS block. Deals with everything after the token DIMENSIONS up to and including the semicolon that terminates the DIMENSIONs command. newtaxaLabel, ntaxLabel and ncharLabel are simply "NEWTAXA", "NTAX" and "NCHAR" for this class, but may be different for derived classes that use newtaxa, ntax and nchar for other things (e.g., ntax is number of populations in an ALLELES block)
    void   HandleEliminate(NxsToken &token)
       
Called when ELIMINATE command needs to be parsed from within the CHARACTERS block. Deals with everything after the token ELIMINATE up to and including the semicolon that terminates the ELIMINATE command. Any character numbers or ranges of character numbers specified are stored in the NxsUnsignedSet eliminated, which remains empty until an ELIMINATE command is processed. Note that like all sets the character ranges are adjusted so that their offset is 0. For example, given "eliminate 4-7;" in the data file, the eliminate array would contain the values 3, 4, 5 and 6 (not 4, 5, 6 and 7). It is assumed that the ELIMINATE command comes before character labels and/or character state labels have been specified; an error message is generated if the user attempts to use ELIMINATE after a CHARLABELS, CHARSTATELABELS, or STATELABELS command.
    void   HandleEndblock(NxsToken &token, NxsString charToken)
       
Called when the END or ENDBLOCK command needs to be parsed from within the CHARACTERS block. Does two things: o checks to make sure the next token in the data file is a semicolon o eliminates character labels and character state labels for characters that have been eliminated
V   void   HandleFormat(NxsToken &token)
       
Called when FORMAT command needs to be parsed from within the DIMENSIONS block. Deals with everything after the token FORMAT up to and including the semicolon that terminates the FORMAT command.
V   void   HandleMatrix(NxsToken &token)
       
Called when MATRIX command needs to be parsed from within the CHARACTERS block. Deals with everything after the token MATRIX up to and including the semicolon that terminates the MATRIX command.
V   bool   HandleNextState(NxsToken &token, unsigned i, unsigned j)
       
Called from HandleStdMatrix or HandleTransposedMatrix function to read in the next state. Always returns true except in the special case of an interleaved matrix, in which case it returns false if a newline character is encountered before the next token.
    void   HandleStatelabels(NxsToken &token)
       
Called when STATELABELS command needs to be parsed from within the DIMENSIONS block. Deals with everything after the token STATELABELS up to and including the semicolon that terminates the STATELABELS command. Note that the numbers of states are shifted back one before being stored so that the character numbers in the NxsStringVectorMap objects are 0-offset rather than being 1-offset as in the NxsReader data file.
V   void   HandleStdMatrix(NxsToken &token)
       
Called from HandleMatrix function to read in a standard (i.e., non-transposed) matrix. Interleaving, if applicable, is dealt with herein.
    void   HandleTaxlabels(NxsToken &token)
       
Called when TAXLABELS command needs to be parsed from within the CHARACTERS block. Deals with everything after the token TAXLABELS up to and including the semicolon that terminates the TAXLABELS command.
V   unsigned   HandleTokenState(NxsToken &token, unsigned j)
       
Called from HandleNextState to read in the next state when TOKENS was specified. Looks up state in character states listed for the character to make sure it is a valid state, and returns state's value (0, 1, 2, ...). Note: does NOT handle adding the state's value to matrix. Save the return value (call it k) and use the following command to add it to matrix: matrix->AddState(i, j, k);
V   void   HandleTransposedMatrix(NxsToken &token)
       
Called from HandleMatrix function to read in a transposed matrix. Interleaving, if applicable, is dealt with herein.
I   void   IncludeCharacter(unsigned i)
       
Includes character whose 0-offset current index is i. If character is already active, this function has no effect.
I   bool   IsActiveChar(unsigned j)
       
Returns true if character j is active. If character j has been excluded, returns false. Assumes j is in the range [0..nchar).
I   bool   IsActiveTaxon(unsigned i)
       
Returns true if taxon i is active. If taxon i has been deleted, returns false. Assumes i is in the range [0..ntax).
I   bool   IsDeleted(unsigned i)
       
Returns true if taxon number i has been deleted, false otherwise.
    bool   IsEliminated(unsigned origCharIndex)
       
Returns true if character number origCharIndex was eliminated, false otherwise. Returns false immediately if eliminated set is empty.
I   bool   IsExcluded(unsigned j)
       
Returns true if character j has been excluded. If character j is active, returns false. Assumes j is in the range [0..nchar).
I   bool   IsGapState(unsigned i, unsigned j)
       
Returns true if the state at taxon i, character j is the gap state, false otherwise. Assumes matrix is non-NULL.
    bool   IsInSymbols(char ch)
       
Returns true if ch can be found in the symbols array. The value of respectingCase is used to determine whether or not the search should be case sensitive. Assumes symbols is non-NULL.
I   bool   IsInterleave()
       
Returns true if INTERLEAVE was specified in the FORMAT command, false otherwise.
I   bool   IsLabels()
       
Returns true if LABELS was specified in the FORMAT command, false otherwise.
I   bool   IsMissingState(unsigned i, unsigned j)
       
Returns true if the state at taxon i, character j is the missing state, false otherwise. Assumes matrix is non-NULL.
I   bool   IsPolymorphic(unsigned i, unsigned j)
       
Returns true if taxon i is polymorphic for character j, false otherwise. Assumes matrix is non-NULL. Note that return value will be false if there is only one state (i.e., one cannot tell whether there is uncertainty using this function).
I   bool   IsRespectCase()
       
Returns true if RESPECTCASE was specified in the FORMAT command, false otherwise.
I   bool   IsTokens()
       
Returns true if TOKENS was specified in the FORMAT command, false otherwise.
I   bool   IsTranspose()
       
Returns true if TRANSPOSE was specified in the FORMAT command, false otherwise.
C     NxsCharactersBlock(NxsTaxaBlock *tb, NxsAssumptionsBlock *ab)
       
Initializes id to "CHARACTERS", taxa to tb, assumptionsBlock to ab, ntax, ntaxTotal, nchar and ncharTotal to 0, newchar to true, newtaxa, interleaving, transposing, respectingCase, tokens and formerly_datablock to false, datatype to `NxsCharactersBlock::standard', missing to '?', gap and matchchar to '', and matrix, charPos, taxonPos, activeTaxon, and activeChar to NULL. The ResetSymbols member function is called to reset the symbols data member. Assumes that tb and ab point to valid NxsTaxaBlock and NxsAssumptionsBlock objects, respectively.
D     ~NxsCharactersBlock()
       
Deletes any memory allocated to the arrays symbols, charPos, taxonPos, activeChar, and activeTaxon. Flushes the containers charLabels, eliminated, and deleted. Also deletes memory allocated to matrix.
    unsigned   PositionInSymbols(char ch)
       
Returns position of ch in symbols array. The value of respectingCase is used to determine whether the search should be case sensitive or not. Assumes symbols is non-NULL. Returns UINT_MAX if ch is not found in symbols.
V   void   Read(NxsToken &token)
       
This function provides the ability to read everything following the block name (which is read by the NxsReader object) to the END or ENDBLOCK statement. Characters are read from the input stream in. Overrides the abstract virtual function in the base class.
V   void   Report(ostream &out)
       
This function outputs a brief report of the contents of this CHARACTERS block. Overrides the abstract virtual function in the base class.
V   void   Reset()
       
Returns NxsCharactersBlock object to the state it was in when first created.
    void   ResetSymbols()
       
Resets standard symbol set after a change in datatype is made. Also flushes equates list and installs standard equate macros for the current datatype.
I   void   RestoreTaxon(unsigned i)
       
Restores taxon whose 0-offset current index is i. If taxon is already active, this function has no effect.
    void   ShowStateLabels(ostream &out, unsigned i, unsigned j, unsigned first_taxon)
       
Looks up the state(s) at row i, column j of matrix and writes it (or them) to out. If there is uncertainty or polymorphism, the list of states is surrounded by the appropriate set of symbols (i.e., parentheses for polymorphism, curly brackets for uncertainty). If TOKENS was specified, the output takes the form of the defined state labels; otherwise, the correct symbol is looked up in symbols and output.
I   void   ShowStates(ostream &out, unsigned i, unsigned j)
       
Shows the states for taxon i, character j, on the stream out. Uses symbols array to translate the states from the way they are stored (as integers) to the symbol used in the original data matrix. Assumes i is in the range [0..ntax) and j is in the range [0..nchar). Also assumes matrix is non-NULL.
IV   unsigned   TaxonLabelToNumber(NxsString s)
       
Converts a taxon label to a number corresponding to the taxon's position within the list maintained by the NxsTaxaBlock object. This method overrides the virtual function of the same name in the NxsBlock base class. If s is not a valid taxon label, returns the value 0.
    void   WriteStates(NxsDiscreteDatum &d, char *s, unsigned slen)
       
Writes out the state (or states) stored in this NxsDiscreteDatum object to the buffer s using the symbols array to do the necessary translation of the numeric state values to state symbols. In the case of polymorphism or uncertainty, the list of states will be surrounded by brackets or parentheses (respectively). Assumes s is non-NULL and long enough to hold everything printed.