Standard diversity indices

1949 : Simpson (Nature)

Simpson concentration : probability that two randomly chosen individuals from a given community are the same species.


Gini-Simpson index : probability that two randomly chosen individuals from a given community are different species.


1948-1962 : Shannon & Weaver

Shannon entropy : average amount of information in the community, given the facts that :

  • rare species carry more information than common species ;
  • their information value is proportional to the logarithm of their relative abundance.


Measures the loss of information due to the loss of a species.
Or the uncertainty about the species obtained when one individual is randomly chosen.


1961-1970 : Rényi

Rényi entropy :

  • q -> 0 : increasingly weighs all possible species more equally, regardless of their probabilities.
  • q = 0 : Hartley entropy (logarithm of the species richness).
  • q -> 1 : Shannon entropy.
  • q -> inf : increasingly determined by the most abundant species.


Hill’s numbers

1973 : Hill (Ecology)


The “doubling” property = if two equally large, completely distinct communities (no shared species) each have diversity X, and if these communities are combined, then the diversity of the combined communities should be 2X.

Most raw diversity indices (standard diversity indices, H) do not obey this property,
but their numbers equivalents do.


The numbers equivalents or effective number of elements of a diversity index = number of equally likely elements needed to produce the given value of the diversity index.
The diversity of S equally abundant species is S.

Every diversity measure H has a number equivalent D :


The order q determines a diversity measure’s sensitivity to species abundance (rare or common species) :

  • orders higher than 1 are disproportionately sensitive to the most common species,
  • while orders lower than 1 are disproportionately sensitive to the rare species.


q = 0 q = 1 q = 2
Standard diversity indices Species richness
Shannon entropy
Simpson index


Gini-Simpson index
- Effective numbers
Hill numbers

- exp(Rényi entropy)
(total dissimilarity)
Species richness Exponential of Shannon entropy Inverse of (Gini)-Simpson index


Similarity-sensitive measures

1982 : Rao (Theoretical Population Biology)

Rao’s quadratic entropy :

  • measure of average conflict among species
  • efficient index of functional diversity
  • expected dissimilarity between two individuals of a given species assemblage selected at random with replacement
  • Simpson concentration + pairwise distance between species

Both Simpson and Rao indices :

  • non-linearity with respect to increasing diversity
  • do not obey the replication principle


1992 : Faith (Biological Conservation)

Phylogenetic diversity (PD) = sum of the lengths of all those branches that are members of the corresponding minimum spanning path (smallest assemblage of branches from the cladogram for the complete set of taxa such that, for any two members of a subset of the taxa, a path connecting the two can be found that uses only branches in the assemblage).


2009 : Allen (American Naturalist)

Phylogenetic entropy index :

  • places a high value on distinctive species but has the property that when members of a species become rare in proportion to other species, it is never desirable to eliminate them
  • Shannon entropy + phylogenetic differences

Both Shannon and Allen indices :

  • non-linearity with respect to increasing diversity
  • do not obey the replication principle


Generalization


q = 0 q = 1 q = 2
Standard diversity indices Species richness
Shannon entropy
Simpson concentration


Gini-Simpson index
- Effective numbers
Hill numbers

- exp(Rényi entropy)
(total dissimilarity)
Species richness Exponential of Shannon entropy Inverse of Simpson concentration
Generalization
Similarity information
Faith PD (ultrametric tree) Allen’s entropy
Generalization of Shannon entropy
Rao’s quadratic entropy
Generalization of Gini-Simpson index


Hill numbers’ + similarity-sensitive measures

Improve the formula of Hill numbers by adding a similarity-sensitive parameter = measure that reflect the varying dissimilarities between species (number specifying how similar they are).


2010 : Chao (Phil. Trans. R. Soc.)

Family of effective number similarity-sensitive measures, tailored specifically to phylogenetic diversity (similarity derived from a tree = distance).

Mean phylogenetic diversity of order q :

  • family of diversity measures taking into account phylogenetic similarities, derived from a phylogenetic tree.
  • product of the interval duration T and the mean diversity over that interval.


Phylogenetic diversity of order q through T years ago :

with :

  • Li = length of branch i

  • ai = total abundance descended from branch i

  • q = sensitivity parameter

  • T = sum(Li * ai) if ultrametric tree
    = mean quantity if non-ultrametric tree


Hill numbers = special case for all species maximally distinct, all branch lengths = T

Faith’s PD for all q = all species maximally distinct and equally common, age of the highest node = T

After applying a simple transformation, the phylogenetic diversity measure of Faith (1992) and the phylogenetic entropy of Allen et al. (2009) are special cases of mean phylogenetic diversity.


Properties of values of Chao model :

  • satisfies replication, but only under the assumption that subcommunities all have the same mean evolutionary change (ultrametric tree)
  • for non-ultrametric trees, mean phylogenetic diversity can be greater than the number of species !


2012 : Leinster (Ecology)


Community’s diversity profile = calculate the diversity of order q for every q, and plot it against q.

For q != 1 :

with :

  • pi = relative abundances

  • Z = matrix of similarities between species ( 0 = total dissimilarity)

  • q = sensitivity parameter

  • (Zp)i = relative abundance of species similar to the ith
    = expected similarity between and individual of the ith species and an individual chosen at random
    = measures the ordinariness of the ith species within the community
    = inversely related to the diversity


Hill numbers = special case of total dissimilarity, Z = identity matrix, (Zp)i = pi

If a measure knows nothing of the commonalities between species, it will evaluate the community as more diverse than it really is. The naive model typically overestimates diversity.


q = 0 q = 1 q = 2
Standard diversity indices Species richness
Shannon entropy
Simpson concentration


Gini-Simpson index
- Effective numbers
Hill numbers

- Leinster naive model
exp(Rényi entropy)
(total dissimilarity)
Species richness Exponential of Shannon entropy Inverse of Simpson concentration
Generalization
Similarity information
Faith PD (ultrametric tree) Allen’s entropy
Generalization of Shannon entropy
Rao’s quadratic entropy
Generalization of Gini-Simpson index
- Leinster model
exp(Rényi entropy)
Similarity information
Faith PD (ultrametric tree) 1 / (1 - Rao’s quadratic entropy)

Properties of values of Leinster model :

  • Partitioning properties : - Effective number : diversity of a community with equally abundant and totally dissimilar species = species richness - Modularity : partition in subcommunities - Replication : if subcommunities of equal size and diversity, diversity = nb subcom * subdiv
  • Elementary properties : - Symmetry - Absent species - Identical species
  • Effect of species similarity on diversity : - Monotonicity - Naive model - Range

Note : Allen and Rao’s entropies do not respect the replication principle.



Glossary

  • Ultrametric tree = in which the distances from the root to every branch tip are equal
    = if the branch lengths are proportional to divergence time, all branch tips are the same distance from the tree base (first node) = a) on the graph, b) is non-ultrametric



Citations