title logo

DNA Surveillance

Species identification with DNA

Witness for Whales Home About How to Use The Science Links and Publications Data Ownership
Search Cluster (Simple) Cluster (Advanced) Maximum Likelihood Example Data

What is Witness for the Whales?

Witness for the Whalesis a service for the identification of cetaceans (whales, dolphins and porpoises) using DNA sequences. Witness for the Whales comprises annotated and curated sets of reference nucleotide sequences from a comprehensive and representative range of cetaceans. Witness for the Whales is implemented within DNA Surveillance, a computer system which applies standard phylogenetic methods to identify the affinity of the DNA sequences that you submit.

The reference nucleotide sequences used in Witness for the Whales are derived from tissues obtained from dead stranded animals, victims of fisheries bycatch, osteological material from museum collections, and biopsy samples from free-swimming cetaceans. Sequences were generated in our laboratories, or obtained, where possible, only from studies by other known species-specialists. Sequences were considered 'validated'only if specimens were identified by experts in the field and diagnostic skeletal material or photographic records were collected (Dizon et al. 2000). In cases of very rare species, sequences were derived from holotype specimens (Dalebout et al. 2002, Dalebout et al. 2003, van Helden et al. 2002).

For the mtDNA control region, all species in the database are represented by at least one validated sequence (with >95%of sequences derived from known validated specimens overall). For the mtDNA cytochrome b, over 90%of species in the database (Vs. 3.1) are represented by at least one validated sequence (with >70%of sequences derived from known validated specimens overall). Additional information will be required from major contributors to fully validate these datasets.

A full validation for these cetacean datasets will be presented in an upcoming publication. Note that for sequences derived from biopsy samples collected from free-swimming cetaceans, the experience of the species-specialist was considered sufficient to assume correct species identification. However, while sequences obtained from biopsies can serve as 'references'for the identification of test specimens, they are not considered fully validated.

Standard methods have been used to generate and align the sequences and then to derive their phylogenetic relationships. Both the techniques and the basic results have been validated by publication in peer-reviewed journals. Witness for the Whales differs from other gene databases in that the sequences used have been carefully selected to maximise the reliability of species identification.

Reference Datasets

Below is a list of the current reference datasets available, with links to more information.
DomainmtDNA control Region (=D-Loop) = CtrlCytochrome B = CytB
All Cetaceans Vs3.1LinkLink
Mysticetes Vs3.1LinkLink
Odontocetes Vs3.1LinkLink
Ziphiidae Vs3.1LinkLink
Phocoenidae Vs3.1LinkLink
Delphinidae (Subgroups) Vs3.1
Delphinidae + Stenoninae Vs3.1LinkLink
Globicephalinae + Orcininae Vs3.1LinkLink
Lissodelphininae Vs3.1LinkLink
Balaenidae Vs4.0LinkN/A
All cetaceans Vs4.0
Mysticetes Vs4.0
Odontocetes Vs4.0
Ziphiidae Vs4.0
Phocoenidae Vs4.0
Delphinidae (Subgroups) Vs4.0
Delphinidae + Stenoninae Vs4.0
Globicephalinae + Orcininae Vs4.0
Lissodelphininae Vs4.0
Humpbacks Vs4.0LinkN/A
Humpback mtDNA
All cetaceans Vs4.3LinkLink
Mysticetes Vs4.3LinkLink
Odontocetes Vs4.3LinkLink
Ziphiidae Vs4.3LinkLink
Phocoenidae Vs4.3LinkLink
Delphinidae (Subgroups) Vs4.3
Delphinidae + Stenoninae Vs4.3LinkLink
Globicephalinae + Orcininae Vs4.3LinkLink
Lissodelphininae Vs4.3LinkLink


There are several issues of interpretation of which the user should be aware. The user must use their professional judgement in applying any results obtained from this site.


The curators of Witness for the Whales are available for consultation, to provide validation of the species identity of cetacean genetic sequences. Please contact us (email: dna-surveillance@auckland.ac.nz) if you would like to discuss this further.

How to Cite

If you use this service to identify species, please cite us as:

History and Changes

DNA Reference Databases Milestones

  1. v 4.3 August 2006
    • database Vs 4.3 is now taxonomically comprehensive, with a total of 399 control region sequences and 264 cyt b sequences representing 88 species. Sequences from documented specimens now represent all of the 83 species recognized by Rice (1998), with two exceptions: the Atlantic hump-backed dolphin, Sousa teuszi and the Indian hump-backed dolphin S. plumbea (the latter of which has not been accepted by IWC). We have also included seven species proposed in recent publications and three subspecies of baleen whales. Both the control region and cytB datasets include unique sequences from 2-6 specimens for most species. The few exceptions include unusual cases like the vaquita, which has only a single known mtDNA haplotype. For widespread species, we have included control region sequences from representatives of different ocean basins or biogeographic zones, following the regional categories used in the archive listings of the NMFS Southwest Fisheries Science Center.
  1. v 3.1 August 2006
    • database Vs 3.1 is revised by replacing all sequences of unknown provenance with those of known provenance, keeping the original number of 285 sequences of the mtDNA control region, representing 78 species, and 165 sequences of the mtDNA cytochrome b gene representing 83 cetacean species. A total of 47 new control region sequences were submitted to GenBank (Appendix 1). All sequences in the CytB dataset of Vs3.1 were submitted previously on GenBank. The aligned sequences for each of the eight Vs3.1 datasets can now be downloaded from DNA Surveillance for further user-based analyses (e.g., analysis of characters or phylogenetic reconstruction by Maximum Parsimony or Maximum Likelihood).
  1. v 3.1 6 June 2003
    • initial comprehensive reference dataset, with 285 sequences of the mtDNA control region, representing 78 species, and 165 sequences of the mtDNA cytochrome b gene representing 83 cetacean species.

Software Milestones

  1. v 3.0 14 October 2004
    • Redesigned web frontend software
    • Introduced reverse complement detection for query sequences
    • Introduced maximum likelihood based searh method (method to be published)
    • Gap extension penalty in the phylogenetic analysis changed to 1 again
  2. v 2.018 September 2003
    • Introduced software changes to change page format from using frames to using cascading style sheets.
    • Gap extension penalty in the phylogenetic analysis changed to 1.7
  3. v 1.4 15 November 2002
    • Introduced software changes to compare query sequences with the reference sequences and then to warn the user when the difference exceeds a threshold.
    • Changed method of calculating evolutionary distance to include sites having IUPAC ambiguity codes.
    • Introduced a list of the species which are represented in each reference database.
    • Expanded a discussion of some of the issues regarding the interpretation of results.
  4. 19 July 2002
    • Added new links on sidebar to recent notices, and details of references sequences
  5. v 1.3 14 June 2002
    Introduced new features:
    • ability to select Full Alignment in Advanced Search
    • set of sample sequences derived from materials obtained in Japanese markets
    • reference dataset of sequences from humback populations in N Atlantic, N Pacific and S Pacific Oceans
  6. v 1.2 6 June 2002
    Introduced new features:
    • ability to submit several sequences at the same time
      NOTE: only one reference dataset can be used at a time
    • ability to enter an email address to which results will be sent
  7. v 1.1 21 April 2002
    Introduced new feature:
    • ability to perform bootstrap analysis on phylogenetic tree
  8. v 1.021November 2001
    • Went live!


Witness for the Whales is a service of the Population Genetics and Evolution Research Group of the University of Auckland School of Biological Sciences. The programming of this website was funded by the Vice-Chancellor's Development Fund. The maintenance of the reference dataset was funded by grants to C.S. Baker from the New Zealand Marsden Fund, the New Zealand Lottery Board, the International Fund for Animal Welfare (IFAW), and the U.S. Marine Mammal Commission.