Royal Holloway logo with departmental theme Royal Holloway, University of London

FACTS AND ARTIFACTS : POTENTIAL PITFALLS IN BIOINFORMATIC ANALYSIS ARISING FROM DATABASE ANOMALIES
Dr Christopher Southan, Head of Computational Biology, Gemini Genomics, Cambridge

Abstract: Peptide fragments isolated from rat urine gave translation matches to many rat liver ESTs. The combined peptide and nucleic acid data delineated two novel parologous secreted Ly-6 proteins (SwissProt P81827 and P81828).

Comprehensive bioinformatic analysis uncovered a swathe of database anomalies that included the following: (1) The TIGR Gene EST assembler had merged three parologous gene products into a single virtual transcript. (2) Extended sequence matches against three unrelated GenBank entries from rat turned out to be chimeric mRNAs. (3) These chimeric mRNAs have propagated major annotation errors in dbEST, Unigene and Locus Link. (4) One chimera gave rise to a mistranslation in the protein databases. (5) Another of the chimeras included a "cryptic" protein that was not annotated in any database. (6) A fourth mRNA anomaly appears to be a pre-mRNA sequence. (7) Aligning the rat urinary proteins with homologues delineated a new family of short secreted Ly-6 proteins not recognised in the current domain databases. (8) Two of the homologues had functional annotations but these turned out to include sequence errors and unverified biochemical results. (9) Forward links to very recent publications were lost because of inconsistent gene names.

The number of anomalies uncovered in this work can be considered unusual. However, they highlight sequence database issues such as: unforeseen consequences of automated sequence parsing, domain recognition thresholds, the verification of functional ontologies, and the authenticity of links to biological data.

This seminar was held at the Department of Computer Science, Royal Holloway, University of London on 13 November 2001.

back


Last updated Mon, 15-Dec-2008 15:19 GMT / PS
Department of Computer Science, University of London, Egham, Surrey TW20 0EX
Tel/Fax : +44 (0)1784 443421 /439786
@@('' )@@
@@('' )@@
@@('' )@@
@@('' )@@
@@('' )@@
@@('' )@@
@@('' )@@
@@('' )@@
@@('' )@@