Birth

In the late 70s, DNA and protein sequences started being produced quite easily and with them came the need for storage and analysis tools. The first computing programs to be developed aimed at comparing proteins between different species to understand what they were doing. This was the very beginning of bioinformatics. In fact, the term appeared for the first time in 1970 and referred to the study of information processes in biotic systems, so it did not really mean what is called bioinformatics now1,2. In the 80s, Amos Bairoch - today Group Leader at SIB - felt the need to perfect an existing protein sequence databank. While working on his PhD, he started to “annotate” proteins by adding information such as their structure and pathological roles, and adapted the database to computers3. Swiss-Prot, the manually annotated protein sequence database, was born! This was in 1986. At that time, the various versions were distributed on magnetic tapes by EMBL-Heidelberg. It was the start of a long-lasting collaboration.

From a one-man team to a Swiss Institute

With the emergence of microcomputers and the constant increase in protein dataflow, Swiss-Prot needed specialized annotators, also called biocurators. Biocuration, i.e. the activity of extracting, organizing, and making biological information accessible to both humans and computers, became essential4. The Swiss-Prot database was growing exponentially and being used world-wide, while protein curation was going full speed both in Heidelberg/Hinxton and in Geneva. Towards the end of the 90s, the Swiss team lacked funds because of the heterogeneous nature of the project: it was both national and international. In 1996, the Swiss government and the Swiss National Fund recognized nevertheless the scientific value of Swiss-Prot and accepted to finance the project for two years but no longer. To solve the issue, the SIB Swiss Institute of Bioinformatics was created in 1998: the Geneva Swiss-Prot group had a new home and its mission was to keep on developing the database. The collaboration with the team in EMBL-EBI could continue2.

Swiss-Prot becomes UniProtKB/Swiss-Prot

In 2002, and thanks to the unparalleled efforts of Rolf Apweiler now Director of EMBL’s European Bioinformatics Institute (EMBL-EBI), the UniProt consortium, a collaboration between the SIB, the European Bioinformatics Institute (EBI), and the Protein Information Resource (PIR), and funded by the National Institutes of Health, was launched. In this context, the UniProt Knowledgebase (UniProtKB) was created, consisting in the curated UniProtKB/Swiss-Prot databank, its automatically annotated supplement TrEMBL, and the PIR protein database5. Today, UniProtKB represents the world's most comprehensive catalogue of information on proteins1. In the space of 30 years, the number of proteins entered in UniProtKB/Swiss-Prot has risen from 4,000 to 550,000, and the staff from one to 70 people6 of whom 50 are in Geneva.

 

  1. https://en.wikipedia.org/wiki/Bioinformatics
  2. The beginnings of a database: An interview with Prof. Amos Bairoch, Prolune Protéines à la “Une”, S. Altairac and G. Baillie, Aug. 2006
  3. An unexpected place, V. Baillie Gerritsen, Protein Spotlight, Nov. 2007
  4. Biocuration: A New Challenge for the Tunicate Community, D. Dauga, Genesis, 2015
  5. https://en.wikipedia.org/wiki/UniProt#UniProtKB.2FSwiss-Prot
  6. UniProtKB/Swiss-Prot UniProt release, Feb. 2016