Ever since the advent of computers and advanced technology in the life sciences, the quantity of biological data has grown exponentially and been stored in databases. The simple piling up of data, however, is of little help not only to researchers but also to computers. To be useful, they need to be sorted some way or another. Such a step is easily performed by specialized software. But as for many things, without a human touch something lacks. Swiss-Prot is a protein sequence database that sprung into existence 30 years ago when protein sequences were still trickling in. In those days, every sequence could be nursed. Today, however, millions of protein sequences are produced on a monthly basis. How does Swiss-Prot cope? Thanks to its biocurators.
Biocuration down the years: what has changed, what has not changed
Amos Bairoch (founder of Swiss-Prot) described the changes that have occurred over the past 30 years for a biocurator:
- No need to type the sequence in manually anymore;
- Updates are no more done via floppy disks or CDs;
- Annotation tools are far more efficient nowadays.
However, as Amos underlines, what is even more surprising is how much has NOT changed:
- Biocuration is seeking ways to make sense of what is really being reported in papers – whose structure can be highly variable – and the community is still stuck in the same old cycle of experimenting, extracting results from articles and spending lots of money in the process so as to make them usable by humans and software alike;
- Biocuration still means wading through figures the size of postage stamps and legends that are cryptic and usually full of mistakes because they are not spell-checked;
- Most of the sequence analysis algorithms for annotating transmembrane domains, signal sequences etc. haven’t changed since the late 1990s;
- And though sequence data has grown exponentially over the years, the same alignment search tool – Blast – created in 1990 is still being used!