That is why the three researchers began to work on the bioinformatics tool. So José Antonio Rodríguez had the biological question; Asier Fullaondo, the knowledge of bioinformatics tools and databases; and Gorka Prieto, the programming skills. Initially, these PhD holders developed a piece of software (WREGEX, available for the scientific community on the UPV/EHU’s server) that can be used to predict and automatically seek out ‘functional motifs’ (the small groups of amino acids that develop specific tasks in a protein). They tested the programme to predict ‘motifs’ that move a protein from the nucleus to the cytoplasm of a cell, the so-called ‘nuclear exportation signals’. At the end of this research phase in 2014, a paper was published in the journal Bioinformatics. But, as José Antonio Rodríguez pointed out, “in research the answer to one question opens the door to more questions.” The question on that occasion was: Which proteins in a sequence of amino acids could have a functional cancer-mutant ‘motif’?
The team took another step and combined the information on the sequences of all known human proteins with the COSMIC catalogue that gathers the mutations linked to cancer. Thus appeared a new version (WREGEX 2.0) that allows a normal protein to be compared with the same mutant one so as to be able to predict ‘functional motifs’ that have been modified and which could be linked to cancer. “You may also have experience in how a motif functions and you want to find out which proteins it could appear in and whether it appears modified into cancer. With this software you can obtain candidates to start to study,” explained Gorka Prieto.
Once the bioinformatics programme had been developed, it had to be tested and to do this they carried out a “cell exportation trial’. They again chose various candidates that could constitute a motif responsible for moving the protein outside the cell nucleus. They checked their functioning and, after modifying them according to the tumour mutations described in the COSMIC catalogue, they ran the trial again. That way, they certified that the candidates acted as an ‘exportation signal’, that the mutation affected the way they worked, and that the software was therefore valid.
So this tool combines three types of information: the protein sequences, the functional motifs and the cancer mutations. “One of the main features of WREGEX 2.0 is that it can simultaneously study highly complex proteomes with masses of proteins and combine information, in the case of the trial, with cancer mutations; but the door is open for using other databases containing information about other types of mutations. The advantage, moreover, is that 40,000 proteins a minute can be analysed, while with other programs the analysis of a single protein took several minutes,” explained Asier Fullaondo. So with this software it is possible to predict that the alteration in a protein may influence the development of disease, not just cancer.
So far, thirteen pieces of research have already used this computing tool. Researchers in China, Japan, Korea, Germany and the United States have accessed the server. In the meantime, the multidisciplinary tandem formed by the three PhD holders is already thinking about continuing with the work to improve the tool.