Research Interests

Currently I am interested in these topics:

  1. Protein function prediction: this is a very broad topic and several approaches have been proposed to solve this task. I have worked in this problem via two different perspectives:

    a) Using inference over biological networks: this approach considers that a certain property is known for a subset of the proteins of interest, while unknown for the rest. If the network is built in a way that the strength (or the presence) of the link resembles functional association, the properties for the group can be computed using a simple guilt-by-association mechanism [1], or a label diffusion method [2], which can be framed into the semi-supervised learning paradigm. In this last framework I have participated in the development of S2F, a method for protein function prediction using diffusion on graphs and very little initial information [3]. Besides, I published in Bioinformatics [4] a tool to compute semantic similarity networks, called GoSSTo, which can be used to produce protein association networks to apply these methods.

    b) Discovering the consensus architecture of a protein. Proteins can be viewed as the combination of one or more individual evolutionary units (known as domains). These domains carry out certain biological processes and molecular functions. However, it is well-known that the function of a protein is not just the sum of the individual functions. I have implemented ConSat [5], a method for obtaining the consensus domain architecture of a set of protins which provides function association methods.

  2. Network medicine: Network medicine is a new paradigm in biosciences where graphs and computational tools are used to solve problems in medicine and human molecular biology. Graphs are used to represent biological entities (proteins, molecules, phenotypes, etc.) and relations between those entities, providing a holistic view of a certain problem to solve. Thus, the answer to several biomedical issues can be viewed in this frame and solved using both Machine Learning and Network Science techniques.

    I have collaborated with Horacio Caniza in the development of methods for the creation of a network measuring similarities between diseases. Using techniques such as semantic similarities and text mining we have built a new measure, which is able to resemble the underlying (real) molecular similarity without taking into consideration any genomic information. This is a particularly interesting problem for the case of rare diseases where very little (or any at all) pathogenic information is available. A manuscript is currently being prepared [6]. We also plan to apply the obtained network to develop a novel gene-disease prediction mechanism based on semi-supervised learning techniques on graphs.

    The second problem I have worked in this field is the evaluation of different combination mechanisms of human gene networks for the problem of gene-disease association. This project was collaboration with Giorgio Valentini (Univ. of Milan, Italy) and his lab, materialised as a publication in Artificial Intelligence in Medicine [7].

Bibliography (including work in progress)

* indicates co-first author

[1] R. Dóczi, L. Ökrész, A. E. Romero, A. Paccanaro, L. Bögre, Exploring the evolutionary path of plant MAPK networks, Trends Plant Sci., 17 (9), 518-525, (link), 2012.

[2] P. Radivojac, W. T. Clark, T. R. Oron, A. M. Schnoes, T. Wittkop, A. Sokolov, K. Graim, C. Funk, K. Verspoor, A. Ben-Hur, G. Pandey, J. M. Yunes, A. S. Talwalkar, S. Repo, M. L. Souza, D. Piovesan, R. Casadio, Z. Wang, J. Cheng, H. Fang, J. Gough, P. Koskinen, P. Törönen, J. Nokso-Koivisto, L. Holm, D. Cozzetto, D. W. A. Buchan, K. Bryson, D. T. Jones, B. Limaye, H. Inamdar, A. Datta, S. K. Manjari, R. Joshi, M. Chitale, D. Kihara, A. M. Lisewski, S. Erdin, E. Venner, O. Lichtarge, R. Rentzsch, H. Yang, A. E. Romero, P. Bhat, A. Paccanaro, T. Hamp, R. Kassner, S. Seemayer, E. Vicedo, C. Schaefer, D. Achten, F. Auer, A. Boehm, T. Braun, M. Hecht, M. Heron, P. Hönigschmid, T. A. Hopf, S. Kaufmann, M. Kiening, D. Krompass, C. Landerer, Y. Mahlich, M. Roos, J. Björne, T. Salakoski, A. Wong, H. Shatkay, F. Gatzmann, I. Sommer, M. N. Wass, M. J. E. Sternberg, N. Škunca, F. Supek, M. Bošnjak, P. Panov, S. Džeroski, T. Šmuc, Y. A. I. Kourmpetis, A. D. J. van Dijk, C. J. F. ter Braak, T. Zhou, Q. Gong, X. Dong, W. Tian, M. Falda, P. Fontana, E. Lavezzo, B. Di Camillo, S. Toppo, L. Lan, N. Djuric, Y. Guo, S. Vucetic, A. Bairoch, M. Linial, P. C. Babbitt, S. E. Brenner, C. Orengo, B. Rost, S. D Mooney & I. Friedberg, A large-scale evaluation of computational protein function prediction, Nat. Methods, 10 (3), 221-227, 2013.

[3] H. Yang, A. E. Romero, P. Bhat, A. Paccanaro, Ab initio protein function prediction using diffusion on graphs (under preparation).

[4] H. Caniza, A. E. Romero*, S. Heron, H. Yang, A. Devoto, M. Frasca, M. Mesiti, G. Valentini, A. Paccanaro, GOssTo: a user-friendly stand-alone and web tool for calculating semantic similarities on the Gene Ontology, Bioinformatics, 30(15), 2235-2236, 2014. (link)

[5] A. E. Romero, T. Nepusz, R. Sasidharan, A. Paccanaro, ConSAT, a catalogue of protein consensus domain architectures (under review).

[6] H. Caniza, A. E. Romero, A. Paccanaro, A new measure of disease similarity (under preparation).

[7] G. Valentini, A. Paccanaro, H. Caniza, A. E. Romero, M. Re, An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods, Artif. Intell. Med., 61(2), 63-78, 2014. (link)