Hauptseminar WS 2011/2012

Type:            Seminar (2 SWS)
Ects:             4.0
Lecturer:      Burkhard Rost
Time:           Monday, 12:15 - 13:45
Room:          MI 01.09.034
Language:    English


 Topics related to the research interests of the group: protein sequence analysis, sequence based predictions, 
 protein structure prediction and analysis; interaction networks.


 Monday, August 1, 2 pm
 Topics were assigned during the pre-meeting as below.



24.10.2011         Nicolas Guirao: Protein (structure) family databases
                           Advisor: Andrea Schafferhans

31.10.2011         Hagen Fritsch: Protein design and engineering
                           Advisor: Marc Offman

07.11.2011         Simon Dirmeier:  Homology-based annotation of protein function in silico
                            Advisor: Edda Kloppmann

14.11.2011         Julia-Sophie Heier: Analysing enzyme function using structures
                           Advisor: Andrea Schafferhans

21.11.2011         Julia Gerke: Prediction using Sequences, Structures and more
                           Advisor: Tobias Hamp

28.11.2011         Valentina Klaus: Prediction of functional effects of non-synonymous SNPs
                           Advisor: Shaila Roessle

05.12.2011         Maria Kalemanov: Disease Networks
                           Advisor: Arthur Dong

12.12.2011         Christoph Hamm: Genetic heterogeneity and disease
                           Advisor: Christian Schaefer

09.01.2011         Martin Steinegger: Application of cloud computing in bioinformatics
                           Advisor: Laszlo Kajan

Open Topics

Molecular Dynamics
Advisor: Marc Offman

Combining physical and genetic interaction networks
Advisor: Arthur Dong

Predicting regions with no regular structure (NORS)
Advisor: Esmeralda Vicedo



Protein design and engineering

Dr. Marc Offman

Proteins are central to most biological processes and their spectrum of functions is seemingly endless. Given that proteins are found in almost any living forms and each organism had to adapt to evolutionary pressure over million of years, a large number of different proteins have evolved. Some of these proteins could potentially be used as drugs, others need to be adapted (engineered), and for some purposes new proteins need to be designed. In protein engineering/design either known proteins are adapted in order to meet certain criteria such as increased stability, function, activity and recognition, or novel protein folds are created. Given the fact that proteins are large, complicated molecules with a huge number of degrees of freedom, protein engineering seems to be an unsolvable task. Nevertheless, methods are under constant development and show some success, as engineered proteins can already be used as therapeutics and as tools for cell biology.


Homology-based annotation of protein function in silico

Dr. Edda Kloppmann

The majority of known proteins has not yet been characterized experimentally. Annotating protein function in silico using available information on their sequence, their structure, their evolutionary history, and their association with other proteins will help to further the understanding of the known protein sequences. This talk should introduce gene ontologies and focus on the homolgy-based prediction of protein function.


  • A Vinayagam, C del Val, F Schubert, R Eils, KH Glatting, S Suhai, R König. GOPET: a tool for automated predictions of Gene Ontology terms. BMC Bioinformatics (2006)  7: 161.
  • S Götz, J M García-Gómez, J Terol, TD Williams, SH Nagaraj, MJ Nueda, M Robles, M Talón, J Dopazo, A Conesa. High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res(2008) 36:3420-35.
  • A Vinayagam, R König, J Moormann, F Schubert, R Eils, K-H Glatting, S Suhai. Applying Support Vector Machines for Gene Ontology based gene function prediction. BMC Bioinformatics (2004) 5:116


Analysing enzyme function using structures

Dr. Andrea Schafferhans

Annotating protein function is routinely done by transferring annotations from related sequences. However, this is a very crude annotation. This talk should give an introduction to methods that infer more specific information from a structural analysis of the active site.


  • Melo-Minardi,R.C. de et al. (2010) Identification of Subfamily-specific Sites based on Active Sites Modeling and Clustering. Bioinformatics, 26, 3075-3082. http://www.ncbi.nlm.nih.gov/pubmed/20980272
  • Nagao,C. et al. (2010) Relationships between functional subclasses and information contained in active-site and ligand-binding residues in diverse superfamilies. Proteins, 78, 2369-84. www.ncbi.nlm.nih.gov/pubmed/20544971


Protein Function Prediction using Sequences, Structures and more

Tobias Hamp

This is the third and final installment of a series of talks about the prediction of protein function in-silico. Whereas the first two focus on function prediction by sequence and structure, respectively, this talk is supposed to introduce so-called meta predictors, i.e. tools which combine sequence and structure information and/or make use of other data such as protein-protein interactions to annotate protein function.


Protein (structure) family databases

Dr. Andrea Schafferhans

A number of databases collect classifications of proteins by structure and sequence families. This talk shall give an overview of the commonalities and differences of the most well-known of these databases: CATH, SCOP and Superfamily.



Prediction of functional effects of non-synonymous SNPs

Dr. Shaila Roessle

Single Nucleotide Polymorphisms (SNPs) represent a very large portion of all genetic variations. SNPs found in the coding regions of genes are often non-synonymous, changing a single amino acid in the encoded protein sequence. SNPs are either "neutral" in the sense that the resulting point-mutated protein is not functionally discernible from the wild-type, or they are "non-neutral" in that the mutant and wild-type differ in function. The ability to identify non-neutral substitutions in an ocean of SNPs could significantly aid targeting disease causing detrimental mutations, as well as SNPs that increase the fitness of particular phenotypes.

There are methods based on physical and comparative considerations that estimate the impact of an amino acid replacement on the three-dimensional structure and function of the protein.


  • A method and server for predicting damaging missense mutations: Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. Nat Methods 7(4):248-249 (2010)
  • SNAP: predict effect of non-synonymous polymorphisms on function. ?Yana Bromberg and Burkhard Rost ?Nucleic Acids Research, 2007, Vol. 35, No. 11 3823-3835 (PubMed)


Disease Networks

Dr. Arthur Dong

Molecular studies of diseases have traditionally focused on single genes (so called monogenic diseases). However, most common diseases are surprisingly complex, involving the interplay of multiple genes and proteins. The increasing availability of genome-scale data and the rise of systems biology ushered in a new era of network-based disease studies.


  • The human disease network.PNAS 2007 May 22;104(21):8685-90

  • Network-based classification of breast cancer metastasis. Mol Syst Biol. 3:140 (2007)


Physical and Genetic Interaction Networks

Dr. Arthur Dong

Proteins are the main molecular actors in a cell, but they rarely carry out their functions alone. Instead, they physically interact with each other in most biological processes. The physical interactions can be either permanent, as in protein complexes, or transient, as in signal transduction. Proteins can also be highly correlated without interacting with each other physically; for example, one protein may induce or suppress another protein, or two proteins may participate in the same pathway. Such indirect interactions are termed genetic interactions. Both physical and genetic interactions in a cell form complex networks, with intriguing properties. In this study we combine the two types of networks to obtain further insights.


  • Kelley, R. and Ideker, T. Systematic interpretation of genetic interactions using protein networks. Nature Biotechnology 23(5):561-566


Genetic heterogeneity and disease

Dipl.-Bioinf. Christian Schaefer

How do mutations influence susceptibility to disease? How could relationships be found between genotype and phenotype? In this seminar general topics like GWAS, SNPs, genetic diseases and examples should be presented and discussed. Special emphasis should be placed on differences between neutral nsSNPs, disease-associated and cancer-related variants.


  • Talavera D., Taylor M., Thornton J. (2010) The (non)malignancy of cancerous amino acidic substitutions. Proteins 78:518-529 
  • Gong S., Blundell T. (2010) Structural and functional restraints on the occurrence of single amino acid variations in human proteins. Plos One 5:2 e9186


  • J. McClellan, M.-C. King (2010) Genetic heterogeneity in human disease, Cell
  • ( J. Hardy, A. Singleton (2009) Genomewide association studies and human disease, N Engl J Med )


Automated prediction method assessment practices in bioinformatics

Present an overview of the current state of continuous automated performance assessment solutions in bioinformatics.

Assorted references:


Predicting regions with no regular structure (NORS)

Diplom Biol. Esmeralda Vicedo

One common definition of regions of “disorder” in proteins is that they do not adopt a regular three-dimensional (3D)structure in   isolation  on their own. These disordered regions are in contrast to regions that are well structured or “ordered”. Notably, there is a great variety of “flavors” of disorder: some adopt a unique regular 3D secondary structure only upon binding; others, for example loops, remain irregular; some proteins are almost entirely disordered and others have only short disordered region. Numerous computational methods exist that predict disorder based on a variety of concepts. One of these methods, Norsnet has been developed in our group. Norsnet uses a neural network to predict disordered regions of the “loopy” type (unstructured loops), important regions for network complexity.