Creation and visualization of secondary structure consensus for protein families

Varování

Publikace nespadá pod Ústav výpočetní techniky, ale pod Přírodovědeckou fakultu. Oficiální stránka publikace je na webu muni.cz.
Autoři

MIDLIK Adam HUTAŘOVÁ VAŘEKOVÁ Ivana HUTAŘ Jan NAVRÁTILOVÁ Veronika KOČA Jaroslav BERKA Karel SVOBODOVÁ VAŘEKOVÁ Radka

Rok publikování 2019
Druh Další prezentace na konferencích
Fakulta / Pracoviště MU

Přírodovědecká fakulta

Citace
Popis Protein structural data, deposited in the Protein Data Bank, are a valuable source of information and their amount is continuously growing (currently more than 150 000 structures). Furthermore, most protein structures can be classified into protein families based on their similarity [1]. Systematic study of these families is gaining importance and can yield interesting research results. Every protein family has a set of characteristic secondary structure elements (SSEs, namely helices and ß-strands). Their arrangement is well defined and consistent throughout the whole family. However, there will always be some differences between the members of the family and a single structure is not enough to represent the whole family of structures, just as a single amino acid sequence is not enough to represent the whole family of sequences. For sequences, this problem is solved by multiple sequence alignment, which produces the consensus sequence and can be visualized by a sequence logo [2] – this extracts the essential features of the family and shows the similarities and differences within the family. For secondary structure, such an approach is currently missing. In this work, we introduce computational methods for extracting and visualizing the secondary structure consensus for a given protein family. Apart from giving an overview of the family, this consensus can also be used as an annotation template for our previously developed program SecStrAnnotator [3]. This allows annotation of SSEs in any family and unlocks the possibility of automated annotation of the key regions (e.g. active sites and channels) based on their position relative to the SSEs. [1] Dawson NL, Lewis TE, Das S, Lees JG, Lee D, Ashford P, Orengo CA, Sillitoe I (2017) CATH: an expanded resource to predict protein function through structure and sequence. Nucleic Acids Res 45(D1):D289-D295. https://doi.org/10.1093/nar/gkw1098

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.

Další info