AggreProt: A Web Server for Predicting and Engineering Protein Aggregation based on Sequential and Structural Features
Joan Planas-Iglesias*, Simeon Borko*, Jan Swiatkowski*, Matej Elias, Martin Havlasek, Antonin Kunka, Tomas Martinovic, Ekaterina Grakova, Jiri Damborsky, Jan Martinovic#, David Bednar#, manuscript in preparation, 2024.
*shared first authors, #authors for correspondence

AggreProt is a web server to predict and engineer protein aggregation. The predictor is an ensemble of deep convolutional networks (14 layers of 14 neurons each, each neuron weighted by 30 different parameters), trained on the 1416 hexapeptides represented in WaltzDB. The model was validated in the subset of AmyPro database consisting of proteins that did not include any of the hexapeptides considered during training (AmyPro37). We developed three different models of the predictor. The first is based exclusively on residue-level sequence-based features, previously described as “atomic characteristics”. The second takes additionally into account structural information as provided in WaltzDB and calculated by CORDAX, along with this method aggregation prediction. The third one is the combination of the two.

AggreProt is designed to make the prediction easy to understand for the user. After inputing the sequence to be analyzed (and optionally its corresponding structure), the calculations can be run and the results are presented in an easy to understand interface. First, a 1D pane represents the protein sequence where the user can focus in and out. Below, the aggregation propensity is displayed along other interesting properties, such as solvent accessible area and transmembrane probability. Finally, the 3D view is connected to the previous panes to help contextualizing the visualized results in the protein 3D structure.