FireProtDB logo
v2.0
Mutational data for protein stability
Loschmidt Laboratories
Search

Help

Input page

The input page is separated into four main parts: (1) the search form, functionally described in the following section, (2) the main menu,  (3) the panel containing information about the FireProtDB database, and (4) the statistics describing the current state of data.

 

Search form

The basic search form (1) serves as a full-text search across several fields, specifically the name of the protein, organism, EC number, UniProt ID, PDB ID, PMID, and DOI. For a more concise results, the advanced search can be utilized (5). The advanced search allows to limit the search based on the values of specific fields and for the construction of more complex queries using the AND/OR operators. Some search terms, e.g. experiment is stabilising, already represent a compound query. The terms are as follows:

  • Protein name: full-text search of protein name, e.g. Lysozyme
  • UniProtKB accession: match to UniProtKB ID
  • Organism: full-text search on the organism from which the protein originates, e.g. Escherichia coli
  • Enzyme commission number: match on EC number
  • InterPro family: whether the protein belongs to specified InterPro families
  • ΔG: The Gibbs free energy upon folding. Connected to the wild-type of the protein.
  • Tm: The melting temperature of the protein wild-type.
  • The experiment is stabilising: complex query returning all stabilising mutations. This combines values of ΔΔG, ΔTm, Normalized_ΔΔG, and Fitness. The experiment is considered stabilising in the condition of ΔΔG < -0.5 kcal/mol, ΔTm > 1°C, Normalized_ΔΔG < -0.3, or Fitness > 0.3. The term has two additional options. ΔΔG and ΔTm must agree checkbox returns mutation only if both ΔΔG and ΔTm denote it as stabilising.
  • The experiment is destabilising: similar to the previous one. The experiment is considered destabilising in the condition of ΔΔG > 0.5 kcal/mol, ΔTm < -1°C, Normalized_ΔΔG > 0.3, or Fitness < -0.3.
  • The experiment is neutral: similar to the previous one. The experiment is considered neutral in the condition of ΔΔG <-1.0, 1.0> kcal/mol, ΔTm <-1, 1>°C, Normalized_ΔΔG <-0.3, 0.3>, or Fitness <-0.3, 0.3>.
  • ΔΔG: Compares the values of ΔΔG with all existing datapoints.
  • Experiment has/has NOT ΔΔG value: returns all datapoints with/not known ΔΔG.
  • Experiment has/has NOT ΔTm value: same as above for ΔTm.
  • ΔTm: Same as above, using the value of ΔTm.
  • Sequence: search based on the match of protein sequence
  • Sequence length: search based on the size of protein sequence
  • Mutated position: returns all mutations on a given position. Usable in combination with a specific protein to check the existence of known mutations at a given position
  • PDB identifier: limits the search to the protein specified by PDB ID.
  • B-factor: filters datapoints based on the specified flexibility threshold. Only available for datapoints with known/analysed tertiary structure.
  • Dataset: can be used to search for mutations that are included/excluded from specific datasets
  • Conservation: filters datapoints based on the specified conservation threshold.
  • Target amino acid: returns only mutations to the given amino acid.
  • pH: returns all datapoints that were measured under the specified pH.
  • Publication DOI/PMID: search based on the DOI or PMID of the publication from which the experiment was obtained.

 

Complex search query can be constructed by adding new terms using the plus button and combining them using AND/OR operators. The following picture (5) captures a situation where the user requires all the stabilising mutations (without any conflicts across multiple experiments) for the protein with PDB ID 1WQ5. It is also possible to see all available data by clicking on the Browse database located in the menu panel.

Statistics

FireProtDB provides various statistics describing the current state of the stored data. The amount of de/stabilising mutations, the information about the most represented proteins, most common protein families, or the amino acid distribution can be found at the bottom of the input page.

 

Result page

Once the data is obtained from the database based on the provided criteria, the table with all available information about a given datapoint is shown in the table as seen below. The individual columns can be filtered using the panel on top of the results page (click on the arrow button on the right side to list all available columns). The mutation format in the second column is as follows: (i) wild_type_aa position mutation_aa for substitutions, (ii) ins position inserted_aa for insertions, (iii) del position deleted_aa for deletions. By clicking on the mutation hyperlink, the user will be transported to the mutation page containing all experimental information for a given mutation (from multiple experiments). By clicking on the protein name, the protein page can be accessed.

 

Protein page

Protein page has its information separated into several sections, from top to bottom: (i) general information about protein, such as UniProt ID and species from which it was obtained, (ii) list of protein domains with links to the IntePro database, (iii) tracks showing the sequential information on the protein sequence, (iv) table containing sequence measurements, i.e. information connected to the wild-type protein or experiment itself and are not dependent on the mutation (can be sorted and visualized based on values from the panel on the left). (v) visualisation of the 3D structure of the protein, (vi) table of mutations connected to the protein in the same format as the results page.

 

Glossary

  • Protein: Name of the protein
  • Mutations: List of mutations resulting in the given measurements. For substitutions, the format is wild-type_aa position mutant_aa (D3A meaning substitution of D for A on position 3), for insertions and deletions, the format is ins position inserted_aa (ins3A meaning A inserted on position 3) and del position deleted_aa (del3D meaning D removed from position 3).
  • Tm: Temperature during the thermal denaturation when half of the protein is unfolded.
  • ΔTm: The difference in the melting temperature between wild-type and mutant protein. The positive and negative values denote stabilising and destabilizing mutations, respectively.
  • ΔG: Free energy of unfolding measured by the concentration of denaturant or extrapolated from ΔCp in the case of thermal denaturation.
  • ΔΔG: The difference in ΔG between wild-type and mutant. Negative and positive values denote stabilising and destabilising mutations, respectively. The values from interval <-1.0, 1.0> kcal/mol are considered neutral.
  • Normalised ΔΔG: Values as calculated in Domainome dataset publication. A ΔΔG of 0 implies no change in stability, while a ΔΔG < -0.3 or ΔΔG > +0.3 represent, respectively, highly stabilising or destabilising mutations as per the reference definitions.
  • Normalised ΔΔG std: Standard deviation of Normalized ΔΔG between the predictions given by individual homologs.
  • Fitness: amino acid fitness influencing the relative stability of individual amino acids at a given site. Positive and negative values denote stabilising and destabilising mutations, respectively, with the interval of <-0.3, 0.3> being considered (close) to neutral.
  • Trypsin ML: Comparison of ΔG measurements while using trypsin to control the effects of the protease activity.
  • Chymotrypsin ML: Comparison of ΔG measurements while using chymotrypsin to control the effects of the protease activity.
  • Cm: Concentration of denaturant at which half of the protein is unfolded.
  • ΔCp: Heat capacity change of denaturation.
  • ΔH: Enthalpy change of denaturation.
  • ΔHvH: van’t Hoff enthalpy change of denaturation
  • Fitness std: the deviation of fitness values between the individual homologs.
  • m: slope of ΔG on denaturant concentration.
  • Reversibility: States whether denaturation is reversible.
  • Stabilizing: Stabilizing tag from the MegaScale dataset.
  • State: Number of transition states. Numbers higher than two signify and existence of an intermediate.
  • Buffer: Name of the buffer.
  • Buffer conc: Buffer concentration.
  • Ion: Name of the added ion.
  • Ion conc: Ion concentration.
  • pH: Experimental pH value.
  • Method: Denaturation method (thermal, chemical)
  • Measure: Experimental technique used for denaturation (CD, Fl, DSC, ...)
  • Exp. temperature: Temperature used in the experiment in the case of chemical denaturation.
  • EC number: The enzymatic number of protein.
  • UniProtKB: UniProt accession ID.
  • InterPro: InterPro domain accession code.
  • PDB: Accession ID to the PDB database.
  • Publication: DOI or PMID accession.
  • Dataset: Annotation to the source of the data (or other datasets where entry was utilized).
  • Organism: Source organism.
  • Sequence length: Size of the protein sequence.