InterPro

InterPro
Content
DescriptionInterPro functionally analyzes protein sequences and classifies them into protein families while predicting the presence of domains and functional sites.
Contact
Research centerEMBL
LaboratoryEuropean Bioinformatics Institute
Primary citationThe InterPro protein families and domains database: 20 years on[1]
Release date1999
Access
Websitewww.ebi.ac.uk/interpro/
Download URLftp.ebi.ac.uk/pub/databases/interpro/
Miscellaneous
Data release
frequency
8-weekly
Version97.0 (9 November 2023 (2023-11-09))

InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences[2] in order to functionally characterise them.[3][4]

The contents of InterPro consist of diagnostic signatures and the proteins that they significantly match. The signatures consist of models (simple types, such as regular expressions or more complex ones, such as Hidden Markov models) which describe protein families, domains or sites. Models are built from the amino acid sequences of known families or domains and they are subsequently used to search unknown sequences (such as those arising from novel genome sequencing) in order to classify them. Each of the member databases of InterPro contributes towards a different niche, from very high-level, structure-based classifications (SUPERFAMILY and CATH-Gene3D) through to quite specific sub-family classifications (PRINTS and PANTHER).

InterPro's intention is to provide a one-stop-shop for protein classification, where all the signatures produced by the different member databases are placed into entries within the InterPro database. Signatures which represent equivalent domains, sites or families are put into the same entry and entries can also be related to one another. Additional information such as a description, consistent names and Gene Ontology (GO) terms are associated with each entry, where possible.

  1. ^ Blum M, Chang HY, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, et al. (November 2020). "The InterPro protein families and domains database: 20 years on". Nucleic Acids Research. 49 (D1): D344–D354. doi:10.1093/nar/gkaa977. PMC 7778928. PMID 33156333.
  2. ^ Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, et al. (January 2012). "InterPro in 2011: new developments in the family and domain prediction database". Nucleic Acids Research. 40 (Database issue): D306-12. doi:10.1093/nar/gkr948. PMC 3245097. PMID 22096229.
  3. ^ Apweiler R, Attwood TK, Bairoch A, Bateman A, Birney E, Biswas M, et al. (January 2001). "The InterPro database, an integrated documentation resource for protein families, domains and functional sites". Nucleic Acids Research. 29 (1): 37–40. doi:10.1093/nar/29.1.37. PMC 29841. PMID 11125043.
  4. ^ Apweiler R, Attwood TK, Bairoch A, Bateman A, Birney E, Biswas M, et al. (December 2000). "InterPro--an integrated documentation resource for protein families, domains and functional sites". Bioinformatics. 16 (12): 1145–50. doi:10.1093/bioinformatics/16.12.1145. PMID 11159333.

Developed by StudentB