Liquid–liquid phase separation sequence-based predictors explained

LLPS often involves sequence regions that have unique functional characteristics, as well as the presence of prion-like and RNA-binding domains. Nowadays there are just a few methods to predict the propensity of a protein to drive LLPS. The range of biological mechanisms involved in LLPS, the limited knowledge about these mechanisms and the important context-dependent component of LLPS make this problem challenging. In the last years, despite the advances in this field, just few predictors, specific for LLPS, have been developed, trying to understand the relationship between protein sequence properties and the capability to drive LLPS.  Here we will revise the state-of-the-art LLPS sequence-based predictors, briefly introducing them and explaining which are the individual protein characteristics that they identify in the context of LLPS.

Table 2!Predictor!Published!Description - Type of data
PSPer[1] 2019PSPer is a method trained to identify prion-like RNA binding phase-separation proteins (PSPs). This method is focused on a particular feature of LLPS proteins and provides an overall score for a given protein depending on the presence of this feature. The method is trained on an experimental dataset of FUS-like PSPs, and the biophysical characteristics (PLD and RNA binding domain, RNA-recognition motif, disordered and additional domains) that belong to each region, implemented in a probabilistic model. This method was also trained including a negative dataset of ordered proteins, so it is expected that its performance is increased on those disordered proteins driven LLPS.
PLAAC[2] 2014PLAAC predicts prion-like amino acid composition, usually enriched in polar-residues by using Hidden Markov Model (HMM). This method was originally developed before realizing the implication of PLDs in LLPS, and consequently it is not trained to identify the majority of phase separating regions.
PScore[3] 2017PScore is a statistical scoring algorithm that predicts pi-pi interactions. It compares pi-pi interactions predicted in the target proteins with all proteins found in the PDB to assign a score of phase-separation propensity.
catGRANULE[4] 2016catGRANULE is a method that was originally trained against yeast protein but it has been shown to be useful to predict human phase-separating proteins.[5] The algorithm is based on sequence composition statistics to differentiate proteins that are localized in yeast granules from the rest of the yeast proteome. The features considered to weight the residues are disorder and nucleic-acid binding propensities, as well as properties of some amino acids.
PSPredictor[6] 2019PSPredictor is a machine learning approach to predict proteins that phase separate, trained on a set of experimentally validated protein sequences in the LLPSDB database.
PSAP[7] 2021PSAP is a random forest classifier to predict the probability of proteins to mediate phase separation. This classifier is trained on a set of 90 high-confident HUMAN proteins that drive LLPS.
FuzDrop[8] 2020FuzDrop is a method to predict droplet-driver promoting regions and proteins. The algorithm was trained on a dataset of drivers collected from different public databases, and the output is a per-residue probability of droplet formation.
ParSe[9] 2022ParSe v2 explores the possibility that protein mediated phase separation can be predicted from sequence-based calculations of hydrophobicity, α-helix propensity, and a model of the polymer scaling exponent (νmodel). The algorithm was trained on a curated dataset of homotypic phase-separating intrinsically disordered sequences that were experimentally verified to phase-separate in vitro.

LLPS Simulations

Another important computational resource in the field of LLPS are the theoretic simulations of proteins, particularly Intrinsically disordered proteins (IDPs), driving LLPS. These simulations are complementary to the experiments and provide important insights about the molecular mechanisms of individual proteins driving LLPS. A review from Dignon et al.[10] discussed how these simulations can be applied to interpret the experimental results, to explain the phase behavior and to provide predictive frameworks to design proteins with tunable phase transition properties. The challenge is the compromise between the resolution of the model and the computational efficiency, since all-atom simulations of big systems involving IDPs are still difficult to be performed. Moreover, the molecular interactions among IDPs in the droplet-state are still poorly understood, and the combination of experimental data and simulations are indispensable to elucidate them. Improvements in sampling and simulation methods might occur in the next few years, in order to enlighten these mechanisms.[11]

See also

References

  1. Orlando. Gabriele. Raimondi. Daniele. Tabaro. Francesco. Codicè. Francesco. Moreau. Yves. Vranken. Wim F. 2019-04-17. Computational identification of prion-like RNA-binding proteins that form liquid phase-separated condensates. Bioinformatics. 35. 22. 4617–4623. 10.1093/bioinformatics/btz274. 30994888. 1367-4803.
  2. Lancaster. A. K.. Nutter-Upham. A.. Lindquist. S.. King. O. D.. 2014-05-13. PLAAC: a web and command-line application to identify proteins with prion-like amino acid composition. Bioinformatics. 30. 17. 2501–2502. 10.1093/bioinformatics/btu310. 1367-4803. 4147883. 24825614.
  3. Vernon. Robert McCoy. Chong. Paul Andrew. Tsang. Brian. Kim. Tae Hun. Bah. Alaji. Farber. Patrick. Lin. Hong. Forman-Kay. Julie Deborah. 2018-02-09. Shan. Yibing. Pi-Pi contacts are an overlooked protein feature relevant to phase separation. eLife. 7. e31486. 10.7554/eLife.31486. 29424691. 5847340. 2050-084X . free .
  4. Bolognesi. Benedetta. Gotor. Nieves Lorenzo. Dhar. Riddhiman. Cirillo. Davide. Baldrighi. Marta. Tartaglia. Gian Gaetano. Lehner. Ben. 2016-06-28. A Concentration-Dependent Liquid Phase Separation Can Cause Toxicity upon Increased Protein Expression. Cell Reports. English. 16. 1. 222–231. 10.1016/j.celrep.2016.05.076. 2211-1247. 4929146. 27320918.
  5. Ambadipudi. Susmitha. Biernat. Jacek. Riedel. Dietmar. Mandelkow. Eckhard. Zweckstetter. Markus. 2017-08-17. Liquid–liquid phase separation of the microtubule-binding repeats of the Alzheimer-related protein Tau. Nature Communications. en. 8. 1. 275. 10.1038/s41467-017-00480-0. 2041-1723. 5561136. 28819146. 2017NatCo...8..275A.
  6. Sun. Tanlin. Li. Qian. Xu. Youjun. Zhang. Zhuqing. Lai. Luhua. Pei. Jianfeng. 2019-11-15. Prediction of liquid-liquid phase separation proteins using machine learning. en. 842336. 10.1101/842336. 209574590.
  7. Mierlo. Guido van. Jansen. Jurriaan R. G.. Wang. Jie. Poser. Ina. Heeringen. Simon J. van. Vermeulen. Michiel. 2021-02-02. Predicting protein condensate formation using machine learning. Cell Reports. English. 34. 5. 108705. 10.1016/j.celrep.2021.108705. 2211-1247. 33535034. 231804701. free. 2066/231424. free.
  8. Hardenberg. Maarten. Horvath. Attila. Ambrus. Viktor. Fuxreiter. Monika. Vendruscolo. Michele. 2020-12-29. Widespread occurrence of the droplet state of proteins in the human proteome. Proceedings of the National Academy of Sciences. en. 117. 52. 33254–33262. 10.1073/pnas.2007670117. 0027-8424. 7777240. 33318217. free.
  9. Ibrahim. Ayyam. Khaodeuanepheng. Nathan. Amarasekara. Dhanush. Correia. John. Lewis. Karen. Fitzkee. Nicholas. Hough. Loren. Whitten. Steven. 2023-01-01. Intrinsically disordered regions that drive phase separation form a robustly distinct protein class. Journal of Biological Chemistry. en. 299. 1. 102801. 10.1016/j.jbc.2022.102801. 0021-9258. 9860499 . 36528065. free.
  10. Dignon. Gregory L. Zheng. Wenwei. Mittal. Jeetain. 2019-03-01. Simulation methods for liquid–liquid phase separation of disordered proteins. Current Opinion in Chemical Engineering. Frontiers of Chemical Engineering: Molecular Modeling. en. 23. 92–98. 10.1016/j.coche.2019.03.004. 2211-3398. 7426017. 32802734.
  11. Shea. Joan-Emma. Best. Robert B. Mittal. Jeetain. 2021-04-01. Physics-based computational and theoretical approaches to intrinsically disordered proteins. Current Opinion in Structural Biology. Theory and Simulation/Computational Methods ● Macromolecular Assemblies. en. 67. 219–225. 10.1016/j.sbi.2020.12.012. 0959-440X. 8150118. 33545530.