Equilibrium unfolding explained

Theoretical background

In its simplest form, equilibrium unfolding assumes that the molecule may belong to only two thermodynamic states, the folded state (typically denoted N for "native" state) and the unfolded state (typically denoted U). This "all-or-none" model of protein folding was first proposed by Tim Anson in 1945,^[3] but is believed to hold only for small, single structural domains of proteins (Jackson, 1998); larger domains and multi-domain proteins often exhibit intermediate states. As usual in statistical mechanics, these states correspond to ensembles of molecular conformations, not just one conformation.

The molecule may transition between the native and unfolded states according to a simple kinetic model

N Uwith rate constants

k_f

and

k_u

for the folding (U -> N) and unfolding (N -> U) reactions, respectively. The dimensionless equilibrium constant

K_ \ \stackrel\ \frac = \frac

can be used to determine the conformational stability

\DeltaG^o

by the equation

\DeltaG^o=-RTlnK_eq

where

is the gas constant and

is the absolute temperature in kelvin. Thus,

\DeltaG^o

is positive if the unfolded state is less stable (i.e., disfavored) relative to the native state.

The most direct way to measure the conformational stability

\DeltaG^o

of a molecule with two-state folding is to measure its kinetic rate constants

k_f

and

k_u

under the solution conditions of interest. However, since protein folding is typically completed in milliseconds, such measurements can be difficult to perform, usually requiring expensive stopped flow or (more recently) continuous-flow mixers to provoke folding with a high time resolution. Dual polarisation interferometry is an emerging technique to directly measure conformational change and

\DeltaG^o

Chemical denaturation

In the less extensive technique of equilibrium unfolding, the fractions of folded and unfolded molecules (denoted as

p_N

and

p_U

, respectively) are measured as the solution conditions are gradually changed from those favoring the native state to those favoring the unfolded state, e.g., by adding a denaturant such as guanidinium hydrochloride or urea. (In equilibrium folding, the reverse process is carried out.) Given that the fractions must sum to one and their ratio must be given by the Boltzmann factor, we have

p_N=

	1
	1+e^-\Delta

p_U=1-p_N=

	e^-\Delta
	1+e^-\Delta

	1
	1+e^\Delta

Protein stabilities are typically found to vary linearly with the denaturant concentration. A number of models have been proposed to explain this observation prominent among them being the denaturant binding model, solvent-exchange model (both by John Schellman^[4]) and the Linear Extrapolation Model (LEM; by Nick Pace^[5]). All of the models assume that only two thermodynamic states are populated/de-populated upon denaturation. They could be extended to interpret more complicated reaction schemes.

The denaturant binding model assumes that there are specific but independent sites on the protein molecule (folded or unfolded) to which the denaturant binds with an effective (average) binding constant k. The equilibrium shifts towards the unfolded state at high denaturant concentrations as it has more binding sites for the denaturant relative to the folded state (

\Deltan

). In other words, the increased number of potential sites exposed in the unfolded state is seen as the reason for denaturation transitions. An elementary treatment results in the following functional form:

\DeltaG=\DeltaG_w-RT\Deltanln\left(1+k[D]\right)

where

\DeltaG_w

is the stability of the protein in water and [D] is the denaturant concentration. Thus the analysis of denaturation data with this model requires 7 parameters:

\DeltaG_w

\Deltan

, k, and the slopes and intercepts of the folded and unfolded state baselines.

The solvent exchange model (also called the ‘weak binding model’ or ‘selective solvation’) of Schellman invokes the idea of an equilibrium between the water molecules bound to independent sites on protein and the denaturant molecules in solution. It has the form:

\DeltaG=\DeltaG_w-RT\Deltanln\left(1+(K-1)X_D\right)

where

is the equilibrium constant for the exchange reaction and

X_d

is the mole-fraction of the denaturant in solution. This model tries to answer the question of whether the denaturant molecules actually bind to the protein or they seem to be bound just because denaturants occupy about 20-30% of the total solution volume at high concentrations used in experiments, i.e. non-specific effects – and hence the term ‘weak binding’. As in the denaturant-binding model, fitting to this model also requires 7 parameters. One common theme obtained from both these models is that the binding constants (in the molar scale) for urea and guanidinium hydrochloride are small: ~ 0.2

M^-1

for urea and 0.6

M^-1

for GuHCl.

Intuitively, the difference in the number of binding sites between the folded and unfolded states is directly proportional to the differences in the accessible surface area. This forms the basis for the LEM which assumes a simple linear dependence of stability on the denaturant concentration. The resulting slope of the plot of stability versus the denaturant concentration is called the m-value. In pure mathematical terms, m-value is the derivative of the change in stabilization free energy upon the addition of denaturant. However, a strong correlation between the accessible surface area (ASA) exposed upon unfolding, i.e. difference in the ASA between the unfolded and folded state of the studied protein (dASA), and the m-value has been documented by Pace and co-workers.^[5] In view of this observation, the m-values are typically interpreted as being proportional to the dASA. There is no physical basis for the LEM and it is purely empirical, though it is widely used in interpreting solvent-denaturation data. It has the general form:

\DeltaG=m\left([D]_1/2-[D]\right)

where the slope

is called the "m-value"(> 0 for the above definition) and

\left[D\right]_1/2

(also called C_m) represents the denaturant concentration at which 50% of the molecules are folded (the denaturation midpoint of the transition, where

p_N=p_U=1/2

In practice, the observed experimental data at different denaturant concentrations are fit to a two-state model with this functional form for

\DeltaG

, together with linear baselines for the folded and unfolded states. The

and

\left[D\right]_1/2

are two fitting parameters, along with four others for the linear baselines (slope and intercept for each line); in some cases, the slopes are assumed to be zero, giving four fitting parameters in total. The conformational stability

\DeltaG

can be calculated for any denaturant concentration (including the stability at zero denaturant) from the fitted parameters

and

\left[D\right]_1/2

. When combined with kinetic data on folding, the m-value can be used to roughly estimate the amount of buried hydrophobic surface in the folding transition state.

Structural probes

Unfortunately, the probabilities

p_N

and

p_U

cannot be measured directly. Instead, we assay the relative population of folded molecules using various structural probes, e.g., absorbance at 287 nm (which reports on the solvent exposure of tryptophan and tyrosine), far-ultraviolet circular dichroism (180-250 nm, which reports on the secondary structure of the protein backbone), dual polarisation interferometry (which reports the molecular size and fold density) and near-ultraviolet fluorescence (which reports on changes in the environment of tryptophan and tyrosine). However, nearly any probe of folded structure will work; since the measurement is taken at equilibrium, there is no need for high time resolution. Thus, measurements can be made of NMR chemical shifts, intrinsic viscosity, solvent exposure (chemical reactivity) of side chains such as cysteine, backbone exposure to proteases, and various hydrodynamic measurements.

To convert these observations into the probabilities

p_N

and

p_U

, one generally assumes that the observable

adopts one of two values,

A_N

A_U

, corresponding to the native or unfolded state, respectively. Hence, the observed value equals the linear sum

A=A_Np_N+A_Up_U

By fitting the observations of

under various solution conditions to this functional form, one can estimate

A_N

and

A_U

, as well as the parameters of

\DeltaG

. The fitting variables

A_N

and

A_U

are sometimes allowed to vary linearly with the solution conditions, e.g., temperature or denaturant concentration, when the asymptotes of

are observed to vary linearly under strongly folding or strongly unfolding conditions.

Thermal denaturation

Assuming a two state denaturation as stated above, one can derive the fundamental thermodynamic parameters namely,

\DeltaH

\DeltaS

and

\DeltaG

provided one has knowledge on the

\DeltaC_p

of the system under investigation.

The thermodynamic observables of denaturation can be described by the following equations:

\begin{align}\DeltaH(T)&=\DeltaH(T_d)+

	T
\int
	T_d

\DeltaC_pdT \\ &=\DeltaH(T_d)+\DeltaC_p[T-T_d]
\\
\DeltaS(T)&=

	\DeltaH(T_d)
	T_d

	T
\int
	T_d

\DeltaC_pdlnT \\ &=

	\DeltaH(T_d)
	T_d

+\DeltaC_pln

	T
	T_d

\\ \DeltaG(T)&=\DeltaH-T\DeltaS \\ &=\DeltaH(T_d)

	T_d-T
	T_d

	T
\int
	T_d

\DeltaC_pdT-

	T
T\int
	T_d

\DeltaC_pdlnT \\ &=\Delta

H(T

d)\left(1-	T
	T_d

\right)-\DeltaC_p\left[T_d-T+Tln\left(

	T
	T_d

\right)\right] \end{align}

where

\DeltaH

\DeltaS

and

\DeltaG

indicate the enthalpy, entropy and Gibbs free energy of unfolding under a constant pH and pressure. The temperature,

is varied to probe the thermal stability of the system and

T_d

is the temperature at which half of the molecules in the system are unfolded. The last equation is known as the Gibbs–Helmholtz equation.

Determining the heat capacity of proteins

In principle one can calculate all the above thermodynamic observables from a single differential scanning calorimetry thermogram of the system assuming that the $\ce$ is independent of the temperature. However, it is difficult to obtain accurate values for $\ce$ this way. More accurately, the $\ce$ can be derived from the variations in $\ce$ vs. $\ce$ which can be achieved from measurements with slight variations in pH or protein concentration. The slope of the linear fit is equal to the $\ce$ . Note that any non-linearity of the datapoints indicates that

\DeltaC_p

is probably not independent of the temperature.

Alternatively, the $\ce$ can also be estimated from the calculation of the accessible surface area (ASA) of a protein prior and after thermal denaturation as follows:

$\ce = \ce - \ce$

For proteins that have a known 3d structure, the $\ce$ can be calculated through computer programs such as Deepview (also known as swiss PDB viewer). The $\ce$ can be calculated from tabulated values of each amino acid through the semi-empirical equation:

$\ce = \left(a_\ce \times \ce\right) + \left(a_\ce \times \ce \right) + \left(a_\ce \times \ce\right)$

where the subscripts polar, non-polar and aromatic indicate the parts of the 20 naturally occurring amino acids.

Finally for proteins, there is a linear correlation between $\ce$ and $\ce$ through the following equation:^[6]

$\ce = 0.61 \times \ce$

Assessing two-state unfolding

Furthermore, one can assess whether the folding proceeds according to a two-state unfolding as described above. This can be done with differential scanning calorimetry by comparing the calorimetric enthalpy of denaturation i.e. the area under the peak,

A_peak

to the van 't Hoff enthalpy described as follows:

\DeltaH_vH(T)=-R

	dlnK
	dT^-1

T=T_d

the

\DeltaH_vH(T_d)

can be described as:

\DeltaH_vH(T_d)=

\Delta

	max
C
	p

A_peak

When a two-state unfolding is observed the

A_peak=\DeltaH_vH(T_d)

. The

\Delta

	max
C
	p

is the height of the heat capacity peak.

Generalization to protein complexes and multi-domain proteins

Using the above principles, equations that relate a global protein signal, corresponding to the folding states in equilibrium, and the variable value of a denaturing agent, either temperature or a chemical molecule, have been derived for homomeric and heteromeric proteins, from monomers to trimers and potentially tetramers. These equations provide a robust theoretical basis for measuring the stability of complex proteins, and for comparing the stabilities of wild type and mutant proteins.^[7] Such equations cannot be derived for pentamers of higher oligomers because of mathematical limitations (Abel–Ruffini theorem).

Notes and References

Book: Lassalle, Michael W.. Protein folding protocols. 2007. Humana Press. Totowa, New Jersey. 978-1-59745-189-5. 21–38. Akasaka, Kazuyuki. Bai, Yawen. Nussinov, Ruth. The use of high-pressure nuclear magnetic resonance to study protein folding. https://archive.org/details/proteinfoldingpr00yawe/page/21.
Book: Ng, Sean P.. Protein folding protocols. 2007. Humana Press. Totowa, New Jersey. 978-1-59745-189-5. 139–167. Randles, Lucy G. Clarke, Jane. Bai, Yawen. Nussinov, Ruth. The use of high-pressure nuclear magnetic resonance to study protein folding. https://archive.org/details/proteinfoldingpr00yawe/page/139.
Anson ML, Protein Denaturation and the Properties of Protein Groups, Advances in Protein Chemistry, 2, 361-386 (1945)
Schellmann, JA, The thermodynamics of solvent exchange, Biopolymers 34, 1015–1026 (1994)
Myers JK, Pace CN, Scholtz JM, Denaturant m values and heat capacity changes: relation to changes in accessible surface areas of protein unfolding, Protein Sci. 4(10), 2138–2148 (1995)
Robertson, A.D., Murphy, K.P. Protein structure and the energetics of protein stability, (1997), Chem Rev, 97, 1251-1267
Bedouelle. Hugues. Principles and equations for measuring and interpreting protein stability: From monomer to tetramer. Biochimie. 2016. 121. 29–37. 10.1016/j.biochi.2015.11.013. 26607240.

Equilibrium unfolding explained

Theoretical background

Chemical denaturation

Structural probes

Thermal denaturation

Determining the heat capacity of proteins

Assessing two-state unfolding

Generalization to protein complexes and multi-domain proteins

Further reading

Notes and References