Abstract
Free full text
Is Posttranslational Folding More Efficient Than Refolding from a Denatured State: A Computational Study
Associated Data
Abstract
The folding of proteins into their native conformation is a complex process that has been extensively studied over the past half-century. The ribosome, the molecular machine responsible for protein synthesis, is known to interact with nascent proteins, adding further complexity to the protein folding landscape. Consequently, it is unclear whether the folding pathways of proteins are conserved on and off the ribosome. The main question remains: to what extent does the ribosome help proteins fold? To address this question, we used coarse-grained molecular dynamics simulations to compare the mechanisms by which the proteins dihydrofolate reductase, type III chloramphenicol acetyltransferase, and d-alanine–d-alanine ligase B fold during and after vectorial synthesis on the ribosome to folding from the full-length unfolded state in bulk solution. Our results reveal that the influence of the ribosome on protein folding mechanisms varies depending on the size and complexity of the protein. Specifically, for a small protein with a simple fold, the ribosome facilitates efficient folding by helping the nascent protein avoid misfolded conformations. However, for larger and more complex proteins, the ribosome does not promote folding and may contribute to the formation of intermediate misfolded states cotranslationally. These misfolded states persist posttranslationally and do not convert to the native state during the 6 μs runtime of our coarse-grain simulations. Overall, our study highlights the complex interplay between the ribosome and protein folding and provides insight into the mechanisms of protein folding on and off the ribosome.
Introduction
Proteins are synthesized by ribosomes during the nonequilibrium process of translation and must fold to a specific native state, dictated by their amino acid sequence, to function. During translation, proteins are synthesized vectorially from N- to C-terminus based on an mRNA template. The nascent protein is initially confined to the ribosome exit tunnel, an ~10 nm long tunnel with a diameter of 1–2 nm that can accommodate approximately 30 amino acids of the elongating protein.1,2 Due to its dimensions, the exit tunnel restricts the ability of the protein to self-interact and form a tertiary structure. However, many proteins fold cotranslationally3−6 as they begin to emerge from the exit tunnel and acquire a tertiary structure before their synthesis is complete. Though some small domains can fold inside the exit tunnel,3−5 most proteins can only begin to fold once they have left the exit tunnel.7−10 The nonequilibrium nature of protein synthesis means that the ability of a protein to fold cotranslationally can depend on the speed at which amino acids are added to the growing nascent chain.11,12 Refolding of a protein from its full-length denatured state, however, allows all segments of the protein to simultaneously fold without the restriction of the exit tunnel or the influence of translation kinetics. Bulk refolding thus presents the opportunity for the formation of a vast number of non-native contacts between amino acids. In general, cotranslational folding is thought to be a beneficial process that aids in the efficient folding of complex proteomes.13−15 The importance of cotranslational folding is highlighted by the recent experimental finding that one-third of Escherichia coli (E. coli) proteins are not able to refold in bulk solution after complete unfolding,16 suggesting that cotranslational folding is critical to their ability to reach their native state.
The folding of a small number of proteins has been experimentally and computationally studied on and off the ribosome.17−22 Evidence so far suggests that the role of the ribosome in folding is protein-specific. For example, structure-based models in combination with an arrest-peptide assay and cryo-EM experiments indicate that the folding of titin I27 is conserved on and off the ribosome.21 Similarly, experiments and molecular simulations of src SH3 show that its folding pathways are the same on and off the ribosome.22 On the other hand, Tanaka et al. used coarse-grained molecular simulation to study the role of the ribosome in guiding multidomain protein folding, finding that folding on the ribosome is more efficient compared to refolding.18 Dabrowski-Tumanski et al. computationally studied a deeply knotted protein and found that the ribosome plays a key role in knot formation.20 In terms of kinetics, single-molecule laser optical tweezer experiments have found that the arrested ribosome nascent chain complexes have reduced protein folding rates compared to folding in bulk.17,23 These studies mostly focus on small proteins (~100 residues) folding on translationally arrested ribosomes. In vivo, many nascent proteins diffuse into the cytosol after synthesis; if folding is not completed on the ribosome, it may complete posttranslationally. Hence, the ribosome may only influence the formation of intermediate states, which nonetheless can change the outcome of folding.24,25 Given the relative paucity of experimental and computational data on the differences between folding on and off the ribosome for large proteins, we believe the influence of the ribosome on protein folding mechanisms remains an open question.
Performing all-atom folding simulations for large proteins is computationally infeasible. In this study, we, therefore, utilize a topology-based coarse-grained model to simulate the refolding in bulk solution as well as the co- and posttranslational folding of three E. coli enzymes (Figure Figure11): (i) dihydrofolate reductase (DHFR, 159 residues, PDB ID: 4KJK(26)), (ii) type III chloramphenicol acetyltransferase (CAT-III, 213 residues, PDB ID: 3CLA(27)), and (iii) d-alanine–d-alanine ligase B (DDLB, 306 residues, PDB ID: 4C5C(28)). DHFR, the smallest of the three, is composed of two domains.29,30 The adenosine binding domain (ABD) consists of residues 38–106, and the discontinuous loop domain (DLD) comprises residues 1–37 and 107–159 (Figures Figures11a and S1). DHFR catalyzes the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate and has been a target enzyme of antifolate drugs.31 The native structure of CAT-III is composed of eight β-sheets and five α-helices (Figures Figures11b and S1); CAT-III is responsible for the high level of bacterial resistance to chloramphenicol.32 Finally, DDLB is a three-domain protein composed of an N-terminal domain (residues 1–85), central domain (residues 86–180), and C-terminal domain (residues 181–306), each of which is classified as α/β. At the secondary structure level, DDLB contains 10 β-sheets and 11 α-helices (Figures Figures11c and S1) and is an essential enzyme for the proper synthesis and maintenance of the bacterial cell wall.33
In this work, we apply multiple order parameters for protein folding, including the recently described entanglement parameter G, to investigate differences in folding on and off the ribosome. We find that while the ribosome assists the folding of DHFR, it does not promote the folding of CAT-III and DDLB, both of which contain a native entanglement. Our results support a mechanism by which the ribosome may promote the formation of intermediate misfolded states with non-native entanglements; these intermediates are kinetically trapped and persist for long time scales posttranslationally.
Materials and Methods
Simulation Details and Construction of Coarse-Grain Model
We employ a previously published G-based coarse-grain model11,34 in which each amino acid is represented by a single interaction site placed at the Cα atom with a specific van der Waals radius for each amino acid; ribosomal RNA is represented as three or four beads per nucleotide, with one bead located at the phosphate position, another at the centroid of the ribose ring, and one at the centroid of each conjugated ring in the base (one bead for pyrimidine nucleobases and two beads for purine nucleobases). The potential energy of a configuration in this model is computed by the equation
The potential energy of a given conformation is calculated as a sum of the contributions from bonds, dihedral angles, bond angles, electrostatic interactions, Lennard-Jones-like native interactions, and repulsive non-native interactions. Model parameters are described in the previous studies.11,34 Parameters for three proteins in this study were taken from the previous work.11
In posttranslational folding simulations, we first performed continuous synthesis using the wild-type mRNA sequences, which are presented in Table S1. Synthesis simulations were conducted using a previously described protocol,11,35 with a cutout of the ribosome exit tunnel and surface. Codon-specific translation times were obtained from a previous study11 (Supplementary Table 8 of ref (11)). Once the protein sequence was fully synthesized, the covalent bond between the C-terminal site and the peptidyl transferase center (PTC) was cleaved and the protein was allowed to diffuse through the ribosome exit tunnel. Protein dissociation from the ribosome was defined as the point at which the position of the C-terminal residue was greater than 20 Å from the ribosome surface. At this point, the ribosome was removed and the left protein was able to undergo posttranslational folding in the absence of the ribosome.
The refolding simulations were initiated from the unfolded state, characterized by a low fraction of native contacts, Q value. Initial conformations for refolding simulations were generated by heating the native state of the protein to 1000 K for 15 ns. The final conformation from heating was then temperature-quenched at 310 K to initialize refolding. All simulations were carried out using a Langevin thermostat at a temperature of 310 K, with a time step of 15 fs and a friction coefficient of 0.050 ps–1. All simulations were carried out using OpenMM 7.7.36
In order to characterize protein folding, we conducted 200 statistically independent folding trajectories for each protein under investigation (100 trajectories of refolding and 100 trajectories of posttranslational folding). Each trajectory lasted for 6 μs, which corresponds to a real-time duration of approximately 24 seconds based on the relative acceleration of folding in these coarse-grain models relative to real time scales.11,34 For CAT-III and DDLB, which had a high prevalence of misfolded trajectories, we extended the simulation time to 30 and 15 μs, respectively, in order to determine if the proteins would eventually fold correctly in a longer time scale.
Calculation of the Fraction of Native Contacts, Q, and Its Usage to Determine Folded Trajectories
Two residues are considered to form a native contact if their α carbons are less than 8 Å apart in the crystal structure. To account for thermal fluctuations in contact distances during simulation, a flexibility parameter Δ = 1.2 was used: a native contact between two residues is classified to be formed in a current frame of the simulated trajectory if their distance is shorter than 1.2 times the distance in the crystal structure. The fraction of native contacts, Q, was calculated for each protein during their posttranslational folding or refolding simulations. Only contacts between pair of residues i and j both within secondary structural elements as identified by STRIDE37 and satisfying the criterion |i – j| > 3, where i and j are the residue indices, were considered; we excluded any secondary segment that is shorter than four residues from the analysis. To determine when a given trajectory of a protein is folded, we first characterized the fraction of native contact, Q, of each protein’s native state by performing ten 1.5 μs coarse-grained simulations at 310 K initialized from the native-state coordinates. The threshold for protein folding during refolding or posttranslational folding simulations, Qthreshold, was determined as Qthreshold = QmodeNS – 3σ, where QmodeNS is the average Qmode over all 15 ns windows of the ten 1.5 μs native-state simulations and σ is the standard deviation of QmodeNS. To determine when folding occurred during refolding or posttranslational folding simulations, the mode of the Q values over a sliding 15 ns window was compared to the Qthreshold. A given trajectory is defined as folded if during its time evolution, Qmode15-ns ≥ Qthreshold, the folding time is the first time that the above condition is met.35,38 The threshold value of Q for each protein is presented in Table 1.
Table 1
protein | Qthreshold = QmodeNS – 3σ |
---|---|
DHFR | 0.9221 |
CAT-III | 0.9269 |
DDLB | 0.9521 |
Estimating the Folding Time of Slow-Folding Proteins with a Large Proportion of Unfolded Trajectories
When the portions of folded trajectories are less than 50% of total trajectories, it is not possible to estimate the folding time as the median first passage time.
We consider three-state folding kinetics with parallel pathways. State A folds rapidly to the native state N at the rate k1, and state B folds slowly to the native state with a much smaller rate k2 (k1 k2), and there is no interconversion between A and B. We have a set of ordinary differential equations respecting the rate of changing portion of states A and B
where [A] and [B] are the portion of non-native states A and B. The portion (survival probability) of non-native states at time t: SU(t) = [A](t) + [B](t) = c1 exp(−k1t) + c2 exp(−k2t), where c1 and c2 are arbitrary constants. The initial condition that at time t = 0, the survival probability of non-native state = 1, we have SU (t = 0) = c1 + c2 = 1, this yields: c2 = 1 – c1.
Hence, we computed the survival probability of the unfolded state as a function of time from simulations, and the resulting time series were then fit to the double-exponential equation
c1, k1, and k2 are the fitting parameters. The time constants of the two kinetic phases are , with the larger of these two times determining the overall time scale of the folding process, τ2 τ1. To estimate the uncertainty of the folding time when fitting to double-exponential folding kinetics, we apply bootstrap resampling by randomly selecting trajectories from the list of simulations. We only consider the random sample with the coefficient of determination R2 > 0.9. This procedure was applied to estimate the folding time of CAT-III.
Definition of the Progress Variable ς and Use to Monitor the Sequence of Pairs of Native Secondary Structure Elements Formed during the Folding Process
To account for the significant variation in folding times among different trajectories, we monitored folding pathways as a function of a progressive variable,39 ς, defined as
where <···> indicates the average over all folded trajectories, and tpair,i and tfold,i are the folding time of pair and the whole protein folding time of the folded trajectory i, respectively. With this definition, we have 0 ≤ ς ≤ 1, ς = 0, which means that the pair under studied folds at the start of the simulation, and ς = 1 indicates the pair folds as the last step in the folding process. To determine the sequence of pairs of the secondary structure formation (defined in Figure S1 and Table S2), we consider a pair between two secondary structure elements that have more than one native contact. A pair is considered to be folded if its fraction of native contacts is larger than the threshold determined from native simulations. In our analysis of folding pathways, trajectories that did not fold within the 6 μs simulation duration were excluded.
Identifying Entanglement and the Changes in Entanglement
We use the approximation to the partial Gaussian double integration method proposed by Baiesi and co-workers40 to calculate these partial linking numbers for a closed (loop) and open curve (termini). To identify lasso-like entanglements, we used the numerically invariant linking numbers,41 which describe the linking between a closed loop and an open segment in a three-dimensional space. This procedure is a modified version of the original protocol proposed by Baiesi to detect entanglement in coarse-grain protein structures. The original protocol is not computationally efficient to analyze trajectories since for each pair of contact, we have to calculate the linking number for all possible combinations of loop and threading segments. In our modified protocol, we only have to calculate the linking number between the closed loop (closes by native contact) and two tails. The closed loop is composed of the peptide backbone connecting residues i and j that form a native contact. Outside this loop is an N-terminal segment composed of residues 5 through i – 4 and a C-terminal segment composed of residues j + 4 through N – 5, where we exclude the first five residues of the N-terminal curve, the last five residues of the C-terminal curve, and four residues before and after the native contact to eliminate the error introduced by both the high flexibility and contiguity of the termini and trivial entanglements in the local structure; this metric is similar to whGLN.42 We can characterize the entanglement of each tail with the loop formed by the native contacts with two partial linking numbers denoted gN and gC. For a given structure of an N-residue protein, with a native contact present at residues (i, j), the coordinates Rl and the gradient dRl of the point l on the curves were calculated as
where rl is the coordinates of the Cα atom in residue l. The linking numbers gN(i, j) and gC(i, j) were calculated as
The total linking number for a native contact (i, j) is therefore estimated as
Comparing the absolute value of the total linking number for a native contact (i, j) to that of a reference state allows us to detect a gain or loss of linking between the backbone trace loop and the terminal open curves as well as any switches in chirality. Therefore, there are six changes in linking cases we should consider when using this approach to quantify entanglement (see Supplementary Figure S1 and Table 1 of ref (43)).
The degree of entanglement G is defined as
where (i, j) is the native contact in the crystal structure; NC is the set of native contacts formed in the current structure at time t; and g(i,j,t) and gnative (i,j) are, respectively, the total linking number of the contact (i, j) at time t and native structures estimated using eq 7. M is the total number of native contacts in the native structure and Θ is a Heaviside step function, equals 1 if the condition is true and equals 0 if the condition is false.
The difference between g(i,j,t) and G(t) is g(i,j,t), which is characterized by the entanglement in a given structure of contact (i, j) at time t, while G(t) provided information about the total number of contacts that changed the entanglement at time t.
Clustering and Coarse-Graining Conformational Space (Q, G)
The projection of conformation space onto (Q,G) reveals intermediate states that may be hidden when projected onto Q alone, as two states can have the same value of Q but one may be entangled while the other is not. Entanglement can prevent a protein from reaching its native state, as the loop-threading segment is improperly organized. Entangled states thus can form kinetic traps with large energy barriers preventing progression to the folded state, as large sections of the protein must unfold to allow disentanglement. To derive the log probability surface as a function of (Q,G), we first combined (Q, G) data from refolding and posttranslational folding for each protein and applied the Min–Max algorithm44 for normalization. K-mean++ clustering45 was then utilized to identify microstates, with 200, 400, and 400 clusters (microstates) being used for DHFR, CAT-III, and DDLB, respectively. As k-mean++ is a distance-based clustering algorithm, the normalization of data was necessary to prevent one-order parameter from dominating the distance measure. The resulting clusters were further coarsened into a small number of metastable states using the PCCA+ algorithms46 to facilitate the interpretation of the folding pathways. The number of metastable states was determined based on the presence of a gap in the eigenvalue spectrum of the transition probability matrix; 11, 14, and 13 metastable states were used for DHFR, CAT-III, and DDLB, respectively. Both the clustering and coarse-graining processes were performed by using the PyEmma47 and Deeptime48 packages.
Identify Folding Pathways along the Order Parameters (Q, G)
To identify folding pathways from the simulated trajectories, the following procedure was followed:
- (1)
For each discrete trajectory, the starting state of the first frame is added to the pathway.
- (2)
The trajectory is then advanced, and the next state that differed from the last recorded state in the pathway was identified. If this state had not yet been recorded in the pathway, it was added to the pathway. If the state is already been recorded in the pathway, the pathway was truncated at the first instance of the recorded state and the trajectory was advanced from that point.
- (3)
Repeat Step (2) until the end of the trajectory is reached.
This process resulted in pathways that contained no loops, and only recorded the on-pathway states for each discrete trajectory. The distribution of distinct pathways and the probabilities of transitioning from one state to another was then estimated based on the pathways of all of the discrete trajectories. The initial, folded, and misfolded states (in the folding/misfolding pathways plots) are colored yellow, blue, and red, respectively. A state is considered misfolded if there is a trajectory that becomes trapped in that state, and there is no direct transition to the native state. The size of the nodes is proportional to the probability of the state appearing in the coarse-grained trajectories. The size of the edges connecting the nodes is proportional to the number of transitions between states, and the red number beside the edge is the total number of transitions observed in the coarse-grained trajectories.
Back-Mapping the Coarse-Grained Model to an All-Atom Model for Visualization
To backmap the coarse-grained model to all-atom representation, the first step was to add coarse-grained interaction sites that represent the side-chain center of mass near the corresponding Cα beads. Then, the orientation of the side-chain center of mass beads was optimized through energy minimization while restraining the Cα positions. Next, Pulchra software49 was used to rebuild the nonhydrogen atoms of both the backbone and the side chain. Finally, additional energy minimization was performed in vacuum with position restraints applied to all Cα atoms to obtain the final all-atom structure.
Results and Discussion
DHFR Folds More Efficiently due to Protein Synthesis
To understand the influence of protein synthesis and the ribosome on the folding of DHFR, we constructed a topology-based coarse-grain model (see the Materials and Methods section) and simulated its folding through two different processes. First, we simulated protein refolding starting from a thermally unfolded ensemble. Second, to probe its folding when synthesized by the ribosome, we simulated continuous synthesis and posttranslational folding. This model has been previously shown to reproduce the cotranslational folding of HemK N-terminal domain,2 accurately predict changes in enzyme-specific activities,11 and to predict misfolded conformations of GlpD that qualitatively agree with LiP-MS experiments.35 To characterize the similarities and differences in how proteins reach the native state, we only analyzed the trajectories that resulted in successful folding.
We find that DHFR folds more efficiently when synthesized by the ribosome and undergoes posttranslational folding. However, when refolding from unfolded ensembles, some trajectories are trapped in misfolded states (Q < Qthreshold) during the 6 μs of simulation time. Specifically, DHFR rapidly transitions from the initial structural ensemble to the folded ensemble. Since these simulations are out-of-equilibrium, we cannot speak of free-energy landscapes, which are time-independent; instead, we compute log probability landscapes (Figure Figure22a), which are time-dependent. This nonequilibrium landscape perspective for refolding and posttranslational folding simulations reveals differences between the two processes. DHFR has a well-defined structure composed of two main subdomains: the adenosine binding subdomain (ABD, residues 38–106) and the discontinuous loop subdomain (DLD, residues 1–37 and 107–159) (Figure Figure11a). In posttranslational folding simulations, this protein samples a smaller region of Q and the ABD domain folds cotranslationally and has the native form (QABD = 0.98; Figure S2) at the start of posttranslational simulations. The DLD domain, consisting of both the N-terminus outside of the ribosome exit tunnel and the C-terminus, which is still within the exit tunnel, has a lower degree of native contacts QDLD = 0.27 (Figure S2). As a result, at the start of the posttranslational simulation, the overall structure of DHFR has approximately 60% of its native contacts formed, and the protein simply rearranges the DLD domain into the correct registry when the C-terminus is released from the exit tunnel. All trajectories reach the folded state (Q ≥ Qthreshold or Qnormalized ≥ 1) with a median folding time of 20.5 ns (95% confidence interval (CI) [18.5 ns, 24.8 ns], computed from bootstrapping). In contrast, refolding from the thermally unfolded ensemble involves initial conformations with a high degree of disorder (Q < 0.1 for both ABD and DLD domains; Figure S2), sampling a wider range of the log probability landscape (Figure Figure22a). Overall, the protein takes a longer time to reach the native state compared to posttranslational folding (Figure Figure22b), with a median folding time of 140.5 ns (95% CI [114.6 ns, 196.1 ns]) (Table 2). Only 92 (95% CI [86, 97]) trajectories fold out of 100 during the simulation. The difference between the median folding times is significant (p-value < 1 × 10–6, permutation test; Table 2), as well as the number of folded trajectories (p-value = 0.007; Table 2) between posttranslational folding and refolding. In both cases, the folding of DHFR proceeds with the ABD folding into its native form first, followed by the folding of the DLD (Figure S2). The folding of DLD is thus rate-limiting to the formation of the overall native structure.
Table 2
refolding | posttranslational
folding | |||
---|---|---|---|---|
protein | # folded trajectories [95% CI] | folding time (ns) [95% CI] | # folded trajectories [95% CI] | folding time (ns) [95% CI] |
DHFR | 92 [86, 97] | 140.5 [114.6, 196.1] | 100 [100, 100] | 20.5 [18.5, 24.8] |
p-value (folded trajectories) = 0.007 | ||||
p-value (folding time) < 10–6 | ||||
CAT-III | 42 [32, 52] | 2.3 × 105 [6.5 × 104, 1.7 × 1012] | 31 [22, 40] | 2.05 × 105 [7.8 × 104, 1.6 × 1012] |
p-value (folded trajectories) = 0.14 | ||||
p-value (folding time) = 0.96 | ||||
DDLB | 76 [67, 84] | 522.5 [412.1, 712.2] | 78 [70, 86] | 426.3 [264.7, 690.9] |
p-value (folded trajectories) = 0.87 | ||||
p-value (folding time) = 0.18 |
Protein Synthesis Does Not Increase the Folding Efficiency of CAT-III and DDLB
Using the same simulation protocol as DHFR, we performed refolding and posttranslational folding for CAT-III and DDLB proteins. In contrast to DHFR, the folding dynamics and population of folded trajectories for CAT-III and DDLB are relatively insensitive to posttranslational folding versus refolding. Specifically, for CAT-III, the log probability landscape of CAT-III is almost identical between posttranslational folding and refolding (Figure Figure22c). The progress of normalized Q of the folded trajectories is similar (Figure Figure22d), and the difference in the number of folded trajectories is insignificant (p-value = 0.14; Table 2). There are a large number of misfolded trajectories (Q < Qthreshold; Table 2) within the simulation time of 6 μs. The proportion of folded trajectories for CAT-III is less than 50%; we, therefore, estimated its folding time by fitting the survival probability of the unfolded state as a function of time to a three-state kinetic model (eq 3; see Materials and Methods section). There is no statistical difference in folding times for CAT-III between refolding (2.3 × 105 ns, 95% CI [6.5 × 104 ns, 1.7 × 1012 ns]) and posttranslational folding (2.05 × 105 ns, 95% CI [7.8 × 104 ns, 1.6 × 1012 ns]), p-value = 0.96 (Table 2).
In the case of DDLB, more than 50% of trajectories are folded; hence, the median folding time could be estimated. We find that the median folding time in refolding is 522.5 ns (95% CI [412.1 ns, 712.2 ns]), compared to the folding time in posttranslational folding, which is 426.3 ns ([264.7 ns, 690.9 ns]). We find that there is no difference in the median folding times or the number of folded trajectories between the refolding and posttranslational folding simulations (p-value = 0.87 for the number of folded trajectories and p-value = 0.18 for the median folding time comparisons; Figure Figure22f and Table 2). However, there are some observed differences: the log probability landscape in the posttranslational folding of DDLB sampled a smaller region along the Q coordinate, and the local minima were deeper compared to refolding (Figure Figure22e). This suggests that the cotranslational formation of native contacts may have occurred after translation.
To test the influence of simulation time on the results, the misfolded trajectories for CAT-III were extended to 30 μs and the misfolded trajectories for DDLB were extended to 15 μs. We find that only one additional trajectory each from the refolding and posttranslational folding simulations of CAT-III folds during this extended duration, at 15 and 29.2 μs, respectively. No misfolded trajectories of DDLB folded in either the refolding or posttranslational folding simulations. This suggests that these misfolded trajectories are kinetically trapped and unlikely to convert to the folded state at longer time scales—consistent with previously published results.11
Measuring the Folding Mechanisms of Proteins Using Progress Variable ς Reveals the Differences for DHFR and Remains Robust for CAT-III and DDLB
Protein folding is typically thought to occur in a hierarchical fashion, with secondary structural elements first forming individually and then cooperatively coalescing into tertiary structures. With this in mind, we characterize the folding process of DHFR, CAT-III, and DDLB as the temporal sequence of formation of their stable pairs of native secondary structural elements with the aid of a progress variable, ς (see the Materials and Methods section, eq 4). The value of ς is relative to the time of complete folding of the protein, with ς = 0 indicating that the pair folds at the start of the simulation and ς = 1 indicating the pair folds as the last step in the folding process. To simplify the analysis, we restrict ourselves to pairs of secondary structures that have more than one native contact, as described in the Materials and Methods section and Table S2.
Based on this analysis, we observe a significant difference in DHFR. In posttranslational folding, all pairs of the native secondary structural elements belonging to the ABD domain fold cotranslationally (ς ~ 0), while in refolding, most of the pairs fold at the end of the folding process (ς ~ 1) (Figure Figure33a and Table 3). This suggests that the vectorial synthesis from the N-terminus to the C-terminus prevents the spontaneous cotranslational folding of some β-sheets in the C-terminal (C1, C2) and that the complete folding of DHFR occurs immediately upon release of the C-terminal from the ribosomal exit tunnel. These observations are consistent with previous experimental studies that have found that the central domain (ABD) acts as an independent folding unit during translation, while the DLD domain folds posttranslationally.30 For CAT-III, the sequence of secondary structure pair folding is similar in both refolding and posttranslational folding, with all pairs folding late during the folding process (ς ~ 1; Figure Figure33b and Table 3). For DDLB, the overall folding order is similar, but some differences were observed, such as in posttranslational folding, four pairs in the center domain (C13, C19, C22, and C24) fold cotranslationally (ς = 0), two pairs in the N-terminal domain (C7, C8) fold posttranslationally but before the complete folding occurs (ς ~ 0.65; Figure Figure33c and Table 3), while these pairs fold at the last event in refolding. Thus, protein synthesis and posttranslational folding do not significantly perturb the folding mechanisms of CAT-III and DDLB.
Table 3
protein | refolding | posttranslational folding |
---|---|---|
DHFR | C9 → C12 → C7 → (C8, C13) → (C1, C2, C3, C4, C5, C6, C10, C11) | (C4, C5, C6, C7, C8, C9, C11, C12, C13) → C10 → C3 → C1 → C2 |
CAT-III | C14 → (C5, C7, C8, C9, C12) → (C1, C2, C3, C4, C6, C10, C11, C13) | (C5, C7, C8, C9, C10, C11, C12, C14) → (C1, C2, C3, C4, C6, C13) |
DDLB | (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, C24, C25, C26, C27, C28) | (C13, C19, C22, C24) → (C7, C8) → (C1, C2, C3, C4, C5, C6, C9, C10, C11, C12, C14, C15, C16, C17, C18, C20, C21, C23, C25, C26, C27, C28) |
Native Entanglements Exist in the Crystal Structure of CAT-III and DDLB Proteins
We hypothesized that there is something distinct about the native topologies of CAT-III and DDLB that leads to a large proportion of misfolding. Several recent papers have predicted a link between misfolding involving a change in the entanglement status and long-lived misfolded states,11,35 including the failure to form native entanglements. Indeed, this is the molecular hypothesis explaining the observation that experimental folding rates of proteins decrease as the number of times the threading segment pierces the loop increases.40 To further understand this phenomenon, we investigate whether entanglement may play a role here by calculating the degree of entanglement for these proteins using eq 7.
We find that the crystal structure of DHFR does not contain any entanglements. In contrast, CAT-III has 16 native entanglements, with 14 of them consisting of a loop located near the N-terminus and a threading segment at the C-terminus. The remaining two native entanglements have a loop located near the C-terminus and a threading segment at the N-terminus. Similarly, DDLB has 36 native entanglements, half of which consist of a loop located closer to the N-terminus and a threading segment at the C-terminus, while the other half has a loop located closer to the C-terminus and a threading segment at the N-terminus. Representative examples of these entanglements are shown in Figure Figure44a,b for CAT-III and DDLB, respectively. Furthermore, proteins with an entanglement loop closer to the N-terminus were found to be folded more difficultly than the proteins with a loop closer to the C-terminus.50 This explains why DHFR (without native entanglement) can fold easily and small portions of CAT-III trajectories (most entanglement loops are located near the N-terminus) folds in our simulation. This observation suggests that entanglement plays an important role in the proper folding of proteins.
Protein Synthesis Assists the Folding of DHFR by Avoiding Misfolded States with Non-Native Entanglements
Entanglement plays an important role in the proper folding of proteins. To further characterize the folding pathways of DHFR, we clustered the conformational space based along the order parameters Q and G and then assigned them to metastable states (see the Materials and Methods section). In posttranslational folding, DHFR can spontaneously fold to its native state once the C-terminus is released from the ribosome. The two-dimensional log probability surface is concentrated in the region around the folded state (small G, high Q; Figure Figure55b), which is consistent with our 1D log probability landscape from the previous section. Specifically, the posttranslational folding simulations of DHFR only sample two states, 5 and 10 (the folded state). There are no misfolded trajectories in posttranslational folding, as all trajectories reach the folded state at the end of the simulation. The protein cotranslationally folds to the ensemble state 5, which has about 60% of native contacts formed (the fraction of native contacts with non-native entanglement is negligible, around 0.16%), and the folding process simply involves diffusion to the folded state (state 10). Folding network analyses reveal that 100% of folding pathways go straight from the initial state 5 to the folded state 10 (Figure Figure55d). There is no off-pathway state in the posttranslational folding of DHFR.
Refolding from the thermally unfolded ensemble is more complicated, compared to posttranslational folding. The −ln(P) surface has sampled a broad region in the non-native (low Q) or near-native (high Q) regions. We found that the population of DHFR refolding samples had a large number of entangled states, indicated by high values of G (Figures Figures55a and S3). The protein follows two parallel pathways to reach the native state: we find that the dominant pathway (*→ 5 → 10), which is the only pathway observed in posttranslational folding, accounts for 87% of the total trajectories in refolding simulation and a small portion (four trajectories, accounts for 4% of total trajectories) folds via intermediate state 7 (*→ 7 → 10). In addition, we find that 9% of trajectories become trapped in misfolded states (states 7–9). The broader −ln(P) surface in refolding is caused by a small number of misfolded trajectories. Five trajectories become trapped in state 7, three trajectories become trapped in state 8, and one trajectory becomes trapped in state 9. States 8 and 9 are off-pathway misfolded states, as we do not observe any folding events (conversion to the folded state 10) if the protein visits these states. When the protein samples the near-native state 7, only 40% of trajectories can fold successfully (* → 7 → 10/folded), while the remaining 60% fold to misfolded states (Figure Figure55c).
Non-Native Entangled States Act as a Kinetic Trap in Both Refolding and Posttranslational Folding of CAT-III and DDLB
In contrast to DHFR, it seems that the ribosome has less effect on the folding/misfolding mechanism of CAT-III. The conformational space is very similar between refolding and posttranslational folding, and these two processes share almost all of the observed states (Figure Figure66). This is reasonable as we have observed that when the protein synthesis is completed, there is a small portion of native contacts that have been formed in CAT-III and hence can be considered an unfolded state (Q ~ 30%; Figure Figure22d). Therefore, when the protein dissociates from the ribosome and undergoes posttranslational folding, this process is similar to folding from unfolded ensembles.
There are two critical classes of intermediate states in the folding of CAT-III: state 1, which leads to misfolding when some native contacts change entanglement, and state 2, which leads to the native state without changing entanglement. In posttranslational folding, a large number of trajectories initiate in state 1 (68%, then transition to state 3) and state 3 (9%), with some portions of native contacts changing entanglement. These trajectories mainly end up in misfolded states (states 9 and 12). Only 27% (CI 95% [19%, 36%]) of total trajectories can fold to the native state. In refolding, the process starts in the fully unfolded state 0 and diversifies to state 1 (40%), where some contacts change entanglement and lead to further misfolding, and a larger number of trajectories go to state 2 (60%) and then fold correctly to the native state. This results in slightly more folded trajectories in refolding (38%, CI 95% [30%, 49%]) compared to posttranslational folding. Thus, protein synthesis and posttranslational folding do not increase the folding efficiency of CAT-III compared to refolding but rather cause the protein to partially fold into misfolded intermediate states.
States 9 and 12 are likely long-lived misfolded states, as even when we extended the simulation time to 30 μs, we did not observe any misfolded trajectories folding to the native state (when considering both Q and G parameters). All of these misfolded states are near-native (high Q) and have a large number of native contacts changing entanglement (Figure S4).
Similar to CAT-III, the ribosome does not aid in the proper folding of DDLB (Figure Figure77 and Table 4). Our simulations indicate that the overall −ln(P) surface is similar in refolding and posttranslational folding simulations. The dominant folding pathway is * → 2 → 4 → 8 → 12. In the posttranslational folding simulation, if the DDLB protein is in states 2 or 4 after protein synthesis (which occurs in 64% of trajectories), it has a high likelihood of successfully folding posttranslationally (2|4 → folded: 98.4%). On the other hand, if the protein is in states 1 or 3 after protein synthesis (36% of all trajectories in our simulations), it is likely to result in a misfolded state posttranslationally (1|3 → misfolded: 97.7%). This has also been observed experimentally for other proteins.13,24,25
Table 4
pathways | percent (%) | pathways | percent (%) |
---|---|---|---|
Refolding | |||
0 → 1 | 28 | 1 → misfolded | 67.9 |
0 → 2 | 72 | 2 → folded | 83.3 |
Posttranslational folding | |||
cotranslational folding → 1|3 | 36 | 1|3 → misfolded | 97.7 |
cotranslational folding → 2|4 | 64 | 2|4 → folded | 98.4 |
Analysis of refolding pathways shows a similar distribution to posttranslational folding, with two classes of folding: one leading to correct folding (69%) and the other leading to misfolding (31%). In refolding simulations, proteins that start in a fully unfolded state (state 0) diversify into the intermediate misfolded state 1 (28% of transitions, with a change in entanglement) and remain trapped in misfolded states (1 → misfolded: 67.9%), while those that sink to state 2 mainly transition to the native state 12 (2 → folded: 83.3%).
State 10 is observed in refolding simulations but not in posttranslational folding, while states 7 and 9 are observed in posttranslational folding but not in refolding. These differences are exhibited in a single misfolded trajectory. In both refolding and posttranslational folding, we did not observe the transition from the near-native state 11 to the native state 12.
Overall, protein synthesis does not increase the folding efficiency of CAT-III and DDLB; intermediate states with non-native entanglement form cotranslationally and persist posttranslationally, and these states act as kinetic traps in protein folding. It should be noted that this work uses a “structure-based” model of protein folding, which encodes that the native state is the global minimum of free energy in our simulations; hence, misfolded states (i.e., those observed for CAT-III and DDLB) are metastable states and kinetically trapped, meaning that they have high free-energy barriers separated from the native state, making them convert to the native state very slowly. One possible limitation of our approach is that the non-native entangled states that we observed can be artifacts of our coarse-grained model. However, in a recent study, we showed that non-native entangled states also occur in all-atom simulations of proteins,43 suggesting that they are not model-dependent. Moreover, various recent studies have also reported a correlation between changes in entanglement and digestion patterns from Limited Proteolysis Mass Spectrometry.11,35 Taken together, these results suggest that our coarse-grained model predictions are reliable.
Conclusions
Protein folding in vivo is not solely regulated by the ribosome. Various other proteins and folding factors, such as chaperones, play a critical role in the process.51−53 In this study, we aimed to investigate the influence of the ribosome on protein folding alone. While it is commonly believed that the ribosome is generally effective in assisting protein folding to native conformations,14,15,54,55 our data do not consistently support this assumption. We do find the ribosome increases the folding efficiency of DHFR, in which two domains ABD and DLD fold independently. The ribosome confines the DLD domain inside the exit tunnel, allowing the ABD domain to fold cotranslationally and without interference; then, the DLD domain arranges into the correct native topology once released from the ribosome. In contrast, during refolding, all segments of the protein are simultaneously folding, presenting the opportunity for the formation of several non-native contacts between amino acids, thus enhancing the probability of being trapped in entangled misfolded states. For CAT-III and DDLB, which contain native entanglements, we did not observe an improvement in folding efficiency due to the ribosome, and in some cases, the ribosome caused these proteins to form intermediate misfolded states during cotranslational synthesis and these misfolded states persisted posttranslationally.
In conclusion, our findings suggest that the effect of ribosomes on protein folding is protein-specific and cannot be described by a universal rule. In general, the ribosome does not have a significant influence on folding outcomes.
Acknowledgments
M.S.L. acknowledges that this work was supported by the National Science Centre, Poland (grant 2019/35/B/ST4/02086). E.P.O. acknowledges support from the National Science Foundation (MCB-1553291) as well as the National Institutes of Health (R35-GM124818). This research was supported in part by the TASK Supercomputer Center in Gdansk and PLGrid Infrastructure in Poland (Prometheus and Ares supercomputers).
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.3c01694.
Secondary structure diagrams of DHFR, CAT-III, and DDLB; fraction of native contacts versus time of ABD and DLD domains of DHFR; representative structures of the metastable states from clustering of DHFR; representative structures of the metastable states from clustering of CAT-III; representative structures of the metastable states from clustering of DDLB; mRNA templates used in continuous synthesis simulations; and structural definitions from pairs of secondary structures (PDF)
Notes
The authors declare no competing financial interest.
References
- Voss N. R.; Gerstein M.; Steitz T. A.; Moore P. B. The Geometry of the Ribosomal Polypeptide Exit Tunnel. J. Mol. Biol. 2006, 360, 893–906. 10.1016/j.jmb.2006.05.023. [Abstract] [CrossRef] [Google Scholar]
- Nissley D. A.; O’Brien E. P. Structural Origins of FRET-Observed Nascent Chain Compaction on the Ribosome. J. Phys. Chem. B 2018, 122, 9927–9937. 10.1021/acs.jpcb.8b07726. [Abstract] [CrossRef] [Google Scholar]
- O’Brien E. P.; Christodoulou J.; Vendruscolo M.; Dobson C. M. New Scenarios of Protein Folding Can Occur on the Ribosome. J. Am. Chem. Soc. 2011, 133, 513–526. 10.1021/ja107863z. [Abstract] [CrossRef] [Google Scholar]
- Marino J.; Von Heijne G.; Beckmann R. Small Protein Domains Fold inside the Ribosome Exit Tunnel. FEBS Lett. 2016, 590, 655–660. 10.1002/1873-3468.12098. [Abstract] [CrossRef] [Google Scholar]
- Nilsson O. B.; Hedman R.; Marino J.; Wickles S.; Bischoff L.; Johansson M.; Müller-Lucks A.; Trovato F.; Puglisi J. D.; O’Brien E. P.; et al. Cotranslational Protein Folding inside the Ribosome Exit Tunnel. Cell Rep. 2015, 12, 1533–1540. 10.1016/j.celrep.2015.07.065. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Ciryam P.; Morimoto R. I.; Vendruscolo M.; Dobson C. M.; O’Brien E. P. In Vivo Translation Rates Can Substantially Delay the Cotranslational Folding of the Escherichia Coli Cytosolic Proteome. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, E132–E140. 10.1073/pnas.1213624110. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Cabrita L. D.; Cassaignau A. M. E.; Launay H. M. M.; Christopher A. A Structural Ensemble of a Ribosome-Nascent Chain Complex during Co-Translational Protein Folding. Nat. Struct. Mol. Biol. 2017, 23, 278–285. 10.1038/nsmb.3182.A. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Goldman D. H.; Kaiser C. M.; Milin A.; Righini M.; Tinoco I.; Bustamante C. Mechanical Force Releases Nascent Chain-Mediated Ribosome Arrest in Vitro and in Vivo. Science 2015, 348, 457–460. 10.1126/science.1261909. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Farías-Rico J. A.; Selin F. R.; Myronidi I.; Frühauf M.; Von Heijne G. Effects of Protein Size, Thermodynamic Stability, and Net Charge on Cotranslational Folding on the Ribosome. Proc. Natl. Acad. Sci. U.S.A. 2018, 115, E9280–E9287. 10.1073/pnas.1812756115. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Frydman J. Folding of Newly Translated Proteins In Vivo: The Role of Molecular Chaperones. Annu. Rev. Biochem. 2001, 70, 603–647. 10.1146/annurev.biochem.70.1.603. [Abstract] [CrossRef] [Google Scholar]
- Jiang Y.; Neti S. S.; Sitarik I.; Pradhan P.; To P.; Xia Y.; Fried S. D.; Booker S. J.; O’Brien E. P. How Synonymous Mutations Alter Enzyme Structure and Function over Long Timescales. Nat. Chem. 2023, 15, 308–318. 10.1038/s41557-022-01091-z. [Abstract] [CrossRef] [Google Scholar]
- Buhr F.; Jha S.; Thommen M.; Mittelstaet J.; Kutz F.; Schwalbe H.; Rodnina M. V.; Komar A. A. Synonymous Codons Direct Cotranslational Folding toward Different Protein Conformations. Mol. Cell 2016, 61, 341–351. 10.1016/j.molcel.2016.01.008. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Alexander L. M.; Goldman D. H.; Wee L. M.; Bustamante C. Non-Equilibrium Dynamics of a Nascent Polypeptide during Translation Suppress Its Misfolding. Nat. Commun. 2019, 10, 270910.1038/s41467-019-10647-6. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Liutkute M.; Samatova E.; Rodnina M. V. Cotranslational Folding of Proteins on the Ribosome. Biomolecules 2020, 10, 97.10.3390/biom10010097. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Netzer W. J.; Hartl F. U. Recombination of Protein Domains Facilitated by Co-Translational Folding in Eukaryotes. Nature 1997, 388, 343–349. 10.1038/41024. [Abstract] [CrossRef] [Google Scholar]
- To P.; Whitehead B.; Tarbox H. E.; Fried S. D. Nonrefoldability Is Pervasive across the E. Coli Proteome. J. Am. Chem. Soc. 2021, 143, 11435–11448. 10.1021/jacs.1c03270. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Kaiser C. M.; Goldman D. H.; Chodera J. D.; Tinoco I.; Bustamante C. The Ribosome Modulates Nascent Protein Folding. Science 2011, 334, 1723–1727. 10.1126/science.1209740. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Tanaka T.; Hori N.; Takada S. How Co-Translational Folding of Multi-Domain Protein Is Affected by Elongation Schedule: Molecular Simulations. PLoS Comput. Biol. 2015, 11, e1004356.10.1371/journal.pcbi.1004356. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Samelson A. J.; Jensen M. K.; Soto R. A.; Cate J. H. D.; Marqusee S. Quantitative Determination of Ribosome Nascent Chain Stability. Proc. Natl. Acad. Sci. U.S.A. 2016, 113, 13402–13407. 10.1073/pnas.1610272113. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Dabrowski-Tumanski P.; Piejko M.; Niewieczerzal S.; Stasiak A.; Sulkowska J. I. Protein Knotting by Active Threading of Nascent Polypeptide Chain Exiting from the Ribosome Exit Channel. J. Phys. Chem. B 2018, 122, 11616–11625. 10.1021/acs.jpcb.8b07634. [Abstract] [CrossRef] [Google Scholar]
- Tian P.; Steward A.; Kudva R.; Su T.; Shilling P. J.; Nickson A. A.; Hollins J. J.; Beckmann R.; von Heijne G.; Clarke J.; Best R. B. Folding Pathway of an Ig Domain Is Conserved on and off the Ribosome. Proc. Natl. Acad. Sci. U.S.A. 2018, 115, E11284–E11293. 10.1073/pnas.1810523115. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Guinn E. J.; Tian P.; Shin M.; Best R. B.; Marqusee S. A Small Single-Domain Protein Folds through the Same Pathway on and off the Ribosome. Proc. Natl. Acad. Sci. U.S.A. 2018, 115, 12206–12211. 10.1073/pnas.1810517115. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Liu K.; Maciuba K.; Kaiser C. M. The Ribosome Cooperates with a Chaperone to Guide Multi-Domain Protein Folding. Mol. Cell 2019, 74, 310–319.e7. 10.1016/j.molcel.2019.01.043. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Plessa E.; Chu L. P.; Chan S. H. S.; Thomas O. L.; Cassaignau A. M. E.; Waudby C. A.; Christodoulou J.; Cabrita L. D. Nascent Chains Can Form Co-Translational Folding Intermediates That Promote Post-Translational Folding Outcomes in a Disease-Causing Protein. Nat. Commun. 2021, 12, 644710.1038/s41467-021-26531-1. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Liu K.; Rehfus J. E.; Mattson E.; Kaiser C. M. The Ribosome Destabilizes Native and Non-Native Structures in a Nascent Multidomain Protein. Protein Sci. 2017, 26, 1439–1451. 10.1002/pro.3189. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- van den Bedem H.; Bhabha G.; Yang K.; Wright P. E.; Fraser J. S. Automated Identification of Functional Dynamic Contact Networks from X-Ray Crystallography. Nat. Methods 2013, 10, 896–902. 10.1038/nmeth.2592. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Leslie A. G. W. Refined Crystal Structure of Type III Chloramphenicol Acetyltransferase at 1·75 Å Resolution. J. Mol. Biol. 1990, 213, 167–186. 10.1016/S0022-2836(05)80129-9. [Abstract] [CrossRef] [Google Scholar]
- Batson S.; De Chiara C.; Majce V.; Lloyd A. J.; Gobec S.; Rea D.; Fülöp V.; Thoroughgood C. W.; Simmons K. J.; Dowson C. G.; et al. Inhibition of D-Ala:D-Ala Ligase through a Phosphorylated Form of the Antibiotic D-Cycloserine. Nat. Commun. 2017, 8, 193910.1038/s41467-017-02118-7. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Arai M.; Iwakura M.; Matthews C. R.; Bilsel O. Microsecond Subdomain Folding in Dihydrofolate Reductase. J. Mol. Biol. 2011, 410, 329–342. 10.1016/j.jmb.2011.04.057. [Abstract] [CrossRef] [Google Scholar]
- Wales T. E.; Pajak A.; Roeselová A.; Shivakumaraswamy S.; Howell S.; Hartl F. U.; Engen J. R.; Balchin D.. Resolving Chaperone-Assisted Protein Folding on the Ribosome at the Peptide Level. bioRxiv2022. 10.1101/2022.09.23.509153. [CrossRef]
- Liu C. T.; Francis K.; Layfield J. P.; Huang X.; Hammes-Schiffer S.; Kohen A.; Benkovic S. J. Escherichia Coli Dihydrofolate Reductase Catalyzed Proton and Hydride Transfers: Temporal Order and the Roles of Asp27 and Tyr100. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 18231–18236. 10.1073/pnas.1415940111. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Day P. J.; Shaw W. V. Acetyl Coenzyme A Binding by Chloramphenicol Acetyltransferase. Hydrophobic Determinants of Recognition and Catalysis. J. Biol. Chem. 1992, 267, 5122–5127. 10.1016/s0021-9258(18)42739-1. [Abstract] [CrossRef] [Google Scholar]
- al-Bar O. A. M.; O’Connor C. D.; Giles I. G.; Akhtar M. D-Alanine:D-Alanine Ligase of Escherichia Coli. Expression, Purification and Inhibitory Studies on the Cloned Enzyme. Biochem. J. 1992, 282, 747–752. 10.1042/bj2820747. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Nissley D. A.; Vu Q. V.; Trovato F.; Ahmed N.; Jiang Y.; Li M. S.; O’Brien E. P. Electrostatic Interactions Govern Extreme Nascent Protein Ejection Times from Ribosomes and Can Delay Ribosome Recycling. J. Am. Chem. Soc. 2020, 142, 6103–6110. 10.1021/jacs.9b12264. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Nissley D. A.; Jiang Y.; Trovato F.; Sitarik I.; Narayan K. B.; To P.; Xia Y.; Fried S. D.; O’Brien E. P. Universal Protein Misfolding Intermediates Can Bypass the Proteostasis Network and Remain Soluble and Less Functional. Nat. Commun. 2022, 13, 308110.1038/s41467-022-30548-5. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Eastman P.; Swails J.; Chodera J. D.; McGibbon R. T.; Zhao Y.; Beauchamp K. A.; Wang L. P.; Simmonett A. C.; Harrigan M. P.; Stern C. D.; et al. OpenMM 7: Rapid Development of High Performance Algorithms for Molecular Dynamics. PLoS Comput. Biol. 2017, 13, e1005659.10.1371/journal.pcbi.1005659. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Frishman D.; Argos P. Knowledge-based Protein Secondary Structure Assignment. Proteins 1995, 23, 566–579. 10.1002/prot.340230412. [Abstract] [CrossRef] [Google Scholar]
- Halder R.; Nissley D. A.; Sitarik I.; O’Brien E. P.. Subpopulations of Soluble, Misfolded Proteins Commonly Bypass Chaperones: How It Happens at the Molecular Level. bioRxiv2021. 10.1101/2021.08.18.456736. [CrossRef]
- Li M. S.; Kouza M.; Hu C. K. Refolding upon Force Quench and Pathways of Mechanical and Thermal Unfolding of Ubiquitin. Biophys. J. 2007, 92, 547–561. 10.1529/biophysj.106.087684. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Baiesi M.; Orlandini E.; Seno F.; Trovato A. Exploring the Correlation between the Folding Rates of Proteins and the Entanglement of Their Native States. J. Phys. A Math. Theor. 2017, 50, 50400110.1088/1751-8121/aa97e7. [CrossRef] [Google Scholar]
- Kauffman L.Knots and Physics, XVI; World Scientific Pub. Co.: Singapore, 1993. [Google Scholar]
- Niemyska W.; Millett K. C.; Sulkowska J. I. GLN: A Method to Reveal Unique Properties of Lasso Type Topology in Proteins. Sci. Rep. 2020, 10, 1518610.1038/s41598-020-71874-2. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Vu Q. V.; Sitarik I.; Jiang Y.; Yadav D.; Sharma P.; Fried S. D.; Li M. S.; O’Brien E. P.. A Newly Identified Class of Protein Misfolding in All-Atom Folding Simulations Consistent with Limited Proteolysis Mass Spectrometry. bioRxiv2022. 10.1101/2022.07.19.500586. [CrossRef]
- Visalakshi N. K.; Thangavel K. Impact of Normalization in Distributed K-Means Clustering. Int. J. Soft Comput. 2009, 4, 168–172. [Google Scholar]
- Macqueen J.Some Methods for Classification and Analysis of Multivariate Observations, Proceedings of Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967; pp 281–297.
- Röblitz S.; Weber M. Fuzzy Spectral Clustering by PCCA+: Application to Markov State Models and Data Classification. Adv. Data Anal. Classif. 2013, 7, 147–179. 10.1007/s11634-013-0134-6. [CrossRef] [Google Scholar]
- Scherer M. K.; Trendelkamp-Schroer B.; Paul F.; Pérez-Hernández G.; Hoffmann M.; Plattner N.; Wehmeyer C.; Prinz J. H.; Noé F. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J. Chem. Theory Comput. 2015, 11, 5525–5542. 10.1021/acs.jctc.5b00743. [Abstract] [CrossRef] [Google Scholar]
- Hoffmann M.; Scherer M.; Hempel T.; Mardt A.; de Silva B.; Husic B. E.; Klus S.; Wu H.; Kutz N.; Brunton S. L.; Noé F. Deeptime: A Python Library for Machine Learning Dynamical Models from Time Series Data. Mach. Learn. Sci. Technol. 2022, 3, 01500910.1088/2632-2153/ac3de0. [CrossRef] [Google Scholar]
- Rotkiewicz P.; Skolnick J. Fast Procedure for Reconstruction of Full-Atom Protein Models from Reduced Representations. J. Comput. Chem. 2008, 29, 1460–1465. 10.1002/jcc.20906. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Baiesi M.; Orlandini E.; Seno F.; Trovato A. Sequence and Structural Patterns Detected in Entangled Proteins Reveal the Importance of Co-Translational Folding. Sci. Rep. 2019, 9, 842610.1038/s41598-019-44928-3. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Willmund F.; Del Alamo M.; Pechmann S.; Chen T.; Albanèse V.; Dammer E. B.; Peng J.; Frydman J. The Cotranslational Function of Ribosome-Associated Hsp70 in Eukaryotic Protein Homeostasis. Cell 2013, 152, 196–209. 10.1016/j.cell.2012.12.001. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
- Hartl F. U.; Hayer-Hartl M. Converging Concepts of Protein Folding in Vitro and in Vivo. Nat. Struct. Mol. Biol. 2009, 16, 574–581. 10.1038/nsmb.1591. [Abstract] [CrossRef] [Google Scholar]
- Kramer G.; Boehringer D.; Ban N.; Bukau B. The Ribosome as a Platform for Co-Translational Processing, Folding and Targeting of Newly Synthesized Proteins. Nat. Struct. Mol. Biol. 2009, 16, 589–597. 10.1038/nsmb.1614. [Abstract] [CrossRef] [Google Scholar]
- Waudby C. A.; Burridge C.; Cabrita L. D.; Christodoulou J. Thermodynamics of Co-Translational Folding and Ribosome–Nascent Chain Interactions. Curr. Opin. Struct. Biol. 2022, 74, 10235710.1016/j.sbi.2022.102357. [Abstract] [CrossRef] [Google Scholar]
- Frydman J.; Erdjument-Bromage H.; Tempst P.; Ulrich Hartl F. Co-Translational Domain Folding as the Structural Basis for the Rapid de Novo Folding of Firefly Luciferase. Nat. Struct. Biol. 1999, 6, 697–705. 10.1038/10754. [Abstract] [CrossRef] [Google Scholar]
Citations & impact
Impact metrics
Article citations
Pulling Forces Differentially Affect Refolding Pathways Due to Entangled Misfolded States in SARS-CoV-1 and SARS-CoV-2 Receptor Binding Domain.
Biomolecules, 14(10):1327, 18 Oct 2024
Cited by: 0 articles | PMID: 39456260 | PMCID: PMC11505858
Data
Data behind the article
This data has been text mined from the article, or deposited into data resources.
Protein structures in PDBe (3)
-
(2 citations)
PDBe - 4KJKView structure
-
(2 citations)
PDBe - 4C5CView structure
-
(1 citation)
PDBe - 3CLAView structure
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Molecular simulations of cotranslational protein folding: fragment stabilities, folding cooperativity, and trapping in the ribosome.
PLoS Comput Biol, 2(7):e98, 14 Jun 2006
Cited by: 68 articles | PMID: 16789821 | PMCID: PMC1523309
The ribosome destabilizes native and non-native structures in a nascent multidomain protein.
Protein Sci, 26(7):1439-1451, 19 May 2017
Cited by: 29 articles | PMID: 28474852 | PMCID: PMC5477528
Modeling protein folding in vivo.
Biol Direct, 13(1):13, 06 Jul 2018
Cited by: 12 articles | PMID: 29980221 | PMCID: PMC6035440
Cotranslational Folding of Proteins on the Ribosome.
Biomolecules, 10(1):E97, 07 Jan 2020
Cited by: 47 articles | PMID: 31936054 | PMCID: PMC7023365
Review Free full text in Europe PMC
Funding
Funders who supported this work.
Division of Molecular and Cellular Biosciences (1)
Grant ID: MCB-1553291
NIGMS NIH HHS (1)
Grant ID: R35 GM124818
Narodowe Centrum Nauki (1)
Grant ID: 2019/35/B/ST4/02086
National Institute of General Medical Sciences (1)
Grant ID: R35-GM124818