RCI protocol

Step-by-step RCI protocol

The protocol presented here consists of several main blocks: 1) reference correction of chemical shifts assignments; 2) calculations of sequence-specific random coil chemical shifts; 3) calculation of the Random Coil Index; 4) prediction of model free order parameters and root mean square fluctuations of NMR and MD ensembles.

The performance of the RCI method is strongly dependent on having properly referenced chemical shift data. Users must ensure that the backbone chemical shift assignments of the protein of interest have been correctly referenced to DSS (2,2-dimethyl-2-silapentane-5-sulfonic acid) according to IUPAC-IUBMB standards ¹⁷. The indirect chemical referencing method has been described in detail in a number of publications ^17,18 and the chemical referencing process is usually indicated in most BMRB chemical shift assignment file. A recent study revealed that 20%-30% of published assignments require reference correction of one or more nuclei ¹⁹. If one is uncertain about the quality or correctness of the chemical shift referencing for their protein, please include steps 5-11 of the PROCEDURE below. These steps describe a simple procedure of referencing correction that is based on a global minimization of the average difference between chemical shifts in helices and b-strands of a particular protein and the chemical shifts commonly observed in these secondary structure elements in a large protein database.

To generate an accurate set of RCI values it is necessary to first determine a set of sequence-specific random coil chemical shifts (see step 2). In the current protocol, these random coil shifts are calculated from random coil chemical shifts ²⁰ and neighboring residue corrections ²¹ published by Dyson group. We chose these reference values because they were obtained under conditions (8 M urea, pH 2.3) that are expected to significantly diminish possible long-range intra-peptide interactions. Also, this set of reference values is the only one that includes i+2, and i-2 adjacent residue corrections.

Random Coil Index is calculated as the inversed average of weighted secondary chemical shifts (see equation 1 below). Weighting coefficients were optimized by maximizing correlation between RCI and RMSF of MD ensembles of 14 proteins. RCI expression was then tested using a set of four new proteins as well as leave-one-out approach and the training set of 14 proteins. Coefficients of correlation between RCI and MD RMSF were 0.82 for both training and test sets. RCI was also found to correlate well with model-free order parameters and RMSFs of NMR ensembles of all 18 proteins with correlation coefficients of 0.77 and 0,81, respectively. Details of development and tests of the RCI method have been published elsewhere ¹². Based on the good correlation between RCI and aforementioned measures of motional amplitudes, we determined empirical expressions that can be used to convert RCI into these parameters. These equations are presented at the end of this protocol

Despite the good overall agreement between RCI and protein flexibility, one should not over-interpret RCI values at the protein termini. Since the effects of polypeptide chain termination on random coil values and neighboring residue corrections are not well known and not fully taken into account in RCI calculations, the accuracy of RCI flexibility predictions is expected to be low for these regions. While the end-effect correction step makes RCI a slightly better predictor of protein flexibility, the improved correlation is not related to the effects of conformational averaging on chemical shifts. Instead, it apparently originates from the indirect correlation between the increase of secondary chemical shifts and protein flexibility due to their shared dependence on the proximity to the terminal residue. We generally recommend that users exclude RCI values of the three terminal residues from consideration or, if the end-effect correction is applied, interpret their values as the commonly observed profile of protein flexibility at termini.

PROCEDURE

1| Obtain ¹³Ca, ¹³Cb, ¹³CO, ¹Ha, ¹HN, ¹⁵N assignments from NMR experiments

2| Calculate the sequence-corrected random coil reference values for the ¹³Ca, ¹³Cb, ¹³CO, ¹Ha, ¹HN, and ¹⁵N chemical shifts by adding the neighboring residue correction factors for residues i+1, i-1, i+2, and i-2 ²¹ to the random coil values of residue i ²⁰.

3| Calculate the secondary chemical shifts for each residue in the protein by subtracting the sequence-corrected random coil chemical shift values from the experimental chemical shifts of the corresponding residue. For example, if the first residue is Ala, the sequence-corrected random coil ¹Ha shift of Ala should be subtracted from the experimental ¹Ha shift of Ala, and so on. This must be done for each assigned backbone shift (¹H, ¹³C and ¹⁵N). For glycines, use the average of the two ¹Ha shifts.

4| If you are certain that your protein is properly referenced, continue to step 12. If not, re-reference the set of chemical shift assignments as described in steps 5-11.

RE-REFERENCING CHEMICAL SHIFT ASSIGNMENTS

5| If ¹Ha shifts are available, determine the protein’s secondary structure from ¹Ha Chemical Shift Index (A). If the ¹Ha shifts are unavailable, replace step 5 with step 9.

A Chemical Shift Index

(i) Assign index values of -1,1 and 0 to residues with secondary chemical shifts (ppm) in the respective ranges tabulated below.

Table 1: Chemical shift indices

	-1	0	+1
¹Ha (all amino acids)	< -0.1	-0.1 ® 0.1	> 0.1
¹³Ca (all amino acids except proline)	< -0.7	-0.7 ® 0.7	> 0.7
¹³Ca (proline)	< -4.0	-4.0 ® 4.0	> 4.0
¹³CO (all except proline)	< -0.5	-0.5 ® 0.5	> 0.5
¹³CO (proline)	< -4.0	-4.0 ® 4.0	> 4.0
¹³Cb (all amino acids except glycine and proline)	< -0.7	-0.7 ® 0.7	> 0.7
¹³Cb (proline)	< -4.0	-4.0 ® 4.0	> 4.0

(ii) Referring to the table below, assign a helical state to a group of four or more residues with ’helix’ indices not interrupted by a residue with ’sheet’ index. Assign a b-strand state to a group of three or more residues with ’sheet’ indices not interrupted by a residue with a ‘helix’ index.

Table 2: Helix and sheet indices

	-1	+1
¹Ha	helix	sheet
¹³Ca	sheet	helix
¹³CO	sheet	helix
¹³Cb	Not applicable*	sheet

*¹³Cb secondary chemical shifts should only be used for prediction of b-strands.

(iii) When a gap in the aforementioned patterns occurs, use a “local density” criterion to determine secondary structure of the gap residues. Assign either helical or b-strand state to a stretch of five residues if 70% of these residues have ‘helix’ or ‘sheet’ indices, respectively

(iv) Identify termination points of helices and b-strands by the first appearance of two consecutive zero indices or indices with the sign opposite to those of the corresponding secondary structure.

(v) If a protein region demonstrates a pattern of chemical shift indices that is not consistent with any aforementioned rules, the secondary structure is assigned as coil.

6| Subtract the chemical shift of each residue in identified helices and/or b-strands from the average chemical shift of the corresponding residue that is commonly observed in these secondary structure elements. The values of the commonly observed chemical shifts can be found in Supplementary Table 3 ²².

7| Calculate the reference offset for each type of shift (¹³Ca, ¹³Cb, ¹³CO, ¹Ha, ¹HN, and ¹⁵N) by averaging the aforementioned differences between experimental and commonly observed chemical shifts over all helices and/or b-strands in the protein.

8| Add the calculated offsets to the values of the corresponding experimental chemical shifts of each residue to correct the spectral referencing.

9| Determine the secondary structure from the chemical shift indices for ¹³Ca, ¹³Cb, ¹³CO, and ¹Ha, if available (A).

10| Determine the final secondary structure assignment for the protein using either rule A or rule B (below) depending on availability of chemical shift assignments.

A) If three or more types of chemical shift are available for a residue, the final prediction is the consensus secondary structure that is predicted by the majority of shifts. In case of b-strand prediction, if an equal number of shifts predict b-strand and non-b-strand structure, use the prediction of the group of shifts without ¹³Cb. Exclude non-b-strand predictions originating from ¹³Cb shifts from the calculation of the consensus, when the choice between a helical structure and a coil structure has to be made.

B) If fewer than three types of chemical shifts are available for a particular residue, the final secondary structure assignment is based on the predictions by individual shifts. Give precedence to these predictions in the following order: ¹³Ca, ¹Ha, ¹³CO, and ¹³Cb.

11| Repeat steps 6-7. If the chemical offsets for all types of chemical shifts are small (< 0.0001 ppm), proceed to the next step. If not, repeat steps 8, 9 and 6-7 until the offsets become smaller than < 0.0001 ppm.

12| Remove secondary chemical shifts of cysteine residues.

13| Substitute ¹³C, ¹H, and ¹⁵N secondary chemical shifts below 0.04 ppm, 0.01 ppm and 0.1 ppm, respectively, with the corresponding “floor values” (0.04 ppm for ¹³Ca, ¹³Cb, ¹³CO, 0.01 ppm for ¹Ha and ¹HN, and 0.1 ppm for ¹⁵N).

14| Fill any gaps in the per-residue distributions of the secondary chemical shifts. If a chemical shift for residue i is missing, use the average of secondary chemical shifts of residues i+1 and i-1 as a proxy for the secondary chemical shift of residue i. If the chemical shifts of residues i+1 or/and i-1 are missing, include the secondary chemical shifts of residues i+2 or/and i-2 in calculation of the average.

CRITICAL STEP This step is optional. While we consider filling gaps in per-residue distributions of secondary chemical shifts appropriate in the middle of a long (≥ 5 residues) secondary structure element (helix, b-strand, long loop), it may be not desirable to do so for shorter stretches of secondary structure and at transition points between secondary structure elements with significantly different dynamic properties (e.g. a transition from a rigid helix to a mobile loop).

14| Apply a three-residue averaging to smooth the per-residue distributions of the chemical shifts. This involves taking the secondary shifts of three residues, averaging them and then assigning that average value to the middle residue and repeating the process sequentially for each triplet of residues in the protein. Use the average of the first two and the last two secondary chemical shifts to obtain smoothed values of secondary shifts for the N-terminal and C-terminal residues, respectively.

15| Multiply ¹³C (¹³Ca, ¹³Cb, and ¹³CO) and ¹H (¹Ha and ¹HN) secondary chemical shifts by 2.5 and 10.0, respectively. If the new secondary chemical shift is below 0.5 ppm, replace it with the “floor value” of 0.5 ppm. Also, replace ¹⁵N secondary chemical shifts below 0.5 ppm with the “floor value” of 0.5 ppm .

16| Determine the weighting coefficients needed to calculate the Random Coil Index (equation 1) that correspond to the set of available secondary chemical shifts (Supplementary Table 4). Multiply each coefficient by 7.5.

17| Calculate the Random Coil Index (RCI) using the following expression.

RCI = (<A|Dd_C_a|+B|Dd_CO|+C|Dd_C_b|+ D|Dd_N|+E|Dd_NH|+F|Dd_H_a|>)^-1 (1)

where |Dd_C_a|, |Dd_CO|, |Dd_C_b|, |Dd_N|, |Dd_NH|, and |Dd_H_a| are the absolute values of the secondary chemical shifts (in ppm) of Ca, CO, Cb, N, NH and Ha, respectively. A,B,C,D,E, and F are weighting coefficients (Supplementary Table 4). Left angle and right angle brackets (< >) indicate the average. If the RCI value is above 0.6, replace its value with the “ceiling value” of 0.6. At this point, you may wish to apply the end-effect corrections to N- and C-termini when the RCI values of the first or/and the last three residues progressively decrease toward the corresponding terminal residue (B).

B End-effect corrections

(i) Identify the largest RCI value among the four terminal residues.

(ii) Calculate the difference between this maximal RCI value and the RCI value of each remaining terminal residue.

(iii) Multiply the calculated difference by two and add the result to the RCI value of corresponding residue. If the new RCI value is above 0.6, replace it with the “ceiling value” of 0.6.

18| Apply a second three-residue averaging to smooth the per-residue distributions of the RCI values. Use the average of the first two and the last two RCI values to obtain smoothed values of RCI for the N-terminal and C-terminal residues, respectively.

19| Predict values of root mean square fluctuations of MD ensembles (MD RMSF), mean square fluctuations of NMR ensembles (NMR RMSF) and model-free order parameters (S²) ^15,16 using the following expressions.

S² = 1 - 0.5 ln (1 + RCI * 10.0) (2)

RMSF (MD) = RCI * 23.6 Å (3)

RMSF (NMR) = RCI * 12.7 Å (4)

CRITICAL STEP While these relationships are based on the strong correlation between RCI and motional amplitudes ¹², we would like to stress the fact that these expressions are empirical. When interpreting RCI in terms of MD RMSF, NMR RMSD, and S², one should realize that the motions described by these parameters may have quite different time-scales. S²and MD RMSF represent protein dynamics on ps-ns time-scale while RCI is expected to be mostly affected by the fast chemical exchange from ps to ms-ms (the coalescence points of nuclei involved in RCI calculations). More precise identification of motional time-scales using the RCI method is not possible. The time-scale of NMR ensembles is hard to estimate. The structural distribution in NMR ensembles often originates from enhanced conformational sampling at the high-temperature steps of MD-based structure calculation protocols and heavily depends on the balance between NMR restraints and contributions of non-restraint terms of NMR force-fields. Regions of NMR ensembles calculated with small number of restraints may demonstrate unrealistic structural diversity and, as a result, low agreement with RCI due to simplifications in non-bonded energy terms of NMR force-field. However, the high correlation observed between RCI and aforementioned motional amplitudes ¹² suggests that, despite the possible differences in the time-scales, these parameters describe dynamics phenomena with a common determinant (e.g. protein atomic density) and agree in identifying “hot” kinetic spots in protein structures.

TIMELINE

When backbone assignments are available, chemical shifts can be re-referenced and protein flexibility can be predicted from the RCI protocol using a spreadsheet program within 1-2 hours. Users may also code the protocol into a spreadsheet program so that subsequent analyses for new proteins can be performed in just seconds.

ANTICIPATED RESULTS

RCI correlates with the conventional measures of protein with an average of correlation coefficient of ~ 0.8 ¹² (Table 3). Figure 1 shows examples of predicted MD RMSF, NMR RMSF and S²and their comparison with values obtained by standard methods.

Table 3: Mean coefficients of correlation between RCI and conventional measures of motional amplitudes ¹².

Parameter	# of proteins	# of residues^b	Correlation^c
MD RMSF	18	2187	0.82
NMR RMSF	17	2-15	0.81
Order parameter^a	18	2187	0.77

^a order parameters predicted with a contact model [Zhang, 2002 #1449 were used whenever experimentally obtained order parameters were not available. ^b number of residues in PDB files. ^c mean correlation coefficient.

More detailed correlation statistics and 19 additional plots with examples of excellent correlation between RCI and these parameters can be found in the first report about the RCI method [Berjanskii, 2005 #1529].

In general, it is not necessary to convert RCI to MD RMSF, NMR RMSF, and S² to effectively identify regions with elevated mobility in proteins. The information content of RCI and the quality of the flexibility predictions does not change upon such a conversion. However, RCI-derived traditional measures of motional amplitudes could be easier to relate to models of protein motions and compare with corresponding parameters obtained with standard methods. In such cases, aforementioned equations 2, 3 and 4 can be used.

REFERENCES.

1. Carugo, O. & Argos, P. Reliability of atomic displacement parameters in protein crystal structures. Acta Crystallogr D Biol Crystallogr 55 ( Pt 2), 473-8 (1999).

2. Petsko, G. A. & Ringe, D. Fluctuations in protein structure from X-ray diffraction. Annu Rev Biophys Bioeng 13, 331-71 (1984).

3. Hansson, T., Oostenbrink, C. & van Gunsteren, W. Molecular dynamics simulations. Curr Opin Struct Biol 12, 190-6 (2002).

4. Elofsson, A. & Nilsson, L. How Consistent Are Molecular-Dynamics Simulations - Comparing Structure and Dynamics in Reduced and Oxidized Escherichia-Coli Thioredoxin. Journal of Molecular Biology 233, 766-780 (1993).

5. Kay, L. E. Protein dynamics from NMR. Nat Struct Biol 5 Suppl, 513-7 (1998).

6. Ishima, R. & Torchia, D. A. Protein dynamics from NMR. Nat Struct Biol 7, 740-3 (2000).

7. Palmer, A. G., 3rd. Nmr probes of molecular dynamics: overview and comparison with other techniques. Annu Rev Biophys Biomol Struct 30, 129-55 (2001).

8. Lacroix, E., Bruix, M., Lopez-Hernandez, E., Serrano, L. & Rico, M. Amide hydrogen exchange and internal dynamics in the chemotactic protein CheY from Escherichia coli. J Mol Biol 271, 472-87 (1997).

9. Korzhnev, D. M., Orekhov, V. Y. & Arseniev, A. S. Model-free approach beyond the borders of its applicability. J Magn Reson 127, 184-91 (1997).

10. Palmer, A. G., 3rd, Kroenke, C. D. & Loria, J. P. Nuclear magnetic resonance methods for quantifying microsecond-to-millisecond motions in biological macromolecules. Methods Enzymol 339, 204-38 (2001).

11. Fushman, D., Cahill, S. & Cowburn, D. The Main-Chain Dynamics of the Dynamin Pleckstrin Homology (Ph) Domain in Solution - Analysis of N-15 Relaxation With Monomer/Dimer Equilibration. Journal of Molecular Biology 266, 173-194 (1997).

12. Berjanskii, M. V. & Wishart, D. S. A simple method to predict protein flexibility using secondary chemical shifts. J Am Chem Soc 127, 14970-1 (2005).

13. Wishart, D. S., Sykes, B. D. & Richards, F. M. The chemical shift index: a fast and simple method for the assignment of protein secondary structure through NMR spectroscopy. Biochemistry 31, 1647-51 (1992).

14. Wishart, D. S. & Sykes, B. D. The 13C chemical-shift index: a simple method for the identification of protein secondary structure using 13C chemical-shift data. J Biomol NMR 4, 171-80 (1994).

15. Lipari, G. & Szabo, A. Model-Free Approach to the Interpretation of Nuclear Magnetic Resonance Relaxation in Macromolecules. 1. Theory and Range of Validity. Journal of the American Chemical Society 104, 4546-4559 (1982).

16. Clore, G. M. et al. Deviations from the simple two-parameter model-free approach to the interpretation of nitrogen-15 nuclear magnetic relaxation of proteins. Journal of the American Chemical Society 112, 4989-4991 (1990).

17. Markley, J. L. et al. Recommendations for the presentation of NMR structures of proteins and nucleic acids. IUPAC-IUBMB-IUPAB Inter-Union Task Group on the Standardization of Data Bases of Protein and Nucleic Acid Structures Determined by NMR Spectroscopy. J Biomol NMR 12, 1-23 (1998).

18. Wishart, D. S. et al. 1H, 13C and 15N chemical shift referencing in biomolecular NMR. J Biomol NMR 6, 135-40 (1995).

19. Zhang, H., Neal, S. & Wishart, D. S. RefDB: a database of uniformly referenced protein chemical shifts. J Biomol NMR 25, 173-95 (2003).

20. Schwarzinger, S., Kroon, G. J., Foss, T. R., Wright, P. E. & Dyson, H. J. Random coil chemical shifts in acidic 8 M urea: implementation of random coil shift data in NMRView. J Biomol NMR 18, 43-8. (2000).

21. Schwarzinger, S. et al. Sequence-dependent correction of random coil NMR chemical shifts. Journal of the American Chemical Society 123, 2970-2978 (2001).

22. Wang, Y. & Jardetzky, O. Probability-based protein secondary structure identification using combined NMR chemical-shift data. Protein Sci 11, 852-61 (2002).

23. Iwahara, J., Peterson, R. D. & Clubb, R. T. Compensating increases in protein backbone flexibility occur when the Dead ringer AT-rich interaction domain (ARID) binds DNA: a nitrogen-15 relaxation study. Protein Sci 14, 1140-50 (2005).

24. Zhang, F. & Bruschweiler, R. Contact model for the prediction of NMR N-H order parameters in globular proteins. J Am Chem Soc 124, 12654-5 (2002).

Supplementary table 1. Random coil reference chemical shifts ²⁰.

aa	N	CO	Ca	Cb	NH	Ha
A	125	178.5	52.8	19.3	8.35	4.35
C	118.7	175.5	55.6	41.2	8.54	4.76
B	118.8	175.3	58.6	28.3	8.44	4.59
D	119.1	175.9	53	38.3	8.56	4.82
E	120.2	176.8	56.1	29.9	8.4	4.42
F	120.7	176.6	58.1	39.8	8.31	4.65
G	107.5	174.9	45.4	-	8.41	4.02
H	118.1	175.1	55.4	29.1	8.56	4.79
I	120.4	177.1	61.6	38.9	8.17	4.21
K	121.6	177.4	56.7	33.2	8.36	4.36
L	122.4	178.2	55.5	42.5	8.28	4.38
M	120.3	177.1	55.8	32.9	8.42	4.52
N	119	176.1	53.3	39.1	8.51	4.79
P	-	177.8	63.7	32.2	-	4.45
Q	120.5	176.8	56.2	29.5	8.44	4.38
R	121.2	177.1	56.5	30.9	8.39	4.38
S	115.5	175.4	58.7	64.1	8.43	4.51
T	112	175.6	62	70	8.25	4.43
V	119.3	177	62.6	31.8	8.16	4.16
W	122.1	177.1	57.6	29.8	8.22	4.7
Y	120.9	176.7	58.3	38.9	8.26	4.58

aa – amino acid one-letter code. B – reduced cysteine.

Supplementary table 2. Neighboring residue corrections ²¹.

Part A: i+1 and i-1 corrections.

aa	i-1						i+1
	¹⁵N	¹³C’	¹³Ca	¹³Cb	¹H_N	¹Ha	¹⁵N	¹³C’	¹³Ca	¹³Cb	¹H_N	¹Ha
A	-0.57	-0.07	0.06	0	0.07	-0.03	-0.33	-0.77	-0.17	0	-0.05	-0.03
R	1.62	-0.19	-0.01	0	0.15	-0.02	-0.14	-0.49	-0.07	0	-0.02	-0.02
N	0.87	-0.1	0.23	0	0.13	-0.02	-0.26	-0.66	-0.03	0	-0.03	-0.01
D	0.86	-0.13	0.25	0	0.14	-0.02	-0.2	-0.58	0	0	-0.03	-0.01
C^a	3.07	-0.28	0.1	0	0.2	0	-0.26	-0.51	-0.07	0	-0.02	0.02
Q	1.62	-0.18	0.04	0	0.15	-0.01	-0.14	-0.48	-0.06	0	-0.02	-0.02
E	1.51	-0.2	0.05	0	0.15	-0.02	-0.2	-0.48	-0.08	0	-0.03	-0.02
G	0	0	0	0	0	0	0	0	0	0	0	0
H	1.68	-0.22	0.02	0	0.2	0.01	-0.55	-0.65	-0.09	0	-0.04	-0.06
I	4.87	-0.18	-0.01	0	0.17	-0.02	-0.14	-0.58	0.2	0	-0.06	-0.02
L	1.05	-0.13	0.03	0	0.14	-0.05	-0.14	-0.5	-0.1	0	-0.03	-0.03
K	1.57	-0.18	-0.02	0	0.14	-0.01	-0.2	-0.5	-0.11	0	-0.03	-0.02
M	1.57	-0.18	-0.06	0	0.14	-0.01	-0.2	-0.41	0.1	0	-0.02	-0.01
F	2.78	-0.25	0.06	0	0.15	-0.08	-0.49	-0.83	-0.23	0	-0.12	-0.09
P	0.87	-0.09	0.02	0	0.1	-0.03	-0.32	-2.84	-2	0	-0.18	0.11
S	2.55	-0.15	0.13	0	0.19	0	-0.03	-0.4	-0.08	0	-0.03	0.02
T	2.78	-0.13	0.12	0	0.14	0	-0.03	-0.19	-0.04	0	0	0.05
W	3.19	-0.3	0.03	0	0.04	-0.15	-0.26	-0.85	-0.17	0	-0.13	-0.1
Y	3.01	-0.24	0.06	0	0.09	-0.08	-0.43	-0.85	-0.22	0	-0.11	-0.1
V	4.34	-0.18	-0.02	0	0.17	-0.02	-0.14	-0.57	-0.21	0	-0.05	-0.01

Part B: i+2 and i-2 corrections.

	i-2						i+2
aa	¹⁵N	¹³C’	¹³Ca	¹³Cb	¹H_N	¹Ha	¹⁵N	¹³C’	¹³Ca	¹³Cb	¹H_N	¹Ha
A	-0.15	-0.02	0.01	0	-0.1	0	-0.12	-0.11	-0.02	0	-0.01	-0.02
R	-0.06	-0.03	0.02	0	-0.06	0	-0.06	-0.06	0	0	0	-0.02
N	-0.17	-0.03	0.01	0	-0.07	-0.01	-0.18	-0.09	-0.06	0	-0.01	-0.01
D	-0.29	-0.04	-0.01	0	-0.11	-0.01	-0.12	-0.08	-0.03	0	-0.02	-0.02
C^a	0	-0.07	-0.01	0	-0.07	0	-0.06	-0.08	-0.03	0	0	-0.01
B	0	-0.07	-0.01	0	-0.07	0	-0.06	-0.08	-0.03	0	0	-0.01
Q	-0.06	-0.03	0.01	0	-0.06	0	-0.06	-0.05	-0.02	0	-0.01	-0.01
E	-0.12	-0.03	0.01	0	-0.07	0	-0.06	-0.09	-0.01	0	-0.01	-0.02
G	0	0	0	0	0	0	0	0	0	0	0	0
H	0.17	-0.07	0.01	0	0	0.01	-0.12	-0.1	-0.05	0	-0.01	-0.03
I	0	-0.02	0.02	0	-0.09	-0.01	-0.18	-0.2	-0.07	0	-0.01	-0.03
L	-0.06	-0.01	0.02	0	-0.08	-0.01	-0.06	-0.13	-0.01	0	0	-0.04
K	-0.06	-0.03	0.02	0	-0.06	0	-0.06	-0.08	-0.01	0	0	-0.02
M	-0.06	-0.02	0.01	0	-0.06	0	-0.06	-0.08	0	0	0	-0.02
F	-0.46	-0.1	0.01	0	-0.37	-0.04	-0.18	-0.27	-0.07	0	-0.03	-0.06
P	-0.17	-0.02	0.04	0	-0.12	-0.01	-0.18	-0.47	-0.22	0	-0.04	-0.01
S	-0.17	-0.06	0	0	-0.08	-0.01	-0.06	-0.08	0	0	0	-0.01
T	-0.12	-0.05	0	0	-0.06	-0.01	-0.06	-0.08	-0.01	0	0.01	-0.01
W	-0.64	-0.17	-0.08	0	-0.62	-0.16	0	-0.26	-0.02	0	-0.08	-0.08
Y	-0.52	-0.13	-0.01	0	-0.42	-0.04	-0.24	-0.28	-0.07	0	-0.04	-0.05
V	-0.06	-0.03	0.01	0	-0.08	-0.01	-0.24	-0.2	-0.07	0	-0.01	-0.02

^a The same correction factors are used for adjacent cysteine residues in reduced and oxidized forms.

Supplementary table 3. Averaged chemical shift (in ppm) observed in b-strands and a-helices ²².

	¹³Ca		¹³Cb		¹³C’		¹H_N		¹Ha		¹⁵N
aa	b	a	b	a	b	a	b	a	b	a	b	a
A	50.86	54.86	21.72	18.27	175.3	179.58	8.59	7.99	4.87	4.03	125.57	121.65
R	54.63	59.05	32.36	30	175.04	178.11	8.57	8.03	4.85	4	122.6	118.99
N	52.48	55.67	40.43	38.28	174.55	176.74	8.7	8.2	5.26	4.45	122.7	117.6
D	53.4	57.04	42.78	40.5	175.15	178.07	8.56	8.05	5.01	4.44	123.82	119.9
Q	54.33	58.61	31.92	28.33	174.5	178.35	8.51	8.11	4.97	4.03	123.14	118.59
E	55.55	59.3	32.45	29.2	175.01	178.46	8.66	8.32	4.76	3.99	123.52	119.89
G	45.08	47.02	-	-	173.01	176.31	8.27	8.23	4.09	3.84	110.19	107.34
H	54.8	59.62	32.2	29.91	173.8	176.83	8.76	8.03	5.07	4.06	121.65	118.09
I	60	64.68	40.09	37.59	174.79	177.49	8.74	8.06	4.72	3.66	124.12	120.22
L	53.94	57.54	44.02	41.4	175.16	178.42	8.63	8.02	4.85	4	125.69	120.18
K	55.01	59.11	34.86	32.31	174.93	177.79	8.54	8.04	4.96	3.98	123.29	119.9
M	54.1	58.45	34.34	31.7	174.64	177.76	8.43	8.05	4.94	4.03	121.67	118.69
F	56.33	60.74	41.64	38.91	174.15	176.42	8.8	8.21	5.1	4.11	121.95	119.12
P	62.79	65.52	32.45	31.08	176.41	178.34	-	-	4.72	4.13	-	-
S	57.14	60.86	65.39	62.81	173.52	176.51	8.57	8.11	5.08	4.2	117.44	114.78
T	61.1	65.89	70.82	68.64	173.47	176.62	8.5	8.1	4.81	4.02	118.09	115.3
W	56.28	60.03	31.78	28.74	175.1	177.81	8.83	8.24	5.24	4.35	124.04	120.48
Y	56.56	61.07	40.79	38.38	174.65	177.05	8.69	8.1	5	4.14	122.55	119.67
V	60.72	65.96	33.81	31.41	174.66	177.75	8.73	7.99	4.66	3.5	123.27	119.53
B	57.64	62.86	29.48	26.99	173.86	177.42	9	8.22	5.18	4.16	123.27	117.4
C	54.19	58.57	43.79	40.02	172.73	176.84	8.68	8.58	5.21	4.53	121.81	119.51

aa – amino acid one-letter code. B – reduced cysteine. b - beta-strand. a - a-helix

Supplementary Table 4. Weighting coefficients for RCI calculation (equation 1) with different sets of chemical shifts.

Nuclei included	Weighting coefficients
Nuclei included	Ca	CO	Cb	N	Ha	NH
Ca, Cb, CO, N, Ha, NH	0.74	0.72	0.13	0.38	0.91	0.15
Ca, Cb, CO, Ha, NH	0.72	0.68	0.1	0	0.91	0.24
Ca, Cb, CO, N, NH	0.85	0.82	0.35	0.32	0	0.21
Ca, Cb, N, Ha, NH	0.8	0	0.16	0.34	0.88	0.09
Ca, CO, N, Ha, NH	0.68	0.68	0	0.33	0.89	0.13
Cb, CO, N, Ha, NH	0	0.79	0.05	0.33	0.89	0.08
Ca, Cb, CO, N, Ha	0.71	0.67	0.13	0.43	0.88	0
Ca, Cb, CO, Ha	0.64	0.6	0.1	0	0.82	0
Ca, Cb, CO, NH	0.81	0.78	0.29	0	0	0.27
Ca, Cb, CO, N	0.82	0.8	0.39	0.4	0	0
Ca, Cb, Ha, NH	0.77	0	0.13	0	0.87	0.15
Ca, Cb, N, Ha	0.77	0	0.16	0.36	0.84	0
Ca, Cb, N, NH	0.86	0	0.37	0.28	0	0.11
Ca, CO, Ha, NH	0.68	0.67	0	0	0.9	0.23
Ca, CO, N, Ha	0.64	0.62	0	0.38	0.85	0
Ca, CO, N, NH	0.82	0.79	0	0.22	0	0.28
Ca, N, Ha, NH	0.75	0	0	0.3	0.87	0.1
Cb, CO, Ha, NH	0	0.71	0.03	0	0.82	0.14
Cb, CO, N, Ha	0	0.76	0.04	0.35	0.85	0
Cb, CO, N, NH	0	0.96	0.28	0.28	0	0.2
Cb, N, Ha, NH	0	0	0	0.27	0.77	0
CO, N, Ha, NH	0	0.75	0	0.3	0.86	0.07
CO, N, Ha	0	0.71	0	0.32	0.83	0
CO, N, NH	0	0.93	0	0.27	0	0.2
N, Ha, NH	0	0	0	0.27	0.77	0
Cb, N, Ha	0	0	0	0.27	0.77	0
Cb, N, NH	0	0	0.73	0.78	0	0.25
CO, Ha, NH	0	0.67	0	0	0.78	0.13
Cb, CO, N	0	0.84	0.28	0.32	0	0
Cb, Ha, NH	0	0	0	0	0.6	0
Cb, CO, Ha	0	0.61	0.05	0	0.76	0
Cb, CO, NH	0	0.85	0.25	0	0	0.2
Ca, N, Ha	0.71	0	0	0.32	0.83	0
Ca, N, NH	0.83	0	0	0.17	0	0.23
Ca, CO, N	0.78	0.76	0	0.35	0	0
Ca, Ha, NH	0.71	0	0	0	0.85	0.13
Ca, CO, Ha	0.56	0.56	0	0	0.78	0
Ca, CO, NH	0.78	0.76	0	0	0	0.3
Ca, Cb, N	0.83	0	0.38	0.32	0	0
Ca, Cb, Ha	0.65	0	0.12	0	0.76	0
Ca, Cb, NH	0.89	0	0.34	0	0	0.23
Ca, Cb, CO	0.79	0.74	0.31	0	0	0
N, NH	0	0	0	0.71	0	0.42
N, Ha	0	0	0	0.27	0.77	0
CO, N	0	0.77	0	0.27	0	0
Ha, NH	0	0	0	0	0.6	0
CO, Ha	0	0.55	0	0	0.69	0
CO, NH	0	0.8	0	0	0	0.2
Cb, N	0	0	0.65	0.68	0	0
Cb, Ha	0	0	0	0	0.6	0
Cb, NH	0	0	0.71	0	0	0.51
Cb, CO	0	0.85	0.25	0	0	0
Ca, N	0.77	0	0	0.27	0	0
Ca, Ha	0.55	0	0	0	0.69	0
Ca, NH	0.85	0	0	0	0	0.25
Ca, CO	0.68	0.65	0	0	0	0
Ca, Cb	0.74	0	0.34	0	0	0
Ca	1	0	0	0	0	0
CO	0	1	0	0	0	0
N	0	0	0	1	0	0
Cb	0	0	1	0	0	0
NH	0	0	0	0	0	1
Ha	0	0	0	0	1	0