Engagement of intrinsic disordered proteins in protein–protein interaction

Proteins from the intrinsically disordered group (IDP) focus the attention of many researchers engaged in protein structure analysis. The main criteria used in their identification are lack of secondary structure and significant structural variability. This variability takes forms that cannot be identified in the X-ray technique. In the present study, different criteria were used to assess the status of IDP proteins and their fragments recognized as intrinsically disordered regions (IDRs). The status of the hydrophobic core in proteins identified as IDPs and in their complexes was assessed. The status of IDRs as components of the ordering structure resulting from the construction of the hydrophobic core was also assessed. The hydrophobic core is understood as a structure encompassing the entire molecule in the form of a centrally located high concentration of hydrophobicity and a shell with a gradually decreasing level of hydrophobicity until it reaches a level close to zero on the protein surface. It is a model assuming that the protein folding process follows a micellization pattern aiming at exposing polar residues on the surface, with the simultaneous isolation of hydrophobic amino acids from the polar aquatic environment. The use of the model of hydrophobicity distribution in proteins in the form of the 3D Gaussian distribution described on the protein particle introduces the possibility of assessing the degree of similarity to the assumed micelle-like distribution and also enables the identification of deviations and mismatch between the actual distribution and the idealized distribution. The FOD (fuzzy oil drop) model and its modified FOD-M version allow for the quantitative assessment of these differences and the assessment of the relationship of these areas to the protein function. In the present work, the sections of IDRs in protein complexes classified as IDPs are analyzed. The classification “disordered” in the structural sense (lack of secondary structure or high flexibility) does not always entail a mismatch with the structure of the hydrophobic core. Particularly, the interface area, often consisting of IDRs, in many analyzed complexes shows the compliance of the hydrophobicity distribution with the idealized distribution, which proves that matching to the structure of the hydrophobic core does not require secondary structure ordering.

The short presentation of the model is given here to make easy interpretation of results presented in this paper [S1].
The majority of proteins are active in water environment. Treating the amino acids as bi-polar molecules makes possible to assume that they tend to construct the micelle-like structural form. The hydrophobic parts of side chains tend to concentrate in the central part with polar parts of amino acids direct toward surface. Thus the distribution of hydrophobicity may be expressed by 3D Gauss function spread all over the protein molecule.
where σ x , σ y and σ z are dependent on size and shape of the molecule.
The T i represents the idealized hydrophobicity level on the position of effective atom (averaged position of atoms belonging to particular amino acid).
The real observed hydrophobicity distribution however is the effect of inter-residual interaction expressed by Levitt's function [S2]  where O i represents the real level of hydrophobicity localized on the position of effective atom as the effect of hydrophobicity local collection. Any intrinsic hydrophobicity (H r ) scale of amino acids can be applied.
The two given distributions can be compared to measure the degree of similarity/differences between them using the convergence entropy introduced by Kullback-Leibler [S3]: The target -observed O i and reference distribution T i shall be normalized before the D KL can be calculated (the first component of eq. 1 and 2).
The value D KL as expressing entropy can not be interpreted. This is why the second reference distribution shall be introduced: R i = 1/N where Nnumber of residues in polypeptide chain. The R distribution is the opposite one to T distribution as representing the protein deprived of any form of hydrophobic core.
The T distribution in eq. 3 substituted by R distribution characterizes the differences between target distribution O and the reference distribution R.

The relation D KL (O|T) < D KL (O|R) is interpreted as representing O distribution similar to T
distribution thus the protein is classified as constructed according to micelle-like hydrophobicity distribution with hydrophobic core present.
To avoid operating with two parameters the Relative Distance (RD) is introduced as follows: The RD < 0.5 represents the protein with hydrophobic core.
examples the 3D Gauss function is generated.
The status of selected fragments : chain in complex, domain in chain or even selected fragment of the chain in context of the distribution of selected unit can be assessed. In this case the sum of values of Ti, Oi and Ri belonging to selected fragment shall be normalized before the RD can be calculated. In such case RD < 0.5 identifies the selected fragment as participating in the hydrophobic core construction or for RD > 0.5 introducing local disorder.
During folding process in water environment proteins accept the conformation directing the hydrophobic residues toward center of the molecule with exposition of hydrophilic residues on the surface. Sequence of amino acids in particular polypeptide chain allows the high-accordance micellelike structuralization. Other sequence of amino acids in polypeptide chain allows the construction of micelle with local discordance in respect to micelle-like organization. These discordances appear to represent the localization of protein specificity. Local exposure of hydrophobicity on surface is very often used to construct the complex with other protein [S4]. The local hydrophobicity deficiency appears to represent the cavityvery often substrate binding cavity [S5]. The FOD model may be applied to proteins acting in other than water environment: in membrane for example [S6-S9]. The distribution of hydrophobicity in membrane proteins is opposite to that in water environment, what can be expressed by the following function: where index "n" denotes normalization In practice the following form is applied to calculate the opposite distribution: The analysis of numerous membrane proteins suggests the following function to be applied for description of hydrophobicity distribution in membrane proteins: where index "n" denotes normalization.

Supplementary Material 4
The interpretation of the eq. 7 is as follows: The idealized distribution characteristic for water environment (T i ) gets modified by the opposite distribution (T MAX -T i ) in the degree as measured by K parameter. This parameter expresses the degree to which the water directed force field gets modified by opposite force field (membrane based) to the extent measured by the K value.
The value of D KL (O|M) assess the proximity of O distribution in respect to M. The minimal value of D KL (O|M) for particular K value defines the best fit for O and M distributions. The K *(T MAX -T i ) expresses the degree in which the water derived force field gets modified by the external factor which is the membrane in particular.
The graphic presentation of the model shown in Fig. S1 is aimed to visualize the basic assumptions for FOD and FOD-M model.

Status of individual unit assessment
The individual unit may be defined arbitrarily. The 3D Gauss function is constructed for one protein molecule, one chain isolated from complex, the complex as a whole. Parameters RD and K may be calculated for any structural unit.

Status of selected part of larger construction
The status of chain as the part of complex can be assessed in two ways: as integral part of complexthe 3D Gauss function constructed for all chains. This interpretation is focused on the analysis of the selected fragment of O profile in respect to T and R profile. The status of chain in complex can be estimated as integral part of T, O and R as well as M profiles.
However the status of chain can be treated as part of complex expressing its participation in hydrophobic core generation or introducing the local disorder (in the FOD interpretation as discordance of Oi status versus Ti status). To perform this analysis the selected fragments (for example chain in complex) of T, O and R profiles shell be normalized. The normalized Ti, Oi and Ri for selected fragment can be used for RD calculation expressing the status of selected fragment (chain in complex). This operation can be adopted for any fragment of chain chosen in particular construction. It reveals the participation of certain part of whole structure in core formation or introduction of local disorder. This procedure The interpretation of RD and K parameter is as follows: 1. RD value characterize the order of hydrophobicity distribution in respect to idealized distribution. The higher RD the lower presence of hydrophobic core.
2. K value assess the degree to which the participation of other than water compounds directed the folding process. It is assumed that the environment directs this process toward organization accordant with external force field. The K=0 characterizes the protein as representing micelle-like organization in water. Proteins as such have been recognized: downhill proteins, fast-folding proteins, ultra-fast-folding and antifreeze proteins type II [S10]