Molecular Basis of Class B GPCR Selectivity for the Neuropeptides PACAP and VIP

The related neuropeptides PACAP and VIP, and their shared PAC1, VPAC1 and VPAC2 receptors, regulate a large array of physiological activities in the central and peripheral nervous systems. However, the lack of comparative and molecular mechanistic investigations hinder further understanding of their preferred binding selectivity and function. PACAP and VIP have comparable affinity at the VPAC1 and VPAC2 receptor, but PACAP is 400–1,000 fold more potent than VIP at the PAC1 receptor. A molecular understanding of the differing neuropeptide-receptor interactions and the details underlying the receptor transitions leading to receptor activation are much needed for the rational design of selective ligands. To these ends, we have combined structural information and advanced simulation techniques to study PACAP/VIP binding selectivity, full-length receptor conformation ensembles and transitions of the PACAP/VIP receptor variants and subtypes, and a few key interactions in the orthosteric-binding pocket. Our results reveal differential peptide-receptor interactions (at the atomistic detail) important for PAC1, VPAC1 and VPAC2 receptor ligand selectivity. Using microsecond-long molecular dynamics simulations and the Markov State Models, we have also identified diverse receptor conformational ensembles and microstate transition paths for each receptor, the potential mechanisms underlying receptor open and closed states, and the interactions and dynamics at the transmembrane orthosteric pocket for receptor activation. These analyses reveal important features in class B GPCR structure-dynamics-function relationships, which provide novel insights for structure-based drug discovery.

Supplementary Table 3 Transition time range of first and second shortest pathways.

Construction, validation, and analyses of Markov state model
Markov state models of molecular kinetics (MSMs) were applied on the collection of ligand-free MD trajectories, in which we could identify the transition pathways and estimate the long-time statistical dynamics between the conformational states of interest. We used the MSMBuilder3.8.0 program [2,3] to construct the transition matrices of MSM and calculated the transition timescale based on the transition-path theory. [4][5][6][7] A general scheme is displayed below.
Supplementary Figure M1. General scheme for MSM constructions in present study.
(1) Trajectories preparation. We first prepared the trajectory dataset with atoms indices by saving the Cα coordinates of a GPCR protein excluding the first and last five terminal residues which are usually dynamic. We first tried to build a MSM on full trajectories (Test1 in Supplementary Fig M1), it turns out data population in use were as low as 50% since finals states take up large population in full trajectories. Therefore, we trimmed our trajectories based on RMSD plots ( Supplementary Fig M2) and visually checking the trimmed trajectories to make sure trajectories contain final conformation population and also pass the initial test. In general, we have 4,519~14,281 configurations.
(2) Clustering. We grouped the conformation dataset into a set of clusters, called microstates, based on structural similarities. We used the k-centers algorithm [2] to group the dataset into 33~55 clusters by RMSD metric and make sure mean distance and maximum distance were not too large, within ~0.35 Å and ~0.59 Å, respectively. (3) A MSM construction. With the set of microstates discretization, a series of microstate transition matrices in the evolution of the observation interval or lag time at τ, 2τ, …, nτ, were constructed, on which the implied timescales and the Chapman-Kolmogorov test [8] were carried out to examine if a microstate transition matrix with the microstate discretization and a chosen lag time is Markovian. [5,8,9] We used maximum likelihood estimation to build the transition matrices at series of lag times, in which increasing the lag time means that states can get larger and more coarse grained as the longer lag time the fewer states are kinetically relevant (kinetically reach each other on timescales faster than the lag time). [10] (4) Implied timescales scanning. The implied timescales as a function of the lag time are shown in Supplementary Fig M3. Plots with gaps between gradually level out, from which we give initial guess for lag time (marked as t') that is Markovian. We constructed MSM with lag time = t', from which we collected these microstates (labels) that include the final conformations to calculate the short pathways connecting these stable conformational states in a transition matrix.

Supplementary
Supplementary Figure M3. Implied timescales as a function of the lag time. The macrostate partitions become robust along lag time series.
(5) Shortest transition pathways. Using the transition-path theory (TPT), [4][5][6][7] we calculated the top five shortest pathways connecting these stable conformational states in a transition matrix, each with the minimum transition net flux in a pathway, given lag time = t'. We collected two sets of microstates that are in the shortest and second shortest pathways to go through the Chapman-Kolmogorov test.
(6) The Chapman-Kolmogorov test [5,8] were applied on set of states that constituting the first and second shortest paths between stable conformational states as shown in Supplementary Fig M4.The detailed implementation of Chapman-Kolmogorov test was described in a specific chapter [11]. We tested if the lag time = t' holds the test, i.e., the transition probabilities from MSM agree well with the probabilities in observed trajectory within statistical uncertainty at lag time = t'. If not, we go back to last step with a new lag time again. If yes, we build up a macrostates MSM with lag time = t'.
Supplementary Figure M4. Groups of microstates that constitute the shortest and second shortest transition pathways between closed and open states were examined by Chapman-Kolmogorov test. The transition probabilities from MSM agree well with the probabilities in the observed trajectories within statistical uncertainty.
(7) The eigenvalues of the transition matrix are given with lag time = t', shown in Supplementary Fig  M5. The division of macrostates were calculated from the eigenfunction structure using the Perron Cluster Cluster Analysis (PCCA) method. [12]While the macrostates number can be visually estimated by the number of the major gaps in the implied timescales, it is directly determined here by the number of eigenvalues that are close to 1. [12] Supplementary Figure M5. Eigenvalues of the transition matrix at lag time of 1.2, 1.68, 2.4 ns for PAC1s, VPAC1 and VPAC2 systems. Only the first sixteen data are shown.