The
pre-NPEBC participants have already performed studies in this direction. See the
last column in Table A.2. At the lower level, a fruitful approach to extending
the scale of simulations has been through the combination of continuum
representations of the solvent with explicit representations of protein and/or
nucleic acid (56). This has enabled
computationally less demanding studies of RNA-protein complexes (59) and ligand binding to
avidin and streptavidin (60). It is encouraging that
these studies yielded reasonable results for binding free energies even though
there were significant conformational changes. In
this context should be mentioned a novel method for optimizing solvation
parameters for continuum models and the efficient methods for conformational
search for protein loops (61-66). The pre-NPEBC participants have also developed BD and Monte
Carlo (MC) algorithms and PCA-based tools for higher (but molecular) level
simulations and trajectory analysis (Table A.2).
The Gaussian Network Model (GNM,
Table A.2) has been introduced by Bahar and coworkers (31;67;68) as an efficient tool for exploring the
dynamics of large biomolecular
structures or complexes. GNM bears close resemblance to a classical normal mode
analysis (NMA) (69-72) but it is significantly simpler in that it
requires no a priori knowledge of energy parameters, following the original
proposition of Tirion (73), and most
importantly it lends itself to a closed mathematical solution. Its roots are
well founded in fundamental statistical mechanical theories of polymer networks
(74;75). The major
assumption in the GNM is the representation of the structure as a network of N
interaction sites (Fig A.1). The
pairs of sites (or nodes) closer than a cutoff distance rc
representative of a first coordination shell radius are connected by identical
springs (dashed lines). They form the connectors of the network. The dynamics
of this network is fully controlled by the Kirchhoff matrix (G) of
contacts. G
gives a complete description of the connectivity
of the network. Thermodynamic characteristics are found using the Hamiltonian H = (g/2) DR G DRT
of the system, DR
being the N-dimensional vector of fluctuations for the individual sites.
An important feature of the GNM is
the possibility of dissecting the observed motion into a collection of modes -
by eigenvalue decomposition of G, and focusing on the slowest modes. These modes usually provide
us with information on the molecular mechanisms relevant to biological function
(76-80). Several studies (81-87), including the comparisons
with H/D exchange (88) and NMR relaxation (89) have demonstrated the
utility of the GNM for understanding the machinery of proteins and their complexes.
New collaborations of the team members (Bahar & Gordon; Bahar & Ho) for
interpreting NMR relaxation data also indicate the potential utility of
combining/comparing the results from GNM or other computational approaches with
NMR data both for constructing more accurate models and understanding the
conformational dynamics of proteins. (see the support letter of Dr Ho). A close
cooperation between computational molecular biologists and NMR experimentalists
is thus anticipated within aim 1(i) activities, both for improving molecular
models and suggesting new experiments.
Finally, the
recent studies (90-92) show promise for the extension of GNM-based
methods to multimolecular assemblies. The idea presented in ref (90) is to represent the
structure at hierarchical levels of detail, and repetitively perform GNM
analysis, by suitable renormalization of the network parameters. The dominant
motions of influenza virus hemagglutinin A could be accurately reproduced by
this method, even when adopting a very coarse-grained model of one representative
site per 40 residues. The major
advantage of this approach is the increase by >3 orders of magnitude in
computational speed, which permits us to explore the dynamics of multiprotein
assemblies of the order of 104 residues within ‘minutes’ using for
example an R10,000 SGI workstation. Caspase 8 (93), RNAP II (94), or protein 14-3-3z (95) could be potential targets
for this approach, within the scope of the respective projects DP1, DP2 and DP3
(see § C.1-3).
The major strategy in this group
of studies will be to assess the minimal level of complexity (geometry and
energetics) to be adopted in simulations in order to capture the biological
function of interest. The already accumulated expertise at Pitt over a broad
range of molecular computations (Table A.2) and the possibility of
performing/testing simulations/predictions for a given molecular mechanism with
a diversity of methods by different groups and comparing/combining the results
are significant opportunities that will be exploited to this aim.