Initial version

This commit is contained in:
sebastiankloth
2022-04-20 14:08:38 +02:00
commit 68b8e1a305
48 changed files with 5133 additions and 0 deletions

93
doc/special/autosave.rst Normal file
View File

@ -0,0 +1,93 @@
Automatic Saving of Analysis Data
=================================
Mdevaluate provides a functionality to save the result of analysis functions automatically.
The data is saved to a file after it was computed.
If an analysis was done in the exact same way before, the result is loaded from this file.
This function may be activated through calling :func:`mdevaluate.autosave.enable`, which takes a directory as input.
If this directory is a relative path (e.g. no trailing slash) the results will be saved in a location
relative to the directory of the trajectory file.
If the output files of your simulations are located in a subdirectory, like ``/path/to/sim/Output`` it is possible
to specify the auto save location as ``../data`` such that the result files will be placed under ``/path/to/sim/data``.
At the moment the two functions which use this behavior are:
- :func:`~mdevaluate.correlation.shifted_correlation`
- :func:`~mdevaluate.distribution.time_average`
Any other function can make use of the autosave mechanism by decorating it with :func:`mdevaluate.autosave.autosave_data`.
A full example
--------------
This is how it works, for a detailed explanation see below::
import mdevaluate as md
md.autosave.enable('data')
water = md.open('/path/to/sim').subset(atom_name='OW')
md.correlation.shifted_correlation(
md.correlation.msd,
water,
description='test'
)
# The result will be saved to the file:
# /path/to/sim/data/shifted_correlation_msd_OW_test.npz
Checksum of the Analysis Call
-----------------------------
The autosave module calculates a checksum for each call of an analysis function,
which is used to validate a present the data file.
This way the result should only be loaded from file if the analysis is exactly the same.
This includes the function code that is evaluated, so the result will be recomputed if any bit of the code changes.
But there is always the possibility that checksums coincide accidentally,
by chance or due to a bug in the code, which should be kept in mind when using this functionality.
Special Keyword Arguments
-------------------------
The autosave module introduces two special keyword arguments to the decorated functions:
- ``autoload``: This prevents the loading of previously calculated data even if a valid file was found.
- ``description``: A descriptive string of the specific analysis, see below.
Those keywords may be passed to those function (shifted_correlation, time_average) like any other keyword argument.
If autosave was not enabled, they will be ignored.
File names and Analysis Descriptions
------------------------------------
The evaluated data is saved to human readable files, whose name is derived from the function call
and the automatic description of the subset.
The latter one is assigned based on the ``atom_name`` and ``residue_name`` of the :func:`~mdevaluate.atoms.AtomSubset.subset` method.
In some cases this is not enough, for example if the same subset is analyzed spatially resolved,
which would lead to identical filenames that would be overwritten.
Therefore a more detailed description of each specific analysis call needs to be provided.
For this reason the autosave module introduces the before mentioned keyword argument ``description``.
The value of this keyword is appended to the filename and in addition if any of
the other arguments of the function call has a attribute description, this will appended as well.
For example this (pseudo) code will lead to the filename ``shifted_correlation_isf_OW_1-2nm_nice.npz``::
OW = traj.subset(atom_name='OW')
corr = subensemble_correlation(spatial_selector)
corr.description = '1-2nm'
shifted_correlation(
isf,
OW,
correlation=corr,
description='nice'
)
Reusing the autosaved data
--------------------------
The results of the functions are saved in NumPy's npz format, see :func:`numpy.savez`.
If the result should be used in a different place, it can either be loaded with
:func:`numpy.load` or :func:`mdevaluate.autosave.load_data`.
The latter function will return the result of the function call directly, the former
returns a dict with the keys ``checksum`` and ``data``, the latter yielding the results data.

18
doc/special/energies.rst Normal file
View File

@ -0,0 +1,18 @@
Gromacs Energy Files
====================
It is possible to read the energy files (.edr) written out by Gromacs with mdevaluate.
Those files contain thermodynamic properties of the system, like temperature or pressure.
The exact contents of an energy file depend on the type of ensemble that was simulated,
an NVT simulation's energy file for example will not contain information about the box size.
To open these files use the function :func:`mdevaluate.open_energy`, which takes the filename of an energy file.
The types of energies stored in the file can be shown with the :attr:`types` attribute of the class :class:`~mdevaluate.reader.EnergyReader`,
the :attr:`units` attribute gives the units of these energy types.
The timesteps at which those energies were written out are accessible through the :attr:`~mdevaluate.reader.EnergyReader.time` property.
The time series of one of these energies can be accessed through the named index, comparable to python dictionaries.
::
import mdevaluate as md
edr = md.open_energy('/path/to/energy.edr')
# plot the evolution of temperature
plot(edr.time, edr['Temperature'])

76
doc/special/overlap.rst Normal file
View File

@ -0,0 +1,76 @@
Computing the Overlap Function
==============================
The overlap function is defined as the portion of particles of a given set,
whose positions *overlap* after a given time :math:`t` with the reference configuration at :math:`t=0`.
This is calculated as follows:
The Initial positions define spheres of a given radius :math:`r` which then are used
to test how many of the particles at a later time are found within those spheres.
Normalized by the number of spheres this gives the correlation of the configurational overlap.
.. math::
Q(t) = \frac{1}{N} \left\langle \sum\limits_{i=1}^N n_i(t) \right\rangle
Where :math:`n_i(t)` defines the :math:`N` spheres, with :math:`n_i(t)=1` if a particle
is found within this sphere at time :math:`t` and :math:`n_i(0) = 1` for :math:`1\leq i \leq N`.
Evaluation with mdevaluate
--------------------------
Computation of the overlap requires the relatively expensive computation of next neighbor distances,
which scales with the order of :math:`\mathcal{O}(N^2)`.
There are more efficient ways for the solution of this problem, the one used here is
the so called :class:`~scipy.spatial.cKDTree`.
This is much more efficient and allows to compute the overlap relatively fast::
OW = md.open('/path/to/sim').subset(atom_name='OW')
tree = md.coordinates.CoordinatesKDTree(OW)
Qol = md.correlation.shifted_correlation(
partial(md.correlation.overlap, crds_tree=tree, radius=0.11),
OW
)
As seen above, mdevaluate provides the function :func:`~mdevaluate.correlation.overlap`
for this evaluation, which uses a special object of type :class:`~mdevaluate.coordinates.CoordinatesKDTree`
for the neighbor search.
The latter provides two features, necessary for the computation:
First it computes a :class:`~scipy.spatial.cKDTree` for each necessary frame of the trajectory;
second it caches those trees, since assembly of KDTrees is expensive.
The size of the cache can be controlled with the keyword argument ``maxsize`` of the CoordinatesKDTree initialization.
Note that this class uses the C version (hence the lowercase C) rather than
the pure Python version :class:`~scipy.spatial.KDTree` since the latter is significantly slower.
The only downside is, that the C version had a memory leak before SciPy 0.17,
but as long as a recent version of SciPy is used, this shouldn't be a problem.
Overlap of a Subsystem
----------------------
In many cases the overlap of a subsystem, e.g. a spatial region, should be computed.
This is done by selecting a subset of the initial configuration before defining the spheres.
The overlap is then probed with the whole system.
This has two benefits:
1. It yields the correct results
2. The KDTree structures are smaller and thereby less computation and memory expensive
An example of a spatial resolved analysis, where ``OW`` is loaded as above::
selector = partial(
md.coordinates.spatial_selector,
transform=md.coordinates.spherical_radius,
rmin=1.0,
rmax=1.5
)
tree = md.coordinates.CoordinatesKDTree(OW, selector=selector)
Qol = md.correlation.shifted_correlation(
partial(md.correlation.overlap, crds_tree=tree, radius=0.11),
OW
)
This computes the overlap of OW atoms in the region :math:`1.0 \leq r \leq 1.5`.
This method can of course be used to probe the overlap of any subsystem, which is selected by the given selector function.
It should return a viable index for a (m, 3) sized NumPy array when called with original frame of size (N, 3)::
subset = frame[selector(frame)]

38
doc/special/spatial.rst Normal file
View File

@ -0,0 +1,38 @@
Spatial Resolved Analysis
=========================
This section describes how spatially resolved correlation can be analyzed with mdevaluate.
This guide assumes that the variable ``traj`` holds a trajectory where the subset of atoms that should be analyzed are selected.
For example::
traj = md.open('/path/to/sim', cached=1000).subset(atom_name='OW')
Which would load a simulation from the directory ``/path/to/sim`` and select all ``OW`` atoms.
Note that for this use case, the caching is quite useful since it enables us to iterate over spatial regions
without significant time penalty.
Moving on let's calculate the ISF of water oxygens with spherical radius between 0.5 and 0.7 nm::
from functools import partial
func = partial(md.correlation.isf, q=22.7)
selector = partial(
md.coordinates.spatial_selector,
transform=md.coordinates.spherical_radius,
rmin=0.5, rmax=0.7
)
t, S = md.correlation.shifted_correlation(
func, traj,
correlation=md.correlation.subensemble_correlation(selector)
)
To explain how this works, let's go through the code from bottom to top.
The spatial filtering is done inside the shifted_correlation by the function
:func:`mdevaluate.correlation.subensemble_correlation`.
This function takes a selector function as argument that should take a frame as input
and return the selection of the coordinates that should be selected.
A new selection is taken for the starting frame of each shifted time segment.
In this case the selection is done with the function :func:`mdevaluate.coordinates.spatial_selector`.
This function takes four arguments, the first being the frame of coordinates which is handed by :func:`subensemble_correlation`.
The second argument is a transformation function, which transforms the input coordinates to the coordinate which will be filtered,
in this case the spherical radius.
The two last arguments define the minimum and maximum value of this quantity.