Advanced features and developer’s tutorial

This tutorial describes more advanced features that can be useful for people developing their software using or extending OpenMMTools.

Using and implementing integrator and force objects

Copy and serialization utilities

The integrators and forces in OpenMMTools are usually implemented by extending custom classes in OpenMM. For example, the declaration of our LangevinIntegrator and the HarmonicRestraintForce goes boils down to

class LangevinIntegrator(mmtools.utils.RestorableOpenMMObject, openmm.CustomIntegrator):
    pass

class HarmonicRestraintForce(mmtools.utils.RestorableOpenMMObject, openmm.CustomCentroidBondForce):
    pass

The purpose of inheriting from openmmtools.utils.RestorableOpenMMObject class has to do with copies and serialization. Without RestorableOpenMMObject, these objects can still be copied and go through the standard serialization and deserialization in OpenMM without errors

from simtk import openmm, unit

class VelocityVerlet(openmm.CustomIntegrator):

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.addComputePerDof("x", "x+dt*v")
        self.addComputePerDof("v", "v+0.5*dt*f/m")

    def my_method(self):
        return 0.0

integrator = VelocityVerlet(1*unit.femtosecond)
copied_integrator = copy.deepcopy(integrator)
integrator_serialization = openmm.XmlSerializer.serialize(integrator)
deserialized_integrator = openmm.XmlSerializer.deserialize(integrator_serialization)

However, copies and serializations in OpenMM are performed at the C++ level, and thus they don’t keep track of the Python class and methods.

>>> print(type(copied_integrator))
<class 'simtk.openmm.openmm.CustomIntegrator'>

>>> deserialized_integrator.my_method()
Traceback (most recent call last):
...
AttributeError: type object 'object' has no attribute '__getattr__'

Inheriting from openmmtools.utils.RestorableOpenMMObject, allows you to easily recover the original interface after copying or deserializing. This happens automatically for copies, but you’ll have to use RestorableOpenMMObject.restore_interface() after deserialization.

from openmmtools import utils

class VelocityVerlet(utils.RestorableOpenMMObject, openmm.CustomIntegrator):

    def __init(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.addComputePerDof("x", "x+dt*v")
        self.addComputePerDof("v", "v+0.5*dt*f/m")

    def my_method(self):
        return 0.0

integrator = VelocityVerlet(1*unit.femtosecond)
>>> copied_integrator = copy.deepcopy(integrator)
>>> isinstance(copied_integrator, VelocityVerlet)
True
>>> integrator_serialization = openmm.XmlSerializer.serialize(integrator)
>>> deserialized_integrator = openmm.XmlSerializer.deserialize(integrator_serialization)
>>> utils.RestorableOpenMMObject.restore_interface(deserialized_integrator)
True
>>> deserialized_integrator.my_method()
0.0

For forces, the function openmmtools.forces.find_forces(system) automatically calls RestorableOpenMMObject.restore_interface() on all system forces so there’s usually no need to perform that call after deserialization.

Integrators coupled to a heat bath

If you implement an integrator coupled to a heat bath, you have to expose getTemperature and setTemperature methods or ThermodynamicState won’t have any way to recognize it, and it will add an AndersenThermostat force when initializing the OpenMM Context object.

The base class openmmtools.integrators.ThermostatedIntegrator is a convenience class implemented for this purpose. Inheriting from ThermostatedIntegrator will implicitly add the RestorableOpenMMObject functionalities as well.

>>> from openmmtools import integrators

>>> class MyIntegrator(integrators.ThermostatedIntegrator):
...     def __init__(self, temperature=298.0*unit.kelvin, timestep=1.0*unit.femtoseconds):
...         super().__init__(temperature, timestep)
...
>>> integrator = MyIntegrator(temperature=350*unit.kelvin)
>>> integrator.getTemperature()
Quantity(value=350.0, unit=kelvin)
>>> integrator.setTemperature(380.0*unit.kelvin)

Using standard Python attribute in custom integrators and forces

You should avoid having pure Python attributes when inheriting from custom OpenMM integrators and forces and instead favor using properties that read that attribute from the underlying OpenMM object as, for example, a global variable.

For example, an integrator exposing the temperature should not hold a simple temperature Python attribute internally such as

class INCORRECTIntegrator(openmm.CustomIntegrator):

    def __init__(self, *args, temperature=298.15*unit.kelvin, **kwargs):
        super().__init__(*args, **kwargs)
        self.temperature = temperature

but it expose it as a getter or a property similarly to the follow.

class CorrectIntegrator(openmm.CustomIntegrator):

    def __init__(self, *args, temperature=298.15*unit.kelvin, **kwargs):
        super().__init__(*args, **kwargs)
        self.addGlobalVariable('temperature', temperature)

    @property
    def temperature(self):
        return self.getGlobalVariableByName('temperature') * unit.kelvin

This is because:

  1. If the parameter doesn’t affect serialization ContextCache won’t be able to distinguish between two integrators that differ by that parameter, and it may return an incorrect integrator.

  2. Python attribute cannot be restored by RestorableOpenMMObject since there’s no information about them in the XML string, and thus they will be lost with serialization.


Handling (compound) thermodynamic states

In the examples that follow, we’ll use a simple ThermodynamicState, but everything applies to CompoundThermodynamicState as well as CompoundThermodynamicState is a subclass of ThermodynamicState.

Modifying a System object buried in a ThermodynamicState

Setting a thermodynamic parameter in ThermodynamicState is practically instantaneous, but modifying anything else involves the copy of the internal System object so it can be very slow.

from openmmtools import states
from openmmtools import testsystems

system = testsystems.TolueneVacuum().system
thermo_state = states.ThermodynamicState(system, temperature=300*unit.kelvin)

# This is very fast.
thermo_state.temperature = 400.0*unit.kelvin

system = thermo_state.system  # This is a copy! Changes to this System won't affect thermo_state.
# Make your changes to system.
thermo_state.system = system  # This involves another System copy.

The copies are there to ensure the consistency of ThermodynamicState internal state. If you need to consistently modifying part of the systems during the simulation consider implementing a composable state that handle those degrees of freedom (see section Implementing a new ComposableState).

Another thing to keep in mind is that by default the property ThermodynamicState.system will return a System containing an AndersenThermostat force. If you only use ThermodynaicState.create_context() or the ContextCache class to create OpenMM Context objects, this shouldn’t cause issues, but if for any reason you don’t want that thermostat you can use the getter instead of the property.

system = thermo_state.get_system(remove_thermostat=True)

Implementation details of compatibility checks

Internally, ThermodynamicState associates a unique hash to a System in a particular ensemble, and it compares this hash to check for compatibility. The function that performs this task looks like this:

@classmethod
def _standardize_and_hash(cls, system):
    """Standardize the system and return its hash."""
    cls._standardize_system(system)
    system_serialization = openmm.XmlSerializer.serialize(system)
    return system_serialization.__hash__()

The _standardize_system() functions sets the thermodynamic parameters controlled by the ThermodynamicState to a standard value so that System that differ by only those parameters will have identical XML serialized strings, and thus identical hashes.

The section Implementing a new ComposableState has information on how the composable states expand the concept of compatibility to thermodynamic parameters other than temperature and pressure.

Note

As a consequence of how the compatibility hash is computed, two ThermodynamicStates to be compatible must have Systems with the same particles and forces in the same order, or the XML serialization will be different.

Copying and initialization of multiple thermodynamic states

Because of some memory optimizations, copying a ThermodynamicState or a CompoundThermodynamicState does not copy the internal System so it is practically instantaneous. On the other hand, initializing a new ThermodynamicState or a CompoundThermodynamicState object does involve a System copy.

thermo_state1 = states.ThermodynamicState(system, temperature=300*unit.kelvin)

# Very fast.
thermo_state2 = copy.deepcopy(thermo_state)
thermo_state2.temperature = 350*unit.kelvin

# Slow.
thermo_state2 = states.ThermodynamicState(system, temperature=350*unit.kelvin)

The function openmmtools.states.create_thermodynamic_state_protocol takes advantage of this to make it easy to instantiate a list of ThermodynamicState or CompoundThermodynamicState objects that differ only by the controlled parameters.


Implementing a new ComposableState

The IComposableInterface

Composable states allow to control thermodynamic parameters of the simulation while masking their implementation details. There are no restrictions on the implementation details, but the class must implement the openmmtools.states.IComposableState interface. You can see the API docs for contract details, but here is a list of the methods.

class IComposableState:

    def apply_to_system(self, system):
        """Modify an OpenMM System to be in this thermodynamic state."""

    def check_system_consistency(self, system):
        """Raise AlchemicalStateError if system has different parameters."""

    def apply_to_context(self, context):
        """Modify an OpenMM Context to be in this thermodynamic state."""

    def _standardize_system(cls, system):
        """Modify the System to be in the standard thermodynamic state."""

    def _on_setattr(self, standard_system, attribute_name, old_attribute_value):
        """Callback that checks if standard system needs to be updated after a state attribute is set."""

    def _find_force_groups_to_update(self, context, current_context_state, memo)
        """Find the force groups whose energy must be recomputed after apply_to_context."""
        # Optional. This is used only for optimizations.

The _standardize_system method effectively determines which other states will be compatible (see also section Implementation details of compatibility checks). The purpose of _standardize_system is to set the parameters of the System that can be manipulated in the Context to the same value so that their XML serialization string and their hash will be identical. Systems that after standardization are identical are assigned to the same Context by ContextCache.get_context().

Relatedly, the callback _on_setattr() is called by CompoundThermodynamicState after a thermodynamic parameter has been set. The method must return True if the change in the thermodynamic parameter has caused the standard system to have a different hash. For example, in the basic ThermodynamicState class this happens when the pressure parameter goes from None to any valid value because states in NVT and NPT are not compatible.

The method _find_force_groups_to_update is optional and related to the optimization described in Computing the reduced potential of one configuration at multiple thermodynamic states.

ComposableStates that control a Force global parameter

Often, a thermodynamic parameter can be implemented with OpenMM as a global parameter added to a custom force. For example, to alchemically soften torsions, alchemy.AbsoluteAlchemicalFactory substitute some of the torsion potential terms using a openmm.CustomTorsionForce whose energy is multiplied by a global parameter called lambda_torsions.

energy_function = "lambda_torsions * k*(1+cos(periodicity*theta-phase))"
custom_force = openmm.CustomTorsionForce(energy_function)
custom_force.addGlobalParameter('lambda_torsions', 1.0)
# Other force configurations.
system.addForce(custom_force)

When this is the case, the base class openmmtools.states.GlobalParameterState can be used to create a composable state very quickly.

from openmmtools.states import GlobalParameterState

class MyComposableState(GlobalParameterState):

    lambda_torsions = GlobalParameterState.GlobalParameter('lambda_torsions', standard_value=1.0)

It is possible to perform checks on the assigned value by adding a validator.

class MyComposableState(GlobalParameterState):

    lambda_torsions = GlobalParameterState.GlobalParameter('lambda_torsions', standard_value=1.0)

    @lambda_torsions.validator
        def lambda_torsions(self, instance, new_value):
            if new_value is not None and not (0.0 <= new_value <= 1.0):
                raise ValueError('lambda_torsions must be between 0.0 and 1.0')
            return new_value

The example above allows only values between 0.0 and 1.0 for lambda_torsions.

Computing the reduced potential of one configuration at multiple thermodynamic states

When computing the potential energy of a single configuration at multiple thermodynamic states, it is often unnecessary to compute the whole Hamiltonian multiple times but just the terms of the Hamiltonian that change from one state to another. OpenMM makes this possible to compute only the energy of a subset of forces through the force groups mechanism.

force = openmm.CustomBondForce('(K/2)*(r-r0)^2;')
force.setForceGroup(5)

The utility function openmmtools.states.reduced_potential_at_states() takes advantage of forces separated in different groups to efficiently compute the reduced potentials at the thermodynamic states.

from openmmtools import alchemy
from openmmtools import cache

alanine = testsystems.AlchemicalAlanineDipeptide()
protocol = {'lambda_sterics': [1.0, 0.5, 0.0],
            'lambda_electrostatics': [1.0, 0.5, 0.0]}
constants = {'temperature': 300*unit.kelvin}
composable_states = [alchemy.AlchemicalState.from_system(alanine.system)]
compound_states = states.create_thermodynamic_state_protocol(alanine.system, protocol,
                                                             constants, composable_states)

sampler_state = states.SamplerState(positions=alanine.positions)
reduced_potentials = states.reduced_potential_at_states(sampler_state, compound_states,
                                                        cache.global_context_cache)

In order for the optimization to take effect, the composable states must implement the method _find_force_groups_to_update(self, context, current_context_state, memo). This method inspects the System associated to the context and return the force groups that will have an updated energy after the state will be changed from current_context_state to self. The memo dictionary can be use to store the force groups to inspect in subsequent calls of the method within a reduced_potential_at_states execution so that the System must be parsed only the first time.


Implementing a new MCMCMove

The MCMCMove interface

An MCMCMove requires exclusively the implementation of an apply method with the following signature (see the API documentation for more details.

class MCMCMove(SubhookedABCMeta):

    def apply(self, thermodynamic_state, sampler_state):
        pass

Anything can happen inside apply as long as thermodynamic_state and sampler_state are updated correctly. It is usually a good idea to include in the constructor a context_cache argument to let the user specify how the Context should be created and on which platform.

OpenMM integrators that modify the thermodnamic state

Custom OpenMM integrators can modify global variables that effectively change the thermodynamic state of the Context.

Important

Remember to update the thermodynamic_state object correctly at the end of apply if the integrator changes the thermodynamic state of the simulation.

When this is the case, it’s not possible to cast your integrator into an MCMCMove with IntegratorMove. Nevertheless, it’s still possible to take advantage of the extra features already offered by IntegratorMove by subclassing the openmmtools.mcmc.BaseIntegratorMove <mcmc> class. IntegratorMove inherits from this base class. An implementation would look more or less like this (see the API documentation for the details).

class MyMove(BaseIntegratorMove):
    def __init__(self, timestep, n_steps, **kwargs):
        super(MyMove, self).__init__(n_steps, **kwargs)
        self.timestep = timestep

    def _get_integrator(self, thermodynamic_state):
        return MyIntegrator(self.timestep, thermodynamic_state.temperature)

    def _before_integration(self, context, thermodynamic_state):
        # Optional: Any operation performed after the context
        # was created but before integration.

    def _after_integration(self, context, thermodynamic_state):
        # Update thermodynamic_state from context parameters.
        # Optional: Read statistics from context.getIntegrator() parameters.

Metropolized MCMCMoves

The mcmc module contains a base class for Metropolized moves as well. The following class implement an example that simply adds the unit vector to the initial coordinates.

from openmmtools import mcmc

class AddOneVector(mcmc.MetropolizedMove):
    def _propose_positions(self, initial_positions):
        print('Propose new positions')
        displacement = numpy.array([1.0, 1.0, 1.0]) * unit.angstrom
        return initial_positions + displacement

The parent class will take care of implementing the Metropolis acceptance criteria, collecting acceptance statistics, and updating the SamplerState correctly. The constructor accepts an optional atom_subset to limit the move to certain atoms. In this case, the initial_positions will be the positions of the atom subset only.

>>> alanine = testsystems.AlanineDipeptideVacuum()
>>> sampler_state = states.SamplerState(alanine.positions)
>>> thermodynamic_state = states.ThermodynamicState(alanine.system, 300*unit.kelvin)
>>> move = AddOneVector(atom_subset=list(range(sampler_state.n_particles)))
>>> move.apply(thermodynamic_state, sampler_state)
Propose new positions
>>> move.n_accepted
1
>>> move.n_proposed
1

Real time analysis output

The MultiStateReporter outputs a file with analysis statistics every few interactions dictated by the online_analysis_interval option from the MultiStateSampler object. The format of this file consists of several entries as the following:

- iteration: 4
  percent_complete: 25.0
  mbar_analysis:
    free_energy_in_kT: 20.747354877367165
    standard_error_in_kT: 0.7055846807301683
    number_of_uncorrelated_samples: 5.0
  timing_data:
    iteration_seconds: 0.04654264450073242
    average_seconds_per_iteration: 0.05740888913472494
    estimated_time_remaining: '0:00:00.746316'
    estimated_localtime_finish_date: 2022-Mar-24-20:07:21
    estimated_total_time: '0:00:00.918542'
    ns_per_day: 150.49934130799824

This is intended to be used for monitoring performance of currently running simulations.