swiftpol

A Python package for building in silico polymer systems

swiftpol.build

swiftpol.build.build_polymer(sequence, monomer_list, reaction, terminal='hydrogen', chain_num=1)[source]

Constructs a polymer from a given sequence of monomers.

Parameters:
  • sequence (str) – A string representing the sequence of monomers (e.g., ‘ABAB’).

  • monomer_list (list) – A list of SMILES strings representing the monomers.

  • reaction (rdkit.Chem.rdChemReactions.ChemicalReaction) – An RDKit reaction object used to link monomers.

  • terminal (str, optional) – The terminal group to be added to the polymer. Options are ‘hydrogen’, ‘carboxyl’, ‘ester’, or a canonical smiles string to insert as the terminal. Default is ‘hydrogen’.

  • chain_number (int, optional) – The number of polymer chains to construct. Default is 1. Input used for ensemble build.

Returns:

The constructed polymer as an RDKit molecule object.

Return type:

rdkit.Chem.rdchem.Mol

swiftpol.build.build_linear_copolymer(sequence, monomer_a_smiles, monomer_b_smiles, reaction=<rdkit.Chem.rdChemReactions.ChemicalReaction object>)[source]

Constructs a linear co-polymer from the provided sequence of monomers.

This function takes a sequence of monomers represented as ‘A’ and ‘B’, and the SMILES strings of two monomers. It constructs a co-polymer based on the sequence, using the provided reaction SMARTS for joining the monomers. The function returns the sanitized polymer and the percentage composition of each monomer in the polymer.

Parameters:
  • sequence (str) – A string representing the sequence of monomers. ‘A’ represents monomer_a and ‘B’ represents monomer_b.

  • monomer_a_smiles (str) – The SMILES string of monomer A.

  • monomer_b_smiles (str) – The SMILES string of monomer B.

  • reaction (rdkit.Chem.rdChemReactions.ChemicalReaction, optional) – The reaction SMARTS used for joining the monomers. Defaults to ‘[C:1][HO:2].[HO:3][C:4]>>[C:1][O:2][C:4].[O:3]’, representing a condensation polymerisation.

Returns:

A tuple containing the following elements: - sanitized_polymer (rdkit.Chem.rdchem.Mol): The constructed and sanitized polymer. - percentage_monomer_a (float): The percentage composition of monomer A in the polymer. - percentage_monomer_b (float): The percentage composition of monomer B in the polymer.

Return type:

tuple

swiftpol.build.PDI(chains)[source]

Calculates the Polydispersity Index (PDI), number-average molecular weight (Mn), and weight-average molecular weight (Mw) of a list of chains.

Parameters:

chains (list) – A list of molecular chains. Each chain is represented as an RDKit molecule object.

Returns:

A tuple containing the following elements: - PDI (float): The Polydispersity Index, which is the ratio of Mw to Mn. - Mn (float): The number-average molecular weight. - Mw (float): The weight-average molecular weight.

Return type:

tuple

swiftpol.build.blockiness_gen(sequence, wrt='A')[source]

Calculate the blockiness and average block length of a co-polymer sequence.

This function calculates the blockiness of a co-polymer sequence by counting the occurrences of ‘BB’ and ‘BA’ or ‘AB’ in the sequence. It also calculates the average block length of ‘A’ and ‘B’ monomers in the sequence.

Parameters:

sequence (str) – A string representing the co-polymer sequence. ‘A’ represents one type of monomer and ‘B’ represents another type.

Returns:

A tuple containing the following elements: - blockiness (float): The blockiness of the co-polymer sequence. Calculated as the ratio of ‘BB’ to ‘BA’ or ‘AB’. - block_length_A (float): The average block length of ‘A’ in the sequence. - block_length_B (float): The average block length of ‘B’ in the sequence.

Return type:

tuple

Notes

If the sequence does not contain both ‘A’ and ‘B’, the function returns a string indicating that the molecule is not a co-polymer.

swiftpol.build.calculate_box_components(chains, monomers, sequence, salt_concentration=<Quantity(0.0, 'mole / liter')>, residual_monomer=0.0)[source]

Calculates the components required to construct a simulation box for a given set of molecular chains.

This function determines the quantity of each molecule type required, considering the salt concentration and residual monomer concentration. It is adapted from the OpenFF Toolkit Packmol wrapper’s solvate_topology function.

Parameters:

chainslist

A list of molecular chains to be included in the simulation box.

sequencestr

A string representing the sequence of the molecular chains. ‘G’ and ‘L’ represent different types of monomers.

salt_concentrationfloat, optional

The desired salt concentration in the simulation box. Defaults to 0 M.

residual_monomerfloat, optional

The desired residual monomer concentration in the simulation box. Defaults to 0.00%.

Returns:

tuple

A tuple containing the following elements: - molecules (list): A list of molecules to be included in the simulation box. - number_of_copies (list): A list indicating the quantity of each molecule to be included in the simulation box. - topology (openff.toolkit.topology.Topology): The topology of the simulation box. - box_vectors (numpy.ndarray): The vectors defining the dimensions of the simulation box.

Notes:

This function is adapted from the OpenFF Toolkit Packmol wrapper’s solvate_topology function.

swiftpol.build.introduce_stereoisomers(stereo_monomer, instance, seq)[source]

Introduce stereoisomers into a polymer sequence by replacing a specified percentage of ‘A’ monomers with ‘S’.

This function replaces a specified percentage of ‘A’ monomers with ‘S’ in the given sequence. The replacements are made in pairs of ‘A’ monomers to ensure stereoisomerism.

Parameters:
  • stereo_monomer (str) – The monomer to be replaced with its stereoisomer (e.g., ‘A’).

  • instance (float) – The fraction of ‘stereo_monomer’ monomers to be replaced with stereoisomer (e.g., 0.5 for 50%).

  • seq (str) – The original polymer sequence.

Returns:

The modified sequence with stereoisomers introduced.

Return type:

str

swiftpol.build.polymer_system

class swiftpol.build.polymer_system(monomer_list, reaction, length_target, num_chains, stereoisomerism_input=None, terminals='hydrogen', perc_A_target=100, blockiness_target=[1.0, 'A'], copolymer=False, acceptance=10)[source]

Bases: object

__init__(monomer_list, reaction, length_target, num_chains, stereoisomerism_input=None, terminals='hydrogen', perc_A_target=100, blockiness_target=[1.0, 'A'], copolymer=False, acceptance=10)[source]

Initialize the polymer system and build the polymer chains.

Parameters:

monomer_list (list): List of monomers to be used in the polymerization.

reaction (str): The type of reaction to be used for polymerization.

length_target (float): The target length of the polymer chains.

num_chains (int): The number of polymer chains to be generated.

stereoisomerism_input (tuple, optional): A tuple containing the monomer, instance fraction (e.g. 0.5 for 50% stereoisomer), and SMILES string of the stereoisomer to be introduced. Default is None.

terminals (str, optional): The type of terminal groups to be used. Default is ‘hydrogen’, adds a hydrogen atom.

perc_A_target (float, optional): The target percentage of monomer A in the copolymer. Default is 100.

blockiness_target (float, optional): The target blockiness of the copolymer, and indication of method calculation. Default is 1.0, with reference to ‘A’ monomer linkages.

copolymer (bool, optional): Flag to indicate if the system is a copolymer. Default is False.

acceptance = % deviation of blockiness and A percentage from target values. Default is 10%

Attributes:

length_target (float): The target length of the polymer chains.

terminals (str): The type of terminal groups used.

blockiness_target (float): The target blockiness of the copolymer.

A_target (float): The target percentage of monomer A in the copolymer.

chains (list): List of polymer chains as OpenFF Molecule objects.

chain_rdkit (list): List of polymer chains as RDKit molecule objects.

lengths (list): List of lengths of the polymer chains.

perc_A_actual (list): List of actual percentages of monomer A in the polymer chains.

B_block_length (float): The average block length of monomer B in the copolymer.

A_block_length (float): The average block length of monomer A in the copolymer.

blockiness_list (list): List of blockiness values for the polymer chains.

mean_blockiness (float): The mean blockiness of the polymer chains.

mol_weight_average (float): The average molecular weight of the polymer chains.

PDI (float): The polydispersity index of the polymer chains.

Mn (float): The number-average molecular weight of the polymer chains.

Mw (float): The weight-average molecular weight of the polymer chains.

num_chains (int): The number of polymer chains generated.

length_average (float): The average length of the polymer chains.

min_length (float): The minimum length of the polymer chains.

max_length (float): The maximum length of the polymer chains.

generate_conformers()[source]

Generate conformers for each polymer chain in the system.

This method uses the OpenFF toolkit OpenEye Wrapper to generate conformers for each polymer chain in the system. It first checks if the OpenEye toolkit is licensed and available. If it is, it uses the OpenEyeToolkitWrapper to generate conformers. Otherwise, it falls back to using the RDKitToolkitWrapper. Each chain is processed to generate a single conformer, and unique atom names are assigned to each chain.

Raises:

ImportError – If neither RDKit nor OpenEye toolkits are available.

charge_system(charge_scheme)[source]

Assign partial charges to each polymer chain in the system.

This method uses one of AM1-BCC, Espaloma, or OpenFF NAGL to assign partial charges to each polymer chain in the system. It iterates over each chain in the self.chains list and assigns partial charges to the chain.

Parameters:

charge_scheme (str) – The charge assignment scheme to use. Options are ‘AM1_BCC’, ‘espaloma’, or ‘NAGL’.

Raises:

ImportError – If the selected toolkit is not available.

export_to_csv(filename)[source]

Export all the instances in polymer_system.__init__() into a pandas DataFrame and save it as a CSV file.

Parameters:

filename (str) – The name of the CSV file to save the data.

pack_solvated_system(salt_concentration=<Quantity(0.0, 'mole / liter')>, residual_monomer=0.0)[source]

Pack a solvated system using Packmol functions, and the OpenFF Packmol wrapper.

This method uses Packmol to build a solvated system by packing molecules into a simulation box. It considers the salt concentration and residual monomer concentration to determine the quantity of each molecule type required.

Parameters:
  • salt_concentration (openff.units.Quantity, optional) – The desired salt concentration in the simulation box. Default is 0.0 M.

  • residual_monomer (float, optional) – The desired residual monomer concentration in the simulation box. Default is 0.00.

Returns:

An Interchange object representing the packed solvated system.

Return type:

openff.toolkit.topology.topology.Topology

Notes

This function uses the OpenFF Interchange Packmol wrapper to pack the molecules into the simulation box. It removes any molecules with zero copies before packing, to avoid packmol errors. Assigns % residual monomer value to ensemble under self.residual_monomer_actual

generate_polyply_files(residual_monomer=0.0, residual_oligomer=0.0)[source]

Generate input files for Polyply from the system.

This method generates the input files required for Polyply (https://github.com/marrink-lab/polyply_1.0/) from the system.

Parameters:
  • residual_monomer (float, optional) – The desired residual monomer concentration in the simulation box. Default is 0.00.

  • residual_oligomer (float, optional) – The desired residual oligomer concentration in the simulation box. Default is 0.00.

Returns:

A tuple containing the paths to the .gro, .top, and .pdb files generated for Polyply and GROMACS insert-molecules input.

Return type:

tuple

Notes

This function uses OpenFF Interchange to generate the input files for Polyply.

Raises:

UserWarning – If partial charges are not assigned to the system, processing large systems may raise errors from OpenFF-Interchange.

calculate_residuals(residual_monomer=0, residual_oligomer=0, return_rdkit=False)[source]

Generate residual monomer and oligomer molecules, and molecule counts.

Parameters:
  • residual_monomer (float) – The desired residual monomer concentration in the simulation box. Default is 0.00.

  • residual_oligomer (float) – The desired residual oligomer concentration in the simulation box. Default is 0.00.

  • return_rdkit (bool, optional) – If True, return RDKit molecule objects instead of OpenFF Molecule objects. Default is False.

Returns:

  • tuple

  • A tuple containing the following elements

    • A list of OpenFF Molecule objects representing the residual monomer and oligomer molecules.

    • A list of molecule counts, corresponding with the molecule list

    • Actual value for % residual monomer

    • Actual value for % residual oligomer

Notes

EXPERIMENTAL CAPABILITY. Proceed with caution

This function works independently of, and is unrelated to, the build.calculate_box_components function.

Raises:

UserWarning - If the residual monomer concentration is close to or lower than the weight of a single monomer.

swiftpol.build.polymer_system_from_PDI

class swiftpol.build.polymer_system_from_PDI(monomer_list, reaction, length_target, num_chains, PDI_target, stereoisomerism_input=None, terminals='hydrogen', perc_A_target=100, blockiness_target=[1.0, 'A'], copolymer=False, acceptance=10)[source]

Bases: object

Build a polymer system based on a target polydispersity index (PDI).

This class generates a polymer system with a specified number of chains, target chain length, and target PDI. It supports the generation of copolymers, stereoisomers, and blockiness control. The system is built using OpenFF and RDKit toolkits, with optional support for OpenEye toolkits if installed.

Note: This capability is under development and may change significantly in the next version of SwiftPol. Proceed with caution.

Parameters:
  • monomer_list (list of str) – A list of monomer SMILES strings to be used in the polymer system.

  • reaction (str) – A SMARTS string defining the polymerization reaction.

  • length_target (int) – The target average chain length for the polymer system.

  • num_chains (int) – The number of polymer chains to generate.

  • PDI_target (float) – The target polydispersity index (PDI) for the polymer system.

  • stereoisomerism_input (tuple, optional) – A tuple containing the stereoisomer monomer, the fraction of stereoisomers to introduce, and the SMILES string of the stereoisomer. Default is None.

  • terminals (str, optional) – The type of terminal groups to use for the polymer chains. Default is ‘hydrogen’.

  • perc_A_target (float, optional) – The target percentage of monomer ‘A’ in the copolymer. Default is 100.

  • blockiness_target (list, optional) – A list containing the target blockiness value and the monomer to control blockiness for. Default is [1.0, ‘A’].

  • copolymer (bool, optional) – Whether to generate a copolymer. Default is False.

  • acceptance (float, optional) – The acceptance margin (in percentage) for the target blockiness and monomer percentage. Default is 10.

Variables:
  • chains (list of openff.toolkit.topology.Molecule) – A list of OpenFF Molecule objects representing the polymer chains.

  • chain_rdkit (list of rdkit.Chem.Mol) – A list of RDKit Molecule objects representing the polymer chains.

  • sequence (str) – The sequence of monomers in the polymer system.

  • mol_weight_average (float) – The average molecular weight of the polymer chains.

  • PDI (float) – The polydispersity index of the polymer system.

  • Mn (float) – The number-average molecular weight of the polymer system.

  • Mw (float) – The weight-average molecular weight of the polymer system.

  • num_chains (int) – The number of polymer chains in the system.

  • length_average (float) – The average chain length of the polymer system.

  • lengths (list of int) – A list of chain lengths for the polymer chains.

  • min_length (int) – The minimum chain length in the polymer system.

  • max_length (int) – The maximum chain length in the polymer system.

  • A_actual (float or None) – The actual percentage of monomer ‘A’ in the copolymer. None if not a copolymer.

  • perc_A_actual (list of float or None) – A list of the actual percentages of monomer ‘A’ in each chain. None if not a copolymer.

  • B_block_length (float or None) – The average block length of monomer ‘B’. None if not a copolymer.

  • A_block_length (float or None) – The average block length of monomer ‘A’. None if not a copolymer.

  • blockiness_list (list of float or None) – A list of blockiness values for each chain. None if not a copolymer.

  • mean_blockiness (float or None) – The mean blockiness value for the polymer system. None if not a copolymer.

Notes

  • If OpenEye toolkits are not installed, RDKit will be used for conformer generation.

  • The system is built using a log-normal distribution of chain lengths to achieve the target PDI.

  • For small systems, PDI may not be close to the target value.

Examples

>>> from swiftpol.build import polymer_system_from_PDI
>>> monomer_list=['OC(=O)COI']
>>> reaction = '[C:1][O:2][H:3].[I:4][O:5][C:6]>>[C:1][O:2][C:6].[H:3][O:5][I:4]'
>>> system = polymer_system_from_PDI(
...     monomer_list=monomer_list,
...     reaction=reaction,
...     length_target=50,
...     num_chains=50,
...     PDI_target=1.7,
...     copolymer=False,
... )
>>> print(system)
SwiftPol ensemble of size 50, average chain length = 10.6-mers, PDI = 1.7006109664210667
__init__(monomer_list, reaction, length_target, num_chains, PDI_target, stereoisomerism_input=None, terminals='hydrogen', perc_A_target=100, blockiness_target=[1.0, 'A'], copolymer=False, acceptance=10)[source]
generate_conformers()[source]

Generate conformers for each polymer chain in the system.

This method uses the OpenFF toolkit OpenEye Wrapper to generate conformers for each polymer chain in the system. It first checks if the OpenEye toolkit is licensed and available. If it is, it uses the OpenEyeToolkitWrapper to generate conformers. Otherwise, it falls back to using the RDKitToolkitWrapper. Each chain is processed to generate a single conformer, and unique atom names are assigned to each chain.

Raises:

ImportError – If neither RDKit nor OpenEye toolkits are available.

charge_system(charge_scheme)[source]

Assign partial charges to each polymer chain in the system.

This method uses one of AM1-BCC, Espaloma, or OpenFF NAGL to assign partial charges to each polymer chain in the system. It iterates over each chain in the self.chains list and assigns partial charges to the chain.

Parameters:

charge_scheme (str) – The charge assignment scheme to use. Options are ‘AM1_BCC’, ‘espaloma’, or ‘NAGL’.

Raises:

ImportError – If the selected toolkit is not available.

export_to_csv(filename)[source]

Export all the instances in polymer_system.__init__() into a pandas DataFrame and save it as a CSV file.

Parameters:

filename (str) – The name of the CSV file to save the data.

pack_solvated_system(salt_concentration=0.0, residual_monomer=0.0)[source]

Pack a solvated system using Packmol functions, and the OpenFF Packmol wrapper.

This method uses Packmol to build a solvated system by packing molecules into a simulation box with water. It considers the salt concentration and residual monomer concentration to determine the quantity of each molecule type required.

Parameters:
  • salt_concentration (openff.units.Quantity, optional) – The desired salt concentration in the simulation box. Default is 0.0 M.

  • residual_monomer (float, optional) – The desired residual monomer concentration in the simulation box. Default is 0.00.

Returns:

An Interchange object representing the packed solvated system.

Return type:

openff.toolkit.topology.topology.Topology

Notes

This function uses the OpenFF Interchange Packmol wrapper to pack the molecules into the simulation box. It removes any molecules with zero copies before packing, to avoid packmol errors. Assigns % residual monomer value to ensemble under self.residual_monomer_actual

generate_polyply_files(residual_monomer=0.0, residual_oligomer=0.0)[source]

Generate input files for Polyply from the system.

This method generates the input files required for Polyply (https://github.com/marrink-lab/polyply_1.0/) from the system.

Parameters:
  • residual_monomer (float, optional) – The desired residual monomer concentration in the simulation box. Default is 0.00.

  • residual_oligomer (float, optional) – The desired residual oligomer concentration in the simulation box. Default is 0.00.

Returns:

A tuple containing the paths to the .gro, .top, and .pdb files generated for Polyply and GROMACS insert-molecules input.

Return type:

tuple

Notes

This function uses OpenFF Interchange to generate the input files for Polyply.

Raises:

UserWarning – If partial charges are not assigned to the system, processing large systems may raise errors from OpenFF-Interchange.

calculate_residuals(residual_monomer=0, residual_oligomer=0)[source]

Generate residual monomer and oligomer molecules, and molecule counts.

Parameters:
  • residual_monomer (float) – The desired residual monomer concentration in the simulation box. Default is 0.00.

  • residual_oligomer (float) – The desired residual oligomer concentration in the simulation box. Default is 0.00.

Returns:

  • tuple

  • A tuple containing the following elements

    • A list of OpenFF Molecule objects representing the residual monomer and oligomer molecules.

    • A list of molecule counts, corresponding with the molecule list

    • Actual value for % residual monomer

    • Actual value for % residual oligomer

Notes

EXPERIMENTAL CAPABILITY. Proceed with caution

This function works independently of, and is unrelated to, the build.calculate_box_components function.

Residual oligomers are polydisperse and are generated based on 10% of the target chain length used in ensemble initiation.

Raises:

UserWarning - If the residual monomer concentration is close to or lower than the weight of a single monomer.

swiftpol.parameterize

swiftpol.parameterize.charge_polymer(polymer, charge_scheme)[source]

Calculate and return the partial charges of a polymer chain based on the specified charge scheme.

Parameters:
  • polymer (rdkit.Chem.rdchem.Mol) – A polymer chain for which the charges are to be calculated.

  • charge_scheme (str) – A string that specifies the charge scheme to be used. It can be either ‘AM1_BCC’, ‘espaloma’, or ‘NAGL’.

Returns:

The partial charges of the polymer chain according to the specified charge scheme.

Return type:

numpy.ndarray

Raises:

AttributeError – If the charge_scheme input is not ‘AM1_BCC’, ‘NAGL’, or ‘espaloma’.

swiftpol.parameterize.charge_openff_polymer(openff_chain, charge_scheme, overwrite=True)[source]

Assign partial charges to a polymer chain based on the specified charge scheme.

This function assigns partial charges to a polymer chain using one of the following charge schemes: ‘AM1_BCC’, ‘NAGL’, or ‘espaloma’. It can optionally overwrite existing charges.

Parameters:
  • openff_chain (openff.toolkit.topology.Molecule) – The original polymer chain to which partial charges will be assigned.

  • charge_scheme (str) – The charge scheme to use for assigning partial charges. Options are ‘AM1_BCC’, ‘NAGL’, or ‘espaloma’.

  • overwrite (bool, optional) – Whether to overwrite existing partial charges. Default is True.

Returns:

The partial charges assigned to the polymer chain.

Return type:

openff.toolkit.topology.molecule.PartialChargeArray

Raises:
  • ModuleNotFoundError – If the installed version of openff-toolkit is below what is required to use NAGL.

  • ImportError – If the required package for the specified charge scheme is not installed.

  • AttributeError – If the charge_scheme input is not ‘AM1_BCC’, ‘NAGL’, or ‘espaloma’.

swiftpol.parameterize.forcefield_with_charge_handler(molecule, charge_method, forcefield='openff-2.2.0.offxml', ensemble=False)[source]

Create a forcefield with a charge handler for a given molecule and charge method.

The function can handle both individual molecules and ensembles of molecules.

Parameters:
  • molecule (rdkit.Chem.rdchem.Mol or list of rdkit.Chem.rdchem.Mol) – An RDKit molecule object or a list of RDKit molecule objects for which the forcefield is to be created.

  • charge_method (str) – A string that specifies the charge method to be used for the molecule.

  • forcefield (str, optional) – A string that specifies the forcefield to be used. Default is “openff-2.2.0.offxml”.

  • ensemble (bool, optional) – A boolean that specifies whether the input molecule is an ensemble of molecules. Default is False.

Returns:

An OpenFF ForceField object with the specified molecule’s charges added to the LibraryCharges parameter.

Return type:

openff.toolkit.typing.engines.smirnoff.ForceField

Raises:

Exception – If the charge method is not supported.