QCflow Documentation

QCflow.load_gaussian

QCflow.load_gaussian.load_data(mol_name, job_name)[source]

Opens and reads the .log data file.

Parameters:
  • mol_name (str) (Name of molecule.)

  • job_name (str) (The type of job run. Possible runs:) –

    • Single Point neutral → sp

    • Optimisation neutral → opt

    • Torsional scan neutral → tor

    • Optimisation neutral + Population analysis → pop_opt_n

    • Single point anion → sp_a

    • Single point cation → sp_c

    • Optimisation anion → opt_a

    • Optimisation cation → opt_c

    • neutral charge, optimised anion geometry → n_a_geo

    • neutral charge, optimised cation geometry → n_c_geo

    • Single Point Hirshfeld → sp_hirsh

Returns:

cclib.parser.data.ccData

Return type:

Parsed data from the .log file.

QCflow.load_gaussian.save_dictionary(mol_dic, name_of_dic)[source]

Saves the created dictionary as a .json file.

Parameters:
  • mol_dic (dict) (Dictionary of oligomers where the key is the number of the oligomer and the value is the SMILE string.)

  • name_of_dic (str) (Desired name of the json file (without the .json extension).)

Return type:

None

QCflow.load_gaussian.open_dictionary(dictionary_file)[source]

Opens a saved dictionary file.

Parameters:

dictionary_file (str) (The name of the .json file in which the dictionary is saved.)

Returns:

dict

Return type:

The dictionary loaded from the specified .json file.

QCflow.load_gaussian.data_dic(mol_dic, job_type)[source]

Takes a dictionary of oligomers and the calculation performed, and returns a dictionary of parsed cclib objects.

Parameters:
  • mol_dic (dict) (A dictionary where keys are the names of oligomers and values are their SMILE strings.)

  • job_type (str) (The type of calculation performed.)

Returns:

dict

Return type:

A dictionary where keys are the names of the oligomers and values are the parsed cclib objects.

QCflow.write_gaussian

QCflow.write_gaussian.write_gaussian(job_name, mol_name, smile, functional='B3LYP', basis_set='6-31G*', mol=None, torsion=None, conformer=None, old_chk=None)[source]

Generates a Gaussian input file based on the provided parameters.

Parameters:
  • job_name (str) (The type of job to run. Possible values:) –

    • ‘sp’: Single Point neutral

    • ‘opt’: Optimisation neutral

    • ‘tor’: Torsional scan neutral

    • ‘pop_opt_n’: Optimisation neutral + Population analysis

    • ‘sp_a’: Single point anion

    • ‘sp_c’: Single point cation

    • ‘opt_a’: Optimisation anion

    • ‘opt_c’: Optimisation cation

    • ‘n_a_geo’: Neutral charge, optimised anion geometry

    • ‘n_c_geo’: Neutral charge, optimised cation geometry

    • ‘sp_hirsh’: Single Point Hirshfeld

  • mol_name (str) (The name of the molecule.)

  • smile (str) (The SMILE string of the molecule.)

  • functional (str, optional) (The functional to use. Default is ‘B3LYP’.)

  • basis_set (str, optional) (The basis set to use. Default is ‘6-31G’.*)

  • mol (rdkit.Chem.rdchem.Mol, optional) (The RDKit embedded molecule object.)

  • torsion (tuple, optional) (The torsion information for the rotatable bond.)

  • conformer (rdkit.Chem.rdchem.Conformer, optional) (The RDKit conformer of the molecule.)

  • old_chk (str, optional) (The previous checkpoint file used to guess geometry/MOs.) – Example: ‘mol_name_opt_321G.chk’

Returns:

None

Return type:

Writes the Gaussian input file to disk.

QCflow.write_psi4

QCflow.write_psi4.write_psi4(job_name, mol_name, smile, functional='b3lyp', basis_set='6-31g*', mol=None, conformer=None)[source]

Generates a psi4 input file based on the provided parameters for single point or geometry optimisation calculations.

Parameters:
  • job_name (str) (The type of job to run. Possible values:) –

    • ‘sp’: Single Point neutral

    • ‘opt’: Optimisation neutral

  • mol_name (str) (The name of the molecule.)

  • smile (str) (The SMILE string of the molecule.)

  • functional (str, optional) (The functional to use. Default is ‘b3lyp’.)

  • basis_set (str, optional) (The basis set to use. Default is ‘6-31g’.)

  • mol (rdkit.Chem.rdchem.Mol, optional) (The RDKit embedded molecule object.)

  • conformer (rdkit.Chem.rdchem.Conformer, optional) (The RDKit conformer of the molecule.)

Returns:

None

Return type:

Writes the psi4 input file to disk.

QCflow.write_psi4.write_psi4_reorg(job_name, mol_name, functional='b3lyp', basis_set='6-31g*')[source]

Generates a psi4 input file based on the provided parameters for reorganisation calculations.

Parameters:
  • job_name (str) (The type of job to run. Possible values:) –

    • ‘cation’: Geometry optimisation cation (opt_c) and single of neutral charge, cation geometry (n_c_geo)

    • ‘anion’: Geometry optimisation anion (opt_a) and single of neutral charge, anion geometry (n_a_geo)

    • ‘sp_c’: Single point calculation of neutral geometry at cation charge

    • ‘sp_a’: Single point calculation of neutral geometry at anion charge

  • mol_name (str) (The name of the molecule.)

  • functional (str, optional) (The functional to use. Default is ‘b3lyp’.)

  • basis_set (str, optional) (The basis set to use. Default is ‘6-31g’.)

Returns:

None

Return type:

Writes the psi4 input file to disk.

QCflow.slurm

QCflow.slurm.write_slurm(job_name, mol_name, cpus=10)[source]

Writes a SLURM batch script for a specified job type and molecule name.

Parameters:
  • job_name (str) (The type of job to run. Possible values:) –

    • ‘sp’: Single Point neutral

    • ‘opt’: Optimisation neutral

    • ‘tor’: Torsional scan neutral

    • ‘pop_opt_n’: Optimisation neutral + Population analysis

    • ‘sp_a’: Single point anion

    • ‘sp_c’: Single point cation

    • ‘opt_a’: Optimisation anion

    • ‘opt_c’: Optimisation cation

    • ‘n_a_geo’: Neutral charge, optimised anion geometry

    • ‘n_c_geo’: Neutral charge, optimised cation geometry

    • ‘sp_hirsh’: Single Point Hirshfeld

  • mol_name (str) (The name of the dimer from the dictionary, e.g., if fragment 0 was attached to fragment 1,) – then the dimer name is ‘0_1’.

  • cpus (int, optional) (The number of CPUs to allocate for the job. Default is 10.)

Notes

The function generates a SLURM batch script file named ‘{mol_name}_{job_name}.sh’ with appropriate configurations based on the job type and molecule name. The script includes settings for job name, output and error files, partition, number of tasks, nodes, CPUs per task, memory per CPU, and time limit. It also sets up the environment and execution line for running Gaussian 16 (g16) with the specified input and output files.

QCflow.slurm.write_slurm_psi4(job_name, mol_name, time=24, cpus=10)[source]

Writes a SLURM batch script for a specified job type and molecule name.

Parameters:
  • job_name (str) (The type of job to run. Possible values:) –

    • ‘sp’: Single Point neutral

    • ‘opt’: Optimisation neutral

    • ‘cation’: Geometry optimisation cation (opt_c) and single of neutral charge, cation geometry (n_c_geo)

    • ‘anion’: Geometry optimisation anion (opt_a) and single of neutral charge, anion geometry (n_a_geo)

    • ‘sp_c’: Single point calculation of neutral geometry at cation charge

    • ‘sp_a’: Single point calculation of neutral geometry at anion charge

  • mol_name (str) (The name of the dimer from the dictionary, e.g., if fragment 0 was attached to fragment 1,) – then the dimer name is ‘0_1’.

  • time (int, optional) (The time limit for the job in hours. Default is 24. (Max is 48))

  • cpus (int, optional) (The number of CPUs to allocate for the job. Default is 10.)

Notes

The function generates a SLURM batch script file named ‘{mol_name}_{job_name}.sh’ with appropriate configurations based on the job type and molecule name. The script includes settings for job name, output and error files, partition, number of tasks, nodes, CPUs per task, memory per CPU, and time limit. It also sets up the environment and execution line for running Psi4 job with the specified input and output files.

QCflow.slurm.is_job_in_queue(submission_name)[source]

Checks if the job has been successfully submitted into the queue

Parameters:

submission_name (str) (The name of the job that has been submitted to the HPC)

Return type:

True or False

QCflow.slurm.submit_slurm_job(job_name, mol_name, max_retries=5, wait_seconds=30)[source]

Submits a SLURM job using the specified job name and molecule name. Works on the KCL CREATE HPC.

Parameters:
  • job_name (str) (The type of job to run. Possible values:) –

    • ‘sp’: Single Point neutral

    • ‘opt’: Optimisation neutral

    • ‘tor’: Torsional scan neutral

    • ‘pop_opt_n’: Optimisation neutral + Population analysis

    • ‘sp_a’: Single point anion

    • ‘sp_c’: Single point cation

    • ‘opt_a’: Optimisation anion

    • ‘opt_c’: Optimisation cation

    • ‘n_a_geo’: Neutral charge, optimised anion geometry

    • ‘n_c_geo’: Neutral charge, optimised cation geometry

    • ‘sp_hirsh’: Single Point Hirshfeld

  • mol_name (str) (The name of the dimer from the dictionary. For example, if fragment 0 was attached to fragment 1,) – then the dimer name would be ‘0_1’.

  • max_retries (int) (The maximum amount of times a job will attempt to submit. Deafult is 5.)

  • wait_seconds (int) (How long python will go to sleep inbetween attempts to submit)

Returns:

bytes

Return type:

The standard output from the SLURM job submission command.

QCflow.run_gaussian

QCflow.run_gaussian.rdkit_predict_conf(mol_smiles, num_of_conformer=100, max_iter=500, min_energy_MMFF=10000, min_energy_index_MMFF=0)[source]

Generates conformers for a given molecule using RDKit and returns the lowest energy conformer.

Parameters:
  • mol_smiles (str) (The SMILES representation of the molecule.)

  • num_of_conformer (int) (The number of conformers to be generated (default: 100).)

  • max_iter (int) (The maximum number of iterations for conformer optimization (default: 500).)

  • min_energy_MMFF (float) (The minimum energy threshold for selecting the lowest energy conformer (default: 10000).)

  • min_energy_index_MMFF (int) (The index of the lowest energy conformer (default: 0).)

Returns:

conf (Chem.Conformer)

Return type:

The lowest energy conformer of the molecule.

QCflow.run_gaussian.run_calc(job_name, mol_name, mol_smile, functional='B3LYP', basis_set='6-31G*')[source]

Submits a Gaussian calculation to CREATE HPC.

Parameters:
  • job_name (str) (The type of job to run. Possible values:) –

    • ‘sp’: Single Point neutral

    • ‘opt’: Optimisation neutral

    • ‘tor’: Torsional scan neutral

    • ‘pop_opt_n’: Optimisation neutral + Population analysis

    • ‘sp_a’: Single point anion

    • ‘sp_c’: Single point cation

    • ‘opt_a’: Optimisation anion

    • ‘opt_c’: Optimisation cation

    • ‘n_a_geo’: Neutral charge, optimised anion geometry

    • ‘n_c_geo’: Neutral charge, optimised cation geometry

    • ‘sp_hirsh’: Single Point Hirshfeld

  • mol_name (str) – Name of the oligomer.

  • mol_smile (str) – SMILES string of the oligomer.

  • functional (str, optional) – Quantum chemistry functional to be used (default is ‘B3LYP’).

  • basis_set (str, optional) – Basis set to be used (default is ‘6-31G*’).

Returns:

  • - For ‘sp_a’, ‘sp_c’, ‘opt_a’, ‘opt_c’, ‘n_a_geo’, ‘n_c_geo’, and ‘sp_hirsh’

    • Writes a Gaussian input file directly.

  • - For ‘opt’, ‘sp’, and ‘pop_opt_n’

    • Converts the SMILES string to an RDKit molecule object.

    • Generates 3D coordinates for the molecule.

    • Predicts the conformer geometry.

    • Writes a Gaussian input file with the conformer geometry.

  • - For ‘tor’

    • Converts the SMILES string to an RDKit molecule object.

    • Identifies the bond for torsional scan.

    • Determines the torsion angle.

    • Generates 3D coordinates for the molecule.

    • Writes a Gaussian input file with the torsion angle.

  • Finally, the function writes a SLURM script, submits the job, and returns to the previous directory.

QCflow.run_gaussian.staging_opt(job_name, mol_name, mol_smile, mol_dic, functional, basis_set)[source]

Checks if the calcultion has been completed at has been a success. If it has, then appends a dictionary showing this. If the calculations haven’t been run all the way, then runs them. If the calculation has failed, then appends a dictionary

Parameters:
  • job_name (str) (The type of job to run. Possible values:) –

    • ‘sp’: Single Point neutral

    • ‘opt’: Optimisation neutral

    • ‘tor’: Torsional scan neutral

    • ‘pop_opt_n’: Optimisation neutral + Population analysis

    • ‘sp_a’: Single point anion

    • ‘sp_c’: Single point cation

    • ‘opt_a’: Optimisation anion

    • ‘opt_c’: Optimisation cation

    • ‘n_a_geo’: Neutral charge, optimised anion geometry

    • ‘n_c_geo’: Neutral charge, optimised cation geometry

    • ‘sp_hirsh’: Single Point Hirshfeld

  • mol_name (str) (The name of the oligomer as seen in the dictionary i.e. if melanin fragment (b) is combined)

  • mol_smile (str) (SMILE string of oligomer)

  • mol_dic (dict) (Dictionary of oligomers where key is the name of the oligomer and value is the SMILES string)

  • functional (str) (Functional used in calculations (e.g. B3LYP))

  • basis_set (str) (Basis set used (e.g. 6-31G)*)

Returns:

tuple – fully_complete (dict): Oligomers that have been calculated at the highest basis set not_complete (dict): Oligomers that failed and need manual assessment (shows basis set they failed at) in_progress (dict): Oligomers that are still in progress (shows basis set they are currently being run at)

Return type:

A tuple containing three dictionaries:

QCflow.run_psi4

QCflow.run_psi4.run_psi4(job_name, mol_name, mol_smile, time=24, cpus=10, functional='b3lyp', basis_set='6-31g*')[source]

Submits a psi4 calculation to CREATE HPC.

Parameters:
  • job_name (str) (The type of job to run. Possible values:) –

    • ‘sp’: Single Point neutral

    • ‘opt’: Optimisation neutral

    • ‘cation’: Geometry optimisation cation (opt_c) and single of neutral charge, cation geometry (n_c_geo)

    • ‘anion’: Geometry optimisation anion (opt_a) and single of neutral charge, anion geometry (n_a_geo)

    • ‘sp_c’: Single point calculation of neutral geometry at cation charge

    • ‘sp_a’: Single point calculation of neutral geometry at anion charge

  • mol_name (str) – Name of the oligomer.

  • mol_smile (str) – SMILES string of the oligomer.

  • time (int, optional) (The time limit for the job in hours. Default is 24. (Max is 48))

  • cpus (int, optional) (The number of CPUs to allocate for the job. Default is 10.)

  • functional (str, optional) – Quantum chemistry functional to be used (default is ‘B3LYP’).

  • basis_set (str, optional) – Basis set to be used (default is ‘6-31G*’).

Returns:

  • - For ‘opt’ and ‘sp’

    • Converts the SMILES string to an RDKit molecule object.

    • Generates 3D coordinates for the molecule.

    • Predicts the conformer geometry.

    • Writes a psi4 input file with the conformer geometry.

  • - For ‘cation’, ‘anion’, ‘sp_c’, and ‘sp_a’

    • Writes a psi4 input file.

  • Finally, the function writes a SLURM script, submits the job, and returns to the previous directory.

QCflow.testing_data

QCflow.testing_data.success_test(parsed_dic)[source]

Tests if the calculation has been successful for each cclib parsed object.

Parameters:

parsed_dic (dict) (A dictionary where the key is the name of the oligomer and the value is the cclib object.)

Returns:

tuple

Return type:

Two lists - the first list contains the names of oligomers that passed, and the second list contains the names of oligomers that failed.

QCflow.testing_data.functional_test(parsed_dic, functional)[source]

Tests if the calculations in the parsed cclib objects were done using the specified functional.

Parameters:
  • parsed_dic (dict) (A dictionary where the key is the name of the oligomer and the value is the cclib object.)

  • functional (str) (The desired functional of the calculation.)

Returns:

tuple – and the second list contains the names of oligomers that failed the test.

Return type:

Two lists - the first list contains the names of oligomers that passed the test,

QCflow.testing_data.basis_set_test(parsed_dic, basis_set)[source]

Tests if the calculations in the parsed dictionary have been done using the correct basis set.

Parameters:
  • parsed_dic (dict) (A dictionary of parsed rdkit objects with the key being the name of the oligomer) – and the value being the cclib object.

  • basis_set (str) (The desired basis set for the calculation.)

Returns:

tuple – and the second list contains the names of oligomers that failed the basis set check.

Return type:

Two lists - the first list contains the names of oligomers that passed the basis set check,

QCflow.energy_calculations

QCflow.energy_calculations.extract_data_from_txt(file_path)[source]

Extracts data from a .txt file with energy, HOMO, LUMO, and energy gap information for Psi4 caucltions.

Parameters:

file_path (str) (Path to the .txt file.)

Returns:

dict

Return type:

A dictionary with extracted values.

QCflow.energy_calculations.cal_reorg(opt_n, sp_c, opt_c, n_c_geo, calculation_software='Gaussian')[source]

Calculate the reorganization energy. Can calculate it for both Gaussian 16 and Psi4. Gaussian is default.

Parameters:
  • calculation_software (str) (The computational chemistry software used to perform the calculations. Default is ‘Gaussian’. Can be ‘Gaussian’ or ‘Psi4’.)

  • For Gaussian

  • opt_n (cclib.io.ccread) (cclib object for the neutral population optimization analysis.)

  • sp_c (cclib.io.ccread) (cclib object for the vertical anion or cation.)

  • opt_c (cclib.io.ccread) (cclib object for the optimized anion or cation.)

  • n_c_geo (cclib.io.ccread) (cclib object for the neutral ion at anion or cation geometry.)

  • For Psi4

  • opt_n (dict) (Dictionary containing the optimized energy of the neutral population optimization analysis.)

  • sp_c (dict) (Dictionary containing the single point energy of the vertical anion or cation.)

  • opt_c (dict) (Dictionary containing the optimized energy of the anion or cation.)

  • n_c_geo (dict) (Dictionary containing the single point energy of the neutral ion at anion or cation geometry.)

Returns:

  • float (The reorganization energy in eV.)

  • The reorganization energy is calculated using the following formula

  • reorg_en = (EcN - EnN) + (EnC - EcC)

    • EnN is the SCF energy of the neutral population optimization.

    • EcN is the SCF energy of the vertical anion or cation.

    • EcC is the SCF energy of the optimized anion or cation.

    • EnC is the SCF energy of the neutral ion charge at anion or cation geometry.

QCflow.energy_calculations.cal_HOMO(opt)[source]

Calculate the Highest Occupied Molecular Orbital (HOMO) energy.

This function takes a cclib object representing the optimized neutral population and returns the HOMO energy in electron volts (eV).

Parameters:

opt (cclib.parser.data.ccData) (A cclib object parsed from a .log file containing the optimization analysis of the neutral population.)

Returns:

float

Return type:

The HOMO energy in electron volts (eV).

QCflow.energy_calculations.cal_LUMO(opt)[source]

Calculate the LUMO energy from the optimized neutral population.

Parameters:

opt (cclib.parser.data.ccData_optdone) (The cclib object containing the parsed .log file for neutral population optimization analysis.)

Returns:

float

Return type:

The LUMO energy in electron volts (eV).

QCflow.energy_calculations.cal_gap(opt)[source]

Calculate the HOMO-LUMO gap for an optimized neutral population.

This function computes the energy difference between the Highest Occupied Molecular Orbital (HOMO) and the Lowest Unoccupied Molecular Orbital (LUMO) in electron volts (eV).

Parameters:

opt (cclib.parser.ccData) (A cclib object representing the parsed .log file for the) – neutral population optimization analysis.

Returns:

float

Return type:

The energy difference between the HOMO and LUMO (HOMO-LUMO gap) in eV.

QCflow.energy_calculations.cal_IP(opt_n, cation, IP_type)[source]

Calculate the ionization potential (IP) given the cclib objects for the optimized neutral and cation populations.

Parameters:
  • opt_n (cclib.parser.data.ccData_optdone) (cclib object for the optimized neutral population.)

  • cation (cclib.parser.data.ccData_optdone) (cclib object for the optimized or vertical cation.)

  • IP_type (str) (Type of ionization potential to calculate. Can be ‘adiabatic’ or ‘vertical’.)

Returns:

float

Return type:

The ionization potential (IP) calculated as the difference between the SCF energies of the optimized cation and neutral populations (eV).

QCflow.energy_calculations.cal_EA(opt_n, anion, EA_type)[source]

Calculate the Electron Affinity (EA) given the optimized neutral population and optimized anion.

Parameters:
  • opt_n (cclib.io.ccread) (Parsed cclib object for neutral population optimization analysis.)

  • anion (cclib.io.ccread) (Parsed cclib object for optimized or vertical anion.)

  • EA_type (str) (Type of electron affinity to calculate. Can be ‘adiabatic’ or ‘vertical’.)

Returns:

float

Return type:

The calculated Electron Affinity (EA).

Notes

  • The function assumes that the SCF energies are available in the scfenergies attribute of the cclib objects.

  • The function uses the first SCF energy where the optimization status is 4 (indicating convergence).

QCflow.find_torsion

QCflow.find_torsion.getBond(mol)[source]

Finds the rotatable bonds in a given RDKit molecule.

Parameters:

mol (rdkit.Chem.Mol) (An RDKit molecule object.)

Returns:

tuple

Return type:

A tuple of tuples, where each inner tuple contains the indices of atoms that form a rotatable bond.

QCflow.find_torsion.getTorsion(mol, bond)[source]

Gets the torsion for the torsional scan.

Parameters:
  • mol (rdkit.Chem.Mol) (RDKit molecule object representing the oligomer.)

  • bond (tuple) (Tuple of two integers representing the indices of the rotatable bond.)

Returns:

tuple – The format is (first_atom, bond_atom1, bond_atom2, last_atom). ‘first_atom’ is the neighbor of ‘bond_atom1’ with the highest priority (N, S, O > C). ‘last_atom’ is the neighbor of ‘bond_atom2’ with the highest priority (N, S, O > C).

Return type:

A tuple of four integers representing the indices of the atoms involved in the torsion.

QCflow.find_torsion.embed_molecule(mol)[source]

Generates the 3D rdkit.Chem.Mol object of the given RDKit molecule and adds hydrogen atoms.

Parameters:

mol (rdkit.Chem.Mol) (An RDKit molecule object.)

Returns:

rdkit.Chem.Mol

Return type:

The RDKit molecule object with embedded 3D coordinates and added hydrogen atoms.

QCflow.torsion_parser

QCflow.torsion_parser.find_min_energy_index(data)[source]

Finds the index of the minimum energy from the torsional scan data.

Parameters:

data (cclib.parser.data.ccData_optdone_bool) (Loaded cclib data from the .log file containing the torsional information.)

Returns:

int

Return type:

The index corresponding to the minimum energy value in the torsional scan data.

QCflow.torsion_parser.min_angle(data)[source]

Finds the angle of the minimum energy torsion.

Parameters:

data (cclib.parser.data.ccData) (Loaded cclib data from the .log file containing the torsional information.)

Returns:

float

Return type:

The dihedral angle corresponding to the minimum energy torsion.

QCflow.torsion_parser.torsion_parser(mol_name, mol_smi)[source]

Parses the torsional profile of a molecule and returns the optimized geometry at the lowest energy minimum.

Parameters:
  • mol_name (str) (The name of the molecule (dimer or trimer).)

  • mol_smi (str) (SMILES string of the molecule.)

Returns:

  • rdkit.Chem.rdchem.Conformer (The optimized geometry at the lowest energy minimum.)

  • The function performs the following steps

  • 1. Converts the SMILES string into an RDKit molecule object.

  • 2. Adds hydrogen atoms to the molecule.

  • 3. Embeds the molecule in 3D space.

  • 4. Loads the torsional data from a .log file.

  • 5. Finds the minimum energy torsion.

  • 6. Sets the atom positions to the geometry corresponding to the minimum energy torsion.

QCflow.torsion_parser.setting_dihedral(mol_smile, deg)[source]

Sets the dihedral angle of the torsion for individual scans.

Parameters:
  • mol_smile (str) (SMILES string of the dimer.)

  • deg (float) (Desired dihedral angle (0 or 180 for planar).)

Returns:

  • rdkit.Chem.rdchem.Conformer (A conformer of the dimer with the specified dihedral angle.)

  • The function performs the following steps

  • 1. Converts the SMILES string into an RDKit molecule object.

  • 2. Embeds the molecule to get estimated 3D coordinates.

  • 3. Retrieves the conformer of the molecule.

  • 4. Identifies the bond and torsion angles between fragments.

  • 5. Sets the specified dihedral angle for the torsion.

  • 6. Applies MMFF force field and minimizes the energy of the molecule.

QCflow.torsion_parser.finding_dihedral_opt(mol_smiles, log_data)[source]

Calculates the dihedral angle of a molecule given its SMILES string and log data.

Parameters:
  • mol_smiles (str) (The SMILES string of the molecule.)

  • log_data (object) (An object containing log data, including converged geometries.)

Returns:

float

Return type:

The dihedral angle in degrees.

QCflow.torsion_parser.find_planarity(angle)[source]

Calculate the planarity of a given angle. This function computes the planarity of an angle by taking the absolute value of the cosine of the angle converted to radians.

Parameters:

angle (float) (The angle in degrees for which the planarity is to be calculated.)

Returns:

float

Return type:

The planarity value.

QCflow.torsion_parser.getBondLinkers(mol, linker_type)[source]

From an RDKit molecule, finds the two atoms involved in specified type of rotatable bond.

Parameters:
  • mol (rdkit.Chem.Mol) (RDKit molecule object.)

  • linker_type (str) (Type of linker to search for. Options are ‘single’, ‘double’, ‘imine’, ‘thio’ or ‘triple’.)

Returns:

tuple

Return type:

A tuple, where indices of the atoms involved in the matching bonds.

QCflow.torsion_parser.getTorsion_one(mol, bond)[source]

Gets the first torsion of a multi torsion molecule. Works for triple, imine and double bonds.

Parameters:
  • mol (rdkit.Chem.Mol) (RDKit molecule object representing the oligomer.)

  • bond (tuple) (Tuple of atom indices representing the rotatable bond.)

Returns:

tuple

Return type:

A tuple containing the indices of the four atoms defining the torsion angle.

QCflow.torsion_parser.getTorsion_two(mol, bond)[source]

Gets the second torsion of a multi torsion molecule. Works for triple, imine and double bonds.

Parameters:
  • mol (rdkit.Chem.Mol) (RDKit molecule object representing the oligomer.)

  • bond (tuple) (Tuple of atom indices representing the rotatable bond.)

Returns:

tuple

Return type:

A tuple containing the indices of the four atoms defining the torsion angle.

QCflow.torsion_parser.finding_multi_planairty(mol_name, mol_smiles, linker_type)[source]

Determines the average planarity of a molecule based on its torsion angles.

Parameters:
  • mol_name (str) (The name of the molecule, used to locate the optimization log file.)

  • mol_smiles (str) (The SMILES representation of the molecule.)

  • linker_type (str) (The type of linker in the molecule. Can be ‘thio’, ‘triple’, ‘double’, or ‘imine’.)

Returns:

float

Return type:

The average planarity of the molecule.

QCflow.torsion_parser.update_conformer_from_xyz(mol, xyz_file)[source]

Updates the conformer of an RDKit molecule using coordinates from an XYZ file.

Parameters:
  • mol (rdkit.Chem.Mol) (The RDKit molecule.)

  • xyz_file (str) (Path to the XYZ file containing new coordinates.)

Returns:

rdkit.Chem.Mol

Return type:

The molecule with updated conformer.

QCflow.torsion_parser.finding_planairty_psi4(mol_name, mol_smiles, linker_type, job_name)[source]

Determines the average planarity of a molecule based on its torsion angles.

Parameters:
  • mol_name (str) (The name of the molecule, used to locate the optimization xyz file.)

  • mol_smiles (str) (The SMILES representation of the molecule.)

  • linker_type (str) (The type of linker in the molecule. Can be ‘single’, ‘thio’, ‘triple’, ‘double’, or ‘imine’.)

  • job_name (str) (The type of job to run. Possible values:) –

    • ‘sp’: Single Point neutral

    • ‘opt’: Optimisation neutral

Returns:

float

Return type:

The average planarity of the molecule.

QCflow.fragments

QCflow.fragments.adding_attach(smi, find='[cH;^2]', get_rid='C([I])')[source]

Adds an attachment point to a given fragment.

Parameters:
  • smi (str) (The SMILES string of the fragment.)

  • find (str) (The SMARTS pattern of the possible attachment point. Default is ‘[cH;x2]’.)

  • get_rid (str) (The replacement pattern for the attachment point. Default is ‘C([I])’.)

Returns:

list

Return type:

A list of unique SMILES strings representing the fragment with the added attachment point.

QCflow.fragments.generate_attachment_points(frag_dic, find='[cH;^2]', get_rid='C([I])')[source]

This function takes a dictionary of fragments and generates attachment points for each fragment. The attachment points are determined by using the ‘adding_attach’ function. The attachment points are represented as alphabetical characters (‘A’, ‘B’, …) and are appended to the identifiers of the fragments. The resulting attachment points dictionary is returned.

Parameters:
  • frag_dic (dict) (A dictionary containing fragment information with keys as identifiers and values as SMILE strings.)

  • find (str) (The SMARTS pattern of the possible attachment point. Default is ‘[cH;^2]’.)

  • get_rid (str) (The replacement pattern for the attachment point. Default is ‘C([I])’.)

Returns:

dict – alphabetical characters (‘A’, ‘B’, …) representing different attachment points for each fragment. Values are the corresponding attachment points obtained from the ‘adding_attach’ function. They are a rdkit mol object.

Return type:

A dictionary containing attachment points. Keys are composed of identifiers followed by

QCflow.fragments.combine_structure(molecule_a, molecule_b, attachment_point='I')[source]

Combines two molecules together at a specified attachment point.

Parameters:
  • molecule_a (object) (The first molecule to be combined.)

  • molecule_b (object) (The second molecule to be combined.)

  • attachment_point (str) (The atom label at which the two molecules will be combined. Default is ‘I’.)

Returns:

object

Return type:

The combined molecule.

QCflow.fragments.make_molecule_dic_from_2_dic(fragment_dic_1, fragment_dic_2)[source]

Generates a dictionary of molecule by combining two different fragment dictionaries.

Parameters:
  • fragment_dic_1 (dict) (The first fragment dictionary.)

  • fragment_dic_2 (dict) (The second fragment dictionary.)

Returns:

dict

Return type:

A dictionary of dimers, where the key is the name of the molecule and the value is the SMILES string.

QCflow.sa_score

QCflow.sa_score.sa_scorer(smile)[source]

Carries out a synthetic accessibility score on a molecule (SA score) The literature for this paper can be found here: https://doi.org/10.1186/1758-2946-1-8

Parameters:

smile (str) (The SMILES string of the molecule to be scored)

Returns:

sa_score_val (float)

Return type:

The SA score of the molecule