pyCADD.Dock.schrodinger package

Submodules

pyCADD.Dock.schrodinger.api module

class pyCADD.Dock.schrodinger.api.DockControl(pdbid: str = None, save_path: str = None)[source]

Bases: object

Ligand Dock Control Class

__init__(pdbid: str = None, save_path: str = None) None[source]

Ligand Dock Control Class

Parameters:
  • pdbid (str, optional) – PDB ID from RCSB. Defaults to None.

  • save_path (str, optional) – directory to save the result files. Defaults to None.

Raises:

ValueError – either protein_file or pdbid must be provided

split(structure_file: str = None, ligand_resname: str = None, ligand_atom_indexes: list = None, ligand_asl: str = None) tuple[str, str][source]

Split structure to protein and ligand.

Parameters:
  • structure_file (str, optional) – structure file. Defaults to None.

  • ligand_resname (str, optional) – specify the ligand name to be splited. Defaults to None.

  • ligand_atom_indexes (list, optional) – atom indexes to define the ligand molecule. Defaults to None.

  • ligand_asl (str, optional) – ASL to define the ligand molecule. Defaults to None.

Returns:

protein_file_path, ligand_file_path

Return type:

Tuple[str, str]

minimize(structure_file: str | MaestroFile = None, ph: float = 7.4, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', fill_side_chain: bool = True, add_missing_loop: bool = True, del_water: bool = True, watdist: float = 5.0, rmsd_cutoff: float = 0.3, overwrite: bool = False) MaestroFile[source]

Minimize the protein structure.

Parameters:
  • structure_file (MaestroFile | str, optional) – structure to minimize. If not specified, use the initial protein structure. Defaults to None.

  • ph (float, optional) – pH value to calculate protonation states. Defaults to 7.4.

  • force_field (str) – force field to use. Defaults to ‘OPLS4’.

  • fill_side_chain (bool, optional) – whether to fill side chain. Defaults to True.

  • add_missing_loop (bool, optional) – whether to add missing loop. Defaults to True.

  • del_water (bool, optional) – whether to delete water molecules. Defaults to True.

  • watdist (float, optional) – how far from the ligand to delete water molecules. Set to 0.0 to delete all water molecules. Defaults to 5.0.

  • rmsd_cutoff (float, optional) – RMSD cutoff for minimization. Defaults to 0.3.

  • overwrite (bool, optional) – whether to overwrite existing result files. Defaults to False.

Returns:

Minimized structure file.

Return type:

MaestroFile

grid_generate(structure_file: str | MaestroFile = None, box_center: tuple[float, float, float] = None, box_center_molnum: int = None, box_center_resname: str = None, box_size: int = 20, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', overwrite: bool = False) GridFile[source]

Generate grid file for docking.

Parameters:
  • structure_file (MaestroFile | str) – structure to generate grid file. If not specified, use the minimized structure.

  • box_center (tuple[float, float, float], optional) – center XYZ of the grid box. Defaults to None.

  • box_center_molnum (int, optional) – the molecule number of the molecule that is set as the center. This molecule will be removed during the grid box generation process, and its centroid will be used as the center of the box. Will be ignored when box_center is set. Defaults to None.

  • box_center_resname (str, optional) – the residue name of the molecule that is set as the center. This molecule will be removed during the grid box generation process, and its centroid will be used as the center of the box. Will be ignored when box_center is set. Defaults to None.

  • box_size (int, optional) – box size of grid. Defaults to 20.

  • force_field (str, optional) – force field to use. Defaults to ‘OPLS4’.

  • overwrite (bool, optional) – whether to overwrite existing files. Defaults to False.

Returns:

generated grid file.

Return type:

GridFile

dock(ligand_file: str | MaestroFile, grid_file: GridFile = None, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', precision: Literal['SP', 'XP', 'HTVS'] = 'SP', calc_rmsd: bool = False, include_receptor: bool = False, overwrite: bool = False) DockResultFile[source]

Perform molecule docking.

Parameters:
  • ligand_file (MaestroFile | str) – Ligand file object or path.

  • grid_file (GridFile | str) – grid file object or path. If not specified, use the grid file generated before.

  • force_field (str, optional) – force field to use. Defaults to ‘OPLS4’.

  • precision (str, optional) – docking precision. Defaults to ‘SP’.

  • calc_rmsd (bool, optional) – Whether to calculate RMSD with co-crystal ligand. If True, grid file must be generated from a complex. Defaults to False.

  • include_receptor (bool, optional) – Whether to include receptor structure in the output file. Defaults to False.

  • overwrite (bool, optional) – whether to overwrite existing files. Defaults to False.

Returns:

Docking result file.

Return type:

DockResultFile

class pyCADD.Dock.schrodinger.api.DockEnsemble(input_file: str | EnsembleInputFile, save_path: str = None, cpu_num: int = None)[source]

Bases: object

Docking Control for Ensemble Docking

__init__(input_file: str | EnsembleInputFile, save_path: str = None, cpu_num: int = None) None[source]

Docking Control for Ensemble Docking

Parameters:
  • input_file (str | EnsembleInputFile) – Input file object or path for ensemble docking.

  • save_path (str, optional) – directory to save the result files. Defaults to None.

  • cpu_num (int, optional) – number of CPU cores to use in parallel. Defaults to 3 / 4 cores.

load_library(ligand_file: str | MaestroFile, overwrite: bool = False) list[MaestroFile][source]

Load the compound/ligand library and split them into multiple single structure files for ensemble docking.

Parameters:
  • ligand_file (str | MaestroFile) – library file object or path.

  • overwrite (bool, optional) – Whether to overwrite existing files. Defaults to False.

Returns:

Splitted ligands files list

Return type:

list[MaestroFile]

keep_single_chain(overwrite: bool = False)[source]

Keep the single chain in the structure file for ensemble docking.

Parameters:

overwrite (bool, optional) – Whether to overwrite existing files. Defaults to False.

minimize(ph: float = 7.4, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', fill_side_chain: bool = True, add_missing_loop: bool = True, del_water: bool = True, watdist: float = 5.0, rmsd_cutoff: float = 0.3, overwrite: bool = False) list[MaestroFile][source]

Minimize the structure file for ensemble docking.

Parameters:
  • ph (float, optional) – pH value for the structure file. Defaults to 7.4.

  • force_field (str, optional) – Force field for the structure file. Defaults to “OPLS4”.

  • fill_side_chain (bool, optional) – Whether to fill side chain. Defaults to True.

  • add_missing_loop (bool, optional) – Whether to add missing loop. Defaults to True.

  • del_water (bool, optional) – Whether to delete water. Defaults to True.

  • watdist (float, optional) – Water distance cutoff. Defaults to 5.0.

  • rmsd_cutoff (float, optional) – RMSD cutoff. Defaults to 0.3.

  • overwrite (bool, optional) – Whether to overwrite existing files. Default to False.

Returns:

Minimized structure files list

Return type:

list[MaestroFile]

grid_generate(box_size: int = 20, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', overwrite: bool = False) list[GridFile][source]

Generate grid files for docking.

Parameters:
  • box_size (int, optional) – Box size of the grid. Defaults to 20.

  • force_field (str, optional) – Force field for the grid. Defaults to ‘OPLS4’.

  • overwrite (bool, optional) – Whether to overwrite existing files. Default to False.

Raises:

ValueError – No minimized structure files found.

Returns:

Grid files list

Return type:

list[GridFile]

dock(retrospective: bool = False, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', precision: Literal['SP', 'XP', 'HTVS'] = 'SP', calc_rmsd: bool = False, include_receptor: bool = False, overwrite: bool = False) list[dict][source]

Perform ensemble docking.

Parameters:
  • retrospective (bool, optional) – Whether to add cocrystal molecules to ligands during ensemble docking. Defaults to False.

  • force_field (str, optional) – Force field for the docking. Defaults to ‘OPLS4’.

  • precision (str, optional) – _description_. Defaults to ‘SP’.

  • calc_rmsd (bool, optional) – Whether to calculate RMSD. Defaults to False.

  • include_receptor (bool, optional) – Whether to include receptor in the docking. Defaults to False.

  • overwrite (bool, optional) – Whether to overwrite existing files. Default to False.

Raises:

ValueError – No grid files found.

Returns:

Docking results

Return type:

list[dict]

extract_data() list[dict][source]

Extract docking result data from ensemble docking results.

Returns:

Docking results

Return type:

list[dict]

pyCADD.Dock.schrodinger.common module

class pyCADD.Dock.schrodinger.common.MetaData(pdbid: str = '', ligand_name: str = '', action: str = '', docking_ligand_name: str = '', precision: str = '')[source]

Bases: object

Data class for metadata

pdbid: str = ''
ligand_name: str = ''
action: str = ''
docking_ligand_name: str = ''
precision: str = ''
property internal_ligand_name: str
classmethod parse_from_filename(file_name: str, sep: str = '_', parse_dict: dict = None) MetaData[source]

Parse metadata from the filename

Parameters:
  • file_name (str) – file name or path

  • sep (str, optional) – separator. Defaults to ‘_’.

  • parse_dict (dict, optional) – specify position index that metadata parse attrs from the file name. Defaults to {‘pdbid’: 0, ‘ligand_name’: 1, ‘action’: 2, ‘docking_ligand_name’: 3, ‘precision’: 4}.

Returns:

metadata object

Return type:

MetaData

classmethod parse_from_dict(metadata_dict: dict) MetaData[source]

Parse metadata from the dictionary

Parameters:

metadata_dict (dict) – metadata dictionary

Returns:

metadata object

Return type:

MetaData

classmethod parse_from_metadata(metadata: MetaData) MetaData[source]

Parse metadata from the metadata object

Parameters:

metadata (MetaData) – metadata object

Returns:

metadata object

Return type:

MetaData

generate_file_name(attributes: list = ['pdbid', 'ligand_name', 'action', 'docking_ligand_name', 'precision'], sep='_') str[source]

Generate file name from metadata

Parameters:
  • attributes (list, optional) – attributes to be included in the file name. Defaults to [‘pdbid’, ‘ligand_name’, ‘action’, ‘docking_ligand_name’, ‘precision’].

  • sep (str, optional) – separator. Defaults to ‘_’.

Returns:

file name without extension

Return type:

str

Example

>>> metadata = MetaData(pdbid='1ABC', ligand_name='LIG', action='glide-dock', docking_ligand_name='LIG', precision='SP')
>>> metadata.generate_file_name()
'1ABC_LIG_glide-dock_LIG_SP'
>>> metadata.generate_file_name(attributes=['pdbid', 'ligand_name', 'action'], sep='-')
'1ABC-LIG-glide-dock'
copy() MetaData[source]

Get the deep copy of the metadata

Returns:

metadata object

Return type:

MetaData

set(attr, value) None[source]

Set the attribute value

Parameters:
  • attr (str) – attribute name

  • value (any) – attribute value

class NoDefault[source]

Bases: object

get(attr, default=<class 'pyCADD.Dock.schrodinger.common.MetaData.NoDefault'>)[source]

Get the attribute value

Parameters:
  • attr (str) – attribute name

  • default (any) – default value that will be returned if the attribute is not found; without it, an AttributeError will be raised in this case.

Returns:

attribute value. If the attribute is not found, return provided default value.

Return type:

Any

delete(attr) None[source]

Delete the attribute

Parameters:

attr (str) – attribute name

__init__(pdbid: str = '', ligand_name: str = '', action: str = '', docking_ligand_name: str = '', precision: str = '') None
class pyCADD.Dock.schrodinger.common.BaseMaestroFile(path: str, metadata: MetaData = None, **kwargs)[source]

Bases: File

__init__(path: str, metadata: MetaData = None, **kwargs) None[source]

Base Mastro file class

Parameters:
  • path (str) – file path

  • metadata (MetaData, optional) – metadata object. Defaults to None.

class pyCADD.Dock.schrodinger.common.GridFile(path: str, metadata: MetaData = None, **kwargs)[source]

Bases: BaseMaestroFile

__init__(path: str, metadata: MetaData = None, **kwargs) None[source]

Mastro grid file class

Parameters:
  • path (str) – file path

  • metadata (MetaData, optional) – metadata object. Defaults to None.

class pyCADD.Dock.schrodinger.common.LigandSearched(ligand_obj: Ligand)[source]

Bases: Ligand

__init__(ligand_obj: Ligand)[source]
Parameters:
  • st (Structure) – Original complex structure.

  • st – Ligand structure.

  • mol_num (int) – Molecular index identifier. Typically, the mol.n from the original structure from whence this ligand structure was derived. Note, depending on the nature of the ligand and the treatment of the original structure this mol.n index may not be valid.

  • atom_indexes (list) – Atom index identifiers. Typically, the at.n from the original structure from whence this ligand structure was derived.

  • lig_asl (str) – ASL identifier. Typically, the expression is defined in terms of the original structure from whence this ligand structure was derived.

Deprecated is_covalently_bound:

Whether this ligand is bonds to other atoms (including zero-order bonds). Will be False if the ligand spans a whole molecule.

property pdbres: str

PDB residue name identifier. If the ligand is composed of multiple residues then the names are joined with a ‘-’ separator. :rtype: str

property chain: str
property resnum: int
property molnum: int
class pyCADD.Dock.schrodinger.common.MaestroFile(path: str, metadata: MetaData = None, **kwargs)[source]

Bases: BaseMaestroFile

__init__(path: str, metadata: MetaData = None, **kwargs) None[source]

Maestro file class

Parameters:
  • path (str) – file path

  • metadata (MetaData, optional) – metadata object. Defaults to None.

property st_reader: StructureReader
property structures: list[Structure]
static get_structure(file_path: str, index: int = 0) Structure[source]

Get the structure from the file

Parameters:
  • file_path (str) – maestro file path

  • index (int, optional) – index of the structure to get. Defaults to 0.

Returns:

Maestro structure object

Return type:

Structure

get_chain_structure(chain_id: str, structure_index: int = 0) Structure[source]

Get the chain structure from the structure file

Parameters:
  • chain_id (str) – chain ID

  • structure_index (int) – index of the structure

Returns:

Maestro structure object

Return type:

Structure

get_residue(resnum: int, structure_index: int = 0) Residue[source]

Get the Residue object from the structure

Parameters:
  • mol_resnum (int) – qeury molecule residue number

  • structure_index (int) – index of the structure

Returns:

Maestro Residue object

Return type:

struc.Residue

get_molecule_by_res(resnum: int = None, structure_index: int = 0) _Molecule[source]

Get the Molecule object from the structure by residue number.

Parameters:
  • resnum (int) – qeury molecule residue number

  • structure_index (int) – index of the structure

Returns:

Maestro Molecule object

Return type:

struc._Molecule

get_covalent_bond(resnum: int, structure_index: int = 0) list[source]

Get list of covalent bond(s) between query molecule and other residues. Return None if no covalent bond is found.

Parameters:
  • resnum (int) – query residue number of the molecule

  • structure_index (int) – index of the structure

Returns:

list of covalent bond(s) between query molecule and other residues

Return type:

list

get_molnum_by_res(mol_resnum: int, structure_index: int = 0) int[source]

Get the molecule number from the structure by residue number.

Parameters:
  • mol_resnum (int, optional) – query residue number. Defaults to None.

  • structure_index (int) – index of the structure

Returns:

Molecule number

Return type:

int

find_ligands(specified_names: list = None, included_names: list = None, excluded_names: list = None, min_heavy_atom_count: int = None, max_atom_count: int = None, allow_amino_acid_only_molecules: bool = False, allow_ion_only_molecules: bool = False, structure_index: int = 0) list[LigandSearched][source]

Find all ligands from the structure file

Parameters:
  • specified_names (list, optional) – PDB residue names which will be considered as ligands. Any found residues not in this list will be filtered out if this list is not None. Defaults to None.

  • included_names (list, optional) – PDB residue names which always be considered as ligands. Defaults to None.

  • excluded_names (list, optional) – PDB residue names which will never be considered as ligands. Defaults to None.

  • min_heavy_atom_count (int, optional) – Minimum number of heavy atoms required in each ligand molecule. Defaults to None.

  • max_atom_count (int, optional) – Maximum number of heavy atoms for a ligand molecule (does not include hydrogens). Defaults to None.

  • allow_amino_acid_only_molecules (bool, optional) – If True, consider small molecules containing only amino acids to be ligands. Defaults to False.

  • allow_ion_only_molecules (bool, optional) – If True, Consider charged molecules to be ligands. Defaults to False.

  • structure_index (int) – Index of the structure

Returns:

List of found Ligand objects.

useful properties: pdbres, chain, resnum, mol_num, centriod, st, atom_indexes

Return type:

list[LigandSearched]

class pyCADD.Dock.schrodinger.common.DockResultFile(path: str, metadata: MetaData = None, include_receptor: bool = None, **kwargs)[source]

Bases: MaestroFile

__init__(path: str, metadata: MetaData = None, include_receptor: bool = None, **kwargs) None[source]

Docking result file class

Parameters:
  • path (str) – file path

  • metadata (MetaData, optional) – metadata object. Defaults to None.

  • include_receptor (bool) – True if the file contains receptor structure.

get_receptor_structure() Structure[source]

Get the receptor structure from the docking result file

Returns:

Maestro structure object

Return type:

Structure

get_ligand_structure() Structure[source]

Get the docking ligand structure from the docking result file

Returns:

Maestro structure object

Return type:

Structure

get_raw_results() list[dict][source]

Get the raw docking result information of all structures

Returns:

raw docking result information

Return type:

list[dict]

get_results() list[dict][source]

Get the docking result information of all structures

Returns:

docking result information

Return type:

list[dict]

pyCADD.Dock.schrodinger.config module

class pyCADD.Dock.schrodinger.config.DefaultDataConfig[source]

Bases: object

__init__() None[source]
items()[source]
keys()[source]
values()[source]
get(key)[source]
class pyCADD.Dock.schrodinger.config.SPConfig[source]

Bases: DefaultDataConfig

__init__() None[source]
class pyCADD.Dock.schrodinger.config.XPConfig[source]

Bases: DefaultDataConfig

__init__() None[source]
class pyCADD.Dock.schrodinger.config.DataConfig(precision: str = None, properties: dict = None)[source]

Bases: DefaultDataConfig

__init__(precision: str = None, properties: dict = None) None[source]

pyCADD.Dock.schrodinger.core module

pyCADD.Dock.schrodinger.core.split_complex(file: MaestroFile | str, ligand_atom_indexes: list = None, ligand_asl: str = None, structure_index: int = 0) tuple[Structure, Structure][source]

Split the complex structure into receptor and ligand.

Parameters:
  • file (MaestroFile | str) – Complex structure file object or path.

  • ligand_atom_indexes (list, optional) – List of ligand atom indexes. Defaults to None.

  • ligand_asl (str, optional) – ASL to identify ligand. Defaults to None.

  • structure_index (int, optional) – Index of the structure to split. Defaults to 0.

Raises:
  • ValueError – Ligand_atom_indexes and ligand_asl cannot be both None.

  • ValueError – Atom indexes or ASL identified multiple molecules or no atom.

Returns:

Receptor and Ligand structures.

Return type:

tuple[Structure, Structure]

pyCADD.Dock.schrodinger.core.minimize(file: MaestroFile | str, ph: float = 7.4, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', fill_side_chain: bool = True, add_missing_loop: bool = True, del_water: bool = True, watdist: float = 5.0, rmsd_cutoff: float = 0.3, save_dir: str = None, overwrite: bool = False) MaestroFile[source]

Prepare the structure and run minimization using Schrodinger’s prepwizard.

Parameters:
  • file (MaestroFile | str) – file path or MaestroFile object to be prepared and minimized.

  • ph (float, optional) – pH value to calculate protonation states. Defaults to 7.4.

  • force_field (str) – force field to use. Defaults to ‘OPLS4’.

  • fill_side_chain (bool, optional) – whether to fill side chain. Defaults to True.

  • add_missing_loop (bool, optional) – whether to add missing loop. Defaults to True.

  • del_water (bool, optional) – whether to delete water molecules. Defaults to True.

  • watdist (float, optional) – how far from the ligand to delete water molecules. Set to 0.0 to delete all water molecules. Defaults to 5.0.

  • rmsd_cutoff (float, optional) – RMSD cutoff for minimization. Defaults to 0.3.

  • save_dir (str, optional) – directory to save the results. Defaults to None.

  • overwrite (bool, optional) – whether to overwrite existing result files. Defaults to False.

Raises:

RuntimeError – raise if the minimization process failed.

Returns:

minimized structure file

Return type:

MaestroFile

pyCADD.Dock.schrodinger.core.grid_generate(file: MaestroFile | str, box_center: tuple[float, float, float] = None, box_center_molnum: int = None, box_size: int = 20, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', save_dir: str = None, overwrite: bool = False) GridFile[source]

Generate grid file for Glide docking.

Parameters:
  • file (MaestroFile | str) – structure to generate grid file.

  • box_center (tuple[float, float, float], optional) – center XYZ of the grid box. Defaults to None.

  • box_center_molnum (int, optional) – the molecule number of the molecule that is set as the center. This molecule will be removed during the grid box generation process, and its centroid will be used as the center of the box. Will be ignored when box_center is set. Defaults to None.

  • box_size (int, optional) – box size of grid. Defaults to 20.

  • force_field (str, optional) – force field to use. Defaults to ‘OPLS4’.

  • save_dir (str, optional) – directory to save the results. Defaults to None.

  • overwrite (bool, optional) – whether to overwrite existing files. Defaults to False.

Raises:

RuntimeError – raise if the grid generation process failed.

Returns:

grid file

Return type:

GridFile

pyCADD.Dock.schrodinger.core.dock(grid_file: GridFile | str, ligand_file: MaestroFile | str, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', precision: Literal['SP', 'XP', 'HTVS'] = 'SP', calc_rmsd: bool = False, include_receptor: bool = False, save_dir: str = None, overwrite: bool = False) DockResultFile[source]

Perform Glide ligand docking

Parameters:
  • grid_file (GridFile | str) – grid file object or path.

  • ligand_file (MaestroFile | str) – Ligand file object or path.

  • force_field (str, optional) – force field to use. Defaults to ‘OPLS4’.

  • precision (str, optional) – docking precision. Defaults to ‘SP’.

  • calc_rmsd (bool, optional) – Whether to calculate RMSD with co-crystal ligand. If True, grid file must be generated from a complex. Defaults to False.

  • include_receptor (bool, optional) – Whether to include receptor structure in the output file. Defaults to False.

  • save_dir (str, optional) – directory to save the results. Results will be further saved in directory named PDBID. Using save_dir directly if PDBID is not provided. Defaults to current working directory.

  • overwrite (bool, optional) – whether to overwrite existing files. Defaults to False.

Raises:

RuntimeError – raise if the docking process failed.

Returns:

docking result file.

Return type:

DockResultFile

pyCADD.Dock.schrodinger.core.keep_single_chain(structure_file: MaestroFile | str, by_resname: str = None, chain_id: str = None, save_dir: str = None, overwrite: bool = False) MaestroFile[source]

Keep single chain from a structure file.

Parameters:
  • structure_file (MaestroFile | str) – Structure file object or path.

  • by_resname (str, optional) – Residue or ligand name from the chain to be kept. Defaults to None.

  • chain_id (str, optional) – Chain ID to be kept. Defaults to None.

  • save_dir (str, optional) – Directory to save the results. Defaults to None.

  • overwrite (bool, optional) – Whether to overwrite existing files. Defaults to False.

Raises:

ValueError – Either chain_id or by_resname should be specified.

Returns:

The structure file with single chain.

Return type:

MaestroFile

pyCADD.Dock.schrodinger.data module

pyCADD.Dock.schrodinger.data.extract_docking_data(docking_result_file: DockResultFile | str, data_config: DataConfig = None) list[dict][source]

Extract docking data from a docking result file.

Parameters:
  • docking_result_file (DockResultFile | str) – docking result file or path to the file.

  • data_config (DataConfig, optional) – data extracting config. By default, the configuration corresponding to the docking precision is used.

Returns:

docking result data list.

Return type:

list[dict]

pyCADD.Dock.schrodinger.data.save_docking_data(docking_result_file: DockResultFile | str, data_config: DataConfig = None, save_dir: str = None, overwrite: bool = False) None[source]

Save docking data to a CSV file. The file name is the same as the docking result file.

Parameters:
  • docking_result_file (DockResultFile | str) – docking result file or path to the file.

  • data_config (DataConfig, optional) – data extracting config. By default, the configuration corresponding to the docking precision is used.

  • save_dir (str, optional) – directory to save the data. Defaults to None.

  • overwrite (bool, optional) – whether to overwrite the existing file. Defaults to False.

Raises:

FileExistsError – raise if the output file already exists.

pyCADD.Dock.schrodinger.ensemble module

pyCADD.Dock.schrodinger.ensemble.split_structure(multi_structure_file: str | File, save_dir: str = None, overwrite: bool = False, cpu_num: int = None) list[MaestroFile][source]

Split a multi-structure file to single structure files.

Parameters:
  • multi_structure_file (str | File) – path or File object of structure file containing multiple structures.

  • save_dir (str, optional) – directory to save the splited structure files. Defaults to None.

  • overwrite (bool, optional) – Whether to overwrite the existed file. Defaults to False.

  • cpu_num (int, optional) – cpu core number used to split structures. Defaults to 3/4 of available cores.

Returns:

list of splited structure file.

Return type:

list[MaestroFile]

pyCADD.Dock.schrodinger.ensemble.multi_keep_single_chain(structure_files: list[str | MaestroFile], ligand_name_list: list[str] = None, chain_id_list: list[str] = None, save_dir: str = None, overwrite: bool = False, cpu_num: int = None) list[MaestroFile][source]

Keep the single chain for multiple structures.

Parameters:
  • structure_files (list[str | MaestroFile]) – Structure files to be processed.

  • ligand_name_list (list[str], optional) – Ligand name list for each structure. Only the chain where the ligand is located will be retained. Should have the same length as structure_files. Defaults to None.

  • chain_id_list (list[str], optional) – Chain ID list for each structure to keep. Should have the same length as structure_files. Will be ignored if ligand_name_list is provided. Defaults to None.

  • save_dir (str, optional) – directory to save the structure files. Defaults to None.

  • overwrite (bool, optional) – Whether to overwrite the existed file. Defaults to False.

  • cpu_num (int, optional) – cpu core number used to split structures. Defaults to 3/4 of available cores.

Raises:
  • ValueError – Single-chain structure cannot be kept without ligand name.

  • ValueError – The number of ligand name list is not equal to the number of structure files.

Returns:

list of structure files with single chain.

Return type:

list[MaestroFile]

pyCADD.Dock.schrodinger.ensemble.multi_minimize(structure_files: list[str | File], ph: float = 7.4, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', fill_side_chain: bool = True, add_missing_loop: bool = True, del_water: bool = True, watdist: float = 5.0, rmsd_cutoff: float = 0.3, save_dir: str = None, overwrite: bool = False, cpu_num: int = None) list[MaestroFile][source]

Minimize multiple structures.

Parameters:
  • structure_files (list[str | File]) – list of structure file paths or File objects.

  • ph (float, optional) – pH value to calculate protonation states. Defaults to 7.4.

  • force_field (str) – force field to use. Defaults to ‘OPLS4’.

  • fill_side_chain (bool, optional) – whether to fill side chain. Defaults to True.

  • add_missing_loop (bool, optional) – whether to add missing loop. Defaults to True.

  • del_water (bool, optional) – whether to delete water molecules. Defaults to True.

  • watdist (float, optional) – how far from the ligand to delete water molecules. Set to 0.0 to delete all water molecules. Defaults to 5.0.

  • rmsd_cutoff (float, optional) – RMSD cutoff for minimization. Defaults to 0.3.

  • save_dir (str, optional) – directory to save the minimized structures. Defaults to None.

  • overwrite (bool, optional) – Whether to overwrite the existed file. Defaults to False.

  • cpu_num (int, optional) – cpu core number used to minimize structures. Defaults to 3/4 of available cores.

Returns:

list of minimized structure file.

Return type:

list[MaestroFile]

pyCADD.Dock.schrodinger.ensemble.multi_grid_generate(structure_files: list[str | File], box_center_list: list[tuple[float, float, float]] = None, box_center_molnum_list: list[int] = None, box_size_list: list[float] = None, force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', save_dir: str = None, overwrite: bool = False, cpu_num: int = None) list[GridFile][source]

Generate multiple grid files.

Parameters:
  • structure_files (list[str | File]) – list of structure file paths or File objects.

  • box_center_list (list[tuple[float, float, float]], optional) – list of box centers. Should be the same length as structure_files. Defaults to None.

  • box_center_molnum_list (list[int], optional) – list of molecule numbers for box centers. Should be the same length as structure_files. Defaults to None.

  • box_size_list – list of box sizes. Should be the same length as structure_files. Defaults to [20, 20, …] with the length of structure_files.

  • force_field (str, optional) – force field to use. Defaults to ‘OPLS4’.

  • save_dir (str, optional) – directory to save the grid files. Defaults to None.

  • overwrite (bool, optional) – Whether to overwrite the existed file. Defaults to False.

  • cpu_num (int, optional) – cpu core number used to generate grid files. Defaults to 3/4 of available cores.

Returns:

list of grid file.

Return type:

list[GridFile]

pyCADD.Dock.schrodinger.ensemble.get_docking_pairs(grid_files: list[str | File], ligand_files: list[str | File]) list[source]

Get all possible docking pairs from grid and ligand files, which will establish a mapping relationship between each provided receptor and each provided ligand.

Parameters:
  • grid_files (list[str | File]) – Grid files.

  • ligand_files (list[str | File]) – Ligand files.

Returns:

a list of tuples containing grid and ligand file.

Return type:

list

pyCADD.Dock.schrodinger.ensemble.multi_dock(docking_pairs: list[tuple[GridFile, File]], force_field: Literal['OPLS4', 'OPLS3e', 'OPLS3', 'OPLS_2005'] = 'OPLS4', precision: Literal['SP', 'XP', 'HTVS'] = 'SP', calc_rmsd: bool = False, include_receptor: bool = False, save_dir: str = None, overwrite: bool = False, cpu_num: int = None) list[DockResultFile][source]

Dock multiple ligands to multiple grids.

Parameters:
  • docking_mapping (list[tuple[GridFile, File]]) – list of tuples containing grid and ligand file paths or File objects. e.g. [(grid1, ligand1), (grid2, ligand2)]

  • force_field (str, optional) – force field to use. Defaults to ‘OPLS4’.

  • precision (str, optional) – docking precision. Defaults to ‘SP’.

  • calc_rmsd (bool, optional) – Whether to calculate RMSD with co-crystal ligand. If True, grid file must be generated from a complex. Defaults to False.

  • include_receptor (bool, optional) – Whether to include receptor structure in the output file. Defaults to False.

  • save_dir (str, optional) – directory to save the docked results. Results will be further saved in separate directories named PDBIDs. Defaults to None.

  • overwrite (bool, optional) – Whether to overwrite the existed file. Defaults to False.

  • cpu_num (int, optional) – cpu core number used to dock ligands. Defaults to 3/4 of available cores.

Returns:

list of docked result file.

Return type:

List[DockResultFile]

pyCADD.Dock.schrodinger.ensemble.multi_extract_data(dock_result_files: list[str | File], data_config: DataConfig = None, cpu_num: int = None) list[dict][source]

Extract data from multiple docked result files.

Parameters:
  • dock_result_files (list[str | File]) – list of docked result file paths or File objects.

  • extract_type (str, optional) – data type to extract. Defaults to ‘all’.

  • save_dir (str, optional) – directory to save the extracted data. Defaults to None.

  • cpu_num (int, optional) – cpu core number used to extract data. Defaults to 3/4 of available cores.

Returns:

list of extracted data dict, which can be converted to DataFrame directly.

Return type:

list[dict]

pyCADD.Dock.schrodinger.utils module

pyCADD.Dock.schrodinger.utils.launch(cmd: str, timeout: int = 0) Job[source]

Launch a Schrodinger job.

Parameters:
  • cmd (str) – command to be executed

  • timeout (int, optional) – timeout in seconds. Defaults to 0.

Returns:

Schrodinger Job object

Return type:

schrodinger.job.Job

pyCADD.Dock.schrodinger.utils.collect_structures(list_file: str, output_file: str = None, data_dir: str = None, required_file_ext: str = 'maegz') None[source]

Collect and extract structures from data directory according to a list file including required structure names, and save them to another file. The structure names(without the extension) should be in the first column with a header.

Parameters:
  • list_file (str) – list file path

  • output_file (str, optional) – output file path. Defaults to a .mae file with the same name as the list file.

  • data_dir (str, optional) – directory containing structure files. Defaults to current working directory.

  • required_file_ext (str, optional) – file extension of the required structure files. Defaults to ‘maegz’.

pyCADD.Dock.schrodinger.utils.convert_format(file_path: str, output_file: str, to_format: str = None, save_dir: str = None, overwrite: bool = False) str[source]

Convert the file to another format using Schrodinger API.

Parameters:
  • input_file_path (str) – input file path

  • output_file_path (str) – output file path

  • to_format (str) – target format. If None, the format will be inferred from the output file extension.

  • save_dir (str, optional) – directory to save the converted file. Defaults to current working directory.

  • overwrite (bool, optional) – overwrite the existing file. Defaults to False.

Raises:
  • ValueError – Unsupported format

  • FileNotFoundError – File to convert is not found

  • FileExistsError – File already exists

Returns:

converted file path

Return type:

str

pyCADD.Dock.schrodinger.utils.get_centroid(file: list[MaestroFile, Structure, str], structure_index: int = 0) tuple[float, float, float][source]

Get the centroid of a structure file.

Parameters:
  • file_path (MaestroFile | str | Structure) – file, maestro structure, or file path to read

  • structure_index (int) – Index of the structure to calculate the centroid. Defaults to 0.

Returns:

centroid coordinates

Return type:

tuple[float, float, float]

Module contents