This module defines Selection class for handling arbitrary subsets of atom.
Before reading this section, familiarity with Atom Selections can be helpful.
Let’s import all classes and functions from ProDy and parse coordinates from the PDB structure 1p38:
>>> from prody import *
>>> prot = parsePDB('1p38')
We will use select() method to get Selection instances as follows:
>>> water = prot.select('water')
Let’s select β-carbon atoms for non-GLY amino acid residues, and α-carbons for GLYs in two steps:
>>> betas = prot.select('name CB and protein')
>>> print( len(betas) )
336
>>> gly_alphas = prot.select('name CA and resname GLY')
>>> print( len(gly_alphas) )
15
The above shows that the p38 structure contains 15 GLY residues.
These two selections can be combined as follows:
>>> betas_gly_alphas = betas | gly_alphas
>>> print( betas_gly_alphas )
Selection '(name CB and pr...nd resname GLY)'
>>> print( len(betas_gly_alphas) )
351
The selection string for the union of selections becomes:
>>> print( betas_gly_alphas.getSelstr() )
(name CB and protein) or (name CA and resname GLY)
Note that it is also possible to yield the same selection using selection string (name CB and protein) or (name CA and resname GLY).
It is as easy to get the intersection of two selections. Let’s find charged and medium size residues in a protein:
>>> charged = prot.select('charged')
>>> print( charged )
Selection 'charged'
>>> medium = prot.select('medium')
>>> print( medium )
Selection 'medium'
>>> medium_charged = medium & charged
>>> print( medium_charged )
Selection '(medium) and (charged)'
>>> print( medium_charged.getSelstr() )
(medium) and (charged)
Let’s see which amino acids are considered charged and medium:
>>> print( set(medium_charged.getResnames()) )
set(['ASP'])
What about amino acids that are medium or charged:
>>> print( set((medium | charged).getResnames()) )
set(['CYS', 'ASP', 'VAL', 'LYS', 'PRO', 'THR', 'GLU', 'HIS', 'ARG', 'ASN'])
It is also possible to invert a selection:
>>> only_protein = prot.select('protein')
>>> print( only_protein )
Selection 'protein'
>>> only_non_protein = ~only_protein
>>> print( only_non_protein )
Selection 'not (protein)'
>>> water = prot.select('water')
>>> print( water )
Selection 'water'
The above shows that 1p38 does not contain any non-water hetero atoms.
Another operation defined on the Select object is addition (also on other AtomPointer derived classes).
This may be useful if you want to yield atoms in an AtomGroup in a specific order. Let’s think of a simple case, where we want to output atoms in 1p38 in a specific order:
>>> protein = prot.select('protein')
>>> water = prot.select('water')
>>> water_protein = water + protein
>>> writePDB('1p38_water_protein.pdb', water_protein)
'1p38_water_protein.pdb'
In the resulting file, the water atoms will precedes the protein atoms.
Selections also allows membership test operations:
>>> backbone = prot.select('protein')
>>> calpha = prot.select('calpha')
Is calpha a subset of backbone?
>>> calpha in backbone
True
Or, is water in protein selection?
>>> water in protein
False
Other tests include:
>>> protein in prot
True
>>> backbone in prot
True
>>> prot in prot
True
>>> calpha in calpha
True
You can also check the equality of selections. Comparison will return True if both selections refer to the same atoms.
>>> calpha = prot.select('protein and name CA')
>>> calpha2 = prot.select('calpha')
>>> calpha == calpha2
True
A class for accessing and manipulating attributes of selection of atoms in an AtomGroup instance. Instances can be generated using select() method. Following built-in functions are customized for this class:
Return index of the coordinate set.
Return active coordinate set label.
Return a copy of alternate location indicators. Alternate location indicators can be used in atom selections, e.g. 'altloc A B', 'altloc _'.
Return a copy of anisotropic temperature factors.
Return a copy of standard deviations for anisotropic temperature factors.
Return associated atom group.
Return a copy of β-values (or temperature factors). β-values can be used in atom selections, e.g. 'beta 555.55', 'beta 0 to 500', 'beta 0:500', 'beta < 500'.
Return coordinate set labels.
Return a copy of partial charges. Partial charges can be used in atom selections, e.g. 'charge 1', 'abs(charge) == 1', 'charge < 0'.
Return a copy of chain identifiers. Chain identifiers can be used in atom selections, e.g. 'chain A', 'chid A B C', 'chain _'. Note that chid is a synonym for chain.
Return a copy of chain indices. Chain indices are assigned to subsets of atoms with distinct pairs of chain identifier and segment name. Chain indices start from zero, are incremented by one, and are assigned in the order of appearance in AtomGroup instance. Chain indices can be used in atom selections, e.g. 'chindex 0'.
Return a copy of coordinates from the active coordinate set.
Return coordinate set(s) at given indices, which may be an integer or a list/array of integers.
Return a copy of data associated with label, if it is present.
Return data labels. For which='user', return only labels of user provided data.
Return type of the data (i.e. data.dtype) associated with label, or None label is not used.
Return a copy of element symbols. Element symbols can be used in atom selections, e.g. 'element C O N'.
Return flag labels. For which='user', return labels of user or parser (e.g. hetatm) provided flags, for which='all' return all possible Atom flags labels in addition to those present in the instance.
Return a copy of atom flags for given label, or None when flags for label is not set.
Return a copy of fragment indices. Fragment indices are assigned to connected subsets of atoms. Bonds needs to be set using AtomGroup.setBonds() method. Fragment indices start from zero, are incremented by one, and are assigned in the order of appearance in AtomGroup instance. Fragment indices can be used in atom selections, e.g. 'fragindex 0', 'fragment 1'. Note that fragment is a synonym for fragindex.
Return a copy of insertion codes. Insertion codes can be used in atom selections, e.g. 'icode A', 'icode _'.
Return a copy of the indices of atoms.
Return a copy of masses. Masses can be used in atom selections, e.g. '12 <= mass <= 13.5'.
Return a copy of names. Names can be used in atom selections, e.g. 'name CA CB'.
Return a copy of occupancy values. Occupancy values can be used in atom selections, e.g. 'occupancy 1', 'occupancy > 0'.
Return a copy of radii. Radii can be used in atom selections, e.g. 'radii < 1.5', 'radii ** 2 < 2.3'.
Return a copy of residue indices. Residue indices are assigned to subsets of atoms with distinct sequences of residue number, insertion code, chain identifier, and segment name. Residue indices start from zero, are incremented by one, and are assigned in the order of appearance in AtomGroup instance. Residue indices can be used in atom selections, e.g. 'resindex 0'.
Return a copy of residue names. Residue names can be used in atom selections, e.g. 'resname ALA GLY'.
Return a copy of residue numbers. Residue numbers can be used in atom selections, e.g. 'resnum 1 2 3', 'resnum 120A 120B', 'resnum 10 to 20', 'resnum 10:20:2', 'resnum < 10'. Note that resid is a synonym for resnum.
Return a copy of secondary structure assignments. Secondary structure assignments can be used in atom selections, e.g. 'secondary H E', 'secstr H E'. Note that secstr is a synonym for secondary.
Return a copy of segment indices. Segment indices are assigned to subsets of atoms with distinct segment names. Segment indices start from zero, are incremented by one, and are assigned in the order of appearance in AtomGroup instance. Segment indices can be used in atom selections, e.g. 'segindex 0'.
Return a copy of segment names. Segment names can be used in atom selections, e.g. 'segment PROT', 'segname PROT'. Note that segname is a synonym for segment.
Return a copy of serial numbers (from file). Serial numbers can be used in atom selections, e.g. 'serial 1 2 3', 'serial 1 to 10', 'serial 1:10:2', 'serial < 10'.
Return a copy of types. Types can be used in atom selections, e.g. 'type CT1 CT2 CT3'.
Return True if data associated with label is present.
Return True if flags associated with label are present.
Yield atoms.
Yield copies of coordinate sets.
Return number of atoms, or number of atoms with given flag.
Return number of coordinate sets.
Return atoms matching selstr criteria. See select module documentation for details and usage examples.
Set coordinates at index active.
Set alternate location indicators. Alternate location indicators can be used in atom selections, e.g. 'altloc A B', 'altloc _'.
Set anisotropic temperature factors.
Set standard deviations for anisotropic temperature factors.
Set β-values (or temperature factors). β-values can be used in atom selections, e.g. 'beta 555.55', 'beta 0 to 500', 'beta 0:500', 'beta < 500'.
Set partial charges. Partial charges can be used in atom selections, e.g. 'charge 1', 'abs(charge) == 1', 'charge < 0'.
Set chain identifiers. Chain identifiers can be used in atom selections, e.g. 'chain A', 'chid A B C', 'chain _'. Note that chid is a synonym for chain.
Set coordinates in the active coordinate set.
Update data associated with label.
| Raises AttributeError: | |
|---|---|
| when label is not in use or read-only | |
Set element symbols. Element symbols can be used in atom selections, e.g. 'element C O N'.
Update flag associated with label.
| Raises AttributeError: | |
|---|---|
| when label is not in use or read-only | |
Set insertion codes. Insertion codes can be used in atom selections, e.g. 'icode A', 'icode _'.
Set masses. Masses can be used in atom selections, e.g. '12 <= mass <= 13.5'.
Set names. Names can be used in atom selections, e.g. 'name CA CB'.
Set occupancy values. Occupancy values can be used in atom selections, e.g. 'occupancy 1', 'occupancy > 0'.
Set radii. Radii can be used in atom selections, e.g. 'radii < 1.5', 'radii ** 2 < 2.3'.
Set residue names. Residue names can be used in atom selections, e.g. 'resname ALA GLY'.
Set residue numbers. Residue numbers can be used in atom selections, e.g. 'resnum 1 2 3', 'resnum 120A 120B', 'resnum 10 to 20', 'resnum 10:20:2', 'resnum < 10'. Note that resid is a synonym for resnum.
Set secondary structure assignments. Secondary structure assignments can be used in atom selections, e.g. 'secondary H E', 'secstr H E'. Note that secstr is a synonym for secondary.
Set segment names. Segment names can be used in atom selections, e.g. 'segment PROT', 'segname PROT'. Note that segname is a synonym for segment.
Set serial numbers (from file). Serial numbers can be used in atom selections, e.g. 'serial 1 2 3', 'serial 1 to 10', 'serial 1:10:2', 'serial < 10'.
Set types. Types can be used in atom selections, e.g. 'type CT1 CT2 CT3'.