wnprpc Example: water
Usage:
water [options...] files...
Description:
Produces a list of problematic water molecules for each
crystallographic PDB file in the
'files...' list. A water molecule is
considered 'potentially problematic' if
- the closest
distance to protein hetero atom is > 3.6 A or < 2.2 A,
- or the closest
distance to a protein hetero atom is more than 0.3 A
shorter
than the closest contact to a protein hetero atom in the
asymmetric unit,
- or if the water
molecule is in close (<2.2A) contact to another
water
molecule.
Options:
--version show
program's version number and exit
-h, --help show this
help message and exit
-a, --all show
all waters (default: show only problematic waters)
Line 20:
from
wnprpc import wnpRPCserver
Lines 24-48:
def
readFile(pdb):
Definition of a function to read a file and return the file content as
a string. If the file is gzip compressed it will be inflated on the
fly. The 'pdb' argument can be an open file, or a string, which will be
interpreted as a filename (standard input, if the string is "-").
Lines 52-137
def
analyze(W,pdb,code):
Definition of the function which will do the bulk of the work. The
loadMolTypes method of the wnpRPCserver
object W is used to load the content 'pdb' of a file read by the
readFile function into a Wit!P session. Using the cmd method of the
wnpRPCserver object W the 'crystal' command is sent to the Wit!P
session, and an error message is issued if the execution led to an
error (s>0 in line 71). Next, three atom sets are created in the
Wit!P session (water: containing all oxygen atoms in water molecules,
hetero: all hetero (non-H, non-C) atoms, and target: all hetereo atoms
that are not in the water atom set). The numer of water oxygens is
extracted from the response text generated by the commands that defined
the sets. If the water count is zero, an empty list is returned. With
the Wit!P session still in 'crystal' mode, the crystal packing is
generated to +-0.75 unit cells in each direction around the center of
the asymmetric unit, and the shortest distences between atoms in the
set water, restricted to the asymmetric unit, and atoms in the set
target set are measured using a "measure autopair nearest" command. The
response text from this command is converted into a tuple of
dictionaries (neighb[
atomname]:
name of target atom closest to water oxygen
atomname, dist[
atomname]: distance between
atomname and neighb[
atomname]). The list of water
oxygen atom names is extracted from the neighb dictionary, and sorted
ascending by dist[
atomname]
(line 107). In a similar way, dictionary pairs are generated for the
shortes water (asymm. unit)/water distances (lines 111-119), and for
the shortest water (aymm. unit) / target (asymm. unit) distances (lines
123-131). Finally, the three dictionary pairs are converted into a list
of 4-tuples (
atomname, (
closest target atom,
distance), (
closest target atom in asymm. unit,
distance), (
closest water oxygen,
distance)) for each water oxygen
atom in the asymmetric unit. This list is returned to the caller.
Lines 141-157:
def
distList(monitor):
Converts the output from "measure autopairs ..." into a pair of
dictionaries. 'monitor' is a response text generated by the cmd method
of a wnpRPCserver object. The defList function processes records with 5
fields, wher the second and fourth fields are '--'. For these records,
the first and third fields are atomnames, the last field is a distance.
The atom names are full names (including molecule name, symop, chain
name, residue name, separated by '/'). A split-and-join operation is
applied to these names to remove the redundant molecule name.
Lines 161-200:
Execution of the script starts at line
161. After parsing the command line using the OptionParser module, a
wnpRPCserver object is created with a Wit!P XML/RPC server listening on
the first available port in the range 19910...19930. If W.server is
None, the XML/RPC server could not be started, and the script
terminates with an error message. The W.cmd is used to set a couple of
"measure monitor" options in the Wit!P session to ensure the proper
functioning of the 'analyze' function defined in lines 52-137. The
script then loops though the list of filenames specified on the command
line, reading and analyzing each file in turn. Problematic water
molecules (all water molecues, if the -all option is used) are listed
on standard output, with the following seven items:
water oxygen atom name,
distance to closest hetero atom (excl. water), name of closest hetero
atom,
distance to closest heter atom (excl. water) in asymm. unit, name of
closest hetero atom in asymmetric unit,
distance to closest water oxygen atom, name of closest water oxygen atom
At the end of each loop, the molecule
in the Wit!P session is deleted.
Sample output:
AW on camm7 836> ./water
/db/pdb/6apr.pdb
Analysis of file /db/pdb/6apr.pdb
(6apr)
number of water molecules: 222
A/E/HOH_591/O
2.52
C_912/E/GLY_101/O
2.91
A/E/SER_176/OG
3.15 C_912/E/HOH_501/O
A/E/HOH_676/O
2.53 B_120/E/THR_184/OG1
19.07
A/E/LYS_108/NZ
3.42 C_022/E/HOH_537/O
A/E/HOH_631/O
2.68 D_101/E/THR_184/OG1
4.12
A/E/ALA_237/O
2.89 A/E/HOH_638/O
A/E/HOH_623/O
2.70
C_912/E/GLY_135/N
3.18
A/E/VAL_3/O
2.56 C_912/E/HOH_529/O
A/E/HOH_542/O
2.73
C_912/E/VAL_136/O
5.02
A/E/GLY_2/N
2.73 A/E/HOH_623/O
A/E/HOH_522/O
2.73 B_120/E/SER_233/OG
3.11
A/E/ALA_68/N
2.64 A/E/HOH_514/O
A/E/HOH_630/O
2.76 B_129/E/LYS_108/NZ
3.13
A/E/THR_208/OG1
3.25 A/E/HOH_785/O
A/E/HOH_624/O
2.84
B_120/E/ASN_229/N
4.34
A/E/ASN_61/O
2.62 A/E/HOH_683/O
A/E/HOH_626/O
2.86 C_912/E/SER_145/OG
5.06
A/E/THR_177/OG1
2.77 A/E/HOH_695/O
A/E/HOH_627/O
2.91
D_101/E/LEU_183/N
4.72
A/E/SER_207/OG
2.62 A/E/HOH_688/O
A/E/HOH_538/O
2.91
C_922/E/SER_84/OG
4.04
A/E/GLY_283/N
2.71 A/E/HOH_547/O
A/E/HOH_580/O
2.91
C_912/E/ASP_141/O
7.84
A/E/GLU_168/OE1
2.78 A/E/HOH_695/O
A/E/HOH_514/O
2.92
B_120/E/SER_233/O
3.99
A/E/GLN_67/NE2
2.64 A/E/HOH_522/O
A/E/HOH_585/O
3.01
B_120/E/ASP_214/N
6.02
A/E/ASN_61/ND2
3.32 A/E/HOH_791/O
A/E/HOH_727/O
3.16
C_922/E/SER_74/OG
4.91
A/E/GLN_282/N
2.85 A/E/HOH_538/O
A/E/HOH_792/O
3.21
D_101/E/THR_185/N
5.34
A/E/ALA_237/O
3.49 A/E/HOH_741/O
A/E/HOH_537/O
3.32
D_101/E/SER_268/O
7.29
A/E/ARG_236/O
3.42 C_922/E/HOH_676/O
A/E/HOH_811/O
3.58
B_120/E/SER_212/O
4.77
A/E/ASP_59/OD2
3.37 A/E/HOH_585/O
A/E/HOH_598/O
3.58
B_129/E/GLY_27/O
12.27
A/E/ASN_229/ND2
5.89 B_129/E/HOH_784/O
A/E/HOH_561/O
3.60 B_129/E/LYS_108/NZ
4.90
A/E/THR_206/O
2.59 A/E/HOH_826/O
A/E/HOH_607/O
3.65 C_012/E/SER_268/N
11.59
A/E/GLN_67/OE1
4.28 B_120/E/HOH_537/O
A/E/HOH_823/O
3.71 B_120/E/SER_212/OG
6.80
A/E/THR_46/O
4.25 B_120/E/HOH_787/O
A/E/HOH_625/O
3.72
A/E/PRO_118/N
3.72
A/E/PRO_118/N
6.16 A/E/HOH_629/O
A/E/HOH_672/O
4.03
A/E/VAL_315/O
4.03
A/E/VAL_315/O
4.33 A/E/HOH_739/O
A/E/HOH_796/O
4.15
A/E/SER_113/N
4.15
A/E/SER_113/N
5.30 A/E/HOH_853/O
A/E/HOH_695/O
4.34
C_912/E/ASP_141/O
5.50
A/E/LYS_178/NZ
2.77 A/E/HOH_626/O
A/E/HOH_857/O
4.54 D_191/E/LYS_258/NZ
12.14
A/E/GLU_317/OE1
4.07 D_191/E/HOH_684/O
A/E/HOH_812/O
11.11 D_291/E/ARG_192/NH1
26.49
A/E/ASN_265/OD1
14.18 C_012/E/HOH_609/O
A.Widmer,
NIBR/CPC/CSG-SB