首先,让我开始说,我是作为Python练习来这样做的,并且不允许使用Biopython 。
我正在编写一个脚本,该脚本将帮助我解析从轨迹生成的任何.pdb文件。我正在尝试创建将 chain变量与 resnumber变量链接的字典。尽管我为只有2个链的特定.pdb文件解决了该问题,但无论链的数量如何,我都希望此脚本适用于任何.pdb文件。这是我写的:
import sys
pdbTraj = open('md20_aligned_3frames.pdb','r')
pdbTraj_line = pdbTraj.readlines()
newFile = open('newfile.txt','w')
pdbDict = {}
resnumberList1 = []
resnumberList2 = []
chainTry = "A"
for line in pdbTraj_line:
if line.startswith(("ATOM" or "HetaTM")):
atomType = line[0:6]
atomSerialNumber = line[6:11]
atomName = line[12:16]
resname = line[17:20]
chain = line[21]
resnumber = line[22:26]
coorX = line[30:38]
coorY = line[38:46]
coorZ = line[46:54]
occupancy = line[54:60]
temperatureFact = line[60:66]
segmentIdentifier = line[72:76]
elementSymbol = line[76:78]
if chain == chainTry:
resnumberList1.append(resnumber)
pdbDict[chain] = list(dict.fromkeys(resnumberList1))
else:
resnumberList2.append(resnumber)
pdbDict[chain] = list(dict.fromkeys(resnumberList2))
print(pdbDict)
这是我得到的结果:
{'A': [' 1',' 2',' 3',' 4',' 5',' 6',' 7',' 8',' 9',' 10',' 11',' 12',' 13',' 14',' 15',' 16',' 17'],'B': [' 19',' 20',' 21',' 22',' 23',' 24',' 25',' 26',' 27',' 28',' 29',' 30',' 31',' 32',' 33',' 34',' 35',' 36',' 37',' 38',' 39',' 40',' 41',' 42',' 43',' 44',' 45',' 46',' 47',' 48',' 49',' 50',' 51',' 52',' 53',' 54',' 55',' 56',' 57',' 58',' 59',' 60',' 61',' 62',' 63',' 64',' 65',' 66',' 67',' 68',' 69',' 70',' 71',' 72',' 73',' 74',' 75',' 76',' 77',' 78',' 79',' 80',' 81',' 82',' 83',' 84',' 85',' 86',' 87',' 88',' 89',' 90',' 91',' 92',' 93',' 94',' 95',' 96',' 97',' 98',' 99',' 100',' 101',' 102',' 103',' 104',' 105',' 106',' 107',' 108',' 109',' 110',' 111',' 112',' 113',' 114',' 115',' 116',' 117',' 118',' 119',' 120',' 121',' 122',' 123',' 124',' 125',' 126',' 127',' 128',' 129',' 130',' 131',' 132',' 133',' 134',' 135',' 136',' 137',' 138',' 139',' 140',' 141',' 142',' 143',' 144',' 145',' 146',' 147',' 148',' 149',' 150',' 151',' 152',' 153',' 154',' 155',' 156',' 157',' 158',' 159',' 160',' 161',' 162',' 163',' 164',' 165',' 166',' 167',' 168',' 169',' 170',' 171',' 172',' 173',' 174',' 175',' 176',' 177',' 178',' 179',' 180',' 181',' 182',' 183',' 184',' 185',' 186',' 187',' 188',' 189',' 190',' 191',' 192',' 193',' 194',' 195',' 196',' 197',' 198',' 199',' 200',' 201',' 202',' 203',' 204',' 205',' 206',' 207',' 208',' 209',' 210',' 211',' 212',' 213',' 214',' 215',' 216',' 217',' 218',' 219',' 220',' 221',' 222',' 223',' 224',' 225',' 226',' 227',' 228',' 229',' 230',' 231',' 232',' 233',' 234',' 235',' 236',' 237',' 238',' 239',' 240',' 241',' 242',' 243',' 244',' 245',' 246',' 247',' 248',' 249',' 250',' 251',' 252',' 253',' 254',' 255',' 256',' 257',' 258',' 259',' 260',' 261',' 262',' 263',' 264',' 265',' 266',' 267',' 268',' 269',' 270',' 271',' 272',' 273',' 274',' 275',' 276',' 277',' 278',' 279',' 280',' 281',' 282',' 283',' 284',' 285',' 286',' 287',' 288',' 289',' 290',' 291',' 292',' 293',' 294',' 295',' 296',' 297',' 298',' 299',' 300',' 301',' 302',' 303',' 304',' 305',' 306',' 307',' 308',' 309',' 310',' 311',' 312',' 313',' 314',' 315',' 316',' 317',' 318',' 319',' 320',' 321',' 322',' 323',' 324',' 325',' 326',' 327',' 328',' 329',' 330',' 331',' 332',' 333',' 334',' 335',' 336',' 337',' 338',' 339',' 340',' 341',' 342',' 343',' 344',' 345',' 346',' 347',' 348',' 349',' 350',' 351',' 352',' 353',' 354',' 355',' 356',' 357',' 358',' 359',' 360',' 361',' 362',' 363',' 364',' 365',' 366',' 367',' 368',' 369',' 370',' 371']}
因此,有2个键(链A 和链B )和2个列表(链A的 resnumber 和链B的 resnumber )。
您能帮我将任何.pdb文件的脚本通用化吗? 谢谢!
.pdb文件格式的前几行如下:
CRYST1 91.372 118.560 70.786 90.00 90.00 90.00 P 1 1
ATOM 1 N LYS A 1 10.246 29.908 8.932 0.00 0.00 A
ATOM 2 HT1 LYS A 1 11.053 29.331 8.619 0.00 0.00 A
ATOM 3 HT2 LYS A 1 10.405 30.386 9.842 0.00 0.00 A
ATOM 4 HT3 LYS A 1 10.211 30.643 8.197 0.00 0.00 A
ATOM 5 CA LYS A 1 9.010 29.017 8.844 0.00 0.00 A
ATOM 6 HA LYS A 1 9.395 28.160 8.311 0.00 0.00 A
ATOM 7 CB LYS A 1 8.484 28.723 10.313 0.00 0.00 A
ATOM 8 HB1 LYS A 1 9.376 28.807 10.970 0.00 0.00 A
ATOM 9 HB2 LYS A 1 7.797 29.544 10.609 0.00 0.00 A
ATOM 10 CG LYS A 1 7.855 27.321 10.494 0.00 0.00 A
ATOM 11 HG1 LYS A 1 7.016 27.501 11.199 0.00 0.00 A
ATOM 12 HG2 LYS A 1 7.294 26.942 9.613 0.00 0.00 A
ATOM 13 CD LYS A 1 8.769 26.282 10.991 0.00 0.00 A
ATOM 14 HD1 LYS A 1 9.376 26.065 10.088 0.00 0.00 A
ATOM 15 HD2 LYS A 1 9.476 26.682 11.750 0.00 0.00 A
ATOM 16 CE LYS A 1 7.894 25.110 11.592 0.00 0.00 A
ATOM 17 HE1 LYS A 1 7.347 25.505 12.475 0.00 0.00 A
或者这样您也可以看到链B:
ATOM 3802 N TYR B 240 -9.050 -41.325 16.074 0.00 0.00 B
ATOM 3803 HN TYR B 240 -8.672 -40.404 16.021 0.00 0.00 B
ATOM 3804 CA TYR B 240 -10.166 -41.491 15.204 0.00 0.00 B
ATOM 3805 HA TYR B 240 -9.685 -41.605 14.243 0.00 0.00 B
ATOM 3806 CB TYR B 240 -10.940 -42.818 15.365 0.00 0.00 B
ATOM 3807 HB1 TYR B 240 -10.241 -43.631 15.078 0.00 0.00 B
ATOM 3808 HB2 TYR B 240 -11.241 -43.061 16.407 0.00 0.00 B
ATOM 3809 CG TYR B 240 -12.233 -42.972 14.454 0.00 0.00 B
ATOM 3810 CD1 TYR B 240 -12.102 -43.272 13.086 0.00 0.00 B
ATOM 3811 HD1 TYR B 240 -11.100 -43.348 12.692 0.00 0.00 B
ATOM 3812 CE1 TYR B 240 -13.248 -43.404 12.343 0.00 0.00 B
ATOM 3813 HE1 TYR B 240 -13.093 -43.818 11.358 0.00 0.00 B
如果您需要有关.pdb文件格式的更多信息,here是一个链接。