File: daca-notes

package info (click to toggle)
coot 1.1.18%2Bdfsg-3
links: PTS, VCS
area: main
in suites: sid
size: 219,964 kB
sloc: cpp: 495,934; python: 35,043; ansic: 26,143; lisp: 22,768; sh: 13,186; makefile: 2,746; awk: 441; xml: 245; csh: 14
file content (228 lines) | stat: -rw-r--r-- 6,578 bytes
parent folder | download | duplicates (2)

Fragments:

convert an mmdb::Model *model_p to std::vector<std::pair<mmdb::Atom *, std::string> model_typed_atoms
Every atom has an associated CCP4-atom-type.

calculate_daca(reference_residue_p, std::vector<std::pair<mmdb::Atom *, std::string> model_typed_atoms)

Get the fragments (atom sets) of the reference residue.

For each fragment

   get the rtop that rotates this fragment onto the reference fragment (at the origin)

   for every atom in model_typed_atoms, find the distance to the centre of the fragment
      If less than limit
         find the box for this atom:
         apply the rtop to this atom - that gives us 3D coordinates near the origin
         convert that coordinate to a box indices.
         Add one to that box for that fragment of that residue type

  

---------------------------

Robbie suggest the use of CCP4 Energy types

Use Symmetry-related molecules when generating the table

Roobie whatif packing score: fine and coarse: to monitor refinement progress.

Ligand validation : break all lignads into rigid groups and look at distribution of side chain groups.

Check the chirality in SMARTS - can it be use t to type ligand fragments?

Internal symmetry: say for a PHE side chain - how does that work?

partition on beta-strand also (as well as helices)

What happens about water? check the code and the paper? Just leave it out?

Tuesday next week? yes.



-----
Wed  1 Jul 19:38:53 BST 2020

   make something like figure 5 tryptophans.

   NT3s for the top hits.

   Mg around the backbond of nucleic acid - in particular positions.

   Asn side chains find the peak for C1 of sugars? - another time.

   Bug in ARG fragments - probably everywhere.


----
Wed  8 Jul 10:41:50 BST 2020

sub.sh multi-daca.sh
[wait]
./coot-daca consolidate splits/table* consolidated [take about a minute, 29 table directories each of ~10m lines]


---
Wed  8 Jul 19:46:04 BST 2020


4 -rw------- 1 paule paule  36 Jul  8 13:48 daca-analyse-final.sh
4 -rw------- 1 paule paule  38 Jul  8 13:47 daca-analyse-cycle-0.sh

use plot-2.r to plot the distributions

----
Thu  9 Jul 23:17:30 BST 2020


grep residue_number daca-analyse-A-chain-0-cycle.log > cycle-0.tab
grep residue_number daca-analyse-A-chain-final.log > final.tab


a = read.table('final.tab')
b = read.table('cycle-0.tab')
plot(a$V2, a$V4, xlab="Residue Number", ylab="DACA Score Sum", main="DACA Score Sum for 1eoi, A chain")
points(b$V2, b$V4, pch=19)

nct = c('Final', 'Cycle 0')
legend(150, 2000000, nct, pch=c(1,19))



----
Thu  9 Jul 23:22:56 BST 2020

   need to normalize:

   so sum of values for each atom type for each box_key needs to be 1.0 (or 1 million)?


----
Wed 22 Jul 09:46:13 BST 2020

awk -f compare-whatif-and-coot-daca.awk tm-A-chain-out-of-register-helix.tab > compare-whatif-and-coot-daca-tm-with-out-of-register-helix.table

> a = read.table('compare-whatif-and-coot-daca-tm-with-out-of-register-helix.table')
> plot(a$V8, a$V4, xlab="Whatif DACA", ylab="Coot DACA", main="Comparison of DACA Scores")

-> No correlation. Sigh.



----
Wed 22 Jul 21:21:23 BST 2020

  Robbie Chat:

  [Read the paper again :-)]
  calculate the solvent exposure of the residue (does this match Whatif?)
  What is the relation of the DACA score to the solvent-exposedness?

  Check the normalization is doing what you think it is - helices are scoring high?
  Check a few more structures to see if it is consitent?

  Try normalizing by fragment-voxel normalizing over atom types. Currently we have
  normalization by fragment-atom type normalizing over voxels. Normalization should
  happen after smoothing.
  

---
Thu  3 Sep 20:37:49 BST 2020

  To extract the solvent content used by Whatif, use the output of dssp and parse it
  like this:

$ awk -f dssp-out-to-ACC.awk 1eoi.dssp > 1eoi.ACC.tab

---
Sun  6 Sep 14:28:55 BST 2020

  Use zip-coot-and-whatif-se-v3.awk to combine coot-daca solvent content with
  dssp whatif (e.g. 1eoi.ACC.tab) to make "combi.tab"

a = read.table('combi.r')
plot(a$V4, a$V6, main="Solvent Content Comparison: 1eoi",
     xlab="DSSP Solvent Content Score",
     ylab="Coot Solvent Score")
abline(lm)

----------------
Wed 23 Sep 20:32:40 BST 2020

Split the solvent content vs data score into Helix and STC (strand/turn,coil)

try using the new dssp to get the secondary structure assignment.

Try without sidechain masking

Try with different blur factors in the "cook" function.

use blast for multiple sequence alignement input to clustalw?
Getting seuenquces is not that hard any more.
Perhaps even search against the PDB?
Try the API from the PDB.


---
Tue 29 Sep 18:06:53 BST 2020

    read this paper: https://onlinelibrary.wiley.com/doi/pdf/10.1002/prot.340200307

    Polar and nonpolar atomic environments in the protein core: Implications for folding and binding

    Abstract

    Hydrophobic interactions are believed to play an important role in
    protein folding and stability. Semi‐empirical attempts to estimate
    these interactions are usually based on a model of solvation,
    whose contribution to the stability of proteins is assumed to be
    proportional to the surface area buried upon folding. Here we
    propose an extension of this idea by defining an environment free
    energy that characterizes the environment of each atom of the
    protein, including solvent, polar or nonpolar atoms of the same
    protein or of another molecule that interacts with the protein. In
    our model, the difference of this environment free energy between
    the folded state and the unfolded (extended) state of a protein is
    shown to be proportional to the area buried by nonpolar atoms upon
    folding. General properties of this environment free energy are
    derived from statistical studies on a database of 82 well‐refined
    protein structures. This free energy is shown to be able to
    discriminate misfolded from correct structural models, to provide
    an estimate of the stabilization due to oligomerization, and to
    predict the stability of mutants in which hydrophobic residues
    have been substituted by site‐directed mutagenesis, provided that
    no large structural modifications occur.




----
Wed  2 Dec 19:16:08 GMT 2020


  take a ubuqitin like domain (76  residues) and make an out of register error model. Make 10 or them or so.

  Send the models over to Robbie for Whatcheck analysis.

  Do that by Monday!


-

 Can you add a button to close the docked Sequence View?

----

ray traced breath of the wild

---

Ubuntun 18.04

----

occupancy-weighted B-factor   (show B/occ)