File: doc_dump2data.md

package info (click to toggle)
moltemplate 2.22.4-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 26,084 kB
  • sloc: python: 25,770; sh: 3,746; tcl: 170; makefile: 14; awk: 4
file content (194 lines) | stat: -rw-r--r-- 9,610 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
dump2data.py
===========

## Description

**dump2data.py** is a tool to extract coordinates from LAMMPS "dump" (trajectory) files.  It was originally designed to convert snapshots from trajectory files into LAMMPS DATA format (for restarting a simulation from where it left off).  However it also reads and writes .XYZ and .RAW (simple 3-column text format) coordinate files.
Although it was written in python, **dump2data.py** is a stand-alone executable intended to be run from the terminal (shell).  *It has not been optimized for speed.*

## Comparison with pizza.py
**dump2data** duplicates some of the tools in
[pizza.py,](http://pizza.sandia.gov/doc/Manual.html).
If you are willing to learn python, **pizza.py** can handle more some dump files which might cause **dump2data.py** to crash.  It includes support for a wider variety of atom styles (eg "atom_style tri").  **dump2data.py** is maintained by the moltemplate developers. **pizza.py** is maintained by the lammps developers.  **pizza.py** may be faster than **dump2data.py** (dump2data.py has not been optimized for speed).


## Arguments

```
   dump2data.py [old_data_file]           \
                [-raw]                    \
                [-xyz]                    \
                [-t time]                 \
                [-tstart ta] [-tstop tb]  \
                [-last]                   \
                [-interval n]             \
                [-type atom_types]        \
                [-id atom_ids]            \
                [-mol mol_ids]            \
                [-multi]                  \
                [-center]                 \
                [-scale x]                \
                [-atomstyle style]        \
                [-xyz-id]                 \
                [-xyz-mol]                \
                [-xyz-type-mol]           \
                < DUMP_FILE > OUTPUT_FILE
```


## Examples

### Example creating RAW (3-column ascii text) coordinate files

   If your LAMMPS dump file is named "traj.lammpstrj", you can
extract the coordinates this way:
```
dump2data.py -raw < traj.lammpstrj > traj.raw
```
The resulting file ("traj.raw") will look like this:
```
-122.28 -19.2293 -7.93705
-121.89 -19.2417 -8.85591
-121.6 -19.2954 -7.20586
  :       :        :

-121.59 -20.3273 -2.0079
-122.2 -19.8527 -2.64669
-120.83 -19.7342 -2.2393
  :       :        :
```
(Note: Blank lines are used to delimit different frames in the trajectory.  If you only want a single frame from the trajectory, you can specify it using the -t or -last arguments.  Alternately you can use the *head* and *tail* unix commands to extract a portion of the trajectory file containing the frame you are interested in beforehand.)

   To limit the output to consider only atoms of a certain type, (for example,
atom types 1,3,5,6, and 7), you can use the *"-type LIST"* argument (for example
"-type 1,3,5-7").  You can also restrict the output to atoms or molecules
with a certain range of ID numbers using the "-id" and "-mol" arguments.
*(These arguments have have no effect if you are creating a ".data" file.
They only work if you are using the '-raw' or '-xyz' arguments.)*


### Example creating XYZ (4-column ascii text) coordinate files

You can extract the coordinates using the .XYZ format this way:
```
dump2data.py -xyz < traj.lammpstrj > traj.xyz
```
This generates a 4-column text file containing the atom-type (first column) followed by the xyz coordinates on each line of each atom (sorted by atomid).  (If you prefer the first column to be something else, you can use the "-xyz-id", "-xyz-mol", and "-xyz-type-mol" arguments instead.)  If there are multiple frames in the trajectory file, it will concatenate them together this way:
```
8192
LAMMPS data from timestep 50000
5 -122.28 -19.2293 -7.93705
3 -121.89 -19.2417 -8.85591
7 -121.6 -19.2954 -7.20586
:   :          :            :
8192
LAMMPS data from timestep 100000
5 -121.59 -20.3273 -2.0079
3 -122.2 -19.8527 -2.64669
7 -120.83 -19.7342 -2.2393
```
(Note: If you want the atom-ID to appear in the first column use "-xyz-id"
       If you want the molecule-ID to appear in the first column use "-xyz-mol"
       If you want the atom-type AND molecule-ID to appear, use "-xyz-type-mol")


## Examples creating DATA files

"dump2data.py" (and "raw2data.py") can also create lammps DATA files.  You must supply them with an existing DATA file containing the correct number of atoms and topology information, and a file containing the coordinates of the atoms.

If your coordinates are stored in a an ordinary 3-column text file ("RAW" file),
you can create the new DATA file this way:
```
raw2data.py -atomstyle ATOM_STYLE data_file < coords.raw  > new_data_file
```
where ATOMSTYLE is a quoted string, such as "full" or "hybrid sphere dipole".
The "-atomstyle ATOM_STYLE" argument is optional.
The default atom_style it is "full".

If your coordinates are stored in a DUMP file (eg "traj.lammpstrj"), 
you can create a new data file this way:
```
dump2data.py -t 10000 data_file < traj.lammpstrj > new_data_file
```
In this example, "10000" is the timestep for the frame you have selected.  You can use *-last* to select the last frame.  If you do not specify the frame you want, multiple data files may be created.  **WARNING: dump2data.py is slow**.  (If you have a long trajectory file, I recommend using the *tail* and *head* unix commands to extract the portion of the trajectory file containing the frame you want before reading it with dump2data.py.  This will be much faster than using the *-t* or *-last* commands.)

(You can use the "-atomstyle" argument with *dump2data.py* as well.)

Creating multiple data files:
The "-multi" command line argument tells "dump2data.py" to generate a new data file for each frame in the trajectory/dump-file.  Those files will have names ending in ".1", ".2", ".3", ...  (If you use the *-interval* argument, frames in the trajectory whose timestep is not a multiple of the interval will be discarded.)  This (probably) occurs automatically whenever the trajectory file contains multiple frames unless you have specified the frame you want (using the *-t* or *-last* arguments)


### Examples using optional command line arguments

If you want to select a particular frame from the trajectory, use:
```
dump2data.py -xyz -t 10000 < traj.lammpstrj > coords.xyz
```
To select the most recent (complete) frame, use:
```
dump2data.py -xyz -last < traj.lammpstrj > coords.xyz
```
(If the last frame is incomplete, this script will attempt to use the previous frame.)

If you want to select multiple frames, but there are too many frames in your trajectory, you can run dump2data.py this way...
```
dump2data.py -xyz -interval 10000 < traj.lammpstrj > traj.xyz
```
...to indicate the desired interval between frames (it must be a multiple of
the save interval).  You can also use "-tstart 500000 and "-tstop 1000000" arguments to limit the output to a particular range of time.  (500000-1000000 in this example).

### Arguments for scaling and centering coordinates

#### -center

This will center the coordinates around the geometric center, so that the average position of the atoms in each frame is located at the origin.  (This script attempts to pay attention to the periodic image flags.  As such, I think this script works with triclinic cells, but I have not tested that feature carefully.)

#### -scale 1.6

This will multiply the coordinates by a constant (eg "1.6")  *(Warning: This argument has not been tested with trajectory files containing periodic image flags: ix iy iz)*

## Limitations

### Speed
The program is slow.  If speed is important to you, you probably should write your own custom script or use pizza.py which might be faster.  (Again, alternatively, you can use the unix *head* and *tail* commands to extract the portion of the trajectory file you are interested in beforehand.)

### Triclinic cells
Support for triclinic cells has been added, but not tested.

### Exotic atom_styles

This script was designed to work with point-like atoms, and it extracts the
x,y,z coordinates (and if present vx,vy,vz velocity)
and it (by default) copies it to the new data being created by this script.

By default, this script assumes you are using "atom_style full".
If you are using some other atom style (eg "hybrid bond dipole"), then you can try to run it this way:
```
dump2data.py -t 10000 \
  -atomstyle "hybrid bond dipole" \
  old_data_file < traj.lammpstrj > new_data_file
```
In general, the -atomstyle argument can be any of the atom styles listed in the
table at:
https://docs.lammps.org/atom_style.html
...such as "angle", "bond", "charge", "full", "molecular", "dipole",
"ellipsoid", or any hybrid combination of these styles.
(When using hybrid atom styles, you must enclose the argument in quotes,
for example: "hybrid sphere dipole")

*Warning: I have not tested using dump2data.py with exotic (non-point-like)
atom styles.
I suspect that the script will not crash, but the dipole or ellipsoid
orientations might not be updated and may remain pointing in their
initial directions.
I suspect that "tri", "template", and "body" atom styles will not work at all.*

You can also customize the order columns you want to appear in that file using
-atomstyle ”molid x y z atomid atomtype mux muy muz”.
*(But again, I worry that the mux, muy, muz information in the new data
file might be out of date.)*

Again, try using pizza.py if you are simulating systems with exotic data types.
http://pizza.sandia.gov/doc/Manual.html

I hope this is useful to someone.