File: formats.html

package info (click to toggle)
socnetv 0.90-3
  • links: PTS, VCS
  • area: main
  • in suites: wheezy
  • size: 2,028 kB
  • sloc: cpp: 12,953; makefile: 75
file content (315 lines) | stat: -rwxr-xr-x 12,347 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
<link rel="stylesheet" href="socnetv.css" type="text/css" />

<h4><a name="Formats" id="Formats"></a>Supported Formats</h4>
<div class="text">

       SocNetV supports many network formats: 
	<ul>
		<li>GraphML  (.graphml), </li> 
		<li>GraphViz  (.dot),  </li> 
		<li>Adjacency matrix (.net, .txt)</li>
		<li>Pajek-like  (.net), </li> 
		<li>UCINET's Data Language (.dl)</li>

	</ul>

You can load   these kinds of files by clicking on menu File > Load or	by  specifying   them  explicitly at  the command line.	
SocNetV uses simple inspection routines to check the format of the given file. 
In most cases, it  will  load the file, no matter what the file extension is.

</div>
<div class="text">
<br />
<strong>WARNING: </strong>The default file format of SocNetV is GraphML. If you create a new network and press Ctrl+S to save it, then by default SocNetV will save it in GraphML format.


</div>
<h4><a name="GraphML" id="GraphML"></a>GraphML files</h4>
<p class="text">
Each GraphML document is written in a special form of XML and defines a graph. For instance the code below, contains 11 nodes and 12 edges: 
<p class="code">
&lt;?xml version="1.0" encoding="UTF-8"?&gt;<br />
&lt;graphml xmlns="http://graphml.graphdrawing.org/xmlns"  <br />
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"<br />
    xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns<br />
     http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd"&gt;<br />
  &lt;graph id="G" edgedefault="undirected"&gt;<br />
    &lt;node id="n0"/&gt;<br />
    &lt;node id="n1"/&gt;<br />
    &lt;node id="n2"/&gt;<br />
    &lt;node id="n3"/&gt;<br />
    &lt;node id="n4"/&gt;<br />
    &lt;node id="n5"/&gt;<br />
    &lt;node id="n6"/&gt;<br />
    &lt;node id="n7"/&gt;<br />
    &lt;node id="n8"/&gt;<br />
    &lt;node id="n9"/&gt;<br />
    &lt;node id="n10"/&gt;<br />
    &lt;edge source="n0" target="n2"/&gt;<br />
    &lt;edge source="n1" target="n2"/&gt;<br />
    &lt;edge source="n2" target="n3"/&gt;<br />
    &lt;edge source="n3" target="n5"/&gt;<br />
    &lt;edge source="n3" target="n4"/&gt;<br />
    &lt;edge source="n4" target="n6"/&gt;<br />
    &lt;edge source="n6" target="n5"/&gt;<br />
    &lt;edge source="n5" target="n7"/&gt;<br />
    &lt;edge source="n6" target="n8"/&gt;<br />
    &lt;edge source="n8" target="n7"/&gt;<br />
    &lt;edge source="n8" target="n9"/&gt;<br />
    &lt;edge source="n8" target="n10"/&gt;<br />
  &lt;/graph&gt;<br />
&lt;/graphml&gt;<br />
</p>

<p class="text"> 
All GraphML files consist of a graphml element and a variety of subelements: graph, node, edge, keys. 
SocNetV understands all of them. 
</p>

<p class="text">
Nodes are defined by the &lt;node id="n1" /&gt; where id is a unique node identification string. This id is used in edge declaration, below.
<br /> 
Edges are defined by the &lt;edge source="n1" target="n1" /&gt; where source and target are equal to existing node ids. 
 


</p>


<h4><a name="GraphViz" id="GraphViz"></a>GraphViz files</h4>
<p class="text">
This is the file format of the graphviz layout package. Unfortunately, I have not yet managed to implement the whole specifications of this nice format. The features that are recognized by SocNetV are displayed in the following example:
</p>
<p class="code">
digraph mydot { <br />
node [color=red, shape=box]; <br />
a -> b -> c ->d <br />
node [color=pink, shape=circle]; <br />
d->e->a->f->j->k->l->o <br />
[weight=1, color=black]; <br />
}
</p>
<p class="text"> 
 Nodes are defined by the "node" declaration. In this you can define the color and the shape of the nodes that will follow. Each link is denoted by an "->" for directed graphs (digraphs) and a "-" for undirected graphs (graphs) between nodes' labels. For instance, "a -> b" means a directed edge from a to b. Moreover, links can have weights and colours.
</p>

<h4><a name="Adjacency" id="Adjacency"></a>Adjacency Matrix files</h4>
<p class="text">
The adjacency sociomatrix format is a very easy one. <br />
It describes one-mode networks and contains a simple matrix NxN, where N is the amount of nodes. Each (i,j) element is a number. <br />
If (i,j)=0 then nodes i and j are not connected. <br />
If (i,j)=x where x a non-zero number then there will be an arc from node i to node j. <br />
Again, negative weights are allowed. Those are depicted as dashed lines when the network is visualised on the canvas. 
</p>
<p class="text"> 
 This is an example of an adjacency sociomatrix formatted network. 

<div class="code">
<pre> 
0000000011
0000101100
0001100000
0010000010
0110001000
0000001100
0100110001
0100010000
1001000000
1000001000 
</pre>
</div>
</p>


<h4><a name="Two-mode Sociomatrix" id="TwoModeSM"></a>Two-mode Sociomatrix files</h4>
<p class="text">
Unlike one-mode networks which describe direct links between actors of the same type, networks can be two-mode as well. Two-mode networks describe either two sets of actors or a set of actors and a set of associated events. <br /> <br />

In the first case, which usually is called dyadic two-mode network, there are two sets of actors. The sociomatrix codifies the relations between actors in the first set and actors in the second set. <br /> <br />

In the second case, which usually is called affiliation network, there is a set of actors and a set of events or organizations. The sociomatrix measures the attendance or affiliations of the actors (first mode) with a particular event or organization (second mode). <br /> <br />

Two-mode networks are described by affiliation network matrices, where A(i,j) codes the events/organizations each actor is affiliated. <br />

A two-mode sociomatrix is a matrix NxM, where N is the amount of nodes and M is the amount of events. Each (i,j) element can be 0 or 1. <br /> <br />
If A(i,j)=1 then actor i is affiliated with event j. <br />
</p>
<p class="text">
 This is an example of an two-mode sociomatrix formatted network.

<div class="code">
<pre>
0 0 1 1 0 0 0 0 1 
0 0 1 0 1 0 1 0 0 
0 0 1 0 0 0 0 0 0 
0 1 1 0 0 0 0 0 0 
0 0 1 0 0 0 0 0 0
0 1 1 0 0 0 0 0 0
0 0 1 1 0 0 0 0 0
0 0 0 1 0 0 1 0 0
1 0 0 1 0 0 0 1 0
0 0 1 0 0 0 0 0 1
0 1 1 0 0 0 0 0 1
0 0 0 1 0 0 1 0 0
0 0 1 1 1 0 0 0 1
</pre>
</div>
</p>


<h4><a name="Pajek" id="Pajek"></a>Pajek-like formatted files</h4>
<p class="text">
Note the 'Pajek-like' part. This is because real Pajek files can be much more complicate than the ones recognised by SocNetV. To be more precise, here is an example of the Pajek-like form that SocNetV understands. The numbers to the left are just indicating line numbers.
</p>
<div class="code">
<pre>
 1) *Network 
 2) *Vertices 6
 3) 1 "pe0" ic LightGreen 0.5 0.5 box
 4) 2 "pe1" ic LightYellow 0.8473 0.4981 ellipse
 5) 3 "pe2" ic LightYellow 0.6112 0.8387 triangle
 6) 4 "pe3" ic LightYellow 0.201 0.7205 diamond
 7) 5 "pe4" ic LightYellow 0.2216 0.2977 ellipse
 8) 6 "pe5" ic LightYellow 0.612 0.1552 circle
 9) *Arcs 
 10) 1 2 1 c black
 11) 1 3 -1 c red
 12) 2 4 1 c black
 13) 3 5 1 c black
 14) *Edges 
 15) 6 4 1 c black 
 16) 5 6 1 c yellow
</pre>
</div>
</p>
 <p class="text">
 Let me analyse this a little bit:
</p>
 <p class="text">
The first line (*Network) declares that this is a Pajek network. 
</p>
 <p class="text">
The second line (*Vertices 6) declares the number of vertices of the network and identifies that the following lines describe node properties. 
</p>
<p class="text">
Each one of the following 6 lines (3-8) construct one node. Each node's line has 7 columns-properties: <br />
Column 1 denotes the node's number.<br />
Column 2 denotes the node's label. <br />
Column 3 indicates that the next column carries the colour of the node's shape.<br />
Column 4 denotes the colour of the node's shape. <br />
Column 5 denotes the proportional X coordinate of the specific node on the canvas.<br />
Column 6 denotes the proportional Y coordinate of the specific node on the canvas. <br />
Column 7 denotes the node's shape. <br />
</p>
<p class="text">
Line 9 (*Arcs) identifies that the following lines will describe arcs from an node to another. 
Each one of the lines 10-13 construct one arc. For instance, Line 10 constructs an arc from node 1 to node 2 with weight 1 and black colour. 
</p>
<p class="text">
Line 14 identifies that the following lines will describe edges (double arcs) between nodes.
Each one of the lines construct one edge. For instance, Line 10 constructs an arc from node 1 to node 2 with weight 1 and black color. 
</p>

<p class="text">
Note that it is legal to have mixed columns in Pajek-like network file. For instance you can have an node's specification line like this:

<p class="code">
 4 "label" 0.201 0.7205 ic LightYellow diamond. 
</p>
</p>
 <p class="text">
 Also, it is not necessary to declare X and Y coordinates or colors and shapes. In that case SocNetV will use the defaults, that is red diamonds scattered randomly across the canvas. Nevertheless, the first two columns must be valid node numbers and labels. 
</p>
 <p class="text">
 Note also that weights might be negative as in line 11. Negative weights are depicted as dashed lines on the canvas.
</p>
 <p class="text">
 Colour names are not arbitrarily created. Valid colour names for nodes and arcs/edges are those specified in the X11 file: /usr/X11R6/lib/X11/rgb.txt, i.e. red, gray, violet, navy, green, etc. You can change colours of all network elements from inside SocNetV.
</p>

<p class="text">
SocNetV also supports Pajek files which declare edges/arcs in matrices, like this:
<div class="code">
<pre>
*Vertices     11
     1 "minister1"                              0.2912    0.2004 ellipse
     2 "pminister"                              0.4875    0.0153 diamond
     3 "minister2"                              0.3537    0.3416 ellipse
     3 "minister2"                              0.3537    0.3416 ellipse
     4 "minister3"                              0.4225    0.5477 ellipse
     5 "minister4"                              0.4538    0.1603 ellipse
     6 "minister5"                              0.4900    0.3836 ellipse
     7 "minister6"                              0.6212    0.5038 ellipse
     8 "minister7"                              0.6450    0.2023 ellipse
     9 "advisor1"                               0.6488    0.6031 box
    10 "advisor2"                               0.3212    0.5515 box
    11 "advisor3"                               0.7188    0.4218 box
*Matrix
 0 1 1 0 0 1 0 0 0 0 0
 0 0 0 0 0 0 0 1 0 0 0
 1 1 0 1 0 1 1 1 0 0 0
 0 0 0 0 0 0 1 1 0 0 0
 0 1 0 1 0 1 1 1 0 0 0
 0 1 0 1 1 0 1 1 0 0 0
 0 0 0 1 0 0 0 1 1 0 1
 0 1 0 1 0 0 1 0 0 0 1
 0 0 0 1 0 0 1 1 0 0 1
 1 0 1 1 1 0 0 0 0 0 0
 0 0 0 0 0 1 0 1 1 0 0
</pre>
</div>
<p class="text">
Here, the *Matrix tag replaces *Arcs or *Edges. An ordinary adjacency matrix follows describing all links.
</p>

<p class="text">
Another possibility, is the *ArcsList tag. When SocNetV finds that tag in a Pajek file, it expects each node to declare list of its link to other nodes. Here is an example:
</p>
<div class="code">
<pre>
*Vertices 9
1
2
3
4
5
6
7
8
9
*Arcslist
2 1 3 9
1 3 4 5
3 1 4 7
4 1 2 3
5 1 3 4
7 2 8 9
</pre>
</div>

<p class="text"> For instance, the first line after *Arcslist means: "node 2 is connected to nodes 1, 3 and 9". It is very simple.</p>


<h4><a name="DL" id="DL"></a>UCINET's DL files</h4>
<p class="text">



UCINET's DL format is one of the easiest to understand. For the moment, we support only FULL MATRIX mode. Each file starts with the "DL" mark; then the amount N of nodes is declared and the format (i.e. if a diagonal is present or not). Then, after the "LABELS:" mark we read the labels of each node line by line. That is, if N was 100 then we expect to read 100 labels. In the end, a DL file declares network data ("DATA") which is only the edges. For instance the network below, contains 4 nodes and 7 arcs/edges: 
<div class="code">
<pre>
DL
N=4
FORMAT = FULLMATRIX DIAGONAL PRESENT
LABELS:
On the normalization and visualization of author co-citation data:Salton's cosine versus the Jaccard index
Caveats for the use of citation indicators in research and journalevaluations
Should co-occurrence data be normalized? A rejoinder
Home on the range - What and where is the middle in science andtechnology studies?
DATA:
0 0 0.158114 0
0.201234 0 1 0
1 0 0 0
0.1 1 1 0
</pre>
</div>