File: tutorial_advanced.docbook

package info (click to toggle)
tulip 3.7.0dfsg-4
links: PTS, VCS
area: main
in suites: wheezy
size: 39,428 kB
sloc: cpp: 231,403; php: 11,023; python: 1,128; sh: 671; yacc: 522; makefile: 315; xml: 63; lex: 55
file content (390 lines) | stat: -rwxr-xr-x 24,713 bytes
<chapter id="tutorial_advanced"><title>Advanced tutorial: real use cases</title>
  <sect1 id="improve-layout"><title>Layouts</title>
    <sect2 id="improve-layout-intro"><title>Introduction</title>
    	<para>In this tutorial, we learn to use layouts algorithms and graph properties.
				We will first import a file system structure, then use the graph properties to find specific files. (*.cpp, *.hpp, ...)</para>
    </sect2>
    <sect2 id="improve-layout-import"><title>File system import</title>
    	<para>To create a graph representing a directory structure, click menu <code>File → Import → Misc → File System Directory</code>.
				Select a directory not too big (to avoid a too long loading time)
				Each folder and file is represented as a node, and an edge link each node to its parent folder.
    	</para>
    	<graphic fileref="images/tutorials_improvelayout_1.png" />
     	<remark>To follow the rest of this tutorial you need to download and open (<code>File → Open</code>) this tulip compressed graph file:
				<ulink url="http://www.labri.fr/perso/auber/projects/tulip/samples/filesystem.tlp.gz">Graph</ulink>.</remark>
    </sect2>
    <sect2 id="improve-layout-newlayout"><title>Layouts</title>
     	<sect3 id="tuto-Bubble-Tree-layout"><title><code>Bubble Tree Layout</code></title>
     		<para>This layout algorithm can be found in menu <code>Algorithm → Layout → Tree</code>.
     			The bubble tree layout is useful to notice directories that have similar structures, as we now demonstrate.</para>
				<para>First, have two drawing side by side:
					create a second drawing (a Node Link Diagram View) by clicking <code>View → Node Link Diagram View</code>.
					Then reorganize the windows by clicking <code>Windows → Tile</code></para>.
     		<para>Now that we can see both views at the same time, zoom in on the right window:
					select the zoom tool <inlinegraphic fileref="images/i_zoom.png"/> and draw a bounding box on the left side of the graph like this:</para>
     		<graphic fileref="images/tutorials_improvelayout_bubble_1.png" />
				<para>Two nodes looks very alike. With the <code>Get Information</code> tool <inlinegraphic fileref="images/i_select.png"/>,
						we can get their ids: node 1965 and node 2009.
					It makes sense that these two directories have the same structure, since they both contain plug-ins source files.</para>
     		<para>Now, recenter the view (<code>View → Center View</code> or <keycap>Ctrl+Shift+C</keycap>). </para>
				<para>Let's say that we now want to study our graph without the plug-in directory.
					To delete the whole directory (in the graph, not on disk), select it, then select all its sub directories, like this:</para>
				<itemizedlist>
					<listitem>
						<para>Select the plug-in directory: use the Find tool (<code>Edit → Find</code> or <keycap>Ctrl+F</keycap>),
							with parameters <code>Input property</code>:"name", filter: "= plugins".</para>
					</listitem>
					<listitem>
						<para>Select all its sub-directories: click <code>Algorithm → Selection → Reachable Sub-Graph</code> with distance set to 50. </para>
					</listitem>
					<listitem>
						<para>Delete the selected elements: press <keycap>Del</keycap></para>
					</listitem>
					<listitem>
						<para>Re-run the bubble layout: <code>Algorithm → Layout → Tree → Bubble Tree</code>.</para>
					</listitem>
				</itemizedlist>
				<para>The graph should look like this:</para>
				<graphic fileref="images/tutorials_improvelayout_bubble_2.png" />
     	</sect3>

     	<sect3 id="improve-layout-treemap"><title><code>Tree Map layout</code></title>
     		<para>This Layout algorithm can be found in <code>Algorithm → Layout → Tree</code>.
     		  The treemap layout is useful to notice large disk usage files.</para>
		    <para>First, make sure that <code>Options → Force Ratio</code> is not checked.</para>
     		<para>Apply the layout algorithm: <code>Algorithm → Layout → Tree → Squarified Tree Map</code> with the following parameters:
     		</para>
     		<itemizedlist>
     			<listitem><para>Metric: "size".
     			</para></listitem>
     			<listitem><para>Aspect Ratio: 1
     			</para></listitem>
     			<listitem><para>TreeMap Type?: checked
     			</para></listitem>
     		</itemizedlist>
     		<para>The picture should look like this:</para>
     		<graphic fileref="images/tutorials_improvelayout_treemap_1.png" />
     		<para>This drawing helps to see filesizes and notice larger files.
					It represent the hierarchy of directories as nested nodes.
					See <ulink url="http://en.wikipedia.org/wiki/Tree_map">Wikipedia: Tree Maps </ulink> for more information. </para>
     	</sect3>
     	<sect3 id="improve-layout-improvewalker"><title><code> Improved Walker layout </code></title>
     		<para>This is a hierarchical layout, useful to understand the tree structure of a file system.</para>
     		<para>Click <code>Algorithm → Layout → Tree → Improved Walker</code>.
					Use the following parameters:</para>
     		<itemizedlist>
     			<listitem><para>Node Size: viewSize
     			</para></listitem>
     			<listitem><para>Orientation: Left to right.
     			</para></listitem>
     			<listitem><para>Orthogonal: Checked.
     			</para></listitem>
     			<listitem><para>Layer Spacing: Default value.
     			</para></listitem>
     			<listitem><para>Node Spacing: Default Value.
     			</para></listitem>
     		</itemizedlist>
     		<para>The result should look like this:</para>
     		<graphic fileref="images/tutorials_improvelayout_2.png" />
     	</sect3>
    </sect2>

    <sect2 id="improve-layout-labels"><title>Labels</title>
			<para>Click <code>View Editor → Layer Manager</code> and check the <code>Nodes Label</code> box in the first column.
     		<para>To view labels, click <code>Graph Editor → Property → name</code> then <code>To labels</code>. 
					You should now see a graph looking like this:</para>
     		<graphic fileref="images/tutorials_improvelayout_3.png" />
     		<para>The labels don't fit in nodes, to fix this,
					go to the <code>Rendering Parameters</code> tab and check <code>Scale labels to fit node sizes</code>.</para>
				<para>The picture should look like this:</para>
     		<graphic fileref="images/tutorials_improvelayout_4.png" />
     		<para>By zooming in we can see the labels:</para>
     		<graphic fileref="images/tutorials_improvelayout_5.png" />
     		<para>If you want to display the big file (high disk space) with big nodes, use the <code>Algorithm → Size → Metric Mapping</code> algorithm with parameter "property" set to "size". </para>
			</para>
    </sect2>

    <sect2 id="improve-layout-ext"><title>Showing a specific kind of file</title>
     	<para>Now that our graph has a nice and clear layout, we would like to locate all our source files (*.cpp).
				To do so, use Edit → Find (or <keycap>Ctrl+F</keycap>) with the following parameters:</para>
     	<itemizedlist>
     		<listitem><para>Input Property: name
     		</para></listitem>
     		<listitem><para>Filter:  filter function set to "=" and filter value set to ".*cpp or .*hpp".
						The filter supports regular expressions:
						'.' means any character and '*' means any number of times, so '.*' means any character, any number of times.
						('*' should not be confused with a wildcard character.)
						For details on regular expressions, see <ulink url="http://en.wikipedia.org/wiki/Regular_expression">Wikipedia: Regexp </ulink>.
     		</para></listitem>
     		<listitem><para>Options: check <code>Replace</code> and select <code>On nodes</code>.
     		</para></listitem>
     	</itemizedlist>
     	<para>Now that the source files are selected, we apply a different color to them.</para>
     	<para>Click <code>Graph Editor → Property</code>, select the node property "viewColor", check <code>selected only</code>.
				Now that we only see the "viewColor" property of the selected nodes,
				click <code>Set all</code> and choose (for instance) green.</para>
     	<para>The picture should look like this:</para>
     	<graphic fileref="images/tutorials_improvelayout_6.png" />
     	<graphic fileref="images/tutorials_improvelayout_treemap_6.png" />
    </sect2>
  </sect1>

  <sect1 id="infovis"><title>People in InfoVis</title>
    <sect2 id="people-in-infovis"><title>Analyzing an author</title>
    	<para>For this tutorial, download and open this <ulink url="http://tulip.labri.fr/samples/GRCite.tlp.gz">tulip graph file</ulink>.
				This graph represent relations between authors, conferences and papers in the InfoVis community.
			</para>
			<para>
				To distinguish between paper, conferences and authors, let us color them:
    	<itemizedlist>
    		<listitem> <para>Select all papers: click <code>Edit → Find</code> or Ctrl+F, set property: "type", set the filter "= 0".</para>
    		</listitem>
    		<listitem> <para>Color the nodes in red: click <code>Graph Editor → Property</code>,
						select the 'viewColor' property, check <code>selected only</code>, click <code>Set all</code> and choose red.</para>
    		</listitem>
				<listitem>Repeat for authors (type = 2) and conferences (type = 1) with other colors (here blue and yellow):</listitem>
    	</itemizedlist>
			</para>
    	<graphic fileref="images/tutorials_peopleinfovis_1.png" />
    	<para>Let's look at a single author, for instance George Robertson..
				Select that node with the Find tool, with "titleshort" "=" "G.*Rob.*" (this is a regular expression).
				Temporarily moving the node away gives a rough estimate of how connected he is:</para>
    	<graphic fileref="images/tutorials_peopleinfovis_2.png" />
    	<para>Now, focus on the graph around this author:
				<itemizedlist>
					<listitem>click <code>Algorithm → Selection → Reachable Sub-Graph</code>,
					leaving direction to 0 (outgoing edges), startingNodes to viewSelection (nodes selected, in our case simply Mr. Robertson), and distance 1.
					</listitem>
					<listitem>Click <code>Edit → Create Subgraph</code> to save this selection for further manipulation, naming it "GR.1hop.outgoing".
						Select this new subgraph in the hierarchy tab.
						We can deselect the nodes now: use selection tool and click away from any node or edge (or Ctrl+Shift+a).</listitem>
					<listitem>To improve the layout, click <code>Algorithm → Size → Auto Sizing</code>,
						then change the layout for <code>Algorithm → Layout → Hierarchical → Hierarchical Graph</code>.
						There is an even better hierarchical layout on the plugin server:
						<code>Help → Plugins → LaBRI Universite Bordeaux 1 → Layout → Sugiyama (OGDF)</code>.</listitem>
				</itemizedlist>
		  </para>
    	<para>We see which papers were published first, as they are cited by the later ones.
				Robertson has 11 papers referenced in this database. 
				If we apply a coloring by the number of citations (<code>Algorithm → Color → Metric Mapping</code>, with field property set to "arityOut" ) 
				we clearly see that "Cone Trees" is his most influential work:</para>
    	<graphic fileref="images/tutorials_peopleinfovis_3.png" />
    	<para>Now, 
				<itemizedlist>
					<listitem>return to the main graph, and make sure the Robertson node is away from the others.</listitem>
					<listitem>Select all the papers nodes (<code>Edit→Find</code>, "property=0").</listitem>
					<listitem>We want to add Robertson and all the edges going from or to this node to our selection. To do so, we need the selection tool, and by pressing  <keycap>Ctrl</keycap> while drawing a rectangle encompassing the node and all of its edges, we add to our current selection.
						Save this selection to a new subgraph named GRCitesub:
    				<graphic fileref="images/tutorials_peopleinfovis_4.png" />
					</listitem>
					<listitem>In the new subgraph, select the Robertson node.
						Click <code>Algorithm → Selection → Reachable Sub-Graph</code> algorithm again, but with a depth of 2 to find all papers that cite a paper written by Robertson.</listitem>
					<listitem>Now we want to remove any node or edge that is not selected:
						invert the selection (<code>Edit → Invert Selection</code> or <keycap>Ctrl + I</keycap>), delete (<keycap>Del</keycap>).
						This leaves us with Robertson and all his papers, and the papers citing his papers.</listitem>
					<listitem>Finally, applying a hierarchical layout yields this picture:</listitem>
				</itemizedlist>
			</para>
    	<graphic fileref="images/tutorials_peopleinfovis_5.png" />
    </sect2>

    <sect2 id="sec"><title>What are the relationships between researchers?</title>
    	<para>You can now close this graph since we use another one for the next part.
  			The next graph is similar to the this one but, has coauthorship edges linking authors who write papers together.
  			Download <ulink url="http://tulip.labri.fr/samples/GRSC.tlp.gz">it</ulink> and open it.</para>
    	<sect3 id="FocusingonTwoAuthors"><title>Focusing on Two Authors</title>
    		<para>Here we focus on the relationship between two authors, in this case Robertson and Card.</para>
    		<para>
					<itemizedlist>
						<listitem>
							Select Robertson like before: find, "titleshort" "=" "G.*Rob.*".
							Now, add the second one: <code>Edit → Find</code>, check option <code>add</code>, enter the regexp ".*Card.*".
						</listitem>
						<listitem>
    					Move those selected nodes away: use the "selection move" tool <inlinegraphic fileref="images/i_move.png"/>
							(if you click on one node, you will be able to move both at the same time):
    		<graphic fileref="images/tutorials_peopleinfovis_6.png" />
				</listitem>
    		<listitem>Select all their neighbors with <code>Algorithm → Selection → Reachable Sub-Graph</code> with distance 1.
					We get the graph of all publications and coauthors of Card and/or Robertson.
				</listitem>
				<listitem>Create a new Subgraph, apply a hierarchical layout:</listitem>
				</itemizedlist>
    		<graphic fileref="images/tutorials_peopleinfovis_7.png" />
    		<para>One can repeat those operations for Robertson only and Card only:</para>
    		<graphic fileref="images/tutorials_peopleinfovis_8.png" />
    		The similarity of these final three images shows the very strong ties between these two authors.</para>
    	</sect3>

    	<sect3 id="auth-conf"><title>Central Authors and Conferences: Overview</title>
				<para>You can now close this graph.
  				The next graph is similar to the old one,
					but the links are from author to papers, and from papers to conferences.
  				Download it <ulink url="http://tulip.labri.fr/samples/ConfAuth.tlp.gz">here</ulink> and open it in tulip.</para>
    		We used the following to show the large-scale structure of this dataset:
				<itemizedlist>
						<listitem>(already done) color authors in green, conferences in blue.</listitem>
						<listitem>Consider only the author and conference nodes (i.e. not the paper nodes):
							Find "type &gt;= 1",
							<code>Algorithm → Selection → Induced subgraph</code>,
							Create subgraph.
						</listitem>
						<listitem>GEM layout (<code>Algorithm → Layout → Force Directed → GEM</code>)</listitem>
					</itemizedlist>
    			<graphic fileref="images/tutorials_peopleinfovis_9.png" />
    		<para>Here is a close-up:</para>
    		<graphic fileref="images/tutorials_peopleinfovis_10.png" />
    		<para>To select only the giant connected component (the largest connected set nodes),
					select a node in the center of the giant component and run <code>Algorithm → Selection → Reachable Subgraph</code>
					with distance 1000 and directed=2.</para>
    		<!-- <para>Finally, we color papers in white.</para> <graphic fileref="images/tutorials_peopleinfovis_11.png" /> -->
				On this subgraph,
				<itemizedlist>
					<listitem>Highlight the structure with <code>Algorithm → Measure → Graph → Eccentricity</code>,
						with "Closeness centrality" checked.</listitem>
					<listitem>To distinguish authors and conferences: select all authors (nodes of type 2) and set their viewLabelColor property to red.
					</listitem>
				</itemizedlist>
    		<graphic fileref="images/tutorials_peopleinfovis_12.png" />
    	</sect3>

    	<sect3 id="top-conf"><title>Central Authors and Conferences: Top Conferences</title>
    		<para>To get a approximation of top conferences, we can look for the ones where the number of authors is high.
					To do so, click <code>Algorithm → Measure → Graph → Degree </code>:</para>
    		<graphic fileref="images/tutorials_peopleinfovis_13.png" />
    	</sect3>

    	<sect3 id="top-auth"><title>Central Authors and Conferences: Top Authors</title>
    		<para>Run <code>Algorithm → Measure → Graph → Strahler</code> to see strong ties between conferences and authors:</para>
    		<graphic fileref="images/tutorials_peopleinfovis_14.png" />
    		<para>Then run <code>Algorithm → General → Convolution</code>,
					the value of the discretization parameter should be near 30 to obtain 5 clusters:
					<graphic fileref="images/tutorials_peopleinfovis_21.png" />
					One gets the following clusters:</para>
    		<graphic fileref="images/tutorials_peopleinfovis_15.png" />
    		<para>The Strahler-Convolution clustering yields five clusters, according to increasing centrality. The first cluster is mostly yellow, and contains most of the data. The second cluster contains a next tier of 26 authors that have had a relatively strong impact. The third cluster contains a group of 7 influential authors (Chi, Bederson, Eick, Rao, Pirolli, Ward, and Brown), and the fourth cluster (Roth, Robertson, Keim, and Stasko) is yet more central. The fifth cluster is the single node of Mackinlay, and the last one is Card and Shneiderman.
					Our automatic clustering method clearly yields plausible results in this case.
    		</para>
    	</sect3>

    	<sect3 id="interauthor-cons"><title>Hierarchical Structure of Inter-author Connections</title>
    		<para>In the last section, we use <ulink url="http://tulip.labri.fr/samples/ConfAuthRecSmallWorld.tlp.gz">this graph</ulink>.</para>
    		<para>To see the metanode labels,
					Click <code>View editor → Layers Manager</code>, check <code>metanode label</code>.</para>
    		<para>The clusters have been computed using <code>Algorithm → General → Strength</code>.</para>
    		<para>The overview image from the previous section, showing the graph of all authors and papers, is extremely cluttered.
					The previous section showed one way to extract information (find the most important nodes by convolution clustering).
					Another clustering approach, small-world clustering, allows instead to navigate through a hierarchical subdivision of the entire dataset.
					The simplified overview allows to understand the graph's high level structure.
					The strength metric computes the number of cycles of length 3 and 4 passing through each edge, normalized by the maximum possible value.
					The first image shows the clustered dataset.
					Small-world navigation is useful when exploring an unfamiliar graph to quickly find the structure of complex components.
					The eccentricity metric (<code>Algorithm → Measure → Graph → Eccentricity</code>),
					  which measures whether nodes are peripheral (here in yellow) or central (here in blue), guides us to complexity immediately.
					This metric is O(n^3), but the small-world decomposition simplifies the graph, making the computation tractable.
    		</para>
    		<graphic fileref="images/tutorials_peopleinfovis_16.png" />
    		<para>Here is a close-up of the node that has many blue lines leading to it:
    		</para>
    		<graphic fileref="images/tutorials_peopleinfovis_20.png" />
    		<para>We then open that cluster (right-click on the metanode, <code>go inside</code>), which is itself a small-world graph:
    		</para>
    		<graphic fileref="images/tutorials_peopleinfovis_17.png" />
    		<para>We once again zoom towards the most central node with many blue edges,
					where we see a cluster containing the InfoVis 96 conference and all the authors who only published at the infovis community in that year.
    		</para>
    		<graphic fileref="images/tutorials_peopleinfovis_18.png" />
    		<graphic fileref="images/tutorials_peopleinfovis_19.png" />
    		<para> We see this star shaped small-world decomposition only for the InfoVis conferences, because of the nature of this dataset:
					only the InfoVis conferences have a complete set of authors and papers available.</para>
    	</sect3>
    </sect2>
  </sect1>
  <sect1 id="tuto-csv-import">
      <title>Importing data from CSV files</title>
      <sect2><title>Importing new nodes from CSV file</title>
	<para>The example below represents graph with ten entities. We will see how to import this graph in Tulip with the CSV import.</para>
	<literallayout>
	<code>	  
	  node_id;second_major;gender;major_index;year;dorm;high_school;student_fac
	  0;0;2;205;2006;169;15903;1
	  1;0;2;207;2005;0;3029;2
	  2;0;1;208;0;0;3699;2
	  3;0;2;228;2006;169;17763;1
	  4;206;2;204;2006;0;2790;1
	  5;0;2;228;2005;169;50029;2
	  6;0;1;223;2006;169;3523;1
	  7;0;1;208;2007;169;2780;1
	  8;0;2;205;2006;170;5477;1
	  9;0;1;228;0;0;23675;1	
	</code>
	</literallayout>
	<orderedlist>
	  <listitem><para>Launch Tulip.</para></listitem>
	  <listitem><para>Click on "File->New"</para></listitem>
	  <listitem><para>Click on “Import CSV data.”</para></listitem>
	  <listitem><para>Edit the location field and select the file created from the example.</para></listitem>
	  <listitem><para>Choose the “;” character as separator.</para></listitem>
	  <listitem><para>Click on next.</para></listitem>
	  <listitem><para>Choose the columns you want to import (in our case all).</para></listitem>
	  <listitem><para>Click on next.</para></listitem>
	  <listitem><para>Select “New entities” in the “Import data on” list.</para></listitem>
	  <listitem><para>Click on the “Finish” button to launch the import.</para></listitem>
	  <listitem><para>You can now work with the imported graph.</para></listitem>
	</orderedlist>
	</sect2>
	<sect2><title>Importing new entities from CSV file</title>
	<para>The example below represents graph with ten entities and eighteen relations. We will see how to import this graph in Tulip with the CSV import.</para>
	<literallayout>
	<code>
	  "Source","Target","second_major","gender","major_index"
	  0,3,0,2,205
	  0,4,0,2,207
	  0,5,0,1,208
	  0,6,0,2,228
	  0,8,206,2,204
	  0,9,0,2,228
	  1,2,0,1,223
	  2,3,0,1,208
	  2,4,0,2,205
	  2,6,0,1,228
	  2,7,200,1,201
	  3,6,0,2,199
	  3,7,0,2,202
	  3,9,0,2,199
	  4,8,0,2,209
	  4,9,200,1,201
	  5,7,206,2,223
	  8,9,0,1,223
	  </code>
	</literallayout>
	<orderedlist>
	  <listitem><para>Launch Tulip.</para></listitem>
	  <listitem><para>Click on "File->New"</para></listitem>
	  <listitem><para>Click on “Import CSV data.”</para></listitem>
	  <listitem><para>Edit the location field and select the file created from the example.</para></listitem>
	  <listitem><para>Choose the “,” character as separator.</para></listitem>
	  <listitem><para>Choose the “ ” ” character as text delimiter.</para></listitem>	  
	  <listitem><para>Click on next.</para></listitem>
	  <listitem><para>Choose the columns you want to import (in our case all).</para></listitem>
	  <listitem><para>Click on next.</para></listitem>
	  <listitem><para>Select “New relations” in the “Import data on” list.</para></listitem>
	  <listitem><para>Choose the “Source” column in the “Choose CSV column containing source entities id”.</para></listitem>
	  <listitem><para>Choose the “Target” column in the “Choose CSV column containing target entities id”.</para></listitem>
	  <listitem><para>Choose the “viewLabel” property in the “ Choose the property containing existing entities ids”.</para></listitem>
	  <listitem><para>Check the Create missing entities to force the creation of missing entities.</para></listitem>
	  <listitem><para>Click on the “Finish” button to launch the import.</para></listitem>
	  <listitem><para>You can now work with the imported graph.</para></listitem>
	</orderedlist>
	</sect2>
    </sect1>
</chapter>

<!--  LocalWords:  GML TLP treemap TreeMap viewSize viewColor InfoVis subgraph
	-->
<!--  LocalWords:  titleshort startingNodes viewSelection interactor arityOut
	-->
<!--  LocalWords:  GRCitesub dataset Strahler viewLabelColor Bederson Eick Rao
	-->
<!--  LocalWords:  Pirolli Keim Stasko Mackinlay Shneiderman metanode infovis
	-->