1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320
|
\section{Introduction}
\label{sec:intro}
The \gviz\ package consists of a variety of software for drawing
attributed graphs. It implements a handful of common graph layout
algorithms. These are:
\begin{description}
\item[dot] A Sugiyama-style hierarchical layout \cite{stt,gknv:methods}.
\item[neato] A ``symmetric'' layout algorithm based on stress reduction.
This is a variation of multidimensional scaling \cite{kruskal,cohen}. The
default implementation uses stress majorization \cite{gkn}.
An alternate implementation uses the Kamada-Kawai algorithm \cite{kk}
\item[fdp] An implementation of the Fruchterman-Reingold force-directed
algorithm \cite{fr}
for ``symmetric'' layouts. This layout is similar to neato, but there
are performance and feature differences.
\item[sfdp] A multiscale force-directed layout using a spring-electrical
model \cite{sfdp}.
\item[twopi] A radial layout as described by Wills \cite{nicheworks}.
\item[circo] A circular layout combining aspects of the
work of Six and Tollis \cite{st,st2} and Kaufmann and Wiese \cite{kw}.
\item[patchwork] An implementation of squarified treemaps \cite{sqtreemap}.
\item[osage] A layout algorithm for clustered graphs based on user specifications.
\end{description}
In addition, \gviz\ provides an assortment of more general-purpose
graph algorithms, such as transitive reduction, which have proven useful in
the context of graph drawing.
The package was designed \cite{gviz}
to rely on the
``program-as-filter'' model of software, in which distinct graph
operations or transformations are embodied as programs. Graph drawing
and manipulation are achieved by using the output of one filter as
the input of another, with each filter recognizing a common, text-based
graph format.
One thus has an algebra of graphs, using a scripting language to provide
the base language with variables and function application and composition.
Despite the simplicity and utility of this approach, some
applications need or desire to use the software as a library with
bindings in a non-scripting language, rather than as primitives composed
using a scripting language. The \gviz\ software provides a variety of
ways to achieve
this, running a spectrum from very simple but somewhat inflexible to
fairly complex but offering a good deal of application control.
\subsection{String-based layouts}
The simplest mechanism for doing this consists of using the filter
approach in disguise. The application, perhaps using the \gviz\ \graph\
library, constructs a representation of a graph in the
\DOT\ language. The application can then invoke the desired layout
program, e.g., using {\tt system} or {\tt popen} on a Unix system,
passing the graph using an intermediate file or a pipe. The layout
program computes position information for the graph, attaches this
as attributes, and delivers the graph back to the application through
another file or pipe. The application can then read in the graph,
and apply the geometric information as necessary. This is the
approach used by many applications, e.g., dotty \cite{dotty} and
grappa \cite{grappa}, which
rely on \gviz.
There are several \gviz\ output formats which can be used in this
approach. As with all output formats, they are specified by
using a {\tt -T} flag when invoking the layout program.
The input to the programs must always be in the \DOT\ language.
\subsubsection{dot}
\label{sect:dot}
This format relies on the \DOT\ language to describe the graphs, with attributes
attached as name-value pairs.
The \graph\ library provides a parser for graphs represented
in \DOT. Using this, it is easy to read the graphs and query the
desired attributes using {\tt agget} or {\tt agxget}.
For more information on these functions, see Section~\ref{sec:attributes}.
The string representations of the various types referred to are
described in Appendix~\ref{sec:types}.
On output, the graph will have a {\tt bb} attribute of type {\tt rectangle},
specifying the bounding box of the drawing.
If the graph has a label, its position is specified by the {\tt lp}
attribute of type {\tt point}.
Each node gets {\tt pos}, {\tt width} and {\tt height} attributes.
The first has type {\tt point}, and indicates the center of the node
in points. The
{\tt width} and {\tt height} attributes are floating point numbers
giving the width and height, in inches, of the node's bounding box.
If the node has a record shape, the record rectangles are given in the
{\tt rects} attribute. This has the format of a space-separated list
of rectangles.
If the node is a polygon (including ellipses) and the {\tt vertices}
attribute is defined for nodes, this attribute will contain the
vertices of the node, in inches, as a space-separated list of {\tt pointf}
values.
For ellipses, the curve is sampled, the number of points used being
controlled by the {\tt samplepoints} attribute.
The points are given relative to the center of the node.
Note also that the points only give the node's basic shape; they do
not reflect any internal structure. If the node has
{\tt peripheries} greater than one,
or a shape like {\tt "Msquare"}, the {\tt vertices}
attribute does not represent the extra curves or lines.
Every edge is assigned a {\tt pos} attribute having {\tt splineType}
type. If the edge has a label, the label position is given in the
{\tt lp} of type {\tt point}.
\subsubsection{xdot}
\label{sect:xdot}
The {\tt xdot} format is a strict extension of the {\tt dot} format, in that it
provides the same attributes as {\tt dot} as well as additional drawing
attributes. These additional attributes specify how to draw each component of
the graph using primitive graphics operations. This can be particularly
helpful in dealing with node shapes and edge arrowheads.
Unlike the information provided by the {\tt vertices} attribute
described above,
the extra attributes in {\tt xdot} provide all geometric drawing information,
including the various types of arrowheads and multiline labels with
variations in alignment. In addition, all the parameters use the same
units.
There are six new attributes, listed in Table~\ref{table:xdot}.
These drawing attributes are only attached to nodes and edges.
\begin{table}[htb]
\centering
\begin{tabular}[t]{|ll|} \hline
{\tt \_draw\_} & General drawing operations \\
{\tt \_ldraw\_} & Label drawing operations \\
{\tt \_hdraw\_} & Head arrowhead \\
{\tt \_tdraw\_} & Tail arrowhead \\
{\tt \_hldraw\_} & Head label \\
{\tt \_tldraw\_} & Tail label \\
\hline
\end{tabular}
\caption{{\tt xdot} drawing attributes}
\label{table:xdot}
\end{table}
Clearly, the last four attributes are only attached to edges.
The value of these attributes are strings consisting of the concatenation of
some (multi-)set of the 7 drawing operations listed in Table~\ref{table:xops}.
The color, font name, and style values supplied in the $C$, $c$, $F$,
and $S$ operations have the same format and interpretation
as the {\tt color}, {\tt fontname}, and
{\tt style} attributes in the source graph.
\begin{table}[htb]
\centering
\begin{tabular}[t]{|lp{4.5in}|} \hline
${\tt E}\ x_0\ y_0\ w\ h$ & Filled ellipse with equation
$((x - x_0)/w)^2 + ((y - y_0)/h)^2 = 1$ \\
${\tt e}\ x_0\ y_0\ w\ h$ & Unfilled ellipse with equation
$((x - x_0)/w)^2 + ((y - y_0)/h)^2 = 1$ \\
${\tt P}\ n\ x_1\ y_1\ \ldots\ x_n\ y_n$ & Filled polygon with the
given $n$ vertices \\
${\tt p}\ n\ x_1\ y_1\ \ldots\ x_n\ y_n$ & Unfilled polygon with the
given $n$ vertices \\
${\tt L}\ n\ x_1\ y_1\ \ldots\ x_n\ y_n$ & Polyline with the
given $n$ vertices \\
${\tt B}\ n\ x_1\ y_1\ \ldots\ x_n\ y_n$ & B-spline with the
given $n$ control points. $n \equiv 1 mod 3$ and $n \geq 4$ \\
${\tt b}\ n\ x_1\ y_1\ \ldots\ x_n\ y_n$ & Filled B-spline with the
given $n$ control points. $n \equiv 1 mod 3$ and $n \geq 4$ \\
${\tt T}\ x\ y\ j\ w\ n\ {\tt -}c_{1}c_{2}\cdots c_{n}$ &
Text drawn using the baseline point $(x,y)$. The text consists of the
$n$ bytes following {\tt '-'}. The text should be left-aligned (centered,
right-aligned) on the point if $j$ is -1 (0, 1), respectively. The
value $w$ gives the width of the text as computed by the library. \\
${\tt t}\ f$ &
Set font characteristics. The integer f is the OR of BOLD=1, ITALIC=2, UNDERLINE=4, SUPERSCRIPT=8, SUBSCRIPT=16, and STRIKE-THROUGH=32. \\
${\tt C}\ n\ {\tt -}c_{1}c_{2}\cdots c_{n}$ & Set color used to fill closed
regions. The
color is specified by the $n$ characters following {\tt '-'}. \\
${\tt c}\ n\ {\tt -}c_{1}c_{2}\cdots c_{n}$ & Set pen color, the color used
for text and line drawing. The
color is specified by the $n$ characters following {\tt '-'}. \\
${\tt F}\ s\ n\ {\tt -}c_{1}c_{2}\cdots c_{n}$ & Set font. The font
size is $s$ points. The font name
is specified by the $n$ characters following {\tt '-'}. \\
${\tt S}\ n\ {\tt -}c_{1}c_{2}\cdots c_{n}$ & Set style attribute. The
style value is specified by the $n$ characters following {\tt '-'}. \\
${\tt I}\ x\ y\ j\ w\ n\ {\tt -}c_{1}c_{2}\cdots c_{n}$ &
Externally-specified image drawn in the box with lower left corner $(x,y)$
and upper right corner $(x+w,y+h)$. The name of the image consists of the $n$ bytes following '-'. This is usually a bitmap image. Note that the image size, even when converted from pixels to points, might be different from the required size $(w,h)$. It is assumed the renderer will perform the necessary scaling.\\
\hline
\end{tabular}
\caption{{\tt xdot} drawing operations}
\label{table:xops}
\end{table}
In handling alignment, the application may want to recompute the string
width using its own font drawing primitives.
The text operation is only used in the {\tt label} attributes.
Normally, the non-text graphics
operations are only used in the non-label attributes.
If, however, a node has {\tt shape="record"} or an HTML-like label is
involved, a label attribute may also contain various graphics operations.
In addition, if the {\tt decorate} attribute is set on an edge, its
{\tt label} attribute will also contain a polyline operation.
All coordinates and sizes are in points.
If an edge or
node is invisible, no drawing operations are attached to it.
\subsubsection{plain}
The {\tt plain} format is line-based and very simple to parse. This works
well for applications which need or wish to avoid using the \graph\
library. The price for this simplicity is that the format
encodes very little detailed layout information beyond basic position
information. If an application needs more than what is supplied in the
format, it should use the {\tt dot} or {\tt xdot} format.
There are four types of lines: {\tt graph}, {\tt node},
{\tt edge} and {\tt stop}. The output
consists of a single {\tt graph} line; a sequence of {\tt node} lines, one
for each node; a sequence of {\tt edge} lines, one for each edge; and
a single terminating {\tt stop} line. All units are in inches,
represented by a floating point number.
As noted, the statements have very simple formats.
\begin{tabbing}
\indent \indent \= {\em {\tt graph} scale width height} \\
\> {\em {\tt node} name x y width height label style shape color fillcolor} \\
\> {\em {\tt edge} tail head n $x_1$ $y_1$ ... $x_n$ $y_n$ [label xl yl] style color} \\
\> {\tt stop}
\end{tabbing}
We now describe the statements in more detail.
\begin{description}
\item[graph]
The {\em width} and {\em height} values give the
width and height of the drawing. The
lower left corner of the drawing is at the origin.
The {\em scale} value indicates
how the drawing should be scaled if a {\tt size} attribute was given and the
drawing needs to be scaled to conform to that size. If no scaling is
necessary, it will be set to 1.0. Note that all graph, node and edge
coordinates and lengths are given unscaled.
\item[node]
The {\em name} value is the name of the node, and
{\em x} and {\em y} give the node's position.
The {\em width} and {\em height} are the width and height of the node.
The {\em label}, {\em style},
{\em shape}, {\em color} and {\em fillcolor} values
give the node's label, style, shape, color and
fillcolor, respectively, using default attribute values where necessary.
If the node does not have a {\tt style} attribute, {\tt "solid"} is used.
\item[edge]
The {\em tail} and {\em head} values give the names of the head and tail nodes.
{\em n} is the number of control points defining the B-spline forming the edge.
This is followed by $2*n$ numbers giving the x and y coordinates of the
control points in order from tail to head. If the edge has a {\tt label}
attribute,
this comes next, followed by the x and y coordinates of the label's position.
The edge description is completed by the edge's style and color. As with
nodes, if a style is not defined, {\tt "solid"} is used.
\end{description}
\subsubsection{plain-ext}
The {\tt plain-ext} format is identical with the {\tt plain} format,
except that port names are attached to the node names in an edge,
when applicable. It uses the usual \DOT\ representation, where port
{\em p} of node {\em n} is given as {\tt {\em n}:{\em p}}.
\subsubsection{GXL and GML}
The GXL \cite{gxl} dialect of XML and GML \cite{gml} are a widely used standards for
representing attributed graphs as text, especially in the graph
drawing and software engineering communities. There
are many tools available for parsing and analyzing graphs represented
in these formats. And, as GXL is based on XML, it is amenable to the panoply of
XML tools.
Various graph drawing and manipulation packages either
use GXL or GML as their main graph language, or provide a translator.
In this, \gviz\ is no different. We supply the programs
{\tt gv2gxl}, {\tt gxl2gv}, {\tt gv2gml} and {\tt gml2gv}
for converting between the DOT and
these formats. Thus, if an application is XML-based, to use the
\gviz\ tools, it needs to insert these filters as appropriate between
its I/O and the \gviz\ layout programs.
\subsection{\gviz\ as a library}
The role of this document is to describe how an application can use the
\gviz\ software as a library rather than as a set of programs. It will
describe the intended API at various levels, concentrating on the purpose
of the functions from an application standpoint, and the way the
library functions should be used together, e.g., that one has to call
function A before function B. The intention is not to provide
detailed manual pages, partly because most of the functions have a high-level
interface, often just taking a graph pointer as the sole argument.
The real semantic details are embedded in the
attributes of the graph, which are described elsewhere.
The remainder of this manual describes how to build an application
using \gviz\ as a library in the usual sense.
The next section presents the basic technique for using the \gviz\ code. Since
the other approaches are merely ramifications and extensions of the
basic approach, the section also serves as an overview for all uses.
Section~\ref{sec:layouts} breaks each layout algorithm apart into
its individual steps.
With this information, the application has the option of eliminating
certain of the steps. For example, all of the layout algorithms can
layout edges as splines. If the application intends to draw all edges
as line segments, it would probably wish to avoid the spline computation,
especially as it is moderately expensive in terms of time.
Section~\ref{sec:layout_info} explains how an application can invoke the
\gviz\ renderers, thereby generating a drawing of a graph in
a concrete graphics format such as {\em png} or {\em PostScript}.
For an application intending to do its own rendering,
Section~\ref{sec:renderers} recommends a technique which allows the
\gviz\ library to handle all of the bookkeeping details related to
data structures and machine-dependent representations while the
application need only supply a few basic graphics functions.
Section~\ref{sec:unconnect}
discusses an auxiliary library for dealing with graphs containing
multiple connected components.
{\bf N.B. Using \gviz\ as a library is not thread-safe.}
|