File: basic.tex

package info (click to toggle)
graphviz 14.1.1-1
links: PTS
area: main
in suites: forky, sid
size: 139,440 kB
sloc: ansic: 142,129; cpp: 11,960; python: 7,770; makefile: 4,043; yacc: 3,030; xml: 2,972; tcl: 2,495; sh: 1,388; objc: 1,159; java: 560; lex: 423; perl: 243; awk: 156; pascal: 139; php: 58; ruby: 49; cs: 31; sed: 1
file content (946 lines) | stat: -rw-r--r-- 47,064 bytes
\section{Basic graph drawing}
Figure~\ref{fig:basic} gives a template for the basic library use of \gviz,
in this instance using the \dot\ hierarchical layout.
(Appendix~\ref{sec:simple} provides the listing of the complete program.)
Basically, the program creates a graph using the \graph\
library, setting node and edge 
attributes to affect how the graph is
to be drawn; calls the layout code; and then uses the position information
attached to the nodes and edges to render the graph. The remainder of
this section explores these steps in more detail.

\begin{figure}[hbt]
\begin{verbatim}
    Agraph_t* G;
    GVC_t* gvc;

    gvc = gvContext();        /* library function */
    G = createGraph ();
    gvLayout (gvc, G, "dot"); /* library function */
    drawGraph (G);
    gvFreeLayout(gvc, g);     /* library function */ 
    agclose (G);              /* library function */
    gvFreeContext(gvc);
\end{verbatim}
\caption{Basic use}
\label{fig:basic}
\end{figure}

Here, we just note the {\tt gvc} parameter. This is a handle
to a {\em Graphviz context}, which contains drawing and rendering
information independent of the properties pertaining to a particular
graph as well as various state information. 
For the present, we view this an abstract parameter required
for various \gviz\ functions. We will discuss it further in  
Section~\ref{sec:gvc}.

\subsection{Creating the graph}
The first step in drawing a graph is to create it. To use the \gviz\
layout software, the graph must be created using the \graph\ library. 

We can create a graph in one of two main ways, using {\tt agread} or {\tt agopen}.
The former function takes a {\tt FILE*} pointer to a file open for reading.
\begin{verbatim}
    FILE* fp;
    Agraph_t* G = agread(fp, 0);
\end{verbatim}
It is assumed the file contains the description of graphs using the
\DOT\ language. The {\tt agread} function parses one graph at a time, 
returning a pointer to an attributed graph generated from the input,
or {\tt NULL} if there are no more graphs or an error occurred.

The \gviz\ library provides several specialized variations of {\tt agread}. If
the \DOT\ representation of the graph is stored in memory at {\tt char* cp}, then
\begin{verbatim}
    Agraph_t* G = agmemread(cp);
\end{verbatim}
can be used to parse the representation. By default, the {\tt agread} function relies on the
standard {\tt FILE} structure and {\tt fgets} function of the stdio library. You
can supply your own data source {\tt dp} coupled with your own discipline {\tt disc} 
for reading the data to read a graph using
\begin{verbatim}
    Agraph_t* G = agread(dp, &disc);
\end{verbatim}
Further details on using {\tt agread} and disciplines can be found in the \graph\ library manual.

The alternative technique is to call {\tt agopen}. 
\begin{verbatim}
    Agraph_t* G = agopen(name, type, &disc);
\end{verbatim}
The first argument is a {\tt char*} giving the name of the graph; 
the second argument is an {\tt Agdesc\_t} value describing the type of graph
to be created. A graph can be directed or undirected. In
addition, a graph can be strict, i.e., have at most one edge between
any pair of nodes, or non-strict, allowing an arbitrary number of edges
between two nodes. If the graph is directed, the pair of nodes is ordered,
so the graph can have edges from node {\tt A} to node {\tt B} as well
as edges from {\tt B} to {\tt A}. These four combinations are specified
by the values in Table~\ref{table:types}.
The return value is a new graph, with no nodes or edges.
\begin{table*}[h]
\centering
\begin{tabular}{|l|l|} \hline
Graph Type & Graph \\ \hline
Agundirected & Non-strict, undirected graph \\
Agstrictundirected & Strict, undirected graph \\
Agdirected & Non-strict, directed graph \\
Agstrictdirected & Strict, directed graph \\ \hline
\end{tabular}
\caption{Graph types}
\label{table:types}
\end{table*}
So, to open a graph named {\tt "network"} that is directed but not strict, one would use
\begin{verbatim}
    Agraph_t* G = agopen("network", Agdirected, 0);
\end{verbatim}
The third argument is a pointer to a discipline of functions used for reading, memory, etc.
If the value of 0 or NULL is used, the library uses a default discipline.

Nodes and edges are created by the functions {\tt agnode} and
{\tt agedge}, respectively.
\begin{verbatim}
    Agnode_t *agnode(Agraph_t*, char*, int);
    Agedge_t *agedge(Agraph_t*,  Agnode_t*,  Agnode_t*, char*, int);
\end{verbatim}
The first argument is the graph containing the node or edge. Note
that if this is a subgraph, the node or edge will also belong to
all containing graphs. The second argument to {\tt agnode} is the
node's name. This is a key for the node within the graph. If
{\tt agnode} is called twice with the same name, the second invocation
will not create a new node but simply return a pointer to the previously
created node with the given name. The third argument specifies whether or not
a node of the given name should be created if it does not already exist.

Edges are created using {\tt agedge} by passing in the edge's two nodes.
If the graph is not strict, additional calls to {\tt agedge} with
the same arguments will create additional edges between the two nodes.
The string argument allows you to supply a further name to distinguish between
edges with the same head and tail.
If the graph is strict, extra calls will simply return the already existing
edge.
For directed graphs, the first and
second node arguments are taken to be the tail and head
nodes, respectively. For undirected graph, they still play this role
for the functions {\tt agfstout} and {\tt agfstin}, but when checking
if an edge exists with {\tt agedge} or {\tt agfindedge}, the order is
irrelevant. As with {\tt agnode}, the final argument specifies whether or not
the edge should be created if it does not yet exist.

As suggested above, a graph can also contain subgraphs. These are
created using {\tt agsubg}:
\begin{verbatim}
    Agraph_t *agsubg(Agraph_t*,  char*, int);
\end{verbatim}
The first argument is the immediate parent graph; the second argument
is the name of the subgraph; the final argument indicates if the subgraph should
be created. 

Subgraphs play three roles in \gviz.
First, a subgraph can be used to represent graph structure, indicating that
certain nodes and edges should be grouped together. This is the usual
role for subgraphs and typically specifies semantic information about
the graph components. In this generality, the drawing software makes
no use of subgraphs, but maintains the structure for use elsewhere
within an application. 

In the second role, a subgraph can provide a context for setting 
attributes. In \gviz, these are often attributes used by the layout
and rendering functions.
For example, the application could specify that {\tt blue}
is the default color for nodes. Then, every node within the subgraph will
have color blue. In the context of graph drawing, a more interesting
example is:
\begin{verbatim}
    subgraph {
      rank = same; A; B; C;
    }
\end{verbatim}
This (anonymous) subgraph specifies that the nodes {\tt A}, {\tt B} and {\tt C} 
should all be placed on the same rank if drawn using \dot.

The third role for subgraphs combines the previous two. If the name of
the subgraph begins with {\tt "cluster"}, \gviz\ identifies the subgraph
as a special {\em cluster} subgraph. The drawing software\footnote{if
supported} will do the layout of the graph so that the nodes belonging
to the cluster are drawn together, with the entire drawing of the cluster
contained within a bounding rectangle.

We note here some important fields used in nodes, edges and graphs.
If {\tt np}, {\tt ep} and {\tt gp} are pointers to a node, edge and
graph, respectively, {\tt agnameof(np)} and {\tt agraphof(np)} give the 
name of the node and the root graph containing it, 
{\tt agtail(ep)} and {\tt aghead(ep)} give the tail and head nodes of the
edge, and {\tt agroot(gp)} gives the root graph containing the subgraph.
For the root graph, this field will point to itself. 

\subsubsection{Attributes}
\label{sec:attributes}
In addition to the abstract graph structure provided by nodes, edges and
subgraphs, the \gviz\ libraries also support graph attributes. These
are simply string-valued name/value pairs. Attributes are used to specify
any additional information which cannot be encoded in the abstract graph.
In particular, the attributes are heavily used by the drawing software to
tailor the various geometric and visual aspects of the drawing.

Reading attributes is easily done. The function {\tt agget} takes a pointer
to a graph component (node, edge or graph) and an attribute name, and
returns the value of the attribute for the given component. Note that the
function may return either {\tt NULL} or a pointer to the empty string.
The first value indicates that the given attribute has not been defined for
any component in the graph of the given kind. Thus, if {\tt abc} is a
pointer to a node and {\tt agget(abc,"color")} returns {\tt NULL}, then
no node in the root graph has a color attribute. If the function returns
the empty string, this usually indicates that the attribute has been 
defined but the attribute value associated with the specified object is the
default for the application. So, if {\tt agget(abc,"color")} 
now returns {\tt ""}, the node is taken to have the default color. In
practical terms, these two cases are very similar. Using our example,
whether the attribute value is {\tt NULL} or {\tt ""}, the drawing code
will still need to pick a color for drawing and will probably use the
default in both cases.

Setting attributes is a bit more complex. Before attaching an attribute
to a graph component, the code must first set up the default case. This
is accomplished by a call to {\tt agattr}.
It takes a graph, an object type ({\tt AGRAPH, AGNODE, AGEDGE}), and
two strings as arguments, and return a representation of the attribute.
The first string gives the name of the attribute; the second supplies
the default value.
The graph must be the root graph.

Once the attribute has been initialized, the attribute can be set for
a specific component by calling 
\begin{verbatim}
    agset (void*, char*, char*)
\end{verbatim}
with a pointer to the component,
the name of the attribute and the value to which it should be set.
For example, the call
\begin{verbatim}
    agset (np, "color", "blue");
\end{verbatim}
sets the color of node {\tt np} to {\tt "blue"}.
The attribute value must not be {\tt NULL}.

For simplicity, the \graph\ library provides the function
\begin{verbatim}
    agsafeset(void*, char*, char*, char*)
\end{verbatim}
the first three arguments
being the same as those of {\tt agset}. This function first checks that
the named attribute has been declared for the given graph component.
If it has not, it declares the attribute, using its last argument as
the required default value. It then sets the attribute value for the
specific component.

Note that some attributes are replicated in the graph, appearing once
as the usual string-valued attribute, and also in an internal machine
format such an {\tt int}, {\tt double} or some more structured type.
An application should only set attributes using strings and {\tt agset}.
The implementation of the layout algorithm
may change the machine-level representation at any time.
Hence, the low-level
interface cannot be relied on by the application to supply the desired
input values. Also note that there
is not a one-to-one correspondence between string-valued
attributes and internal attributes. A given string attribute might be
encoded as part of some data structure, might be represented via 
multiple fields, or may have no internal representation at all. 

In order to expedite the reading and writing of attributes for large
graphs, \gviz\ provides a lower-level mechanism for manipulating attributes
which can avoid hashing a string.
Attributes have a representation of type \verb+Agsym_t+. This is basically the
value returned by the initialization function {\tt agattr}. (Passing {\tt NULL}
as the default value will cause {\tt agattr} to return the \verb+Agsym_t+ if
it exists, and {\tt NULL} otherwise.)
An attribute can also be obtained by a call to {\tt agattrsym}, which takes
a graph component and an attribute name. If the attribute has been defined,
the function returns a pointer to the corresponding \verb+Agsym_t+ value. 
This can be used to directly access the corresponding attribute value,
using the functions {\tt agxget} and {\tt agxset}. These are identical to 
{\tt agget} and {\tt agset}, respectively, except that instead of
taking the attribute name as the second argument, they use 
the \verb+Agsym_t+ value to access the attribute
value from an array.

Due to the nature of the implementation of attributes in \gviz, an application 
should, if possible, attempt to define and initialize all
attributes before creating nodes and edges.

The drawing algorithms in \gviz\ use a large collection of attributes,
giving the application a great deal of control over the appearance of the
drawing. For more detailed and complete information on what the attributes mean, the
reader should consult the page \url{https://www.graphviz.org/doc/info/attrs.html}.

Here, we consider some of the more commonly used attributes.
We can divide the attributes into those that affect the placement
of nodes, edges and clusters in the layout and those, such as color, 
which do not. 
Table~\ref{tab:nattr_geom} gives the node attributes which have the potential
to change the layout. This is followed by Tables~\ref{tab:eattr_geom},
\ref{tab:gattr_geom} and \ref{tab:cattr_geom}, which do the same for 
edges, graphs, and clusters.
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|l|p{2.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Default} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt distortion} & {\tt 0.0} & node distortion for {\tt shape=polygon} \\
{\tt fixedsize} & false & label text has no affect on node size \\
{\tt fontname} & {\tt Times-Roman} & font family \\
{\tt fontsize} & {\tt 14} & point size of label \\
{\tt group} &  & name of node's group \\
{\tt height} & {\tt .5} & height in inches \\
{\tt label} & node name & any string \\
{\tt margin} &  0.11,0.055 & space between node label and boundary \\
{\tt orientation} & {\tt 0.0} & node rotation angle \\
{\tt peripheries} & {\em shape dependent} & number of node boundaries \\
{\tt pin} & false & fix node at its {\tt pos} attribute \\
{\tt regular} & false & force polygon to be regular \\
{\tt root} & & indicates node should be used as root of a layout \\
{\tt shape} & {\tt ellipse} & node shape \\
{\tt shapefile} & & $\dagger$ external EPSF or SVG custom shape file\\
{\tt sides} & {\tt 4} & number of sides for {\tt shape=polygon} \\
{\tt skew} & {\tt 0.0} & skewing of node for {\tt shape=polygon} \\
{\tt width} & {\tt .75} & width in inches \\
{\tt z} & {\tt 0.0} & $\dagger$ z coordinate for VRML output \\
\hline
\end{tabular}
\caption{Geometric node attributes}
\label{tab:nattr_geom}
\end{table}
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|l|p{2.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Default} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt constraint} & true & use edge to affect node ranking \\
{\tt fontname} & {\tt Times-Roman} & font family \\
{\tt fontsize} & {\tt 14} & point size of label \\
{\tt headclip} & true & clip head end to node boundary \\
{\tt headport} & center & position where edge attaches to head node \\
{\tt label} & & edge label \\
{\tt len} & 1.0/0.3 & preferred edge length \\
{\tt lhead} & & name of cluster to use as head of edge \\
{\tt ltail} & & name of cluster to use as tail of edge \\
{\tt minlen} & {\tt 1} & minimum rank distance between head and tail \\
{\tt samehead} & & tag for head node; edge heads with the same tag are merged
 onto the same port \\
{\tt sametail} & & tag for tail node; edge tails with the same tag are merged
 onto the same port \\
{\tt tailclip} & true & clip tail end to node boundary \\
{\tt tailport} & center & position where edge attaches to tail node \\
{\tt weight} & {\tt 1} & importance of edge \\
\hline
\end{tabular}
\caption{Geometric edge attributes}
\label{tab:eattr_geom}
\end{table}
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|l|p{2.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Default} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt center} & false & $\dagger$ center drawing on {\tt page} \\ 
{\tt clusterrank} & {\tt local} & may be {\tt global} or {\tt none} \\
{\tt compound} & false & allow edges between clusters \\
{\tt concentrate} & false & enables edge concentrators  \\
{\tt defaultdist} & $1+(\sum_{e \in E} len)/|E|\sqrt{|V|}$  & separation between nodes in different components \\
{\tt dim} & $2$ & dimension of layout  \\
{\tt dpi} & $96/0$ & dimension of layout  \\
{\tt epsilon} & $.0001 |V|$ or $.0001$ & termination condition  \\
{\tt fontname} & {\tt Times-Roman} & font family \\
{\tt fontpath} &  & list of directories to such for fonts \\
{\tt fontsize} & $14$ & point size of label \\
{\tt label} & & $\dagger$ any string \\
{\tt margin} & & $\dagger$ space placed around drawing \\
{\tt maxiter} & {\em layout dependent} & bound on iterations in layout \\
{\tt mclimit} & $1.0$ & scale factor for mincross iterations \\
{\tt mindist} & $1.0$ & minimum distance between nodes \\
{\tt mode} & {\tt major} & variation of layout \\
{\tt model} & {\tt shortpath} & model used for distance matrix \\
{\tt nodesep} & $.25$ & separation between nodes, in inches \\
{\tt nslimit} & & if set to {\it f}, bounds network simplex iterations by {\it (f)(number of nodes)} when setting x-coordinates \\
{\tt ordering} &  & specify out or in edge ordering \\
{\tt orientation} & {\tt portrait} & $\dagger$ use landscape orientation if {\tt rotate} is not used and the value is {\tt landscape} \\
{\tt overlap} & true & specify if and how to remove node overlaps \\
{\tt pack} &  & do components separately, then pack \\
{\tt packmode} & {\tt node} & granularity of packing \\
{\tt page} & & $\dagger$ unit of pagination, {\it e.g.} {\tt "8.5,11"} \\
{\tt quantum} &  & if ${\tt quantum} > 0.0$, node label dimensions will be rounded to integral multiples of {\tt quantum} \\
{\tt rank} & & {\tt same}, {\tt min}, {\tt max}, {\tt source} or {\tt sink} \\
{\tt rankdir} & {\tt TB} & sense of layout, i.e, top to bottom, left to right, etc. \\
{\tt ranksep} & {\tt .75} & separation between ranks, in inches. \\
{\tt ratio} & & approximate aspect ratio desired, {\tt fill} or {\tt auto} \\
{\tt remincross} & & If true and there are multiple clusters, re-run crossing minimization \\
{\tt resolution} & & synonym for {\tt dpi}  \\
{\tt root} & & specifies node to be used as root of a layout \\ 
{\tt rotate} & & $\dagger$ If 90, set orientation to landscape \\
{\tt searchsize} & {\tt 30} & maximum edges with negative cut values to
check when looking for a minimum one during network simplex \\
{\tt sep} & $0.1$ & factor to increase nodes when removing overlap \\
{\tt size} & & maximum drawing size, in inches \\
{\tt splines} &  & render edges using splines \\
{\tt start} & {\tt random} & manner of initial node placement \\
{\tt voro\_margin} & $0.05$ & factor to increase bounding box when more space is necessary during Voronoi adjustment \\
{\tt viewport} & & $\dagger$Clipping window \\
\hline
\end{tabular}
\caption{Geometric graph attributes}
\label{tab:gattr_geom}
\end{table}
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|l|p{2.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Default} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt fontname} & {\tt Times-Roman} & font family \\
{\tt fontsize} & {\tt 14} & point size of label \\
{\tt label} & & edge label \\
{\tt peripheries} & $1$ & number of cluster boundaries \\
\hline
\end{tabular}
\caption{Geometric cluster attributes}
\label{tab:cattr_geom}
\end{table}
Note that in some cases, the effect is indirect. An example of this is the
{\tt nslimit} attribute, which potentially reduces the effort spent on
network simplex algorithms to position nodes, thereby changing the layout.
Some of these attributes affect the initial layout of the graph in universal
coordinates. Others only play a role if the application uses the \gviz\
renderers (cf. Section~\ref{sec:layout_info}), 
which map the drawing into device-specific coordinates related to a
concrete output format.
For example, \gviz\ only uses the {\tt center} attribute, which specifies
that the graph drawing should be centered within its page, when the library
generates a concrete representation.
The tables distinguish these device-specific attributes by
a $\dagger$ symbol at the start of the Use column.

Tables~\ref{tab:nattr_dec}, \ref{tab:eattr_dec}, \ref{tab:gattr_dec} and
\ref{tab:cattr_dec} list the node, edge, graph and cluster 
attributes, respectively,
that do not effect the placement of components.
Obviously, the values of these attributes are not reflected in the
position information of the graph after layout. If the application
handles the actual drawing of the graph, it must decide if it wishes
to use these attributes or not.
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|l|p{2.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Default} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt color} & {\tt black} & node shape color \\
{\tt fillcolor} & {\tt lightgrey} & node fill color \\
{\tt fontcolor} & {\tt black} & text color \\
{\tt layer} & overlay range & {\tt all}, {\it id} or {\it id:id} \\
{\tt nojustify} & false & context for justifying multiple lines of text \\
{\tt style} & & style options, e.g. {\tt bold, dotted, filled} \\ 
\hline
\end{tabular}
\caption{Decorative node attributes}
\label{tab:nattr_dec}
\end{table}
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|l|p{2.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Default} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt arrowhead} & normal & style of arrowhead at head end \\
{\tt arrowsize} & {\tt 1.0} & scaling factor for arrowheads \\
{\tt arrowtail} & normal & style of arrowhead at tail end \\
{\tt color} & {\tt black} & edge stroke color \\
{\tt decorate} & & if set, draws a line connecting labels with their edges \\
{\tt dir} & {\tt forward/none} & {\tt forward}, {\tt back}, {\tt both}, or {\tt none} \\ 
{\tt fontcolor} & {\tt black} & type face color \\
{\tt headlabel} & & label placed near head of edge \\
{\tt labelangle} & {\tt -25.0} & angle in degrees which head or tail label
is rotated off edge \\
{\tt labeldistance} & {\tt 1.0} & scaling factor for distance of head or tail label from node \\
{\tt labelfloat} & false & lessen constraints on edge label placement \\
{\tt labelfontcolor} & {\tt black} & type face color for head and tail labels\\
{\tt labelfontname} & {\tt Times-Roman} & font family for head and tail labels\\
{\tt labelfontsize} & {\tt 14} & point size for head and tail labels \\
{\tt layer} & overlay range & {\tt all}, {\it id} or {\it id:id} \\
{\tt nojustify} & false & context for justifying multiple lines of text \\
{\tt style} & & drawing attributes such as {\tt bold}, {\tt dotted}, or {\tt filled} \\ 
{\tt taillabel} & & label placed near tail of edge \\
\hline
\end{tabular}
\caption{Decorative edge attributes}
\label{tab:eattr_dec}
\end{table}
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|l|p{2.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Default} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt bgcolor} &  & background color for drawing, plus initial fill color \\
{\tt charset} & {\tt UTF-8} & character encoding for text \\ 
{\tt fontcolor} & {\tt black} & type face color \\ 
{\tt labeljust} & centered & left, right or center alignment for graph labels \\
{\tt labelloc} & bottom & top or bottom location for graph labels \\
{\tt layers} & & names for output layers \\
{\tt layersep} & {\tt "\t :" } & separator characters used in layer specification \\
{\tt nojustify} & false & context for justifying multiple lines of text \\
{\tt outputorder} & {\tt breadthfirst} & order in which to emit nodes and edges \\ 
{\tt pagedir} & {\tt BL} & traversal order of pages \\
{\tt samplepoints} & {\tt 8} & number of points used to represent ellipses
and circles on output \\
{\tt stylesheet} & & XML stylesheet \\
{\tt truecolor} & & determines truecolor or color map model for bitmap output \\
\hline
\end{tabular}
\caption{Decorative graph attributes}
\label{tab:gattr_dec}
\end{table}
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|l|p{2.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Default} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt bgcolor} &  & background color for cluster \\
{\tt color} & {\tt black} & cluster boundary color \\
{\tt fillcolor} & {\tt black} & cluster fill color \\
{\tt fontcolor} & {\tt black} & text color \\
{\tt labeljust} & centered & left, right or center alignment for cluster labels \\
{\tt labelloc} & top & top or bottom location for cluster labels \\
{\tt nojustify} & false & context for justifying multiple lines of text \\
{\tt pencolor} & {\tt black} & cluster boundary color \\
{\tt style} & & style options, e.g. {\tt bold, dotted, filled}; \\ 
\hline
\end{tabular}
\caption{Decorative cluster attributes}
\label{tab:cattr_dec}
\end{table}

Among these attributes, some are used more frequently than others.
A graph drawing typically needs to encode various application-dependent
properties in the representations of the nodes. This can be done 
with text, using the {\tt label}, {\tt fontname} and {\tt fontsize} 
attributes; with color, using the {\tt color}, {\tt fontcolor}, 
{\tt fillcolor} and {\tt bgcolor} attributes; or with
shapes, the most common attributes being {\tt shape}, {\tt height},
{\tt width}, {\tt style}, {\tt fixedsize}, {\tt peripheries} and {\tt regular},

Edges often display additional semantic information with the
{\tt color} and {\tt style} attributes. If the edge is directed,
the {\tt arrowhead}, {\tt arrowsize}, {\tt arrowtail} and {\tt dir}
attributes can play a role. Using splines rather than line segments
for edges, as determined by the {\tt splines} attribute, is done
for aesthetics or clarity rather than to convey more information.

There are also a number of frequently used attributes which affect
the layout geometry of the nodes and edges. These include
{\tt compound}, {\tt len}, {\tt lhead}, {\tt ltail}, {\tt minlen},
{\tt nodesep}, {\tt pin}, {\tt pos}, {\tt rank}, {\tt rankdir}, {\tt ranksep}
and {\tt weight}. Within this category, we should also mention the
{\tt pack} and {\tt overlap} attributes, though they have a somewhat
different flavor.

The attributes described thus far are used as input to the layout algorithms.
There is a collection of attributes, displayed in Table~\ref{tab:write}, 
which, by convention, \gviz\ uses to specify the geometry of a layout. 
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|p{3.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt bb} &  bounding box of drawing or cluster \\
{\tt lp} & position of graph, cluster or edge label \\
{\tt pos} & position of node or edge control points \\
{\tt rects} & rectangles used in records \\
{\tt vertices} & points defining node's boundary, if requested \\
\hline
\end{tabular}
\caption{Output position attributes}
\label{tab:write}
\end{table}
After an application has used \gviz\ to determine position information, 
if it wants to write out the graph in \DOT\ with this information, it should
use the same attributes.
 
In addition to the attributes described above which have visual effect,
there is a collection of attributes used to supply identification
information or web actions. Table~\ref{tab:web} lists these.
\begin{table}[htbp]\footnotesize
\centering
\begin{tabular}[t]{|l|p{3.5in}|} \hline
\multicolumn{1}{|c|}{Name} & \multicolumn{1}{c|}{Use} \\ \hline
{\tt URL} & hyperlink associated with node, edge, graph or cluster \\
{\tt comment} & comments inserted into output \\
{\tt headURL} & URL attached to head label \\
{\tt headhref} & synonym for {\tt headURL} \\
{\tt headtarget} & browser window associated with {\tt headURL} \\
{\tt headtooltip} & tooltip associated with {\tt headURL} \\
{\tt href} & synonym for {\tt URL} \\
{\tt tailURL} & URL attached to tail label \\
{\tt tailhref} & synonym for {\tt tailURL} \\
{\tt tailtarget} & browser window associated with {\tt tailURL} \\
{\tt tailtooltip} & tooltip associated with {\tt tailURL} \\
{\tt target} & browser window associated with {\tt URL} \\
{\tt tooltip} & tooltip associated with {\tt URL} \\
\hline
\end{tabular}
\caption{Miscellaneous attributes}
\label{tab:web}
\end{table}

\subsubsection{Attribute and HTML Strings}
\label{sec:attributes_strings}
When an attribute is assigned a value, the graph library replicates the
string. This means the application can use a temporary string as the
argument; it does not have to keep the string throughout the application.
Each node, edge, and graph maintains its own attribute 
values. Obviously, many of these are the same strings, so to save
memory, the graph library uses a reference counting mechanism to 
share strings. An application can employ this mechanism by using
the {\tt agstrdup()} function. If it does, it must also use the
{\tt agstrfree()} function if it wishes to release the string.

When using strings as labels, one can have some formatting control via
the various inline escape sequences such as \verb+"\n"+, \verb+"\l"+, 
\verb+"\N"+, etc., and attributes such as {\tt fontname} and {\tt fontcolor}.
To get a great deal more flexibility, one can use HTML-like labels. 
In the \DOT\ language, these strings are delimited by angle brackets
{\tt <...>} rather than double quotes in order to be work seamlessly with
ordinary strings. Even at the library level, these strings are semantically
identical to ordinary strings except when used as labels. 
To create one of these, one uses {\tt agstrdup\_html()}
rather than {\tt agstrdup()}. The {\tt agstrfree()} is still used to
release the string. For example, one might use the following code to attach
an HTML string to a node:
\begin{verbatim}
Agnode_t* n;
char* l = agstrdup_html(agroot(n), "<B>some bold text</B>");
agset (n,"label",l);
agstrfree (l);
\end{verbatim}
In addition, the function  {\tt aghtmlstr()} 
can be used query if an attribute string is an HTML string.

\subsection{Laying out the graph}

Once the graph exists and the attributes are set, the application can
pass the graph to one of the \gviz\ layout functions by a call to
{\tt gvLayout}. As arguments, this function takes a pointer to a \gvc, a
pointer to the graph to be laid out, and the name of the desired layout
algorithm. The algorithm names are the same as those of the layout
programs listed in Section~\ref{sec:intro}. Thus, {\tt "dot"} is used
to invoke {\tt dot}, etc.\footnote{Usually, all of these algorithms
are available. It is possible, however, that an application can arrange
to have only a subset made available.}

The layout algorithm will do everything that the corresponding
program would do, given the graph and its attributes. This includes
assigning node positions,
representing edges as splines\footnote{
Line segments are represented as degenerate splines.
}, handling the special case of an unconnected
graph, plus dealing with various technical features such as preventing
node overlaps.

There are two special layout engines available in the library: 
{\tt "nop"} and {\tt "nop2"}. These correspond to running the
\neato\ command with the flags {\tt -n} and {\tt -n2}, respectively.
That is, they assume the input graph already has position information stored
for nodes and, in the latter case, some edges. They can be used to route
edges in the graph or perform other adjustments. Note that they expect
the position information to be stored as {\tt pos} attributes in the
nodes and edges. The application can do this itself, or use the
{\tt dot} renderer.

For example, if one wants to position the nodes of a graph using
a \dot\ layout, but wants edges drawn as line segments, one could
use the following code shown in Figure~\ref{fig:nop}.
\begin{figure}[hbt]
\begin{verbatim}
    Agraph_t* G;
    GVC_t* gvc;

    /* 
     * Create gvc and graph 
     */

    gvLayout (gvc, G, "dot");
    gvRender (gvc, G, "dot", NULL);
    gvFreeLayout(gvc, G);
    gvLayout (gvc, G, "nop");
    gvRender (gvc, G, "png", stdout);
    gvFreeLayout(gvc, G);
    agclose (G);
\end{verbatim}
\caption{Basic use}
\label{fig:nop}
\end{figure}
The first call to {\tt gvLayout} lays out the graph using dot;
the first call to {\tt gvRender} attaches the computed position
information to the nodes and edges. 
The second call to {\tt gvLayout} adds straight-line edges to the
already positioned nodes; the second call to {\tt gvRender} outputs
the graph in {\tt png} for on {\tt stdout}.

\subsection{Rendering the graph}
\label{sec:layout_info}
Once the layout is done, the graph data structures contain
the position information for drawing the graph. 
The application needs to decide how to use this information.

To use the renderers supplied with the \gviz\ software, the application
can call one of the library functions 
\begin{verbatim}
    gvRender (GVC_t *gvc, Agraph_t* g, char *format, FILE *out);
    gvRenderFilename (GVC_t *gvc, Agraph_t* g, char *format, char *filename);
\end{verbatim}
The first and second arguments are a graphviz context handle and a pointer
to the graph to be rendered. The final argument gives, respecitively,
a file stream open for writing or the name of a file to which the
graph should be written. The third argument names the renderer to
be used, such as {\tt "ps"}, {\tt "png"} or {\tt "dot"}.
The allowed strings are the same ones used with the {\tt -T} flag
when the layout program is invoked from a command shell.

After a graph has been laid out using {\tt gvLayout}, an application
can perform multiple calls to the rendering functions. A typical instance
might be  
\begin{verbatim}
    gvLayout (gvc, g, "dot");
    gvRenderFilename (gvc, g, "png", "out.png");
    gvRenderFilename (gvc, g, "cmap", "out.map");
\end{verbatim}
in which the graph is laid out using the \dot\ algorithm, followed by
PNG bitmap output and a corresponding map file which can be used in a web
browser.

As with reading, \gviz\ provides some specialized functions for rendering.
Of note is
\begin{verbatim}
    gvRenderData (GVC_t *gvc, Agraph_t* g, char *format, char **result, 
      unsigned int *length)
\end{verbatim}
which writes the output of the rendering onto an allocated character buffer. A pointer
to this buffer is returned in {\tt *result} and the number of bytes written is stored in
{\tt length}. After using the buffer, the memory should be freed by the application.
As the calling program may rely on a different run-time system than that used by
\gviz, the library provides the function 
\begin{verbatim}
    gvFreeRenderData (char *data); 
\end{verbatim}
which can be used to free the memory pointed to by {\tt *result}.

Sometimes, an application will decide to do its own rendering.
An application-supplied
drawing routine, such as {\tt drawGraph} in Figure~\ref{fig:basic}
can then read this information,
map it to display coordinates, and call routines to render the drawing.

One simple way to do this is to use the position and drawing information as
supplied by the {\tt dot} or {\tt xdot} format (see
Sections~\ref{sect:dot} and \ref{sect:xdot}). To get this, the application
can call the appropriate renderer, passing a NULL stream
pointer to {\tt gvRender}\footnote{
This convention only works, and only makes sense, with the {\tt dot} 
and {\tt xdot} renderers. For other renders, a NULL stream will cause
output to be written on {\tt stdout}.} as in Figure~\ref{fig:nop}.
This will attach the information as string attributes. The application
can then use {\tt agget} to read the attributes.

On the other hand, an application may desire to read the primitive
data structures used by the algorithms to record the layout information.
In the remainder of this section, 
we describe in reasonable detail these data structures.
An application can
use these values directly to guide its drawing. In some cases, for example,
with arrowheads attached to {\tt bezier} values or HTML-like labels, it
would be onerous for an application to fully interpret the data.
For this reason,
if an application wishes to provide all of the
graphics features while avoiding the low-level details of the data
structures, we suggest either using {\tt xdot} approach, described above,
or supplying its own renderer plug-in as described in
Section~\ref{sec:renderers}.

The \gviz\ layout algorithms rely on a specific set of fields
to record position and drawing information.
Thus, the definitions
of the information fields are fixed by the layout library and
cannot be altered by the application.\footnote{This is a limitation
of the \graph\ library. We plan to remove this restriction by moving to
a mechanism which allows arbitrary dynamic extensions to the
node, edge and graph structures. Meanwhile, if the application requires
the addition of extra fields, it can define its own structures, which
should be extensions of the components of the information types, with
the additional fields attached at the end. Then, instead of calling
{\tt aginit()}, it can use the more general {\tt aginitlib()}, and
supply the sizes of its nodes, edges and graphs. This will ensure
that these components will have the correct sizes and alignments. 
The application can then cast the generic \graph\ types to the
types it defined, and access the additional fields.}

The fields should only be accessed using macro expressions provided for
this purpose.
Thus, if {\tt np} is a node pointer, the width field should
be read using \verb+ND_width(np)+.
Edge and graph attributes follow the same convention, with
prefixes \verb+ED_+ and \verb+GD_+, respectively.
A complete list of these macros is given in {\tt types.h}, 
along with various auxiliary types such as {\tt pointf} or 
{\tt bezier}
\footnote{We strongly deprecate accessing the fields directly, for the usual reason
of good programming style. By using the macros, source code will not be 
affected by any changes to the how the value is provided}.

We now consider the principal fields providing position information.

Each node has {\tt ND\_coord}, {\tt ND\_width} and {\tt ND\_height} 
attributes. The value
of {\tt ND\_coord} gives the position of the center of the node, 
in points.\footnote{
The \neato\ and \fdp\ layouts allow the graph to specify fixed positions
for nodes. Unfortunately, some post-processing done in \gviz\ translates
the layout so that its lower-left corner is at the origin. To recover
the original coordinates, the application will need to translate all positions
by the vector $p_0 - p$, where $p_0$ and $p$ are the input position and
the final position of some node whose position was fixed.
} 
The {\tt ND\_width} and {\tt ND\_height} attributes specify the size of the
bounding box of the node, in inches.
Note that the {\tt width} and {\tt height} attributes provide in the input
graph are minimum values, so that the values stored in {\tt ND\_width} 
and {\tt ND\_height} may be larger.

Edges, even if a line segment, are represented as cubic B-splines or
piecewise Bezier curves.
The {\tt ED\_spl} attribute of the edge stores this spline information.
It has a pointer to an array of 1 or more {\tt bezier} structures. Each
of these describes a single piecewise Bezier curve as well as associated
arrowhead information. Normally, a single {\tt bezier} structure is
sufficient to represent an edge. In some cases, however, 
the edge may need multiple {\tt bezier} parts, as when the {\tt concentrate}
attribute is set, whereby mostly parallel edges are represented by a
shared spline.
Of course, the application always has the possibility of drawing a line
segment connecting the centers of the edge's nodes.

If a subgraph is specified as a cluster, the nodes of the cluster
will be drawn together and the entire subgraph is contained within
a rectangle containing no other nodes. The rectangle is specified by
the {\tt GD\_bb} attribute of the subgraph, the coordinates in points in
the global coordinate system.

\subsubsection{Drawing nodes and edges}
\label{sec:nodes}

With the position and size information described above, an application 
can draw the nodes and edges of a graph. It could just use rectangles 
or circles for nodes, and represent edges as line segments or splines.
However, nodes and edges typically have a variety of other attributes,
such as color or line style, which an application can read from the
appropriate fields and use in its rendering.

Additional drawing information about the node depends mostly on the shape
of the node. 
For record-type nodes, where \verb+ND_shape(n)->name+ is {\tt "record"} or 
{\tt "Mrecord"}, the node consists of a packed collection
of rectangles. In this case, \verb+ND_shape_info(n)+ can be cast to
\verb+field_t*+, which describes the recursive partition of the node
into rectangles. The value {\tt b} of \verb+field_t+
gives the bounding rectangle of the field, in points in the coordinate
system of the node, i.e., where the center of the node is at the
origin. 

If \verb+ND_shape(n)->usershape+ is true, the shape
is specified by the user. Typically, this is format dependent, e.g.,
the node might be specified by a GIF image, and we ignore this case
for the present. 

The final node class consists of those with polygonal shape\footnote{This
is not quite true but close enough for now.}, which includes the limiting
cases of circles, ellipses, and none. In this case,
\verb+ND_shape_info(n)+ can be cast to \verb+polygon_t*+, which specifies
the many parameters (number of sides, skew and distortions, etc.)
used to describe polygons, as well as the points used as vertices. 
Note that the vertices are in inches, and are in the coordinate system
of the node, with the origin at the center of the node.

To handle a node's shape, an application has two basic choices. It
can implement the geometry for each of the different shapes.
Thus, it could see that \verb+ND_shape(n)->name+ is {\tt "box"}, and
use the {\tt ND\_coord}, {\tt ND\_width} and {\tt ND\_height} attributes to draw
rectangle at the given position with the given width and  height.
A second approach would be
to use the specification of the shape as stored internally in the
{\tt shape\_info} field of the node. For example, given a polygonal node,
its \verb+ND_shape_info(n)+ field contains a {\tt vertices} field,
mentioned above, which is an ordered list of all the vertices used
to draw the appropriate polygon, taking into account multiple peripheries. 
Again, if an application desires to be fully faithful in the rendering,
it may be preferable to use the {\tt xdot} information or to supply its
own renderer plugin.

For edges, each {\tt bezier} structure has a \verb+list+ field pointing to
an array containing the control points and a \verb+size+ field giving the 
number of points in \verb+list+, which will always have the form $(3*n+1)$. 
In addition, there are
fields for specifying arrowheads. If {\tt bp} points to
a {\tt bezier} structure and the {\tt bp->sflag} field is
true, there should be an arrowhead attached to the beginning of the bezier.
The field {\tt bp->sp} gives the point where the nominal tip of the arrowhead
would touch the tail node. (If there is no arrowhead, {\tt bp->list[0]} will
touch the node.) Thus, the length and direction of the arrowhead
is determined by the vector going from {\tt bp->list[0]} to {\tt bp->sp}. 
The actual
shape and width of the arrowhead is determined by the {\tt arrowtail} and
{\tt arrowsize} attributes. Analogously, an arrowhead at the head node is
specified by {\tt bp->eflag} and the vector 
from {\tt bp->list[bp->size-1]} to {\tt bp->ep}.

The label field (\verb+ND_label(n)+, \verb+ED_label(e)+,
\verb+GD_label(g)+) encodes any text label associated with a graph object.
Edges, graphs and clusters will occasionally have labels; nodes
almost always have a label, since the default label is the node's name.
The basic label string is stored in the {\tt text} field, while the
{\tt fontname}, {\tt fontcolor} and {\tt fontsize} fields describe
the basic font characteristics.
In many cases, the basic label string is further parsed, either into
multiple, justified text lines, or as a nested box structure for
HTML-like labels or nodes of record shape.
This information is available in other fields.

\subsection{Cleaning up a graph}
\label{sec:clean}

Once all layout information is obtained from the graph, the resources
should be reclaimed. To do this, the application should call
the cleanup routine associated with the layout algorithm used to
draw the graph. This is done by a call to {\tt gvFreeLayout}.

The example of Figure~\ref{fig:basic}
demonstrates the case where the application is drawing a single graph.
The example given in Appendix~\ref{sec:dot}
shows how cleanup might be done when processing multiple graphs.

The application can best determine when it should clean up. The example
in the appendix
performs this just before a new graph is drawn, but the application could
have done this much earlier, for example, immediately after the graph
is drawn using {\tt gvRender}. Note, though, that layout information
is destroyed during cleanup. If the application needs to reuse this
data, for example, to refresh the display, it should delay calling the
cleanup function, or arrange to copy the layout data elsewhere.
Also, in the simplest case where the application just draws one graph
and exits, there is no need to do cleanup at all, though this is sometimes
considered poor programming style.

A given graph can be laid out multiple times.
The application, however, must clean up the earlier layout's
information with a call to {\tt gvFreeLayout}
before invoking a new layout function.
An example of this was given in Figure~\ref{fig:nop}.

Note that if you render a graph into the {\tt dot} or {\tt xdot}
format, this attaches attributes onto the graph. Some of these attributes
are used during layout. For example, the {\tt neato} layout will use
the {\tt pos} attribute of the nodes for an initial layout, while the
{\tt twopi} layout may set the {\tt root} attribute, which will lock in
this attribute for any future layouts using {\tt twopi}.
To avoid having these attributes affecting another
layout of the graph, the user should should set these attributes to the
empty string before calling {\tt gvLayout} again.

Once the application is totally done with a graph, it should call
{\tt agclose} to close the graph and reclaim the remaining resources associated
with it.