1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159
|
\documentclass[11pt,letterpaper]{article}
\usepackage{url}
\usepackage{graphicx}
\usepackage{newtxtext}
\usepackage{xcolor}
\usepackage[]{hyperref}
\usepackage[cmintegrals,cmbraces]{newtxmath} % times roman, the classic
\hypersetup{
colorlinks = true, %Colours links instead of ugly boxes
urlcolor = {blue!80!black}, %Colour for external hyperlinks
linkcolor = {blue!80!black}, %Colour of internal links
citecolor = red %Colour of citations
}
\usepackage{lgrind}
\usepackage{footnote}
%\usepackage[margin=0.5in]{geometry}
\addtolength{\oddsidemargin}{-.875in}
\addtolength{\evensidemargin}{-.875in}
\addtolength{\textwidth}{1.75in}
\renewcommand\floatpagefraction{.9}
\renewcommand\topfraction{.9}
\renewcommand\bottomfraction{.9}
\renewcommand\textfraction{.1}
%\setcounter{totalnumber}{50}
%\setcounter{topnumber}{50}
%\setcounter{bottomnumber}{50}
%\setlength{\textwidth}{8.0in}
%\oddsidemargin 0in
%\evensidemargin .5in
%\textwidth 7.5in
\author{Stephen C. North \\
{\small scnorth@gmail.com}
\and Emden R.~Gansner \\
{\small erg@graphviz.com}
}
\title{Cgraph Tutorial}
\date{\today}
\begin{document}
%\begin{titlepage}
\maketitle
\newpage
\tableofcontents
\newpage
%\thispagestyle{empty}
%\end{titlepage}
%\pagenumbering{arabic}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}
\label{sec:introduction}
Cgraph is a C library for graph programming.
It defines data types and operations for graphs comprised
of attributed nodes, edges and subgraphs.
Attributes may be string name-value pairs for convenient
file I/O, or internal C data structures for efficient algorithm
implementation.
Cgraph is aimed at graph representation; it is not an library of
higher-level algorithms such as shortest path or network flow.
We envision these as higher-level libraries written on top of
Cgraph. Efforts were made in Cgraph's design to strive
for time and space efficiency. The basic (unattributed) graph representation
takes 104 bytes per node and 64 bytes per edge, so storage of graphs with
millions of objects is reasonable. For attributed graphs, Cgraph also
maintains an internal shared string pool, so if all the nodes of a graph
have \verb'color=red', only one copy of \verb'"color"' and \verb'"red"'
are made. There are other tricks that experts can exploit for
flat-out coding efficiency. For example, there are ways to inline the
instructions for edge list traversal and internal data structure access.
Cgraph uses Phong Vo's dictionary library, libcdt, to store node
and edge sets. This library provides a uniform interface to hash
tables and splay trees, and its API is usable for general programming
(such as storing multisets, hash tables, lists and queues) in Cgraph
programs.
{\bf Notation} In the following, we use \verb"TRUE" to denote a non-zero
value, and \verb"FALSE" to denote zero.
\section{Graph Objects}
\label{sec:graphobjects}
Almost all Cgraph programming can be done with pointers to
these data types:
\begin{itemize}
\item \verb"Agraph_t": a graph or subgraph
\item \verb"Agnode_t": a node from a particular graph or subgraph
\item \verb"Agedge_t": an edge from a particular graph or subgraph
\item \verb"Agsym_t": a descriptor for a string-value pair attribute
\item \verb"Agrec_t": an internal C data record attribute of a graph object
\end{itemize}
Cgraph is responsible for its own memory management; allocation
and deallocation of Cgraph data structures is always done through
Cgraph calls.
\section{Graphs}
\label{sec:graphs}
A top-level graph (also called a root graph) defines a universe
of nodes, edges, subgraphs, data dictionaries and other information.
A graph has a name and two properties: whether it is directed or
undirected, and whether it is strict (multi-edges are
forbidden). \footnote{It is possible to specify that a graph is simple
(neither multi-edges nor loops), or can have multi-edges but not loops.}
Note that nodes, edges and subgraphs exists in exactly one root graph. They
cannot be used independently of that graph, or attached to another root graph.
The following examples use the convention that \verb"G" and
\verb"g" are \verb"Agraph_t*" (graph pointers), \verb"n", \verb"u",
\verb"v", \verb"w" are \verb"Agnode_t*" (node pointers), and
\verb"e", \verb"f" are \verb"Agedge_t*" (edge pointers).
To make a new, empty top-level directed graph:
\begin{verbatim}
Agraph_t *g;
g = agopen("G", Agdirected, NULL);
\end{verbatim}
The first argument to \verb"agopen" is any string, and is not
interpreted by Cgraph, except it is recorded and preserved when
the graph is written as a file.\footnote{An application
could, of course, maintain its own graph catalogue using graph names.}
The second argument indicates the type of graph, and should be one of
{\tt Agdirected}, {\tt Agstrictdirected}, {\tt Agundirected},
or {\tt Agstrictundirected}.
The third argument is an optional pointer to a collection of
methods for overriding certain default behaviors of Cgraph,
and in most situations can just be {\tt NULL}.
You can get the name of a graph by \verb"agnameof(g)", and
you can get its properties by the predicate functions
{\tt agisdirected(g)} and {\tt agisstrict(g)}.
You can also construct a new graph by reading a file:
\begin{verbatim}
g = agread(stdin,NULL);
\end{verbatim}
Here, the graph's name, type and contents including attributes depend
on the file contents. (The second argument is the same optional method
pointer mentioned above for \verb"agopen"). If the default I/O discipline
is used and the stream is wide-oriented, the behavior is undefined.
Sometimes it is convenient to have the graph represented concretely as a
character string \verb"str".
In this case, the graph can be created using:
\begin{verbatim}
g = agmemread (str);
\end{verbatim}
You can write a representation of a graph to a file:
\begin{verbatim}
g = agwrite(g,stdout);
\end{verbatim}
\verb"agwrite" creates an external representation of a graph's
contents and attributes (except for internal attributes),
that can later be reconstructed by calling \verb"agread"
on the same file.\footnote{It is the application programmer's job to
convert between internal attributes to external strings
when graphs are read and written, if desired.
This seemed better than inventing a complicated way
to automate this conversion.} If the default I/O discipline is used
and the stream is wide-oriented, the behavior is undefined.
\verb"agnnodes(g)", \verb"agnedges(g)" and \verb"agnsubg(g)" return the count of nodes,
edges and (immediate) subgraphs in a graph (or subgraph).
To delete a graph and its associated data structures,
freeing their memory, one uses:
\begin{verbatim}
agclose(g);
\end{verbatim}
Finally, there is an interesting if obscure function to concatenate
the contents of a graph file onto an existing graph, as shown here.
\begin{verbatim}
g = agconcat(g,stdin,NULL);
\end{verbatim}
\section{Nodes}
\label{sec:nodes}
In Cgraph, a node is usually identified by a unique string name
and a unique internal integer ID assigned by Cgraph.
(For convenience, you can also create "anonymous" nodes by
giving \verb"NULL" as the node name.) A node also has in- and out-edge sets,
even in undirected graphs.
Once you have a graph, you can create or search for nodes this way:
\begin{verbatim}
Agnode_t *n;
n = agnode(g,"node28",TRUE);
\end{verbatim}
The first argument is a graph or subgraph in which the node
is to be created. The second is the name (or NULL for anonymous nodes.)
When the third argument is \verb"TRUE", the node is created if it doesn't
already exist. When it's \verb"FALSE", as shown below, then Cgraph
searches to locate an existing node with the given name, returning NULL if
none is found.
\begin{verbatim}
n = agnode(g,"node28",FALSE);
\end{verbatim}
The function \verb"agdegree(g, n, in, out)" gives the degree
of a node in (sub)graph \verb"g", where \verb"in" and \verb"out" select the edge sets.
\begin{itemize}
\item \verb"agdegree(g,n,TRUE,FALSE)" returns the in-degree.
\item \verb"agdegree(g,n,FALSE,TRUE)" returns the out-degree.
\item \verb"agdegree(g,n,TRUE,TRUE)" returns their sum.
\end{itemize}
The function \verb"agcountuniqedges" is identical to {\tt agdegree}
except when the last two arguments are both \verb"TRUE". In this case, a loop is only
counted once.
\verb"agnameof(n)" returns the printable string name of a node.
Note that for various reasons this string may be a temporary
buffer that can be overwritten by subsequent calls.
Thus, the usage
\begin{verbatim}
printf("%s %s\n",agnameof(agtail(e)),agnameof(aghead(e)));
\end{verbatim}
is unsafe because the buffer may be overwritten when the
arguments to printf are being computed.
A node can be deleted from a graph or subgraph by \verb"agdelnode(g,n)".
\section{Edges}
\label{sec:edges}
An edge is a node pair: an ordered pair in a directed graph,
unordered in an undirected graph. For convenience there is
a common edge data structure for both kinds and the endpoints
are the fields \verb"tail" and \verb"head". \footnote{When an edge is created,
the first node will be used as the tail node, and the second node as the head.}
Because an edge is implemented as an edge pair, there are two valid pointers
to the same edge, so simple pointer comparison does not work for edge equality.
The function \verb"ageqedge(Agedge_t *e0, Agedge_t *e1)" evaluates to true if
the two pointers represent the same abstract edge, and should usually be used
for edge comparisons.
An edge is made using
\begin{verbatim}
Agnode_t *u, *v;
Agedge_t *e;
/* assume u and v are already defined */
e = agedge(g,u,v,"e28",TRUE);
\end{verbatim}
\verb"u" and \verb"v" must belong to the same graph or subgraph
for the operation to succeed. The ``name'' of an edge
(more correctly, identifier) is treated as a unique identifier for edges
between a particular node pair. That is, there can only be at most
one edge with name \verb"e28" between any given \verb"u" and \verb"v",
but there can be many other edges \verb"e28" between other nodes.
\verb"agtail(e)" and \verb"aghead(e)" return the endpoints of \verb"e".
If \verb"e" is created as in the call to \verb"agedge" above, \verb"u"
will be the tail node and \verb"v" will be the head node. This holds true
even for undirected graphs.
The value \verb"e->node" is the ``other'' endpoint
with respect to the node from which e was obtained.
That is, if \verb"e" is an out-edge of node \verb"n" (equivalently, \verb"n"
is the tail of \verb"e"), then \verb"e->node" is the head of \verb"e".
A common idiom is:
\begin{verbatim}
for (e = agfstout(g,n); e; e = agnxtout(g,e))
/* do something with e->node */
\end{verbatim}
\verb"agedge" can also search for edges:
\begin{verbatim}
/* finds any u,v edge */
e = agedge(g,u,v,NULL,FALSE);
/* finds a u,v edge with name "e8" */
e = agedge(g,u,v,"e8",FALSE);
\end{verbatim}
In an undirected graph, an edge search will consider the given vertices
as both tail and head nodes.
An edge can be deleted from a graph or subgraph by \verb"agdeledge(g,e)".
The \verb"agnameof" function can be used to get the ``name'' of an edge.
Note that this will be the identifier string provided at edge creation.
The names
of the head and tail nodes will not be part of the string. In addition, it
returns NULL for anonymous edges.
\section{Traversals}
\label{sec:traversals}
Cgraph has functions for iterating over graph objects.
For example, we can scan all the edges of a graph (directed
or undirected) by the following:
\begin{verbatim}
for (n = agfstnode(g); n; n = agnxtnode(g,n)) {
for (e = agfstout(g,n); e; e = agnxtout(g,e)) {
/* do something with e */
}
}
\end{verbatim}
\label{traversal}
The functions \verb"agfstin(g,n)" and \verb"afnxtin(g,e)" are
provided for walking in-edge lists.
In the case of a directed edge, the meanings of ``in'' and ``out'' are
obvious. For undirected graphs, Cgraph assigns an
orientation based on the analogous order of the two nodes when the edge
is created.
To visit all the edges of a node in an undirected graph:
\begin{verbatim}
for (e = agfstedge(g,n); e; e = agnxtedge(g,e,n))
/* do something with e */
\end{verbatim}
Be careful if your code deletes an edge or node during the traversal, as
then the object will no longer be valid to get the next object.
This is typically handled by code like:
\begin{verbatim}
for (e = agfstedge(g,n); e; e = f) {
f = agnxtedge(g,e,n))
/* delete e */
}
\end{verbatim}
Traversals are guaranteed to visit the nodes of a graph, or edges of a node,
in their order of creation in the root graph (unless we allow programmers
to override object ordering, as mentioned in section~\ref{sec:openissues}).
\section{External Attributes}
\label{sec:externalattributes}
Graph objects may have associated string name-value pairs.
When a graph file is read, Cgraph's parser takes care of
the details of this, so attributes can just be added
anywhere in the file. In C programs, values must be
declared before use.
Cgraph assumes that all objects of a given kind (graphs/subgraphs,
nodes, or edges) have the same attributes - there's no notion of
subtyping within attributes. Information about attributes is
stored in data dictionaries. Each graph has three (for
graphs/subgraphs, nodes, and edges) for which you'll need the
predefined constants AGRAPH, AGNODE and AGEDGE in
calls to create, search and walk these dictionaries.
Thus, to create an attribute for nodes, one uses:
\begin{verbatim}
Agsym_t *sym;
sym = agattr(g,AGNODE,"shape","box");
\end{verbatim}
If this succeeds, \verb"sym" points to a descriptor for the
newly created (or updated) attribute. (Thus, even if \verb"shape"
was previously declared and had some other default value,
it would be set to \verb"box" by the above.)
By using a NULL pointer as the value,
you can use the same function to search the attribute definitions of a graph.
\begin{verbatim}
sym = agattr(g,AGNODE,"shape",0);
if (sym)
printf("The default shape is %s.\n",sym->defval);
\end{verbatim}
If you have the pointer to some graph object, you can also use the function
\verb"agattrsym".
\begin{verbatim}
Agnode_t* n;
Agsym_t* sym = agattrsym (n,"shape");
if (sym)
printf("The default shape is %s.\n",sym->defval);
\end{verbatim}
Both functions return NULL if the attribute is not defined.
Instead of looking for a particular attribute, it is possible to
iterate over all of them:
\begin{verbatim}
sym = 0; /* to get the first one */
while (sym = agnxtattr(g,AGNODE,sym)
printf("%s = %s\n",sym->name,sym->defval);
\end{verbatim}
Assuming an attribute already exists for some object, its value can be
obtained or set using its string name or its \verb"Agsym_t" descriptor.
To use the string name, we have:
\begin{verbatim}
str = agget(n,"shape");
agset(n,"shape","hexagon");
\end{verbatim}
If an attribute will be referenced often, it is faster to
use its descriptor as an index, as shown here:
\begin{verbatim}
Agsym_t *sym = agattr(g,AGNODE,"shape","box");
str = agxget(n,sym);
agxset(n,sym,"hexagon");
\end{verbatim}
Cgraph provides two helper functions for dealing with attributes. The function
\verb"agsafeset(void *obj, char *name, char *value, char *def)" first checks
that the attribute has been defined, defining it with the default value \verb"def"
if not. It then uses \verb"value" as the specific value assigned to \verb"obj".
It is sometimes useful to copy all of the values from one object to another. This can
be easily done using \verb"agcopyattr(void *src, void* tgt)". This assumes that the
source and target are the same type of graph objects, and that the attributes of
\verb"src" have already been defined for \verb"tgt". If \verb"src" and \verb"tgt"
belong to the same root graph, this will automatically be true.
\section{Internal Attributes}
\label{sec:internalattributes}
It would be possible to do everything using just string-valued
attributes. In general, though, this will be too inefficient.
To deal with this,
each graph object (graph, node or edge) may have a list of
associated internal data records. The layout of each such
record is programmer-defined, except each must have an
\verb"Agrec_t" header. The records are allocated through
Cgraph. For example:
\begin{verbatim}
typedef struct mynode_s {
Agrec_t h;
int count;
} mynode_t;
mynode_t *data;
Agnode_t *n;
n = agnode(g,"mynodename",TRUE);
data = (mynode_t*)agbindrec(n,"mynode_t",
sizeof(mynode_t),FALSE);
data->count = 1;
\end{verbatim}
In a similar way, \verb"aggetrec" searches for a record, returning a pointer
to the record if it exists and NULL otherwise;
\verb"agdelrec" removes a record from an object.
Although each graph object may have its own unique, individual collection
of records, for convenience, there are functions that update an entire graph
by allocating or removing the same record from all nodes,
edges or subgraphs at the same time. These functions are:
\begin{verbatim}
void aginit(Agraph_t *g, int kind, char *rec_name,
int rec_size, int move_to_front);
void agclean(Agraph_t *g, int kind, char *rec_name);
\end{verbatim}
Note that in addition to \verb"agdelrec" and \verb"agclean", records are removed
and their storage freed when their associated graph object is deleted. Only the
record data structure is freed. If the application has attached any additional heap
memory to a record, it is the responsibility of the application to handle this before
the actual record is deleted.
For further efficiency, there is a way to ``lock'' the data pointer of
a graph object to point to a given record. This can be done by using \verb"TRUE"
as the last argument in \verb"agbindrec", \verb"aginit" or \verb"aggetrec".
If this is done, in the above example
we could then simply cast this pointer to the appropriate type
for direct (un-typesafe) access to the data.
\begin{verbatim}
(mydata_t*) (n->base.data)->count = 1;
\end{verbatim}
Typically, it is convenient to encapsulate this access using macros. For example,
we may have:
\begin{verbatim}
#define ND_count(n) (((mydata_t*)(AGDATA(n)))->count)
ND_count(n) = 1;
\end{verbatim}
As this is unsafe if the record was not allocated for some object, it is good form
to two versions of the macro:
\begin{verbatim}
#ifdef DEBUG
#define ND_count(n) \
(assert(aggetrec(n,"mynode_t",1)),((mynode_t*)(AGDATA(n)))->count)
#else
#define ND_count(n) (((mydata_t*)(AGDATA(n)))->count)
#endif
\end{verbatim}
\section{Subgraphs}
\label{sec:subgraphs}
Subgraphs are an important construct in Cgraph. They are intended for
organizing subsets of graph objects and can be used interchangeably with
top-level graphs in almost all Cgraph functions.
A subgraph may contain any nodes or edges of its parent.
(When an edge is inserted in a subgraph, its nodes
are also implicitly inserted if necessary. Similarly,
insertion of a node or edge automatically implies
insertion in all containing subgraphs up to the root.)
Subgraphs of a graph form a hierarchy (a tree).
Cgraph has functions to create, search, and iterate over subgraphs.
For example,
\begin{verbatim}
Agraph_t *g, *h;
/* search for subgraph by name */
h = agsubg(g,"mysubgraph",FALSE);
if (!h)
/* create subgraph by name */
h = agsubg(g,"mysubgraph",TRUE);
for (h = agfstsubg(g); h; h = agnxtsubg(h)) {
/* agparent is up one level */
assert (g == agparent(h));
/* Use subgraph h */
}
\end{verbatim}
The function \verb"agparent" returns the (sub)graph immediately containing the
argument subgraph. The iteration done using \verb"agfstsubg" and \verb"agnxtsubg"
returns only immediate subgraphs. To find subgraphs further down the hierarchy
requires a recursive search.
It is not uncommon to want to populate a subgraph with nodes and edges that have
already been created. This can be done using
the functions \verb"agsubnode" and \verb"agsubedge",
\begin{verbatim}
Agnode_t *agsubnode(Agraph_t *g, Agnode_t *n, int create);
Agedge_t *agsubedge(Agraph_t *g, Agedge_t *e, int create);
\end{verbatim}
which take a subgraph,
and an object from another subgraph of the same
graph (or possibly a top-level object) and add it to the argument subgraph
if the \verb"create" flag is \verb"TRUE".
It is also added to all enclosing subgraphs, if necessary.
If the \verb"create" flag is \verb"FALSE", then the
request is only treated as a search and returns NULL for failure.
A subgraph can be removed by \verb"agdelsubg(g,subg)" or by
\verb"agclose(subg)".
\section{Utility Functions and Macros}
\label{sec:utilityfunctionsandmacros}
For convenience, Cgraph provides some polymorphic functions and macros
that apply to all Cgraph objects. (Most of these functions could
be implemented in terms of others already described, or by accessing
fields in the \verb"Agobj_t" base object.
\begin{itemize}
\item \verb"AGTYPE(obj)": object type - AGRAPH, AGNODE, or AGEDGE
\item \verb"AGID(obj)": internal object ID (an unsigned long)
\item \verb"AGSEQ(obj)": object creation timestamp (an integer)
\item \verb"AGDATA(obj)": data record pointer (an \verb"Agrec_t*")
\end{itemize}
Other useful functions include:
\begin{verbatim}
/* Returns root graph of obj */
Agraph_t *agroot(void* obj);
/* Returns root graph of obj or obj if a (sub)graph */
Agraph_t *agraphof(void* obj);
/* True of obj belongs to g */
int agcontains(Agraph_t *g, void *obj);
/* Delete obj from the (sub)graph */
int agdelete(Agraph_t *g, void *obj);
/* Synonym of AGTYPE */
int agobjkind(void *obj);
\end{verbatim}
A root graph \verb"g" will always have
\begin{verbatim}
g == agroot(g) == agraphof(g)
\end{verbatim}
\section{Error Handling}
\label{sec:errorhandling}
Cgraph provides some basic error handling functions, hampered by the
lack of exceptions in C. At present, there are basically two types of anomalies:
warnings and errors.
To report an anomaly, one uses:
\begin{verbatim}
typedef enum { AGWARN, AGERR, AGMAX, AGPREV
} agerrlevel_t;
int agerr(agerrlevel_t level, const char *fmt, ...);
\end{verbatim}
The \verb"agerr" function has a \verb"printf"-interface, with the first argument
indicating the severity of the problem.
A message is only written if its severity is higher than a programmer-controlled minimum,
which is AGWARN by default. The programmer can set this value using \verb"agseterr",
which returns the previous value. Calling \verb"agseterr(AGMAX)" turns off the writing of messages.
Sometimes additional context information is only available in functions calling the function
where the error is actually caught. In this case, the calling function can indicate that it
is continuing the current error by using \verb"AGPREV" as the first argument.
The function \verb"agwarningf" is shorthand for \verb"agerr(AGWARN,...)"; similarly,
\verb"agerrorf" is shorthand for \verb"agerr(AGERR,...)".
Some applications desire to directly control the writing of messages. Such an application can
use the function \verb"agseterrf" to register a function that the library should call to actually
write the message:
\begin{verbatim}
typedef int (*agusererrf) (char*);
agusererrf agseterrf(agusererrf);
\end{verbatim}
The previous error function is returned. By default, the messages are written to \verb"stderr".
Errors not written are stored in a log file. The last recorded error can be retrieved by
calling \verb"aglasterr".
The function \verb"agerrors" returns the maximum severity reported to \verb"agerr".
The function \verb"agreseterrors" is identical, except it also resets the error level
as though no errors have been reported.
\section{Strings}
\label{sec:strings}
As mentioned, Cgraph maintains a reference-counted shared string pool for each graph.
As there are often many identical strings used in a graph, this helps to reduce the
memory overhead.
\begin{verbatim}
char *agstrdup(Agraph_t *, char *);
char *agstrbind(Agraph_t *, char*);
int agstrfree(Agraph_t *, char *);
char *agstrdup_html(Agraph_t *, char *);
int aghtmlstr(char *);
char *agcanonStr(char *str);
char *agstrcanon(char *, char *);
char *agcanon(char *, int);
\end{verbatim}
Cgraph has functions to directly create and destroy references to shared strings.
As with any reference-counted object, you can use these as ordinary
strings, though the data should not be modified. If you need to store the pointer
in some data structure, you should call \verb"agstrdup(g,s)". This will create a
new copy of \verb"s" if necessary, and increment
the count of references, ensuring that the string will not be freed by some
other part of the program while you are still using it. Since \verb"agstrdup" makes
a copy of the string, the argument string can use temporary storage.
A call to \verb"agstrfree(g,s)" decrements the reference count and, if this becomes
zero, frees the string.
The function \verb"agstrbind(g,s)" checks if a shared string identical to \verb"s" exists
and, if so, returns it. Otherwise, it returns NULL.
The DOT language supports a special type of string, containing HTML-like markups. To make
sure that the HTML semantics are used, the application should call \verb"agstrdup_html(g,s)"
rather than \verb"agstrdup(g,s)". The following code makes sure the node's label will be
interpreted as an HTML-like string and appear as a table. If \verb"agstrdup" were used,
the label would appear as the literal string \verb"s".
\begin{verbatim}
Agraph_t* g;
Agnode_t* n;
char* s = "<TABLE><TR><TD>one cell</TD></TR></TABLE>";
agset (n, "label", agsgtrdup_html (g,s));
\end{verbatim}
The predicate \verb"aghtmlstr" returns \verb"TRUE" if the argument string is marked as an
HTML-like string. {\bf N.B.} This is only valid if the argument string is a shared string.
HTML-like strings are also freed using \verb"agstrfree".
The DOT language uses various special characters and escape sequences. When writing out
strings as concrete text, it is important that these lexical features are used in order
that the string can be re-read as DOT and be interpreted correctly.
Cgraph provides three functions to handle this. The simplest is \verb"agcanonStr(s)".
It returns a pointer to a canonical version of the input string. It uses an internal
buffer, so the returned string should be written or copied before another call to
\verb"agcanonStr". \verb"agstrcanon(s,buf)" is identical, except the calling function
also supplies a buffer where the canonical version may be written. An application should only use
the returned pointer, as it is possible the buffer will not be used at all. The buffer
needs to be large enough to hold the canonical version. Normally, an expansion of
\verb"2*strlen(s)+2" is sufficient.
Both \verb"agcanonStr" and \verb"agstrcanon" assume that string argument is a
shared string. For convenience, the library also provides the function \verb"agcanon(s,h)".
This is identical to \verb"agcanonStr(s)", except \verb"s" can be any string. If \verb"h"
is \verb"TRUE", the canonicalization assumes \verb"s" is an HTML-like string.
\section{Expert-level Tweaks}
\label{sec:expertleveltweaks}
\subsection{Callbacks}
\label{subsec:callbacks}
There is a way to register client functions to be called
whenever graph objects are inserted into or deleted from a graph or subgraph,
or have their string attributes modified.
The arguments to the callback functions for insertion and
deletion (an \verb"agobjfn_t") are the containing (sub)graph, the object and a pointer
to a piece of state data supplied by the application.
The object update callback (an \verb"agobjupdfn_t") also
receives the data dictionary entry pointer for the
name-value pair that was changed.
The graph argument will be the root graph.
\begin{verbatim}
typedef void (*agobjfn_t)(Agraph_t*, Agobj_t*, void*);
typedef void (*agobjupdfn_t)(Agraph_t*, Agobj_t*,
void*, Agsym_t*);
struct Agcbdisc_s {
struct {
agobjfn_t ins;
agobjupdfn_t mod;
agobjfn_t del;
} graph, node, edge;
};
\end{verbatim}
Callback functions are installed by \verb"agpushdisc", which also takes
a pointer to the client data structure \verb"state" that is
later passed to the callback function when it is invoked.
\begin{verbatim}
agpushdisc(Agraph_t *g, Agcbdisc_t *disc, void *state);
\end{verbatim}
Callbacks are removed by \verb"agpopdisc", which deletes a
previously installed set of callbacks anywhere in the stack.
This function returns zero for success. (In real life, this
function isn't used much; generally callbacks are set up and
left alone for the lifetime of a graph.)
\begin{verbatim}
int agpopdisc(Agraph_t *g, Agcbdisc_t *disc);
\end{verbatim}
The default is for callbacks to be issued synchronously, but it is
possible to hold them in a queue of pending callbacks to be delivered
on demand. This feature is controlled by the interface:
\begin{verbatim}
/* returns previous value */
int agcallbacks(Agraph_t *g, int flag);
\end{verbatim}
If the flag is zero, callbacks are kept pending.
If the flag is one, pending callbacks are immediately issued,
and the graph is put into immediate callback mode.
(Therefore the state must be reset via \verb"agcallbacks"
if they are to be kept pending again.)
{\bf N.B.}: it is a small inconsistency that Cgraph depends on the client
to maintain the storage for the callback function structure.
(Thus it should probably not be allocated on the dynamic stack.)
The semantics of \verb"agpopdisc" currently identify callbacks by
the address of this structure so it would take a bit of reworking
to fix this. In practice, callback functions are usually passed
in a static struct.
\subsection{Disciplines}
\label{subsec:disciplines}
A graph has an associated set of methods (``disciplines'')
for file I/O and graph object ID assignment.
\begin{verbatim}
typedef struct {
Agiddisc_t *id;
Agiodisc_t *io;
} Agdisc_t;
\end{verbatim}
A pointer to an \verb"Agdisc_t" structure is used as an argument when
a graph is created or read using \verb"agopen",
\verb"agread" and \verb"agconcat". If the pointer is NULL, the default
Cgraph disciplines are used. An application can pass in its own disciplines
to override the defaults. Note that it doesn't need to provide disciplines for
all three fields. If any field is the NULL pointer, Cgraph will use the default
discipline for that task.
The default disciplines are also accessible by name.
\begin{verbatim}
Agiddisc_t AgIdDisc;
Agiodisc_t AgIoDisc;
Agdisc_t AgDefaultDisc;
\end{verbatim}
This is useful because, unlike with \verb"Agdisc_t", all of the fields of
specific disciplines must be non-NULL.
Cgraph copies the three individual pointers. Thus, these three structures must remain
allocated for the life of the graph, though the \verb"Agdisc_t" may be temporary.
\subsubsection{I/O management}
The I/O discipline is probably the most frequently used of the three, as the I/O requirements of
applications vary widely.
\begin{verbatim}
typedef struct Agiodisc_s {
int (*afread)(void *chan, char *buf, int bufsize);
int (*putstr)(void *chan, char *str);
int (*flush)(void *chan);
} Agiodisc_t ;
\end{verbatim}
The default I/O discipline uses stdio and the FILE structure for reading and writing. The functions
\verb"afread", \verb"putstr" and \verb"flush" should have semantics roughly equivalent to
\verb"fread", \verb"fputs" and \verb"fflush", with the obvious permutation of arguments.
The implementation of the \verb"agmemread" function of Cgraph provides a typical example of using
a tailored I/O discipline. The idea is to read a graph from a given string of characters. The
implementation of the function is given below.
The \verb"rdr_t" provides a miniature version of FILE, providing the necessary state information.
The function \verb"memiofread" fills the role of \verb"afread" using the state provided by \verb"rdr_t".
The \verb"agmemread" puts everything together, creating the needed state using the argument string,
and constructing a discipline structure using \verb"memiofread" and the defaults. It then calls
\verb"agread" with the state and discipline to actually create the graph.
\lgrindfile{agmemread.tex}
\subsubsection{ID management}
Graph objects (nodes, edges, subgraphs) use an uninterpreted long integer value as keys.
The ID discipline gives the application control over how these keys are allocated, and how
they are mapped to and from strings.
The ID discipline makes it possible for a Cgraph client
control this mapping. For instance, in one application, the client
may create IDs that are pointers into another object space defined by
a front-end interpreter. In general, the ID discipline must provide
a map between internal IDs and external strings.
\begin{verbatim}
typedef unsigned long ulong;
typedef struct Agiddisc_s {
void *(*open)(Agraph_t *g, Agdisc_t*);
long (*map)(void *state, int objtype, char *name,
ulong *id, int createflag);
long (*alloc)(void *state, int objtype, ulong id);
void (*free)(void *state, int objtype, ulong id);
char *(*print)(void *state, int objtype, ulong id);
void (*close)(void *state);
void (*idregister) (void *state, int objtype, void *obj);
} Agiddisc_t;
\end{verbatim}
The \verb"open" function permits the ID discipline to initialize any data structures that it maintains
per individual graph. Its return value is then passed as the first argument (\verb"void *state") to all
subsequent ID manager calls.
When the graph is closed, the discipline's \verb"close" function is called to allow for finalizing and
freeing any resources used.
The \verb"alloc" and \verb"free" functions explicitly create and destroy IDs.
The former is used by Cgraph to see if the given ID is available for use. If it is available, the function
should allocate it and return \verb"TRUE"; otherwise, it should return \verb"FALSE".
If it is not available, the calling function will abort the operation.
\verb"free" is called to inform the ID manager that the object labeled with the given ID is about to be
deleted.
If supported buy the ID discipline, the \verb"map" function is called
to convert a string name into an ID for a given object type
(AGRAPH, AGNODE, or AGEDGE), with an optional flag that tells if the
ID should be allocated if it does not already exist.
There are four cases:
\begin{description}
\item[{\tt name} \&\& {\tt createflag}]
Map the string (e.g., a name in a graph file)
into an ID. If the ID manager can comply, then it stores the result
in the \verb"id" parameter and returns \verb"TRUE". It is then also responsible for
being able to print the ID again as a string. Otherwise, the ID manager
may return \verb"FALSE" but it must implement the following case, at least
in order for the reading and writing of graph files to work.
\item[!{\tt name} \&\& {\tt createflag}]
Create a unique new ID. It may return FALSE if it does not support
anonymous objects, but this is strongly discouraged
in order for Cgraph to support ``local names'' in graph files.
\item[{\tt name} \&\& !{\tt createflag}]
Test if \verb"name" has already been mapped and allocated. If so, the ID
should be returned in the \verb"id" parameter.
Otherwise, the ID manager may
either return \verb"FALSE", or may store any unallocated ID into result.
This is convenient, for example, if names
are known to be digit strings that are directly converted into integer values.
\item[!{\tt name} \&\& !{\tt createflag}] Never used.
\end{description}
The \verb"print" function is called to convert an internal ID back to a string.
It is allowed to return a pointer to a static buffer;
a caller must copy its value if needed past subsequent calls.
NULL should be returned by ID managers that do not map names.
Note that the \verb"alloc" and \verb"map" functions do not receive pointers to any
newly created object. If a client needs to install object pointers in a handle table,
it can obtain them via new object callbacks (Section~\ref{subsec:callbacks}).
To make this mechanism accessible, Cgraph provides functions
to create objects by ID instead of external name:
\begin{verbatim}
Agnode_t *agidnode(Agraph_t *g, unsigned long id, int create);
Agedge_t *agidedge(Agraph_t *g, Agnode_t *t, Agnode_t *h,
unsigned long id, int create);
Agraph_t *agidsubg(Agraph_t *g, unsigned long id, int create);
\end{verbatim}
Note that, with the default ID discipline, these functions return NULL.
\subsection{Flattened node and edge lists}
For random access, nodes and edges are usually stored in splay trees.
This adds a small but noticeable overhead when traversing the ``lists.''
For flat-out efficiency,
there is a way of linearizing the splay trees in which node
and edge sets are stored, converting them into flat lists. After
this they can be traversed very quickly.
\section{Related Libraries}
\label{sec:relatedlibraries}
{\bf Libgraph} is a predecessor of Cgraph and is now considered
obsolete. All of the Graphviz code is now written using Cgraph.
As some older applications using libgraph
may need to be converted to Cgraph, we note some of the main
differences.
A key difference between the two libraries is the handling of
C data structure attributes. Libgraph hard-wires these
at the end of the graph, node and edge structuress. That is,
an application programmer defines the structs {\tt graphinfo}, {\tt nodeinfo}
and {\tt edgeinfo} before including {\tt graph.h}, and the library inquires of
the size of these structures at runtime so it can allocate graph
objects of the proper size. Because there is only one shot at
defining attributes, this design creates an impediment to writing
separate algorithm libraries.
The dynamic \verb"Agrec_t" structures, described in Section~\ref{sec:internalattributes},
allows each algorithm to attach its own required data structure.
As noted in Section~\ref{sec:edges}, edges are implemented slightly
differently in the two libraries, so comparison of edge pointers for
equality should be replaced with \verb"ageqedge" unless you are certain
that the two pointers have consistent types. As an example where problems
could arise, we have the following code:
\begin{verbatim}
void traverse(Agraph_t *g, Agedge_t *e0)
{
Agnode_t *n = aghead (e0);
Agedge_t *e;
for (e = agfstout(g, n); n; n = agnxtout(g, e)) {
if (e == e0) continue; /* avoid entry edge */
/* do something with e */
}
\end{verbatim}
If \verb"e0" is an in-edge (\verb"AGTYPE(e) == AGINEDGE"), the comparison
with \verb"e" will always fail, as the latter is an out-edge.
In Cgraph, the nesting of subgraphs forms a tree.
In libgraph, a subgraph can belong to more than one parent,
so they form a DAG (directed acyclic graph). Libgraph
actually represents this DAG as a special meta-graph
that is navigated by libgraph calls. After gaining
experience with libgraph, we decided this complexity
was not worth its cost. In libgraph, the code for traversing
subgraphs would have a form something like\label{subg}:
\begin{verbatim}
Agedge_t* me;
Agnode_t* mn;
Agraph_t* mg = g->meta_node->graph;
Agraph_t* subg;
for (me = agfstout(mg, g->meta_node); me; me = agnxtout(mg, me)) {
mn = me->head;
subg = agusergraph(mn);
/* use subgraph subg */
}
\end{verbatim}
The similar traversal using Cgraph would have the form
\begin{verbatim}
Agraph_t* subg;
for (subg = agfstsubg(g); subg; subg = agnxtsubg(subg)) {
/* use subgraph subg */
}
\end{verbatim}
Finally, there are some small syntactic differences in the APIs.
For example, in libgraph, the name of a node or graph and
the head and tail nodes of an edge are
directly accessible via pointers while in Cgraph, it is necessary to use
the functions \verb"agnameof", \verb"agtail" and \verb"aghead".
Libgraph tends to have separate functions for creating and finding
an object, or for handling different types of objects, e.g.,
\verb"agnode" and \verb"agfindnode". Instead, Cgraph will use a single function
with an additional parameter. Overall, the two libraries are vary close, both
syntactically and semantically, so conversion is fairly straightforward.
Tables~\ref{table:libgraph:g}--\ref{table:libgraph:a} list the common
constants and operations in
libgraph, and the corresponding value or procedure in Cgraph, if any.
\begin{savenotes}
\begin{table*}[htb]
\centering
\begin{tabular}{|l|l|} \hline
\multicolumn{1}{|c|}{\em libgraph} & \multicolumn{1}{c|}{\em Cgraph} \\ \hline
\verb"AGRAPH" & \verb"Agundirected" \\ \hline
\verb"AGRAPHSTRICT" & \verb"Agstrictundirected" \\ \hline
\verb"AGDIGRAPH" & \verb"Agdirected" \\ \hline
\verb"AGDIGRAPHSTRICT" & \verb"Agstrictdirected" \\ \hline
\verb"aginit" & not necessary\footnote{The Cgraph library has a function called \texttt{aginit}, but it has a different signature and semantics.} \\ \hline
\verb"agopen(name,type)" & \verb"agopen(name,type,NULL)"\\ \hline
\verb"agread(filep)" & \verb"agread(filep,NULL)"\\ \hline
\verb"agread_usergets(filep,gets)" \\
& \verb"Agiddisc_t id = AgIdDisc;" \\
& \verb"Agiodisc_t io = AgIoDisc;" \\
& \verb"io.afread = gets;"\footnote{Note that the order of the parameters differs from the \texttt{gets} used in libgraph.} \\
& \verb"agread(filep,disc);"\\ \hline
\verb"agsetiodisc" & no direct analogue; see above \\ \hline
\verb"obj->name" & \verb"agnameof(obj);" \\ \hline
\verb"graph->root" & \verb"agroot(graph);" \\ \hline
\verb"node->graph" & \verb"aggraphof(node);" \\ \hline
\verb"edge->head" & \verb"aghead(edge);" \\ \hline
\verb"edge->tail" & \verb"agtail(edge);" \\ \hline
\verb"AG_IS_DIRECTED(graph)" & \verb"agisdirected(graph)" \\ \hline
\verb"AG_IS_STRICT(graph)" & \verb"agisstrict(graph)" \\ \hline
\verb"agobjkind(obj)" & \verb"AGTYPE(obj)" \\ \hline
\verb"agsubg(parent,name)" & \verb"agsubg(parent,name,1)" \\ \hline
\verb"agfindsubg(parent,name)" & \verb"agsubg(parent,name,0)" \\ \hline
\verb"g->meta_node_graph" & \verb"agfstsubg/agnxtsubg" \\
\verb"agusergraph(graph)" & See the examples on Page~\pageref{subg}\\ \hline
\verb"aginsert(graph, obj)" & \verb"agsubnode(graph,obj);" if obj is a node \\
& \verb"agsubedge(graph,obj);" if obj is a edge \\
& not allowed if obj is a graph \\ \hline
\end{tabular}
\caption{Graph function conversions}
\label{table:libgraph:g}
\end{table*}
\begin{table*}[ht]
\centering
\begin{tabular}{|l|l|} \hline
\multicolumn{1}{|c|}{\em libgraph} & \multicolumn{1}{c|}{\em Cgraph} \\ \hline
\verb"agnode(graph,name)" & \verb"agnode(graph,name,1") \\ \hline
\verb"agfindnode(graph,name") & \verb"agnode(graph,name,0") \\ \hline
\verb"agedge(graph,tail,head") & \verb"agedge(graph,tail,head,NULL,1") \\ \hline
\verb"agfindedge(graph,tail,head") & \verb"agedge(graph,tail,head,NULL,0") \\ \hline
\end{tabular}
\caption{Node and edge function conversions}
\label{table:libgraph:n}
\end{table*}
\begin{table*}[ht]
\centering
\begin{tabular}{|l|l|} \hline
\multicolumn{1}{|c|}{\em libgraph} & \multicolumn{1}{c|}{\em Cgraph} \\ \hline
\verb"agprotograph()" & no analogue \\ \hline
\verb"agprotonode(graph)" & not used \\ \hline
\verb"agprotoedge(graph)" & not used \\ \hline
\verb"agattr(obj, name, default)" & \verb"agattr(agroot(obj),AGTYPE(obj),name,default)" \\ \hline
\verb"agfindattr(obj, name)"& \verb"agattrsym(obj,name)" \\ \hline
\verb"agraphattr(graph, name default)" & \verb"agattr(graph,AGRAPH,name,default)"\\ \hline
\verb"agnodeattr(graph, name, default)" & \verb"agattr(graph,AGNODE,name,default)"\\ \hline
\verb"agedgeattr(graph, name, default)" & \verb"agattr(graph,AGEDGE,name,default)"\\ \hline
\verb"agfstattr(obj)" & \verb"agnxtattr(agroot(obj),AGTYPE(obj),NULL)"\\ \hline
\verb"agnxtattr(obj,sym)" & \verb"agnxtattr(agroot(obj),AGTYPE(obj),sym)"\\ \hline
\verb"aglstattr(obj)" & no analogue\\ \hline
\verb"agprvattr(obj, sym)" & no analogue\\ \hline
\end{tabular}
\caption{Attribute function conversions}
\label{table:libgraph:a}
\end{table*}
{\bf Lgraph} is a successor to Cgraph, written in C++ by Gordon Woodhull.
It follows Cgraph's overall graph model (particularly, its subgraphs and
emphasis on efficient dynamic graph operations) but uses the C++ type
system of templates and inheritance to provide typesafe, modular and
efficient internal attributes. (LGraph relies on cdt for dictionary
sets, with an STL-like C++ interface layer.) A fairly mature prototype
of the Dynagraph system (a successor to dot and neato to handle
online maintenance of dynamic diagrams) has been prototyped in LGraph.
See the dgwin (Dynagraph for Windows) page
\url{https://www.dynagraph.org/dgwin/} for further details.
\section{Interfaces to Other Languages}
\label{sec:interfacetootherlanguages}
If enabled, the Graphviz package contains bindings for Cgraph in a variety
of languages, including Java, Perl, PHP, Tcl, Python and Ruby.
\section{Open Issues}
\label{sec:openissues}
{\bf Node and Edge Ordering.} The intent in Cgraph's design was to
eventually support user-defined node and edge set ordering, overriding
the default (which is object creation timestamp order). For example,
in topologically embedded graphs, edge lists could potentially be sorted
in clockwise order around nodes.
Because Cgraph assumes that all edge sets in the same \verb"Agraph_t"
have the same ordering, there should probably be a new primitive to
switching node or edge set ordering functions. Please contact the
author if you need this feature.
{\bf XML.} XML dialects such as GXL and GraphML have been proposed
for graphs. Although it is simple to encode nodes and edges in XML,
there are subtleties to representing more complicated structures, such as
Cgraph's subgraphs and nesting of attribute defaults.
On the other hand, GXL and GraphML provide notions of compound graphs,
which are not available in DOT.
We've prototyped an XML parser and would like to complete and release
this work if we had a compelling application. (Simple XML output of
graphs is not difficult to program.)
{\bf Graph views; external graphs.} At times it would be convenient
to relate one graph's objects to another's without making one into a
subgraph of another. At other times there is a need to populate
a graph from objects delivered on demand from a foreign API (such as a
relational database that stores graphs). We are now experimenting with
attacks on some of these problems.
%{\bf Additional primitives.} To be done: Object renaming, cloning, first-class
%nodes and edges, etc.
\end{savenotes}
\section{Example}
\label{sec:example}
The following is a simple Cgraph filter that reads a graph and
emits its strongly connected components, each as a separate graph,
plus an overview map of the relationships between the components.
To save space, the auxiliary functions in the header ingraph.h
are not shown; the entire program can be found in the Graphviz
source code release under \verb"cmd/tools".
About lines 32-41 are the declarations for internal records for
graphs and nodes. Line 43-48 define access macros
for fields in these records. Lines 52-83 define a simple stack
structure needed for the strongly connected component algorithm
and down to line 97 are some global definitions for the program.
The rest of the code can be read from back-to-front. From
around line 262 to the end is boilerplate code that handles
command-line arguments and opening multiple graph files.
The real work is done starting with the function {\it process}
about line 223, which works on one graph at a time. After initializing
the node and graph internal records using {it aginit}, it creates a new graph for the
overview map, and it calls {\it visit} on unvisited nodes to
find components. {\it visit} implements a standard algorithm to form
the next strongly connected component on a stack. When one has been
completed, a new subgraph is created and the nodes of the component
are installed. (There is an option to skip trivial components that
contain only one node.) {\it nodeInduce} is called to process the
out-edges of nodes in this subgraph. Such edges either belong to
the component (and are added to it), or else point to a node in
another component that must already have been processed.
%\newpage
\lgrindfile{sccmap.tex}
\end{document}
|