1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Draft//EN">
<HTML>
<HEAD>
<TITLE>NoSQL: Generating or modifing rdbtables</TITLE>
</HEAD>
<BODY>
<A HREF="NoSQL-4.html">Previous</A>
<A HREF="NoSQL-6.html">Next</A>
<A HREF="NoSQL.html#toc5">Contents</A>
<HR>
<H2><A NAME="s5">5. Generating or modifing rdbtables</A> </H2>
<H2><A NAME="ss5.1">5.1 Generating new rdbtables</A>
</H2>
<P>Any editor may be used to construct or modify an rdbtable,
since it is a regular UNIX file, and this 'direct editing'
method is occasionally used, especially for small amounts
of data. However, avoid using an editor that destroys
TAB characters.
<P>To generate a new rdbtable the best plan (and usually the
safest one) is to first generate a template file, then
convert it to rdbtable format and add the rows of data.
Any convenient editor may be used to generate a template
file. To convert it to an rdbtable the command 'nsq-headchg
-gen' may be used, which will produce an empty rdbtable.
Next use the operator 'nsq-ed' to edit in rows of data.
<P>A typical template file is shown below:
<P>
<BLOCKQUOTE><CODE>
<PRE>
# These are lines of table documentation. They can be of any length,
# and any number of such lines may exist.
# Each line must start correctly, e.g with "# " or " #". Any number of
# space characters may preceed the sharp sign in the second case above.
0 Name 24 Name of item
1 Type 1 Type: 1,2,3,7,9,A,B,X
2 Count 3N Number of items
3 K 1 Constant modifier
4 SS7 2 Special status for type 7
5 Size 12N In Kilobytes
</PRE>
</CODE></BLOCKQUOTE>
<P>It makes sense to have all significant or critical
documentation about an rdbtable embedded in the rdbtable,
rather than in some other place. The above template file
contains the usual elements to describe a table of six
columns: table documentation (the comment lines that
each start with a sharp sign '#'), index number (the
first number on each of the column lines), column name
("Name", "Type", "Count", ...), column definition ("24",
"1", "3N", ...), and column documentation for each column
(the text at the end of each column line).
<P>Note that the index number, column name, and column
definition all consist of contiguous characters, each
forming a word separated by whitespace. Also note that
there is one or more space characters after the column
definition and before the column documentation. That is,
the column documentation starts with the fourth word on
the line.
<P>When the template file is converted into an rdbtable,
all documentation will remain in the header (although
the column documentation may be hard to read if there are
many columns). At any time the entire header, including
documentation, can be viewed by using the command 'nsq-valid
-templ < rdbtable' (or 'nsq-headchg -templ < rdbtable). The
output from either command will be essentially like the
above example.
<P>
<H2><A NAME="ss5.2">5.2 Modifing existing rdbtables</A>
</H2>
<P>Basically there are two ways to modify an existing
rdbtable: Use either 'nsq-ed', or 'nsq-merge'.
<P>The operator 'nsq-ed' can be used to add new rows, change
existing rows, or delete existing rows of data in an
rdbtable. To modify an rdbtable 'nsq-ed' can be used in
either column or list form. The choice of form to use
depends somewhat on the structure of the rdbtable. If the
rdbtable has several columns of relatively narrow data
(that will all fit in the width of the current window
or terminal) and also several very wide columns (none
of which will fit) and changes need to be made to some
of the narrow columns, then it makes sense to use 'nsq-ed'
on the desired narrow columns in 'column' form, as in:
<P>
<BLOCKQUOTE><CODE>
<PRE>
table narrow_cola narrow_colb ...
</PRE>
</CODE></BLOCKQUOTE>
<P>If changes need to be made to some of the wide columns
then use 'nsq-ed' in 'list' form on the wide columns,
plus any key columns necessary, as in:
<P>
<BLOCKQUOTE><CODE>
<PRE>
nsq-ed -list table control_col ... wide_cola wide_colb ...
</PRE>
</CODE></BLOCKQUOTE>
<P>After editing an rdbtable it is always recommended that
the structure of the rdbtable be checked with the operator
'nsq-valid'. If there are data values that are longer than
the defined column width, use the command 'nsq-valid -w'
to cause a more verbose output.
<P>The 'nsq-merge' process actually involves other operators,
like 'nsq-search' and 'nsq-ed', and works only when the existing
rdbtable is sorted on one or more columns (which is a
fairly common case). The process includes selecting
rows from an existing sorted rdbtable (using 'nsq-search')
into a small rdbtable which is easy to edit (using
'nsq-ed') and then combining the two rdbtables again (using
'nsq-merge'). Since 'nsq-ed' is used modifications may include
changes, additions, or delitions of rows. Also note that
'nsq-merge' keeps the final table in sort order.
<P>The difference is that 'nsq-search' is much faster than 'nsq-row'
or 'nsq-ed', the editing is done on a table of conveniently
small size, and that the 'nsq-merge' operation can be
done in the background. Remember that whether one uses
'nsq-merge' or 'nsq-ed', putting the data back together after
editing requires the entire original table to be passed,
which can take some time if the original rdbtable is large.
<P>
<H2><A NAME="ss5.3">5.3 Concatenating rdbtables</A>
</H2>
<P>The need to concatenate rdbtables comes up every
so often and although it is simple to do it may not
be obvious. The UNIX 'cat' command can not be used
as it would result in duplicating the header and
thus make an invalid rdbtable. And of course, only
rdbtables with the same header should be concatenated,
otherwise an invalid rdbtable would result (in
this case it could be a gross inconsistency if the number
of columns were different). If we have two rdbtables,
TABA and TABB, then to concatenate TABB onto the end of
TABA we use the command:
<P>
<BLOCKQUOTE><CODE>
<PRE>
nsq-headchg -del < TABB >> TABA
</PRE>
</CODE></BLOCKQUOTE>
<P>Note that this avoids duplicating the header.
Note also that in this case the operator 'nsq-headchg'
does not use a template file.
<P>Note also that the operator 'nsq-merge' may be used to merge
two like rdbtables based on a key of one or more columns.
In this case however the two rdbtables must be sorted on
the key.
<P>
<HR>
<A HREF="NoSQL-4.html">Previous</A>
<A HREF="NoSQL-6.html">Next</A>
<A HREF="NoSQL.html#toc5">Contents</A>
</BODY>
</HTML>
|