File: NoSQL-5.html

package info (click to toggle)
nosql 0.9-0
links: PTS
area: main
in suites: hamm
size: 1,364 kB
ctags: 225
sloc: perl: 3,766; sh: 476; makefile: 41
file content (163 lines) | stat: -rw-r--r-- 6,661 bytes
parent folder | download | duplicates (2)
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Draft//EN">
<HTML>
<HEAD>
<TITLE>NoSQL: Generating or modifing rdbtables</TITLE>
</HEAD>
<BODY>
<A HREF="NoSQL-4.html">Previous</A>
<A HREF="NoSQL-6.html">Next</A>
<A HREF="NoSQL.html#toc5">Contents</A>
<HR>
<H2><A NAME="s5">5. Generating or modifing rdbtables</A>  </H2>

<H2><A NAME="ss5.1">5.1 Generating new rdbtables</A>
        </H2>

<P>Any editor may be used to construct or modify an rdbtable,
since it is a regular UNIX file, and this 'direct editing'
method is occasionally used, especially for small amounts
of data.  However, avoid using an editor that destroys
TAB characters.
<P>To generate a new rdbtable the best plan (and usually the
safest one) is to first generate a template file, then
convert it to rdbtable format and add the rows of data.
Any convenient editor may be used to generate a template
file.  To convert it to an rdbtable the command 'nsq-headchg
-gen' may be used, which will produce an empty rdbtable.
Next use the operator 'nsq-ed' to edit in rows of data.
<P>A typical template file is shown below:
<P>
<BLOCKQUOTE><CODE>
<PRE>

      # These are lines of table documentation. They can be of any length,
      # and any number of such lines may exist.
      # Each line must start correctly, e.g with &quot;# &quot; or &quot; #&quot;. Any number of
      # space characters may preceed the sharp sign in the second case above.
      0       Name  24   Name of item
      1       Type   1   Type: 1,2,3,7,9,A,B,X
      2      Count   3N  Number of items
      3          K   1   Constant modifier
      4        SS7   2   Special status for type 7
      5       Size  12N  In Kilobytes
    
        
</PRE>
</CODE></BLOCKQUOTE>
<P>It makes sense to have all significant or critical
documentation about an rdbtable embedded in the rdbtable,
rather than in some other place.  The above template file
contains the usual elements to describe a table of six
columns: table documentation (the comment lines that
each start with a sharp sign '#'), index number (the
first number on each of the column lines), column name
("Name", "Type", "Count", ...), column definition ("24",
"1", "3N", ...), and column documentation for each column
(the text at the end of each  column line).
<P>Note that the index number, column name, and column
definition all consist of contiguous characters, each
forming a word separated by whitespace.  Also note that
there is one or more space characters after the column
definition and before the column documentation.  That is,
the column documentation starts with the fourth word on
the line.
<P>When the template file is converted into an rdbtable,
all documentation will remain in the header (although
the column documentation may be hard to read if there are
many columns).  At any time the entire header, including
documentation, can be viewed by using the command 'nsq-valid
-templ &lt; rdbtable' (or 'nsq-headchg -templ &lt; rdbtable). The
output from either command will be essentially like the
above example.
<P>
<H2><A NAME="ss5.2">5.2 Modifing existing rdbtables</A>
    </H2>

<P>Basically there are two ways to modify an existing
rdbtable:  Use either 'nsq-ed', or 'nsq-merge'.
<P>The operator 'nsq-ed' can be used to add new rows, change
existing rows, or delete existing rows of data in an
rdbtable.  To modify an rdbtable 'nsq-ed' can be used in
either column or list form.  The choice of form to use
depends somewhat on the structure of the rdbtable.  If the
rdbtable has several columns of relatively narrow data
(that will all fit in the width of the current window
or terminal) and also several very wide columns (none
of which will fit) and changes need to be made to some
of the narrow columns, then it makes sense to use 'nsq-ed'
on the desired narrow columns in 'column' form, as in:
<P>
<BLOCKQUOTE><CODE>
<PRE>
      table  narrow_cola  narrow_colb  ...
        
</PRE>
</CODE></BLOCKQUOTE>
<P>If changes need to be made to some of the wide columns
then use 'nsq-ed' in 'list' form on the wide columns,
plus any key columns necessary, as in:
<P>
<BLOCKQUOTE><CODE>
<PRE>
      nsq-ed  -list  table  control_col  ...  wide_cola  wide_colb  ...
        
</PRE>
</CODE></BLOCKQUOTE>
<P>After editing an rdbtable it is always recommended that
the structure of the rdbtable be checked with the operator
'nsq-valid'.  If there are data values that are longer than
the defined column width, use the command 'nsq-valid -w'
to cause a more verbose output.
<P>The 'nsq-merge' process actually involves other operators,
like 'nsq-search' and 'nsq-ed', and works only when the existing
rdbtable is sorted on one or more columns (which is a
fairly common case).  The process includes selecting
rows from an existing sorted rdbtable (using 'nsq-search')
into a small rdbtable which is easy to edit (using
'nsq-ed') and then combining the two rdbtables again (using
'nsq-merge'). Since 'nsq-ed' is used modifications may include
changes, additions, or delitions of rows.  Also note that
'nsq-merge' keeps the final table in sort order.
<P>The difference is that 'nsq-search' is much faster than 'nsq-row'
or 'nsq-ed', the editing is done on a table of conveniently
small size, and that the 'nsq-merge' operation can be
done in the background.  Remember that whether one uses
'nsq-merge' or 'nsq-ed', putting the data back together after
editing requires the entire original table to be passed,
which can take some time if the original rdbtable is large.
<P>
<H2><A NAME="ss5.3">5.3 Concatenating rdbtables</A>
        </H2>

<P>The need to concatenate rdbtables comes up every
so often and although it is simple to do it may not
be obvious.  The UNIX 'cat' command can not be used
as it would result in duplicating the header and
thus make an invalid rdbtable.  And of course, only
rdbtables with the same header should be concatenated,
otherwise an invalid rdbtable would result (in
this case it could be a gross inconsistency if the number
of columns were different).  If we have two rdbtables,
TABA and TABB, then to concatenate TABB onto the end of
TABA we use the command:
<P>
<BLOCKQUOTE><CODE>
<PRE>
      nsq-headchg -del &lt; TABB &gt;&gt; TABA
        
</PRE>
</CODE></BLOCKQUOTE>
<P>Note that this avoids duplicating the header.
Note also that in this case the operator 'nsq-headchg'
does not  use a template file.
<P>Note also that the operator 'nsq-merge' may be used to merge
two like rdbtables based on a key of one or more columns.
In this case however the two rdbtables must be sorted on
the key.
<P>
<HR>
<A HREF="NoSQL-4.html">Previous</A>
<A HREF="NoSQL-6.html">Next</A>
<A HREF="NoSQL.html#toc5">Contents</A>
</BODY>
</HTML>