1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
|
#!/usr/bin/awk -f
#****************************************************************************
# ## ## ##### ##### ## ** NoSQL RDBMS - ddsort *
# ### ## ####### ####### ## ** $Revision: 2.4.0 $ *
# #### ## ### ## ## ## ************************************
# ####### #### ##### ## ## ## ** Carlo Strozzi (c) 1998-2000 *
# ####### ###### ##### ## # ## ## ************************************
# ## #### ## ## ### ## ### ## ** Written by *
# ## ### ###### ####### ###### ###### ** Carlo Strozzi *
# ## ## #### ##### #### # ###### ** e-mail: carlos@linux.it *
#****************************************************************************
# NoSQL RDBMS, Copyright (C) 1998 Carlo Strozzi. *
# This program comes with ABSOLUTELY NO WARRANTY; for details *
# refer to the GNU General Public License. *
#****************************************************************************
# NOTE: to edit, set ts=8 in 'vi' (or equivalent)
# to print, pipe through 'pr -t -e8'
#****************************************************************************
# NAME
# ddsort - print column positions, in a form suitable for sort(1).
#
# SYNOPSIS
# ddsort < table
#
# Note: options must be passed through the environment
# variable _awk_args, i.e.:
#
# _awk_args='[options] [-bdfiMnr] column ...'
#
#
# DESCRIPTION
#
# Selects columns by name (and order) and outputs the list of their
# respective positions in the table, as found in the specified
# description file (Data Dictionary). The output format is especially
# meant for use by the sort(1) utility. Positional sort(1) options
# may be specified before each column specification. If a column name
# does not match any of the columns in table it is silently ignored.
# If no columns are specified, then the current sequence of fields as
# found in the description file is printed. An error code is returned
# if the data dictionary file cannot be read.
#
# OPTIONS
# -a|--add-missing
# If a column name does not match any of the columns in table,
# instead of ignoring it print a non existent field, i.e. a
# field with position NF+1.
#
# -l|--last
# If the input table contains duplicated column names
# pick the last occurrence of each. The default is to
# pick the first one. This is sometimes useful after
# the 'join' operator.
#
########################################################################
BEGIN {
NULL = "" ; FS = OFS = "\t"; split( ENVIRON["_awk_args"], args, " " )
while ( args[++i] != NULL )
{
if ( args[i] == "-a" || args[i] == "--add-missing" )
{
add_missing = 1
}
else if ( args[i] == "-l" || args[i] == "--last" ) pick_last = 1
else if ( args[i] ~ /^-/ )
{
col_args[++j] = args[i]
cols[j] = args[++i]
}
else cols[++j] = args[i]
}
# This is necessary only if using ARGV.
#j-- ; ARGC = 0
}
########################################################################
# Main loop
########################################################################
NR == 1 {
nf = NF
i = 0
while ( ++i <= NF )
{
if ( pick_last ) { P[$i] = i ; N[i] = $i }
else
{
if ( ! P[$i] ) { P[$i] = i ; N[i] = $i }
}
}
# If no columns were specified, then print all column positions.
if ( !j )
{
for ( j = 1; j <= NF; j++ ) cols[j] = N[j]
j--
}
i = 0
while ( ++i <= j )
{
if ( !P[cols[i]] )
{
if ( !add_missing ) continue
P[cols[i]] = ++nf
}
if ( k ) printf(" ")
if ( col_args[i] != NULL ) printf("%s ", col_args[i])
printf("+%d -%d", P[cols[i]] - 1, P[cols[i]] )
k++
}
}
NR > 1 { exit } # Skip the rest of the input table.
########################################################################
# End of program.
########################################################################
|