1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<!--Converted with LaTeX2HTML 99.2beta8 (1.46)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>K.3 The long and winding road</TITLE>
<META NAME="description" CONTENT="K.3 The long and winding road">
<META NAME="keywords" CONTENT="GMT_Docs">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="LaTeX2HTML v99.2beta8">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="GMT_Docs.css">
<LINK REL="next" HREF="node136.html">
<LINK REL="previous" HREF="node134.html">
<LINK REL="up" HREF="node132.html">
<LINK REL="next" HREF="node136.html">
</HEAD>
<BODY bgcolor="#ffffff">
<!--Navigation Panel-->
<A NAME="tex2html2951"
HREF="node136.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next" SRC="next.gif"></A>
<A NAME="tex2html2945"
HREF="node132.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" SRC="up.gif"></A>
<A NAME="tex2html2939"
HREF="node134.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" SRC="prev.gif"></A>
<A NAME="tex2html2947"
HREF="node1.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents" SRC="contents.gif"></A>
<A NAME="tex2html2949"
HREF="node149.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index" SRC="index.gif"></A>
<BR>
<B> Next:</B> <A NAME="tex2html2952"
HREF="node136.html">K.4 The Five Resolutions</A>
<B> Up:</B> <A NAME="tex2html2946"
HREF="node132.html">K. The GMT High-Resolution</A>
<B> Previous:</B> <A NAME="tex2html2940"
HREF="node134.html">K.2 Format required by</A>
  <B> <A NAME="tex2html2948"
HREF="node1.html">Contents</A></B>
  <B> <A NAME="tex2html2950"
HREF="node149.html">Index</A></B>
<BR>
<BR>
<!--End of Navigation Panel-->
<H1><A NAME="SECTION002630000000000000000">
K.3 The long and winding road</A>
</H1>
<P>
The WVS and WDB together represent more than 100 Mb of binary
data and something like 20 million data points. Hence, it
becomes obvious that any manipulation of these data must be
automated. For instance, the reasonable requirement that no
coastline should cross another coastline becomes a complicated
processing step.
<P>
<OL>
<LI>To begin, we first made sure that all data were ``clean'',
i.e. that there were no outliers and bad points. We had to
write several programs to ensure data consistency and remove
``spikes'' and bad points from the raw data. Also, crossing
segments were automatically ``trimmed'' provided only
a few points had to be deleted. A few hundred more complicated
cases had to be examined semi-manually.
<P>
</LI>
<LI>Programs were written to examine all the loose segments
and determine which segments should be joined to produce
polygons. Because not all segments joined exactly (there were
non-zero gaps between some segments) we had to find all possible
combinations and choose the simplest combinations.
The WVS segments joined to produce more than 200,000 polygons,
the largest being the Africa-Eurasia polygon which has 1.4
million points. The WDB data resulted in a smaller data base
(<IMG
WIDTH="16" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img163.gif"
ALT="$\sim$">25% of WVS).
<P>
</LI>
<LI>We now needed to combine the WVS and WDB data bases.
The main problem here is that we have duplicates of polygons:
most of the features in WVS are also in WDB. However, because
the resolution of the data differ it is nontrivial to figure
out which polygons in WDB to include and which ones to ignore.
We used two techniques to address this problem.
First, we looked for crossovers between all possible pairs of
polygons. Because of the crossover processing in step 1 above we know
that there are no remaining crossovers within WVS and WDB; thus
any crossovers would be between WVS and WDB polygons. Crossovers
could mean two things: (1) A slightly misplaced WDB polygon
crosses a more accurate WVS polygon, both representing the same
geographic feature, or (2) a misplaced WDB polygon (e.g. a small
coastal lake) crosses the accurate WVS shoreline. We distinguished
between these cases by comparing the area and centroid of the two
polygons. In almost all cases it was obvious when we had
duplicates; a few cases had to be checked manually. Second,
on many occasions the WDB duplicate polygon did not cross its
WVS counterpart but was either entirely inside or outside the
WVS polygon. In those cases we relied on the area-centroid tests.
<P>
</LI>
<LI>While the largest polygons were easy to identify by visual
inspection, the majority remain unidentified. Since it is
important to know whether a polygon is a continent or a small
pond inside an island inside a lake we wrote programs that would
determine the hierarchical level of each polygon. Here, level = 1
represents ocean/land boundaries, 2 is land/lakes borders, 3 is
lakes/islands-in-lakes, and 4 is islands-in-lakes/ponds-in-islands-in-lakes.
Level 4 was the highest level encountered in the data.
To automatically determine the hierarchical levels we wrote
programs that would compare all possible pairs of polygons
and find how many polygons a given polygon was inside. Because
of the size and number of the polygons such programs would
typically run for 3 days on a Sparc-2 workstation.
<P>
</LI>
<LI>Once we know what type a polygon is we can enforce a
common ``orientation'' for all polygons. We arranged them so
that when you move along a polygon from beginning to end, your
left hand is pointing toward ``land''. At this step we also
computed the area of all polygons since we would like the
option to plot only features that are bigger than a minimum
area to be specified by the user.
<P>
</LI>
<LI>Obviously, if you need to make a map of Denmark then
you do not want to read the entire 1.4 million points making
up the Africa-Eurasia polygon. Furthermore, most plotting
devices will not let you paint and fill a polygon of that size
due to memory restrictions. Hence, we need to partition the
polygons so that smaller subsets can be accessed rapidly.
Likewise, if you want to plot a world map on a letter-size paper
there is no need to plot 10 million data points as most of them
will plot several times on the same pixel and the operation
would take a very long time to complete. We chose to make 5
versions on the database, corresponding to different resolutions.
The decimation was carried out using the Douglas-Peucker (DP)
line-reduction algorithm<A NAME="tex2html714"
HREF="footnode.html#foot15598"><SUP>K.2</SUP></A>. We chose the
cutoffs so that each subset was approximately 20% the size of
the next higher resolution. The five resolutions are called
<B>f</B>ull, <B>h</B>igh, <B>i</B>ntermediate, <B>l</B>ow, and
<B>c</B>rude; they are accessed in <A NAME="tex2html718"
HREF="../pscoast.html"><I><B>pscoast</B></I></A><A NAME="15647"></A>, <A NAME="tex2html719"
HREF="../gmtselect.html"><I><B>gmtselect</B></I></A><A NAME="15656"></A>,
and <A NAME="tex2html720"
HREF="../grdlandmask.html"><I><B>grdlandmask</B></I></A><A NAME="15665"></A> with the <B>-D</B> option<A NAME="tex2html715"
HREF="footnode.html#foot15579"><SUP>K.3</SUP></A>. For each of
these 5 data sets (<B>f</B>, <B>h</B>, <B>i</B>, <B>l</B>, <B>c</B>)
we specified an equidistant grid (1<IMG
WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img61.gif"
ALT="$^{o}$">, 2<IMG
WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img61.gif"
ALT="$^{o}$">, 5<IMG
WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img61.gif"
ALT="$^{o}$">,
10<IMG
WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img61.gif"
ALT="$^{o}$">, 20<IMG
WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img61.gif"
ALT="$^{o}$">) and split all polygons into line-segments
that each fit inside one of the many boxes defined by these grid
lines. Thus, to paint the entire continent of Australia we
instead paint many smaller polygons made up of these line
segments and gridlines. Some book-keeping has to be done since
we need to know which parent polygon these smaller pieces came
from in order to prescribe the correct paint or ignore if the
feature is smaller than the cutoff specified by the user. The
resulting segment coordinates were then scaled to fit in short
integer format to preserve precision and written in netCDF format
for ultimate portability across hardware platforms<A NAME="tex2html716"
HREF="footnode.html#foot15599"><SUP>K.4</SUP></A>.
<P>
</LI>
<LI>While we are now back to a file of line-segments we are in
a much better position to create smaller polygons for painting.
Two problems must be overcome to correctly paint an area:
<P>
<UL>
<LI>We must be able to join line segments and grid cell borders
into meaningful polygons; how we do this will depend on whether
we want to paint the land or the oceans.
<P>
</LI>
<LI>We want to nest the polygons so that no paint falls on areas
that are ``wet'' (or ``dry''); e.g., if a grid cell completely on
land contains a lake with a small island, we do not want to paint
the lake and then draw the island, but paint the annulus or ``donut''
that is represented by the land and lake, and then plot the island.
<P>
</LI>
</UL>
<P>
<A NAME="tex2html721"
HREF="http://www.soest.hawaii.edu/gmt"><B>GMT</B></A> uses a polygon-assembly routine that carries out these
tasks on the fly.
<A NAME="15589"></A>
<A NAME="15590"></A>
<P>
</LI>
</OL>
<P>
<HR>
<!--Navigation Panel-->
<A NAME="tex2html2951"
HREF="node136.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next" SRC="next.gif"></A>
<A NAME="tex2html2945"
HREF="node132.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" SRC="up.gif"></A>
<A NAME="tex2html2939"
HREF="node134.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" SRC="prev.gif"></A>
<A NAME="tex2html2947"
HREF="node1.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents" SRC="contents.gif"></A>
<A NAME="tex2html2949"
HREF="node149.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index" SRC="index.gif"></A>
<BR>
<B> Next:</B> <A NAME="tex2html2952"
HREF="node136.html">K.4 The Five Resolutions</A>
<B> Up:</B> <A NAME="tex2html2946"
HREF="node132.html">K. The GMT High-Resolution</A>
<B> Previous:</B> <A NAME="tex2html2940"
HREF="node134.html">K.2 Format required by</A>
  <B> <A NAME="tex2html2948"
HREF="node1.html">Contents</A></B>
  <B> <A NAME="tex2html2950"
HREF="node149.html">Index</A></B>
<!--End of Navigation Panel-->
<ADDRESS>
Paul Wessel
2001-04-18
</ADDRESS>
</BODY>
</HTML>
|