File: node135.html

package info (click to toggle)
gmt-doc 3.4-1
  • links: PTS
  • area: main
  • in suites: woody
  • size: 4,756 kB
  • ctags: 1,800
  • sloc: makefile: 30
file content (274 lines) | stat: -rw-r--r-- 11,009 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

<!--Converted with LaTeX2HTML 99.2beta8 (1.46)
original version by:  Nikos Drakos, CBLU, University of Leeds
* revised and updated by:  Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
  Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>K.3 The long and winding road</TITLE>
<META NAME="description" CONTENT="K.3 The long and winding road">
<META NAME="keywords" CONTENT="GMT_Docs">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="LaTeX2HTML v99.2beta8">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">

<LINK REL="STYLESHEET" HREF="GMT_Docs.css">

<LINK REL="next" HREF="node136.html">
<LINK REL="previous" HREF="node134.html">
<LINK REL="up" HREF="node132.html">
<LINK REL="next" HREF="node136.html">
</HEAD>

<BODY  bgcolor="#ffffff">
<!--Navigation Panel-->
<A NAME="tex2html2951"
  HREF="node136.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next" SRC="next.gif"></A> 
<A NAME="tex2html2945"
  HREF="node132.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" SRC="up.gif"></A> 
<A NAME="tex2html2939"
  HREF="node134.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" SRC="prev.gif"></A> 
<A NAME="tex2html2947"
  HREF="node1.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents" SRC="contents.gif"></A> 
<A NAME="tex2html2949"
  HREF="node149.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index" SRC="index.gif"></A> 
<BR>
<B> Next:</B> <A NAME="tex2html2952"
  HREF="node136.html">K.4 The Five Resolutions</A>
<B> Up:</B> <A NAME="tex2html2946"
  HREF="node132.html">K. The GMT High-Resolution</A>
<B> Previous:</B> <A NAME="tex2html2940"
  HREF="node134.html">K.2 Format required by</A>
 &nbsp <B>  <A NAME="tex2html2948"
  HREF="node1.html">Contents</A></B> 
 &nbsp <B>  <A NAME="tex2html2950"
  HREF="node149.html">Index</A></B> 
<BR>
<BR>
<!--End of Navigation Panel-->

<H1><A NAME="SECTION002630000000000000000">
K.3 The long and winding road</A>
</H1> 

<P>
The WVS and WDB together represent more than 100 Mb of binary
data and something like 20 million data points.  Hence, it
becomes obvious that any manipulation of these data must be
automated.  For instance, the reasonable requirement that no
coastline should cross another coastline becomes a complicated
processing step.

<P>

<OL>
<LI>To begin, we first made sure that all data were ``clean'',
i.e. that there were no outliers and bad points.  We had to
write several programs to ensure data consistency and remove
``spikes'' and bad points from the raw data.  Also, crossing
segments were automatically ``trimmed'' provided only
a few points had to be deleted.  A few hundred more complicated
cases had to be examined semi-manually.

<P>
</LI>
<LI>Programs were written to examine all the loose segments
and determine which segments should be joined to produce
polygons.  Because not all segments joined exactly (there were
non-zero gaps between some segments) we had to find all possible
combinations and choose the simplest combinations.
The WVS segments joined to produce more than 200,000 polygons,
the largest being the Africa-Eurasia polygon which has 1.4
million points.  The WDB data resulted in a smaller data base
(<IMG
 WIDTH="16" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
 SRC="img163.gif"
 ALT="$\sim$">25% of WVS).

<P>
</LI>
<LI>We now needed to combine the WVS and WDB data bases.
The main problem here is that we have duplicates of polygons:
most of the features in WVS are also in WDB.  However, because
the resolution of the data differ it is nontrivial to figure
out which polygons in WDB to include and which ones to ignore.
We used two techniques to address this problem.
First, we looked for crossovers between all possible pairs of
polygons.  Because of the crossover processing in step 1 above we know
that there are no remaining crossovers within WVS and WDB; thus
any crossovers would be between WVS and WDB polygons.  Crossovers
could mean two things: (1) A slightly misplaced WDB polygon
crosses a more accurate WVS polygon, both representing the same
geographic feature, or (2) a misplaced WDB polygon (e.g. a small
coastal lake) crosses the accurate WVS shoreline.  We distinguished
between these cases by comparing the area and centroid of the two
polygons.  In almost all cases it was obvious when we had
duplicates; a few cases had to be checked manually.  Second,
on many occasions the WDB duplicate polygon did not cross its
WVS counterpart but was either entirely inside or outside the
WVS polygon.  In those cases we relied on the area-centroid tests.

<P>
</LI>
<LI>While the largest polygons were easy to identify by visual
inspection, the majority remain unidentified.  Since it is
important to know whether a polygon is a continent or a small
pond inside an island inside a lake we wrote programs that would
determine the hierarchical level of each polygon.  Here, level&nbsp;=&nbsp;1
represents ocean/land boundaries, 2 is land/lakes borders, 3 is
lakes/islands-in-lakes, and 4 is islands-in-lakes/ponds-in-islands-in-lakes.
Level 4 was the highest level encountered in the data.
To automatically determine the hierarchical levels we wrote
programs that would compare all possible pairs of polygons
and find how many polygons a given polygon was inside.  Because
of the size and number of the polygons such programs would
typically run for 3 days on a Sparc-2 workstation.

<P>
</LI>
<LI>Once we know what type a polygon is we can enforce a
common ``orientation'' for all polygons. We arranged them so
that when you move along a polygon from beginning to end, your
left hand is pointing toward ``land''.  At this step we also
computed the area of all polygons since we would like the
option to plot only features that are bigger than a minimum
area to be specified by the user.

<P>
</LI>
<LI>Obviously, if you need to make a map of Denmark then
you do not want to read the entire 1.4 million points making
up the Africa-Eurasia polygon.  Furthermore, most plotting
devices will not let you paint and fill a polygon of that size
due to memory restrictions.  Hence, we need to partition the
polygons so that smaller subsets can be accessed rapidly.
Likewise, if you want to plot a world map on a letter-size paper
there is no need to plot 10 million data points as most of them
will plot several times on the same pixel and the operation
would take a very long time to complete.  We chose to make 5
versions on the database, corresponding to different resolutions.
The decimation was carried out using the Douglas-Peucker (DP)
line-reduction algorithm<A NAME="tex2html714"
  HREF="footnode.html#foot15598"><SUP>K.2</SUP></A>.  We chose the
cutoffs so that each subset was approximately 20% the size of
the next higher resolution.  The five resolutions are called
<B>f</B>ull, <B>h</B>igh, <B>i</B>ntermediate, <B>l</B>ow, and
<B>c</B>rude; they are accessed in <A NAME="tex2html718"
  HREF="../pscoast.html"><I><B>pscoast</B></I></A><A NAME="15647"></A>, <A NAME="tex2html719"
  HREF="../gmtselect.html"><I><B>gmtselect</B></I></A><A NAME="15656"></A>,
and <A NAME="tex2html720"
  HREF="../grdlandmask.html"><I><B>grdlandmask</B></I></A><A NAME="15665"></A> with the <B>-D</B> option<A NAME="tex2html715"

HREF="footnode.html#foot15579"><SUP>K.3</SUP></A>.  For each of
these 5 data sets (<B>f</B>, <B>h</B>, <B>i</B>, <B>l</B>, <B>c</B>)
we specified an equidistant grid (1<IMG
 WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
 SRC="img61.gif"
 ALT="$^{o}$">, 2<IMG
 WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
 SRC="img61.gif"
 ALT="$^{o}$">, 5<IMG
 WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
 SRC="img61.gif"
 ALT="$^{o}$">,
10<IMG
 WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
 SRC="img61.gif"
 ALT="$^{o}$">, 20<IMG
 WIDTH="11" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
 SRC="img61.gif"
 ALT="$^{o}$">) and split all polygons into line-segments
that each fit inside one of the many boxes defined by these grid
lines.  Thus, to paint the entire continent of Australia we
instead paint many smaller polygons made up of these line
segments and gridlines.  Some book-keeping has to be done since
we need to know which parent polygon these smaller pieces came
from in order to prescribe the correct paint or ignore if the
feature is smaller than the cutoff specified by the user.  The
resulting segment coordinates were then scaled to fit in short
integer format to preserve precision and written in netCDF format
for ultimate portability across hardware platforms<A NAME="tex2html716"
  HREF="footnode.html#foot15599"><SUP>K.4</SUP></A>.

<P>
</LI>
<LI>While we are now back to a file of line-segments we are in
a much better position to create smaller polygons for painting.
Two problems must be overcome to correctly paint an area:

<P>

<UL>
<LI>We must be able to join line segments and grid cell borders
into meaningful polygons; how we do this will depend on whether
we want to paint the land or the oceans.

<P>
</LI>
<LI>We want to nest the polygons so that no paint falls on areas
that are ``wet'' (or ``dry''); e.g., if a grid cell completely on
land contains a lake with a small island, we do not want to paint
the lake and then draw the island, but paint the annulus or ``donut''
that is represented by the land and lake, and then plot the island.

<P>
</LI>
</UL>

<P>
<A NAME="tex2html721"
  HREF="http://www.soest.hawaii.edu/gmt"><B>GMT</B></A> uses a polygon-assembly routine that carries out these
tasks on the fly.
<A NAME="15589"></A>
<A NAME="15590"></A>

<P>
</LI>
</OL> 

<P>
<HR>
<!--Navigation Panel-->
<A NAME="tex2html2951"
  HREF="node136.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next" SRC="next.gif"></A> 
<A NAME="tex2html2945"
  HREF="node132.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" SRC="up.gif"></A> 
<A NAME="tex2html2939"
  HREF="node134.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" SRC="prev.gif"></A> 
<A NAME="tex2html2947"
  HREF="node1.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents" SRC="contents.gif"></A> 
<A NAME="tex2html2949"
  HREF="node149.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index" SRC="index.gif"></A> 
<BR>
<B> Next:</B> <A NAME="tex2html2952"
  HREF="node136.html">K.4 The Five Resolutions</A>
<B> Up:</B> <A NAME="tex2html2946"
  HREF="node132.html">K. The GMT High-Resolution</A>
<B> Previous:</B> <A NAME="tex2html2940"
  HREF="node134.html">K.2 Format required by</A>
 &nbsp <B>  <A NAME="tex2html2948"
  HREF="node1.html">Contents</A></B> 
 &nbsp <B>  <A NAME="tex2html2950"
  HREF="node149.html">Index</A></B> 
<!--End of Navigation Panel-->
<ADDRESS>
Paul Wessel
2001-04-18
</ADDRESS>
</BODY>
</HTML>