1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
|
# Example loader for UK Ordnance Survey Mastermap Loader
# Inspired and functionally equivalent to https://github.com/AstunTechnology/Loader
#
# Main ETL, also to file (output_file) for test purposes
[etl]
chains = input_sql_pre|schema_name_filter|output_postgres,
input_big_gml_files|osmm_filter|xml_assembler|output_file,
input_big_gml_files|osmm_filter|xml_assembler|output_ogr2ogr
# Pre SQL file inputs to be executed
[input_sql_pre]
class = stetl.inputs.fileinput.StringFileInput
file_path = postgres/create-schema.sql,postgres/create-tables.sql,postgres/create-indexes.sql
# Generic filter to substitute Python-format string values like {schema} in string
[schema_name_filter]
class = stetl.filters.stringfilter.StringSubstitutionFilter
# format args {schema} is schema name
format_args = schema:{schema}
[output_postgres]
class = stetl.outputs.dboutput.PostgresDbOutput
database = {database}
host = {host}
port = {port}
user = {user}
password = {password}
schema = {schema}
# The source input file(s) from dir and produce osmm:*Member elements
# Namespaces are stripped as prep_osmgml.py uses non-namespaced xpath
[input_big_gml_files]
class = stetl.inputs.fileinput.XmlElementStreamerFileInput
file_path = {gml_files}
element_tags = topographicMember,boundaryMember,cartographicMember
strip_namespaces = True
# modifying filter for OSMM specific element manipulation
# prep_class is the specific Python class to actually modify single
# feature elements. Original code from AstunTech in prep_osmgml.py
[osmm_filter]
class = python.osmmfilter.OrdSurveyGmlFilter
prep_class = python.prep_osmgml.prep_osmm_topo
# Assembles etree docs gml:featureMember elements, each with "max_elements" elements
[xml_assembler]
class = stetl.filters.xmlassembler.XmlAssembler
max_elements = {max_features}
container_doc = <osgb:FeatureCollection
xmlns:osgb='http://www.ordnancesurvey.co.uk/xml/namespaces/osgb'
xmlns:gml='http://www.opengis.net/gml'
xmlns:xlink='http://www.w3.org/1999/xlink'
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xsi:schemaLocation='http://www.ordnancesurvey.co.uk/xml/namespaces/osgb http://www.ordnancesurvey.co.uk/xml/schema/v7/OSDNFFeatures.xsd'
fid='GDS-58116-1'>
<gml:description>Ordnance Survey, (c) Crown Copyright. All rights reserved, 2009-07-30</gml:description>
<gml:boundedBy>
<gml:null>unknown</gml:null>
</gml:boundedBy>
<osgb:queryTime>2009-07-30T02:35:17</osgb:queryTime>
<osgb:queryExtent>
<osgb:Rectangle srsName='osgb:BNG'>
<gml:coordinates>291000.000,92000.000 293000.000,94000.000</gml:coordinates>
</osgb:Rectangle>
</osgb:queryExtent>
</osgb:FeatureCollection >
element_container_tag = FeatureCollection
# The ogr2ogr command-line, may use any output here, as long as
# the input is a GML file. The "temp_file" is where etree-docs
# are saved. It has to be the same file as in the ogr2ogr command.
# TODO: find a way to use a GML-stream through stdin to ogr2ogr
[output_ogr2ogr]
class = stetl.outputs.ogroutput.Ogr2OgrOutput
temp_file = {temp_dir}/osmm-tmp.gml
gfs_file = gfs/osmm_topo_postgres.gfs
# lco will only be added to ogr2ogr on first run
# lco = -lco LAUNDER=YES -lco PRECISION=NO
ogr2ogr_cmd = ogr2ogr
-append -skipfailures
-t_srs EPSG:27700
-f PostgreSQL
"PG:dbname={database} host={host} port={port} user={user} password={password} active_schema={schema}"
{temp_dir}/osmm-tmp.gml
--config GML_EXPOSE_FID NO
--config PG_USE_COPY YES
# Below Alternative outputs for testing
# Send to stdout
[output_std]
class = stetl.outputs.standardoutput.StandardXmlOutput
[output_file]
class = stetl.outputs.fileoutput.FileOutput
file_path = output/osmm_topo_prepared.gml
|