File: intro_to_creating_rdf.rst

package info (click to toggle)
rdflib 7.1.1-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 77,580 kB
  • sloc: python: 58,671; sh: 153; makefile: 88; ruby: 74; xml: 45
file content (201 lines) | stat: -rw-r--r-- 6,914 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
.. _intro_to_creating_rdf: 

====================
Creating RDF triples
====================

Creating Nodes
--------------

RDF data is a graph where the nodes are URI references, Blank Nodes or Literals. In RDFLib, these node types are
represented by the classes :class:`~rdflib.term.URIRef`, :class:`~rdflib.term.BNode`, and :class:`~rdflib.term.Literal`.
``URIRefs`` and ``BNodes`` can both be thought of as resources, such a person, a company, a website, etc.

* A ``BNode`` is a node where the exact URI is not known - usually a node with identity only in relation to other nodes.
* A ``URIRef`` is a node where the exact URI is known. In addition to representing some subjects and predicates in RDF graphs, ``URIRef``\s are always used to represent properties/predicates
* ``Literals`` represent object values, such as a name, a date, a number, etc. The most common literal values are XML data types, e.g. string, int... but custom types can be declared too

Nodes can be created by the constructors of the node classes:

.. code-block:: python

   from rdflib import URIRef, BNode, Literal

   bob = URIRef("http://example.org/people/Bob")
   linda = BNode()  # a GUID is generated

   name = Literal("Bob")  # passing a string
   age = Literal(24)  # passing a python int
   height = Literal(76.5)  # passing a python float

Literals can be created from Python objects, this creates ``data-typed literals``. For the details on the mapping see
:ref:`rdflibliterals`.

For creating many ``URIRefs`` in the same ``namespace``, i.e. URIs with the same prefix, RDFLib has the
:class:`rdflib.namespace.Namespace` class

::

   from rdflib import Namespace

   n = Namespace("http://example.org/people/")

   n.bob  # == rdflib.term.URIRef("http://example.org/people/bob")
   n.eve  # == rdflib.term.URIRef("http://example.org/people/eve")


This is very useful for schemas where all properties and classes have the same URI prefix. RDFLib defines Namespaces for
some common RDF/OWL schemas, including most W3C ones:

.. code-block:: python

    from rdflib.namespace import CSVW, DC, DCAT, DCTERMS, DOAP, FOAF, ODRL2, ORG, OWL, \
                               PROF, PROV, RDF, RDFS, SDO, SH, SKOS, SOSA, SSN, TIME, \
                               VOID, XMLNS, XSD

    RDF.type
    # == rdflib.term.URIRef("http://www.w3.org/1999/02/22-rdf-syntax-ns#type")

    FOAF.knows
    # == rdflib.term.URIRef("http://xmlns.com/foaf/0.1/knows")

    PROF.isProfileOf
    # == rdflib.term.URIRef("http://www.w3.org/ns/dx/prof/isProfileOf")

    SOSA.Sensor
    # == rdflib.term.URIRef("http://www.w3.org/ns/sosa/Sensor")


Adding Triples to a graph
-------------------------

We already saw in :doc:`intro_to_parsing`, how triples can be added from files and online locations with with the
:meth:`~rdflib.graph.Graph.parse` function.

Triples can also be added within Python code directly, using the :meth:`~rdflib.graph.Graph.add` function:

.. automethod:: rdflib.graph.Graph.add
    :noindex:

:meth:`~rdflib.graph.Graph.add` takes a 3-tuple (a "triple") of RDFLib nodes. Using the nodes and
namespaces we defined previously:

.. code-block:: python

    from rdflib import Graph, URIRef, Literal, BNode
    from rdflib.namespace import FOAF, RDF

    g = Graph()
    g.bind("foaf", FOAF)

    bob = URIRef("http://example.org/people/Bob")
    linda = BNode()  # a GUID is generated

    name = Literal("Bob")
    age = Literal(24)

    g.add((bob, RDF.type, FOAF.Person))
    g.add((bob, FOAF.name, name))
    g.add((bob, FOAF.age, age))
    g.add((bob, FOAF.knows, linda))
    g.add((linda, RDF.type, FOAF.Person))
    g.add((linda, FOAF.name, Literal("Linda")))

    print(g.serialize())


outputs: 

.. code-block:: Turtle

    @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

    <http://example.org/people/Bob> a foaf:Person ;
        foaf:age 24 ;
        foaf:knows [ a foaf:Person ;
                foaf:name "Linda" ] ;
        foaf:name "Bob" .

For some properties, only one value per resource makes sense (i.e they are *functional properties*, or have a
max-cardinality of 1). The :meth:`~rdflib.graph.Graph.set` method is useful for this:

.. code-block:: python

    from rdflib import Graph, URIRef, Literal
    from rdflib.namespace import FOAF

    g = Graph()
    bob = URIRef("http://example.org/people/Bob")

    g.add((bob, FOAF.age, Literal(42)))
    print(f"Bob is {g.value(bob, FOAF.age)}")
    # prints: Bob is 42

    g.set((bob, FOAF.age, Literal(43)))  # replaces 42 set above
    print(f"Bob is now {g.value(bob, FOAF.age)}")
    # prints: Bob is now 43


:meth:`rdflib.graph.Graph.value` is the matching query method. It will return a single value for a property, optionally
raising an exception if there are more.

You can also add triples by combining entire graphs, see :ref:`graph-setops`.


Removing Triples
----------------

Similarly, triples can be removed by a call to :meth:`~rdflib.graph.Graph.remove`:

.. automethod:: rdflib.graph.Graph.remove
    :noindex:

When removing, it is possible to leave parts of the triple unspecified (i.e. passing ``None``), this will remove all
matching triples:

.. code-block:: python

    g.remove((bob, None, None))  # remove all triples about bob


An example
----------

LiveJournal produces FOAF data for their users, but they seem to use
``foaf:member_name`` for a person's full name but ``foaf:member_name``
isn't in FOAF's namespace and perhaps they should have used ``foaf:name``

To retrieve some LiveJournal data, add a ``foaf:name`` for every
``foaf:member_name`` and then remove the ``foaf:member_name`` values to
ensure the data actually aligns with other FOAF data, we could do this:

.. code-block:: python

    from rdflib import Graph
    from rdflib.namespace import FOAF

    g = Graph()
    # get the data
    g.parse("http://danbri.livejournal.com/data/foaf")

    # for every foaf:member_name, add foaf:name and remove foaf:member_name
    for s, p, o in g.triples((None, FOAF['member_name'], None)):
        g.add((s, FOAF['name'], o))
        g.remove((s, FOAF['member_name'], o))

.. note:: Since rdflib 5.0.0, using ``foaf:member_name`` is somewhat prevented in RDFlib since FOAF is declared
    as a :meth:`~rdflib.namespace.ClosedNamespace` class instance that has a closed set of members and
    ``foaf:member_name`` isn't one of them! If LiveJournal had used RDFlib 5.0.0, an error would have been raised for
    ``foaf:member_name`` when the triple was created.


Creating Containers & Collections
---------------------------------
There are two convenience classes for RDF Containers & Collections which you can use instead of declaring each
triple of a Containers or a Collections individually:

    * :meth:`~rdflib.container.Container` (also ``Bag``, ``Seq`` & ``Alt``) and
    * :meth:`~rdflib.collection.Collection`

See their documentation for how.