File: coreprops.rst

package info (click to toggle)
python-docx 0.8.11%2Bdfsg1-5
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 6,640 kB
  • sloc: xml: 25,311; python: 21,911; makefile: 168
file content (199 lines) | stat: -rw-r--r-- 7,557 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199

Core Document Properties
========================

The Open XML format provides for a set of descriptive properties to be
maintained with each document. One of these is the *core file properties*.
The core properties are common to all Open XML formats and appear in
document, presentation, and spreadsheet files. The 'Core' in core document
properties refers to `Dublin Core`_, a metadata standard that defines a core
set of elements to describe resources.

The core properties are described in Part 2 of the ISO/IEC 29500 spec, in
Section 11. The names of some core properties in |docx| are changed from
those in the spec to conform to the MS API.

Other properties such as company name are custom properties, held in
``app.xml``.


Candidate Protocol
------------------

::

    >>> document = Document()
    >>> core_properties = document.core_properties
    >>> core_properties.author
    'python-docx'
    >>> core_properties.author = 'Brian'
    >>> core_properties.author
    'Brian'


Properties
----------

15 properties are supported. All unicode values are limited to 255 characters
(not bytes).

author *(unicode)*
    Note: named 'creator' in spec. An entity primarily responsible for making
    the content of the resource. (Dublin Core)

category *(unicode)*
    A categorization of the content of this package. Example values for this
    property might include: Resume, Letter, Financial Forecast, Proposal,
    Technical Presentation, and so on. (Open Packaging Conventions)

comments *(unicode)*
    Note: named 'description' in spec. An explanation of the content of the
    resource. Values might include an abstract, table of contents, reference
    to a graphical representation of content, and a free-text account of the
    content. (Dublin Core)

content_status *(unicode)*
    The status of the content. Values might include “Draft”, “Reviewed”, and
    “Final”. (Open Packaging Conventions)

created *(datetime)*
    Date of creation of the resource. (Dublin Core)

identifier *(unicode)*
    An unambiguous reference to the resource within a given context.
    (Dublin Core)

keywords *(unicode)*
    A delimited set of keywords to support searching and indexing. This is
    typically a list of terms that are not available elsewhere in the
    properties. (Open Packaging Conventions)

language *(unicode)*
    The language of the intellectual content of the resource. (Dublin Core)

last_modified_by *(unicode)*
    The user who performed the last modification. The identification is
    environment-specific. Examples include a name, email address, or employee
    ID. It is recommended that this value be as concise as possible.
    (Open Packaging Conventions)

last_printed *(datetime)*
    The date and time of the last printing. (Open Packaging Conventions)

modified *(datetime)*
    Date on which the resource was changed. (Dublin Core)

revision *(int)*
    The revision number. This value might indicate the number of saves or
    revisions, provided the application updates it after each revision.
    (Open Packaging Conventions)

subject *(unicode)*
    The topic of the content of the resource. (Dublin Core)

title *(unicode)*
    The name given to the resource. (Dublin Core)

version *(unicode)*
    The version designator. This value is set by the user or by the
    application. (Open Packaging Conventions)


Specimen XML
------------

.. highlight:: xml

core.xml produced by Microsoft Word::

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <cp:coreProperties
        xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
        xmlns:dc="http://purl.org/dc/elements/1.1/"
        xmlns:dcterms="http://purl.org/dc/terms/"
        xmlns:dcmitype="http://purl.org/dc/dcmitype/"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <dc:title>Core Document Properties Exploration</dc:title>
      <dc:subject>PowerPoint core document properties</dc:subject>
      <dc:creator>Steve Canny</dc:creator>
      <cp:keywords>powerpoint; open xml; dublin core; microsoft office</cp:keywords>
      <dc:description>
        One thing I'd like to discover is just how line wrapping is handled
        in the comments. This paragraph is all on a single
        line._x000d__x000d_This is a second paragraph separated from the
        first by two line feeds.
      </dc:description>
      <cp:lastModifiedBy>Steve Canny</cp:lastModifiedBy>
      <cp:revision>2</cp:revision>
      <dcterms:created xsi:type="dcterms:W3CDTF">2013-04-06T06:03:36Z</dcterms:created>
      <dcterms:modified xsi:type="dcterms:W3CDTF">2013-06-15T06:09:18Z</dcterms:modified>
      <cp:category>analysis</cp:category>
    </cp:coreProperties>


Schema Excerpt
--------------

::

    <xs:schema
      targetNamespace="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
      xmlns="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
      xmlns:xs="http://www.w3.org/2001/XMLSchema"
      xmlns:dc="http://purl.org/dc/elements/1.1/"
      xmlns:dcterms="http://purl.org/dc/terms/"
      elementFormDefault="qualified"
      blockDefault="#all">

      <xs:import
        namespace="http://purl.org/dc/elements/1.1/"
        schemaLocation="http://dublincore.org/schemas/xmls/qdc/2003/04/02/dc.xsd"/>
      <xs:import
        namespace="http://purl.org/dc/terms/"
        schemaLocation="http://dublincore.org/schemas/xmls/qdc/2003/04/02/dcterms.xsd"/>
      <xs:import
        id="xml"
        namespace="http://www.w3.org/XML/1998/namespace"/>

      <xs:element name="coreProperties" type="CT_CoreProperties"/>

      <xs:complexType name="CT_CoreProperties">
        <xs:all>
          <xs:element name="category"        type="xs:string"   minOccurs="0"/>
          <xs:element name="contentStatus"   type="xs:string"   minOccurs="0"/>
          <xs:element ref="dcterms:created"                     minOccurs="0"/>
          <xs:element ref="dc:creator"                          minOccurs="0"/>
          <xs:element ref="dc:description"                      minOccurs="0"/>
          <xs:element ref="dc:identifier"                       minOccurs="0"/>
          <xs:element name="keywords"        type="CT_Keywords" minOccurs="0"/>
          <xs:element ref="dc:language"                         minOccurs="0"/>
          <xs:element name="lastModifiedBy"  type="xs:string"   minOccurs="0"/>
          <xs:element name="lastPrinted"     type="xs:dateTime" minOccurs="0"/>
          <xs:element ref="dcterms:modified"                    minOccurs="0"/>
          <xs:element name="revision"        type="xs:string"   minOccurs="0"/>
          <xs:element ref="dc:subject"                          minOccurs="0"/>
          <xs:element ref="dc:title"                            minOccurs="0"/>
          <xs:element name="version"         type="xs:string"   minOccurs="0"/>
        </xs:all>
      </xs:complexType>

      <xs:complexType name="CT_Keywords" mixed="true">
        <xs:sequence>
          <xs:element name="value" minOccurs="0" maxOccurs="unbounded" type="CT_Keyword"/>
        </xs:sequence>
        <xs:attribute ref="xml:lang" use="optional"/>
      </xs:complexType>

      <xs:complexType name="CT_Keyword">
        <xs:simpleContent>
          <xs:extension base="xs:string">
            <xs:attribute ref="xml:lang" use="optional"/>
          </xs:extension>
        </xs:simpleContent>
      </xs:complexType>

    </xs:schema>


.. _Dublin Core:
   http://en.wikipedia.org/wiki/Dublin_Core