File: comments.rst

package info (click to toggle)
python-docx 1.2.0%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 7,216 kB
  • sloc: xml: 25,323; python: 23,414; makefile: 175
file content (168 lines) | stat: -rw-r--r-- 7,240 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
.. _comments:

Working with Comments
=====================

Word allows *comments* to be added to a document. This is an aspect of the *reviewing*
feature-set and is typically used by a second party to provide feedback to the author
without changing the document itself.

The procedure is simple:

- You select some range of text with the mouse or Shift+Arrow keys
- You press the *New Comment* button (Review toolbar)
- You type or paste in your comment

.. image:: /_static/img/comment-parts.png

A comment can only be added to the main document. A comment cannot be added in a header,
a footer, or within a comment. A comment _can_ be added to a footnote or endnote, but
those are not yet supported by *python-docx*.

**Comment Anatomy.** Each comment has two parts, the *comment-reference* and the
*comment-content*:

The **comment-refererence**, sometimes *comment-anchor*, is the text in the main
document you selected before pressing the *New Comment* button. It is a so-called
*range* in the main document that starts at the first selected character and ends after
the last one.

The **comment-content**, sometimes just *comment*, is whatever content you typed or
pasted in. The content for each comment is stored in a separate comment object, and
these comment objects are stored in a separate *comments-part* (part-name
``word/comments.xml``), not in the main document. Each comment is assigned a unique id
when it is created, allowing the comment reference to be associated with its content and
vice versa.

**Comment Reference.** The comment-reference is a *range*. A range must both start and
end at an even *run* boundary. Intuitively, a range corresponds to a *selection* of text
in the Word UI, one formed by dragging with the mouse or using the *Shift-Arrow* keys.

In the XML, this range is delimited by a start marker `<w:commentRangeStart/>` and an
end marker `<w:commentRangeEnd/>`, both of which contain the *id* of the comment they
delimit. The start marker appears before the run starting with the first character of
the range and the end marker appears immediately after the run ending with the last
character of the range. Adding a comment that references an arbitrary range of text in
an existing document may require splitting runs on the desired character boundaries.

In general a range can span paragraphs, such that the range begins in one paragraph and
ends in a later paragraph. However, a range must enclose *contiguous* runs, such that a
range that contains only two vertically adjacent cells in a multi-column table is not
possible (even though Word allows such a selection with the mouse).

**Comment Content.** Interestingly, although commonly used to contain a single line of
plain text, the comment-content can contain essentially any content that can appear in
the document body. This includes rich text with emphasis, runs with a different typeface
and size, both paragraph and character styles, hyperlinks, images, and tables. Note that
tables do not appear in the comment as displayed in the *comment-sidebar* although they
do apper in the *reviewing-pane*.

**Comment Metadata.** Each comment can be assigned *author*, *initals*, and *date*
metadata. In Word, these fields are assigned automatically based on values in ``Settings
> User`` of the installed Word application. These might be configured automatically in
an enterprise installation, based on the user account, but by default they are empty.

*author* metadata is required, although silently assigned the empty string by Word if
the user name is not configured. *initials* is optional, but always set by Word, to the
empty string if not configured. *date* is also optional, but always set by Word to the
UTC date and time the comment was added, with seconds resolution (no milliseconds or
microseconds).

**Additional Features.** Later versions of Word allow a comment to be *resolved*. A
comment in this state will appear grayed-out in the Word UI. Later versions of Word also
allow a comment to be *replied to*, forming a *comment thread*. Neither of these
features is supported by the initial implementation of comments in *python-docx*.

**Applicability.** Note that comments cannot be added to a header or footer and cannot
be nested inside a comment itself. In general the *python-docx* API will not allow these
operations but if you outsmart it then the resulting comment will either be silently
removed or trigger a repair error when the document is loaded by Word.


Adding a Comment
----------------

A simple example is adding a comment to a paragraph::

    >>> from docx import Document
    >>> document = Document()
    >>> paragraph = document.add_paragraph("Hello, world!")

    >>> comment = document.add_comment(
    ...    runs=paragraph.runs,
    ...    text="I have this to say about that"
    ...    author="Steve Canny",
    ...    initials="SC",
    ... )
    >>> comment
    <docx.comments.Comment object at 0x02468ACE>
    >>> comment.id
    0
    >>> comment.author
    'Steve Canny'
    >>> comment.initials
    'SC'
    >>> comment.date
    datetime.datetime(2025, 6, 11, 20, 42, 30, 0, tzinfo=datetime.timezone.utc)
    >>> comment.text
    'I have this to say about that'

The API documentation for :meth:`.Document.add_comment` provides further details.


Accessing and using the Comments collection
-------------------------------------------

The comments collection is accessed via the :attr:`.Document.comments` property::

    >>> comments = document.comments
    >>> comments
    <docx.parts.comments.Comments object at 0x02468ACE>
    >>> len(comments)
    1

The comments collection supports random access to a comment by its id::

    >>> comment = comments.get(0)
    >>> comment
    <docx.comments.Comment object at 0x02468ACE>


Adding rich content to a comment
--------------------------------

A comment is a _block-item container_, just like the document body or a table cell, so
it can contain any content that can appear in those places. It does not contain
page-layout sections and cannot contain a comment reference, but it can contain multiple
paragraphs and/or tables, and runs within paragraphs can have emphasis such as bold or
italic, and have images or hyperlinks.

A comment created with `text=""` will contain a single paragraph with a single empty run
containing the so-called *annotation reference* but no text. It's probably best to leave
this run as it is but you can freely add additional runs to the paragraph that contain
whatever content you like.

The methods for adding this content are the same as those used for the document and
table cells::

    >>> paragraph = document.add_paragraph("The rain in Spain.")
    >>> comment = document.add_comment(
    ...     runs=paragraph.runs,
    ...     text="",
    ... )
    >>> cmt_para = comment.paragraphs[0]
    >>> cmt_para.add_run("Please finish this thought. I believe it should be ")
    >>> cmt_para.add_run("falls mainly in the plain.").bold = True


Updating comment metadata
-------------------------

The author and initials metadata can be updated as desired::

    >>> comment.author = "John Smith"
    >>> comment.initials = "JS"
    >>> comment.author
    'John Smith'
    >>> comment.initials
    'JS'