File: FAQ-wordcount.html

package info (click to toggle)
tetex-base 3.0.dfsg.3-5
  • links: PTS
  • area: main
  • in suites: etch-m68k
  • size: 239,540 kB
  • ctags: 10,610
  • sloc: xml: 103,461; perl: 9,398; ruby: 2,850; python: 1,551; php: 1,067; sh: 981; lisp: 494; makefile: 371; awk: 88
file content (43 lines) | stat: -rw-r--r-- 2,694 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
<head>
<title>UK TeX FAQ -- question label wordcount</title>
</head><body>
<h3>How many words have you written?</h3>
<p>One often has to submit a document (e.g., a paper or a dissertation)
under some sort of constraint about its size.  Sensible people set a
constraint in terms of numbers of pages, but there are some that
persist in limiting the numbers of words you type.
<p>A simple solution to the requirement can be achieved following a
simple observation: the powers that be are unlikely to count all the
words of a document submitted to them.  Therefore, a statistical
method can be employed: find how many words there are on a full page;
find how many full pages there are in the document (allowing for
displays of various sorts, this number will probably not be an
integer); multiply the two.  However, if the document to be submitted
is to determine the success of the rest of one's life, it takes a
brave person to thumb their nose at authority quite so
comprehensively...
<p>The simplest method is to strip out the (La)TeX markup, and to count
what's left.  On a  Unix-like system, this may be done using
<i>detex</i> and the built-in <i>wc</i>:
<pre>
  detex &lt;filename&gt; | wc -w
</pre>
The <i>latexcount</i> script does the same sort of job, in one
"step"; being a <i>perl</i> script, it is in principle rather
easily configured (see documentation inside the script).
<i>Winedt</i> (see <a href="FAQ-editors.html">editors and shells</a>)
provides this functionality direct in the Windows environment.
<p>Simply stripping (La)TeX markup isn't entirely reliable, however:
that markup itself may contribute typeset words, and this could be a
problem.  The <i>wordcount</i> package
contains a Bourne shell (i.e., typically Unix) script for running a
LaTeX file with a special piece of supporting TeX code, and then
counting word indications in the log file.  This is probably as
accurate automatic counting as you can get.
<dl>
<dt><tt><i>detex</i></tt><dd><a href="ftp://cam.ctan.org/tex-archive/support/detex.tar.gz">support/detex</a> (<a href="ftp://cam.ctan.org/tex-archive/support/detex.zip">zip</a>, <a href="http://www.tex.ac.uk/tex-archive/support/detex/">browse</a>)
<dt><tt><i>wordcount</i></tt><dd><a href="ftp://cam.ctan.org/tex-archive/macros/latex/contrib/wordcount.tar.gz">macros/latex/contrib/wordcount</a> (<a href="ftp://cam.ctan.org/tex-archive/macros/latex/contrib/wordcount.zip">zip</a>, <a href="http://www.tex.ac.uk/tex-archive/macros/latex/contrib/wordcount/">browse</a>)
</dl>
<p>
<p><p><p><p>This question on the Web: <a href="http://www.tex.ac.uk/cgi-bin/texfaq2html?label=wordcount">http://www.tex.ac.uk/cgi-bin/texfaq2html?label=wordcount</a>
</body>