File: 01-introduction.Rmd

package info (click to toggle)
r-cran-bookdown 0.32%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 3,664 kB
  • sloc: javascript: 11,322; makefile: 21; sh: 20
file content (193 lines) | stat: -rw-r--r-- 14,681 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
\mainmatter

# Introduction

This book is a guide to authoring books and technical documents with R Markdown [@R-rmarkdown] and the R package **bookdown** [@R-bookdown]. It focuses on the features specific to writing books, long-form articles, or reports, such as:

- how to typeset equations, theorems, figures and tables, and cross-reference them;
- how to generate multiple output formats such as HTML, PDF, and e-books for a single book;
- how to customize the book templates and style different elements in a book;
- editor support (in particular, the RStudio IDE); and
- how to publish a book.

It is not a comprehensive introduction to R Markdown or the **knitr** package [@R-knitr], on top of which **bookdown** was built. To learn more about R Markdown, please check out the online documentation <http://rmarkdown.rstudio.com>. For **knitr**, please see @xie2015. You do not have to be an expert of the R language [@R-base] to read this book, but you are expected to have some basic knowledge about R Markdown and **knitr**. For beginners, you may get started with the cheatsheets at <https://www.rstudio.com/resources/cheatsheets/>. The appendix of this book contains brief introductions to these software packages. To be able to customize the book templates and themes, you should be familiar with LaTeX, HTML and CSS.

## Motivation

Markdown is a wonderful language to write relatively simple documents that contain elements like sections, paragraphs, lists, links, and images, etc. Pandoc (<http://pandoc.org>) has greatly extended the [original Markdown syntax,](http://daringfireball.net/projects/markdown/) and added quite a few useful new features, such as footnotes, citations, and tables. More importantly, Pandoc makes it possible to generate output documents of a large variety of formats from Markdown, including HTML, LaTeX/PDF, Word, and slides.

There are still a few useful features missing in Pandoc's Markdown at the moment that are necessary to write a relatively complicated document like a book, such as automatic numbering of figures and tables in the HTML output, cross-references of figures and tables, and fine control of the appearance of figures (e.g., currently it is impossible to specify the alignment of images using the Markdown syntax). These are some of the problems that we have addressed in the **bookdown** package.

Under the constraint that we want to produce the book in multiple output formats, it is nearly impossible to cover all possible features specific to these diverse output formats. For example, it may be difficult to reinvent a certain complicated LaTeX environment in the HTML output using the (R) Markdown syntax. Our main goal is not to replace _everything_ with Markdown, but to cover _most_ common functionalities required to write a relatively complicated document, and make the syntax of such functionalities consistent across all output formats, so that you only need to learn one thing and it works for all output formats.\index{Markdown}\index{LaTeX}

Another goal of this project is to make it easy to produce books that look visually pleasant. Some nice existing examples include GitBook (<https://www.gitbook.com>), Tufte CSS (<http://edwardtufte.github.io/tufte-css/>), and Tufte-LaTeX (<https://tufte-latex.github.io/tufte-latex/>). We hope to integrate these themes and styles into **bookdown**, so authors do not have to dive into the details of how to use a certain LaTeX class or how to configure CSS for HTML output.

## Get started

The easiest way for beginners to get started with writing a book with R Markdown and **bookdown** is through the demo `bookdown-demo` on GitHub:

1. Download the GitHub repository <https://github.com/rstudio/bookdown-demo> as a [Zip file,](https://github.com/rstudio/bookdown-demo/archive/main.zip) then unzip it locally.
1. Install the RStudio IDE. Note that you need a version higher than 1.0.0. Please [download the latest version](https://www.rstudio.com/products/rstudio/download/) if your RStudio version is lower than 1.0.0.
1. Install the R package **bookdown**:

    ```{r eval=FALSE}
    # stable version on CRAN
    install.packages('bookdown')
    # or development version on GitHub
    # remotes::install_github('rstudio/bookdown')
    ```

1. Open the `bookdown-demo` repository you downloaded in RStudio by clicking `bookdown-demo.Rproj`.
1. Open the R Markdown file `index.Rmd` and click the button `Build Book` on the `Build` tab of RStudio.

```{block2, type='rmdnote'}
If you are planning on printing your book to PDF, you will need a LaTeX distribution. We recommend that you install TinyTeX (which includes XeLaTeX): <https://yihui.org/tinytex/>.
```

Now you should see the index page of this book demo in the RStudio Viewer. You may add or change the R Markdown files, and hit the `Knit` button again to preview the book. If you prefer not to use RStudio, you may also compile the book through the command line. See the next section for details.

Although you see quite a few files in the `bookdown-demo` example, most of them are not essential to a book. If you feel overwhelmed by the number of files, you can use this minimal example instead, which is essentially one file `index.Rmd`: https://github.com/yihui/bookdown-minimal. The `bookdown-demo` example contains some advanced settings that you may want to learn later, such as how to customize the LaTeX preamble, tweak the CSS, and build the book on GitHub, etc.

## Usage {#usage}

A typical **bookdown** book contains multiple chapters, and one chapter lives in one R Markdown file, with the filename extension `.Rmd`. Each R Markdown file must start immediately with the chapter title using the first-level heading, e.g., `# Chapter Title`. All R Markdown files must be encoded in UTF-8, especially when they contain multi-byte characters such as Chinese, Japanese, and Korean. Here is an example (the bullets are the filenames, followed by the file content):

- index.Rmd

    ```markdown
    # Preface {-}
    
    In this book, we will introduce an interesting
    method.
    ```

- 01-intro.Rmd

    ```markdown
    # Introduction
    
    This chapter is an overview of the methods that
    we propose to solve an **important problem**.
    ```

- 02-literature.Rmd

    ```markdown
    # Literature
    
    Here is a review of existing methods.
    ```

- 03-method.Rmd

    ```markdown
    # Methods
    
    We describe our methods in this chapter.
    ```

- 04-application.Rmd

    ```markdown
    # Applications
    
    Some _significant_ applications are demonstrated
    in this chapter.
    
    ## Example one
    
    ## Example two
    ```

- 05-summary.Rmd

    ```markdown
    # Final Words
    
    We have finished a nice book.
    ```

By default, **bookdown** merges all Rmd files by the order of filenames, e.g., `01-intro.Rmd` will appear before `02-literature.Rmd`. Filenames that start with an underscore `_` are skipped. If there exists an Rmd file named `index.Rmd`, it will always be treated as the first file when merging all Rmd files. The reason for this special treatment is that the HTML file `index.html` to be generated from `index.Rmd` is usually the default index file when you view a website, e.g., you are actually browsing http://yihui.org/index.html when you open http://yihui.org/.

You can override the above behavior by including a configuration file named `_bookdown.yml`\index{\_bookdown.yml} in the book directory. It is a YAML\index{YAML} file (https://en.wikipedia.org/wiki/YAML), and R Markdown users should be familiar with this format since it is also used to write the metadata in the beginning of R Markdown documents (you can learn more about YAML in Section \@ref(r-markdown)). You can use a field named `rmd_files` to define your own list and order of Rmd files for the book. For example,

```yaml
rmd_files: ["index.Rmd", "abstract.Rmd", "intro.Rmd"]
```

In this case, **bookdown** will use the list of files you defined in this YAML field (`index.Rmd` will be added to the list if it exists, and filenames starting with underscores are always ignored). If you want both HTML and LaTeX/PDF output from the book, and use different Rmd files for HTML and LaTeX output, you may specify these files for the two output formats separately, e.g.,

```yaml
rmd_files:
  html: ["index.Rmd", "abstract.Rmd", "intro.Rmd"]
  latex: ["abstract.Rmd", "intro.Rmd"]
```

Although we have been talking about R Markdown files, the chapter files do not actually have to be R Markdown. They can be plain Markdown files (`.md`), and do not have to contain R code chunks at all. You can certainly use **bookdown** to compose novels or poems!

At the moment, the major output formats that you may use include `bookdown::pdf_book`, `bookdown::gitbook`, `bookdown::html_book`, and `bookdown::epub_book`. There is a `bookdown::render_book()`\index{bookdown::render\_book()} function similar to `rmarkdown::render()`, but it was designed to render _multiple_ Rmd documents into a book using the output format functions. You may either call this function from command line directly, or click the relevant buttons in the RStudio IDE. Here are some command-line examples:

```{r eval=FALSE}
bookdown::render_book('foo.Rmd', 'bookdown::gitbook')
bookdown::render_book('foo.Rmd', 'bookdown::pdf_book')
bookdown::render_book('foo.Rmd', bookdown::gitbook(lib_dir = 'libs'))
bookdown::render_book('foo.Rmd', bookdown::pdf_book(keep_tex = TRUE))
```

To use `render_book` and the output format functions in the RStudio IDE, you can define a YAML field named `site` that takes the value `bookdown::bookdown_site`,^[This function calls `bookdown::render_book()`.] and the output format functions can be used in the `output` field, e.g.,

```yaml
---
site: "bookdown::bookdown_site"
output:
  bookdown::gitbook:
    lib_dir: "book_assets"
  bookdown::pdf_book:
    keep_tex: yes
---
```

Then you can click the `Build Book` button in the `Build` pane in RStudio to compile the Rmd files into a book, or click the `Knit` button on the toolbar to preview the current chapter.

More **bookdown** configuration options in `_bookdown.yml` are explained in Section \@ref(configuration). Besides these configurations, you can also specify some Pandoc-related configurations in the YAML metadata of the _first_ Rmd file of the book, such as the title, author, and date of the book, etc. For example:

```yaml
--- 
title: "Authoring A Book with R Markdown"
author: "Yihui Xie"
date: "`r '\x60r Sys.Date()'``"
site: "bookdown::bookdown_site"
output:
  bookdown::gitbook: default
documentclass: book
bibliography: ["book.bib", "packages.bib"]
biblio-style: apalike
link-citations: yes
---
```

## Two rendering approaches {#new-session}

Merging all chapters into one Rmd file and knitting it is one way to render the book in **bookdown**. There is actually another way: you may knit each chapter in a _separate_ R session, and **bookdown** will merge the Markdown output of all chapters to render the book. We call these two approaches "Merge and Knit" (M-K) and "Knit and Merge" (K-M), respectively. The differences between them may seem subtle, but can be fairly important depending on your use cases.

- The most significant difference is that M-K runs _all_ code chunks in all chapters in the same R session, whereas K-M uses separate R sessions for individual chapters. For M-K, the state of the R session from previous chapters is carried over to later chapters (e.g., objects created in previous chapters are available to later chapters, unless you deliberately deleted them); for K-M, all chapters are isolated from each other.^[Of course, no one can stop you from writing out some files in one chapter, and reading them in another chapter. It is hard to isolate these kinds of side-effects.] If you want each chapter to compile from a clean state, use the K-M approach. It can be very tricky and difficult to restore a running R session to a completely clean state if you use the M-K approach. For example, even if you detach/unload packages loaded in a previous chapter, R will not clean up the S3 methods registered by these packages.
- Because **knitr** does not allow duplicate chunk labels in a source document, you need to make sure there are no duplicate labels in your book chapters when you use the M-K approach, otherwise **knitr** will signal an error when knitting the merged Rmd file. Note that this means there must not be duplicate labels throughout the whole book. The K-M approach only requires no duplicate labels within any single Rmd file.
- K-M does not allow Rmd files to be in subdirectories, but M-K does.

The default approach in **bookdown** is M-K. To switch to K-M, you either use the argument `new_session = TRUE` when calling `render_book()`, or set `new_session: yes` in the configuration file `_bookdown.yml`.

You can configure the `book_filename` option in `_bookdown.yml` for the K-M approach, but it should be a Markdown filename, e.g., `_main.md`, although the filename extension does not really matter, and you can even leave out the extension, e.g., just set `book_filename: _main`. All other configurations work for both M-K and K-M.

## Some tips

Typesetting under the paging constraint (e.g., for LaTeX/PDF output) can be an extremely tedious and time-consuming job. I'd recommend you not to look at your PDF output frequently, since most of the time you are very unlikely to be satisfied: text may overflow into the page margin, figures may float too far away, and so on. Do not try to make things look right _immediately_, because you may be disappointed over and over again as you keep on revising the book, and things may be messed up again even if you only made some minor changes (see <http://bit.ly/tbrLtx> for a nice illustration).

If you want to preview the book, preview the HTML output. Work on the PDF version after you have finished the content of the book, and are very sure no major revisions will be required.

If certain code chunks in your R Markdown documents are time-consuming to run, you may cache them by adding the chunk option `cache = TRUE` in the chunk header, and you are recommended to label such code chunks as well, e.g.,

````markdown
`r ''````{r important-computing, cache=TRUE}
````

In Chapter \@ref(editing), we will talk about how to quickly preview a book as you edit . In short, you can use the `preview_chapter()` function to render a single chapter instead of the whole book. The function `serve_book()` makes it easy to live-preview HTML book pages: whenever you modify an Rmd file, the book can be recompiled and the browser can be automatically refreshed accordingly.