File: a_introduction.Rmd

package info (click to toggle)
r-cran-benchmarkme 1.0.8-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, forky, sid, trixie
  • size: 404 kB
  • sloc: sh: 13; makefile: 2
file content (190 lines) | stat: -rw-r--r-- 5,913 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
---
title: "Crowd sourced benchmarks"
author: "Colin Gillespie"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Crowd sourced benchmarks}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

```{r echo=FALSE, purl=FALSE}
library("benchmarkme")
data(sample_results, package = "benchmarkme")
res = sample_results
```


# System benchmarking 

R benchmarking made easy. The package contains a number of benchmarks, heavily based on the benchmarks at https://mac.R-project.org/benchmarks/R-benchmark-25.R, for assessing 
the speed of your system. 

## Overview 

A straightforward way of speeding up your analysis is to buy a better computer. Modern
desktops are relatively cheap, especially compared to user time. However, it isn't
clear if upgrading your computing is worth the cost. The **benchmarkme** package
provides a set of benchmarks to help quantify your system. More importantly, it allows
you to compare your timings with _other_ systems.

<!-- You can view past benchmarks via the [Shiny](https://jumpingrivers.shinyapps.io/benchmarkme/) interface. -->

## Overview

The package is on [CRAN](https://cran.r-project.org/package=benchmarkme) and can be installed in the usual way

```{r, eval=FALSE}
install.packages("benchmarkme")
```

There are two groups of benchmarks:

  * `benchmark_std()`: this benchmarks numerical operations such as loops and matrix operations. The benchmark comprises 
  of three separate benchmarks: `prog`, `matrix_fun`, and `matrix_cal`.
  * `benchmark_io()`: this benchmarks reading and writing a 5 / 50, MB csv file.

### The benchmark_std() function

This benchmarks numerical operations such as loops and matrix operations. 
This benchmark comprises of three separate benchmarks: `prog`, `matrix_fun`, and `matrix_cal`.
If you have less than 3GB of RAM (run `get_ram()` to find out how much is
available on your system), then you should kill any memory hungry applications, e.g.
firefox, and set `runs = 1` as an argument.

To benchmark your system, use

```{r eval=FALSE}
library("benchmarkme")
## Increase runs if you have a higher spec machine
res = benchmark_std(runs = 3)
```
and upload your results

```{r, eval=FALSE}
## You can control exactly what is uploaded. See details below.
upload_results(res)
```

You can compare your results to other users via

```{r eval=FALSE}
plot(res)
```

<!-- You can also compare your results using the [Shiny](https://jumpingrivers.shinyapps.io/benchmarkme/) interface.  -->
<!-- Simply create a results bundle -->
<!-- ```{r, eval=FALSE} -->
<!-- create_bundle(res, filename = "results.rds") -->
<!-- ``` -->
<!-- and upload to the webpage. -->

### The benchmark_io() function

This function benchmarks reading and writing a 5MB or 50MB (if you have less than 4GB of RAM, reduce the number
of `runs` to 1). Run the benchmark using
```{r eval=FALSE}
res_io = benchmark_io(runs = 3)
upload_results(res_io)
plot(res_io)
```
By default the files are written to a temporary directory generated
```{r eval=FALSE}
tempdir()
```
which depends on the value of 
```{r eval=FALSE}
Sys.getenv("TMPDIR")
```
You can alter this to via the `tmpdir` argument. This is useful for comparing
hard drive access to a network drive.
```{r eval=FALSE}
res_io = benchmark_io(tmpdir = "some_other_directory")
```

### Parallel benchmarks

The benchmark functions above have a parallel option - just simply specify the number of cores you
want to test. For example to test using four cores
```{r eval=FALSE}
res_io = benchmark_std(runs = 3, cores = 4)
```
The process for the parallel benchmarks of the pseudo function `benchmark_x(cores = n)` is:
  - initialise the parallel environment
  - Start timer
  - Run job x in core 1, 2, ..., n simultaneously
  - when __all__ jobs finish stop timer
  - stop parallel environment
This procedure is repeat `runs` times.


## Previous versions of this

This package was started around 2015. However, multiple changes in the byte compiler
over the last few years, has made it very difficult to use previous results. So we have to 
start from scratch.

The previous data can be obtained via

```{r}
data(past_results, package = "benchmarkmeData")
```

## Machine specs

The package has a few useful functions for extracting system specs:

  * RAM: `get_ram()`
  * CPUs: `get_cpu()`
  * BLAS library: `get_linear_algebra()`
  * Is byte compiling enabled: `get_byte_compiler()`
  * General platform info: `get_platform_info()`
  * R version: `get_r_version()`
  
The above functions have been tested on a number of systems. If they don't work
on your system, please raise [GitHub](https://github.com/csgillespie/benchmarkme/issues) issue.

## Uploaded data sets

A summary of the uploaded data sets is available in the [benchmarkmeData](https://github.com/csgillespie/benchmarkme-data) package
```{r}
data(past_results_v2, package = "benchmarkmeData")
```

A column of this data set, contains the unique identifier returned by the
`upload_results()` function.

## What's uploaded

Two objects are uploaded:

1. Your benchmarks from `benchmark_std()` or `benchmark_io()`;
1. A summary of your system information (`get_sys_details()`).

The `get_sys_details()` returns:

  * `Sys.info()`;
  * `get_platform_info()`;
  * `get_r_version()`;
  * `get_ram()`;
  * `get_cpu()`;
  * `get_byte_compiler()`;
  * `get_linear_algebra()`;
  * `installed.packages()`;
  * `Sys.getlocale()`;
  * The `benchmarkme` version number;
  * Unique ID - used to extract results;
  * The current date.

The function `Sys.info()` does include the user and nodenames. In the public release
of the data, this information will be removed. If you don't wish to upload certain
information, just set the corresponding argument, i.e.

```{r eval=FALSE}
upload_results(res, args = list(sys_info = FALSE))
```

---

Development of this package was supported by [Jumping Rivers](https://www.jumpingrivers.com)