File: README.md

package info (click to toggle)
r-cran-vipor 0.4.5-3
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye
  • size: 4,988 kB
  • sloc: sh: 13; makefile: 2
file content (121 lines) | stat: -rw-r--r-- 4,394 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# Plot one-dimensional data using quasirandom noise and kernel density

[![Build Status](https://travis-ci.org/sherrillmix/vipor.svg?branch=master)](https://travis-ci.org/sherrillmix/vipor)
[![codecov.io](https://codecov.io/github/sherrillmix/vipor/coverage.svg?branch=master)](https://codecov.io/github/sherrillmix/vipor?branch=master)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/vipor)](https://cran.r-project.org/package=vipor)

## Introduction

`vipor` (VIolin POints in R) provides a way to plot one-dimensional data (perhaps divided into several categories) by spreading the data points to fill the kernel density. It uses a [van der Corput sequence](http://en.wikipedia.org/wiki/Van_der_Corput_sequence) to space the dots and avoid generating distracting patterns in the data. See the examples below.

Violin scatter plots (aka column scatter plots or beeswarm plots or one dimensional scatter plots) are a way of plotting points that would ordinarily overlap so that they fall next to each other instead. In addition to reducing overplotting, it helps visualize the density of the data at each point (similar to a violin plot), while still showing each data point individually.

## Installation
This package is on CRAN so install should be a simple:

```r
install.packages('vipor')
```

If you want the development version from GitHub, you can do:


```r
devtools::install_github("sherrillmix/vipor")
```

## Examples

### Violin point examples

We use the provided function `offsetX` to generate the x-offsets for plotting.

```r
library(vipor)
# Generate data
set.seed(12345)
dat <- list(rnorm(50), rnorm(500), c(rnorm(100), rnorm(100,5)), rcauchy(100))
names(dat) <- c("Normal", "Dense Normal", "Bimodal", "Extremes")

# Violin points of several distributions
par(mfrow=c(4,1), mar=c(2.5,3.1, 1.2, 0.5),mgp=c(2.1,.75,0),
	cex.axis=1.2,cex.lab=1.2,cex.main=1.2)
sapply(names(dat),function(label) {
	y<-dat[[label]]
	offsets <- list(
		'Default'=offsetX(y),  # Default
		'Adjust=2'=offsetX(y, adjust=2),    # More smoothing
		'Adjust=.1'=offsetX(y, adjust=0.1),  # Tighter fit
		'Width=10%'=offsetX(y, width=0.1)    # Less wide
	)  
	ids <- rep(1:length(offsets), each=length(y))
	plot(unlist(offsets) + ids, rep(y, length(offsets)), ylab='y value',
		xlab='', xaxt='n', pch=21,col='#00000099',bg='#00000033',las=1,main=label)
	axis(1, 1:length(offsets), names(offsets))
})
```

![plot of chunk adjust-examples](tools/adjust-examples-1.png)


### Comparison with other methods

```r
library(beeswarm)
par(mfrow=c(4,1), mar=c(2.5,3.1, 1.2, 0.5),mgp=c(2.1,.75,0),
	cex.axis=1.2,cex.lab=1.2,cex.main=1.2)
sapply(names(dat),function(label) {
	y<-dat[[label]]
	#need to start plot first for beeswarm so xlim is magic number here
	plot(1,1,type='n',ylab='y value',xlim=c(.5,8+.5),
		ylim=range(y),xlab='', xaxt='n', ,las=1,main=label)
	offsets <- list(
		'Quasi'=offsetX(y),  # Default
		'Pseudo'=offsetX(y, method='pseudorandom',nbins=100),
		'Frown'=offsetX(y, method='frowney',nbins=20),
		'Smile\n20 bin'=offsetX(y, method='smiley',nbins=20),
		'Smile\n100 bin'=offsetX(y, method='smiley',nbins=100),
		'Smile\nn/5 bin'=offsetX(y, method='smiley',nbins=round(length(y)/5)),
		'Tukey'=offsetX(y, method='tukey'),
		'Beeswarm'=swarmx(rep(0,length(y)),y)$x
	)
	ids <- rep(1:length(offsets), each=length(y))

	points(unlist(offsets) + ids, rep(y, length(offsets)),pch=21,col='#00000099',bg='#00000033')
	par(lheight=.8)
	axis(1, 1:length(offsets), names(offsets),padj=1,mgp=c(0,-.3,0),tcl=-.5)
})
```

![plot of chunk other-methods](tools/other-methods-1.png)

And using the county data from Tukey and Tukey:

```r
par(mar=c(2.5,3.1, 1.2, 0.5),mgp=c(2.1,.75,0))
y<-log10(counties$landArea)
offsets <- list(
  'Quasi'=offsetX(y),  # Default
  'Quasi\nadjust=.25'=offsetX(y,adjust=.25),
  'Pseudo'=offsetX(y, method='pseudorandom',nbins=100),
  'Smile'=offsetX(y, method='smiley'),
  'Smile\nadjust=.25'=offsetX(y, method='smiley',adjust=.25),
  'Tukey'=offsetX(y, method='tukey')
)
ids <- rep(1:length(offsets), each=length(y))
plot(
  unlist(offsets) + ids,
  rep(y, length(offsets)),
  xlab='', ylab='Land area (log10)',
  main='Counties', xaxt='n', las=1,
  pch='.'
)
par(lheight=.8)
axis(1, 1:length(offsets), names(offsets),padj=1,mgp=c(0,-.3,0),tcl=-.5)
```

![plot of chunk methods-county](tools/methods-county-1.png)

------
Authors: Scott Sherrill-Mix and Erik Clarke