-
Notifications
You must be signed in to change notification settings - Fork 0
/
errors.Rmd
299 lines (224 loc) · 9.9 KB
/
errors.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
# Error messages
```{r, include = FALSE}
knitr::opts_chunk$set(eval = FALSE)
```
An error message should start with a general statement of the problem then give a concise description of what went wrong.
Consistent use of punctuation and formatting makes errors easier to parse.
This guide assumes that you're using `cli::cli_abort()`.
We are transitioning to use this function in the tidyverse because it:
- Makes it easy to generate bulleted lists.
- Uses glue style interpolation to insert data into the error.
- Supports a wide range of [inline markup](https://cli.r-lib.org/reference/inline-markup.html).
- Provides convenient tools to [chain errors together](https://rlang.r-lib.org/reference/topic-error-chaining.html).
- Can control the [name of the function](https://rlang.r-lib.org/reference/topic-error-call.html) shown in the error.
Much of the advice in this guide still applies if you're using `stop()`, but it will be be much more work to generate the message.
<!--# Need a brief description of the overall structure of an error here, as a guide to the following sections -->
## Problem statement
Every error message should start with a general statement of the problem.
It should be concise, but informative (This is hard!).
The problem statement should use sentence case and end with a full stop.
- If the cause of the problem is clear (e.g. an incorrect type or size), use "**must**":
```{r}
dplyr::nth(1:10, "x")
#> Error:
#> ! `n` must be a numeric vector, not a character vector.
dplyr::nth(1:10, 1:2)
#> Error:
#> ! `n` must have length 1, not length 2.
```
Do your best to tell the user both what is expected ("a numeric vector") and what they actually provided ("a character vector").
- If you cannot state what was expected, use "**can't**":
```{r}
mtcars %>% pull(b)
#> Error:
#> ! Can't find column `b` in `.data`.
as_vector(environment())
#> Error:
#> ! Can't coerce `.x` to a vector.
purrr::modify_depth(list(list(x = 1)), 3, ~ . + 1)
#> Error:
#> ! Can't find specified `.depth` in `.x`.
```
<!--# Are there other forms that we use? Would be good to have some examples -->
## Error location
Ideally the error message should mention the failing function call.
<!--# I think it'd be useful to include a small example here -->
See <https://rlang.r-lib.org/reference/topic-error-call.html> for more about how to pass calls through error helpers.
## Error details
After the problem statement, use a bulleted list to provide further information.
Use cross bullets (`x`) to let the user know what the problem is, then use info bullets (`i`) to provide contextual information.
These are easy to create with `cli_abort()`: <!--# Include brief example here -->
<!--# Can you reorganise to show cross bullets then info bullets -->
Try to keep the sentences short and sweet:
```{r}
# Good
vec_slice(letters, 100)
#> ! Can't subset elements past the end.
#> ℹ Location 100 doesn't exist.
#> ℹ There are only 26 elements.
# Bad
vec_slice(letters, 100)
#> ! Must index an existing element.
#> There are 26 elements and you've tried to subset element 100.
```
Do your best to reveal the location, name, and/or content of the troublesome component of the input.
The goal is to make it as easy as possible for the user to find and fix the problem.
```{r}
# Good
map_int(1:5, ~ "x")
#> Error:
#> ! Each result must be a single integer.
#> ✖ Result 1 is a character vector.
# Bad
map_int(1:5, ~ "x")
#> Error:
#> ! Each result must be a single integer
```
(It is often not easy to identify the exact problem; it may require passing around extra arguments so that error messages generated at a lower-level can know the original source. For frequently used functions, the effort is typically worth it.)
If the source of the error is unclear, avoid pointing the user in the wrong direction by giving an opinion about the source of the error:
```{r}
# Good
pull(mtcars, b)
#> Error:
#> ! Can't find column `b` in `.data`.
tibble(x = 1:2, y = 1:3, z = 1)
#> Error:
#> ! Tibble columns must have compatible sizes.
#> • Size 2: Existing data.
#> • Size 3: Column `y`.
#> ℹ Only values of size one are recycled.
# Bad: implies one argument at fault
pull(mtcars, b)
#> Error:
#> ! Column `b` must exist in `.data`.
pull(mtcars, b)
#> Error:
#> ! `.data` must contain column `b`.
tibble(x = 1:2, y = 1:3, z = 1)
#> Error:
#> ! Column `x` must be length 1 or 3, not 2.
```
If there are multiple issues, or an inconsistency revealed across several arguments or items, prefer a bulleted list:
```{r}
# Good
purrr::reduce2(1:4, 1:2, `+`)
#> Error:
#> ! `.x` and `.y` must have compatible lengths:
#> ✖ `.x` has length 4
#> ✖ `.y` has length 2
# Bad: harder to scan
purrr::reduce2(1:4, 1:2, `+`)
#> Error:
#> ! `.x` and `.y` must have compatible lengths: `.x` has length 4 and
#> `.y` has length 2
```
If the list of issues might be long, make sure to truncate to only show the first few:
```{r}
# Good
#> Error: NAs found at 1,000,000 locations: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
```
If you want to correctly pluralise the error message, consider using `ngettext()`.
See the notes in `?ngettext()` for some challenges related to correct translation to other languages.
## Hints
If the source of the error is clear and common, you may want to provide a hint as to how to fix it.
The hint should be the last bullet, use an info bullet (`i`), and end in a question mark.
```{r}
dplyr::filter(iris, Species = "setosa")
#> Error:
#> ! Filter specifications must be named.
#> ℹ Did you mean `Species == "setosa"`?
ggplot2::ggplot(ggplot2::aes())
#> Error:
#> ! Can't plot data with class "uneval".
#> ℹ Did you accidentally provide the results of aes() to the `data` argument?
mtcars |> ggplot() |> geom_point()
#> Error in `validate_mapping()`:
#> ! `mapping` must be created by `aes()`.
#> ℹ Did you use %>% instead of +?
```
Hints are particularly important if the source of the error is far away from the root cause:
```{r}
# Bad
mean[[1]]
#> Error:
#> ! object of type 'closure' is not subsettable
# BETTER
mean[[1]]
#> Error:
#> ! Can't subset a function.
# BEST
mean[[1]]
#> Error:
#> ! Can't subset a function.
#> ℹ Have you forgotten to define a variable named `mean`?
```
Good hints are difficult to write because you want to avoid steering users in the wrong direction.
Generally, we avoid writing a hint unless the problem is common, and you can easily find a common pattern of incorrect usage (e.g. by searching StackOverflow).
## Punctuation
<!--# This should be spread amongst the relevant sections. -->
- Errors should be written in sentence case, and should end in a full stop.
Bullets should be formatted similarly; make sure to capitalise the first word (unless it's an argument or column name).
- Prefer the singular in problem statements:
```{r}
# Good
map_int(1:2, ~ "a")
#> Error:
#> ! Each result must be coercible to a single integer.
#> ✖ Result 1 is a character vector.
# Bad
map_int(1:2, ~ "a")
#> Error:
#> ! Results must be coercible to single integers.
#> ✖ Result 1 is a character vector.
```
- If you can detect multiple problems, list up to five.
This allows the user to fix multiple problems in a single pass without being overwhelmed by many errors that may have the same source.
<!--# Mention cli helper here -->
```{r}
# BETTER
map_int(1:10, ~ "a")
#> Error:
#> ! Each result must be coercible to a single integer.
#> ✖ Result 1 is a character vector
#> ✖ Result 2 is a character vector
#> ✖ Result 3 is a character vector
#> ✖ Result 4 is a character vector
#> ✖ Result 5 is a character vector
#> ... and 5 more problems
```
- Pick a natural connector between problem statement and error location: this may be ", not", ";", or ":" depending on the context.
- Surround the names of arguments in backticks, e.g. `` `x` ``.
Use "column" to disambiguate columns and arguments: `` Column `x` ``.
Avoid "variable", because it is ambiguous.
<!--# Needs to be updated to refer to cli inline formatting -->
- Ideally, each component of the error message should be less than 80 characters wide.
Do not add manual line breaks to long error messages; they will not look correct if the console is narrower (or much wider) than expected.
Instead, use bullets to break up the error into shorter logical components.
In case you do need longer sentences, let cli perform paragraph wrapping automatically.
It inserts newlines automatically depending on the width of the console.
## Before and after
More examples gathered from around the tidyverse.
```{r}
dplyr::filter(mtcars, cyl)
#> BEFORE:
#> ! Argument 2 filter condition does not evaluate to a logical vector.
#> AFTER:
#> ! Each argument must be a logical vector.
#> * Argument 2 (`cyl`) is an integer vector.
tibble::tribble("x", "y")
#> BEFORE: ! Expected at least one column name; e.g. `~name`
#> AFTER: ! Must supply at least one column name, e.g. `~name`.
ggplot2::ggplot(data = diamonds) + ggplot2::geom_line(ggplot2::aes(x = cut))
#> BEFORE: ! geom_line requires the following missing aesthetics: y
#> AFTER: ! `geom_line()` must have the following aesthetics: `y`.
dplyr::rename(mtcars, cyl = xxx)
#> BEFORE: ! `xxx` contains unknown variables
#> AFTER: ! Can't find column `xxx` in `.data`.
dplyr::arrange(mtcars, xxx)
#> BEFORE: ! Evaluation error: object 'xxx' not found.
#> AFTER: ! Can't find column `xxx` in `.data`.
```
## Localisation
It is encouraged to be as informative as possible, but each sentence should be very simple to make localisation and translation possible.
[A Localization Horror Story: It Could Happen To You](https://metacpan.org/pod/distribution/Locale-Maketext/lib/Locale/Maketext/TPJ13.pod) is a Good summary of the challenges of localising error messages.
You might not support localised messages right now but you should make it as easy as possible to do it in the future.