1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
|
---
title: "Parsetools Coding Conventions"
author: "Andrew Redd"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Parsetools Coding Conventions}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
Coding Conventions for the `parsetools` package
===============================================
Functions
---------
All the functions in `parsetools` are concerned with `parse-data` objects.
Functions either obtain the `parse-data`, such as `get_parse_data()`;
convert or transform the parse data, such as `classify_comments()`;
identify elements of the parse data tree, such as `pd_is_function()`;
or navigate the tree, such as `get_parent_id()`.
### Arguments ###
With the exception of obtaining functions and manipulation functions,
all function will either take an argument `pd` as a stand alone argument
which expects a `parse-data` object, or will take the combination of
an `id` and a `pd`, exclusively in that order. The `id` argument is
expected to be an integer of values that exist in `pd$id` which denotes
the node or nodes of interest. Whether id is a singleton or vector
is differentiated by function naming conventions described in the
following sections. If there is a need for additional function arguments
they must occur after the `id` and `pd` arguments.
### Naming Conventions Overview ###
Function names should follow the underscore standard with appropriate prefixes and suffixes,
and conform to proper plurality.
form Meaning Accepts Returns Exported
------------- ------------------------- ---------- --------------------------- ------------
pd_is_* Logical test function id vector logical 1:1 for input id Yes
pd_get_*_id Navigation function id vector id integer 1:1 for input Yes
pd_get_*_ids Set identification single id id vector many:1 for input Yes
get_*_pd Subsetting id(1)+pd subsetted parse-data No
all_*_ids Global sets parse-data id vector many:1 for input No
pd_* Other action function id+pd Depending
### Logical Test Functions ###
Functions of the form `pd_is_<name>` test if the specified id satisfies the
criteria for `<name>`. For example, `pd_is_function(id,pd)` tests if the
id identifies a function expression, namely the token for id is `expr`,
and the token for the firstborn, i.e. child with minimum id, is `FUNCTION`.
These also appear in the form `pd_is_in_<name>`.
Example, `pd_is_in_class_definition` tests if the expression is nested inside
any defined class definition.
### Navigation Functions ###
Functions that get a single id relative to the current id follow the format
`pd_get_<name>_id`, they may accept a vector of inputs and return a vector
of outputs, with each element of the output vector corresponding respectively
to the input vector.
### Set Identification Functions ###
Function that of the form `pd_get_<name>_ids` are set identification functions.
They take a single node, and only a single node and return a set of nodes
relative to the given node.
## Shortcut functions
There are several functions that are shortcuts for internal expressions.
These should not be exported but may use the shortcut of defining the function
to infer the `pd` argument from the parent function environment,
`pd=get('pd', parent.frame())`.
Exported functions should not utilize this shortcut. The shortcut functions
are renamed with the following conventions:
- `pd_is_<name>` → `is_<name>`
- `pd_get_<name>_id` → `<name>`
- `pd_get_<name>_ids` → `<name>s` or the appropriate plural form.
## Error Checking
Exported functions should perform error checking on arguments.
This can be made optional by the .check argument, and when used internally
the checking should be turned off.
## Tests
In testing code blocks results of output should be directly in the `expect_*`
function, or stored in an object called `test.object`.
## Vectorization
Functions that return a single id value for each input should vectorize over input.
When possible keep the shortest form, if explicit vectorization is needed use:
if (length(id) > 1L) return(sapply(id, <function_name>, pd=pd, <other arguments>, .check=FALSE))
Those that return multiple values for each input id accept only only singleton ids.
These functions should check that the input is of length 1.
|