# Awkward Arrays of vectors

First, [install](../index.md#installation) and import Vector and [Awkward Array](https://awkward-array.org/).

In [1]:
import vector
import awkward as ak

## Making an Awkward Array of vectors

Awkward Arrays are arrays with more complex data structures than NumPy allows, such as variable-length lists, nested records, missing and even heterogeneous data (different data types in the same array).

Vectors can be included among those data structures. In this context, vectors are Awkward "records," objects with named fields, that can be nested inside of other structures. The vector properties and methods are implemented through Awkward Array's [behavior](https://awkward-array.org/doc/main/reference/ak.behavior.html) mechanism. Unlike [vector objects](object.md) and [NumPy subclasses](numpy.md), the vectors can't be ordinary Python classes because they might be nested within other data structures, such as variable-length lists, and these lists are implemented in a columnar way that isn't open to Python's introspection.

Let's start with an example. Below, we create an Awkward Array using its [ak.Array](https://awkward-array.org/doc/main/reference/generated/ak.Array.html) constructor, but include `with_name` and `behavior` arguments:

In [2]:
arr = ak.Array(
    [
        [{"x": 1.1, "y": 2.2}, {"x": 3.3, "y": 4.4}],
        [],
        [{"x": 5.5, "y": 6.6}],
    ],
    with_name="Vector2D",
    behavior=vector.backends.awkward.behavior,
)
arr

The above array contains 3 lists, the first has length 2, the second has length 0, and the third has length 1. The lists contain records with field names `"x"` and `"y"`, and the record type is named `"Vector2D"`. In addition, this array has `behavior` from `vector.backends.awkward.behavior`, which is a large dict containing classes and functions to implement vector operations.

For instance, we can compute `rho` and `phi` coordinates in the same way as with the [NumPy subclasses](numpy.md), an array at a time:

In [3]:
arr.rho

In [4]:
arr.phi

As with NumPy, performing operations an array at a time is usually much faster than writing Python for loops. What Awkward Array provides on top of that is the ability to do these operations _through_ variable-length lists and other structures.

An Awkward Array needs all of the following for its records to be interpreted as vectors:

1. the record name, which can be assigned using [ak.with_name](https://awkward-array.org/doc/main/reference/generated/ak.with_name.html) or as a constructor argument, must be one of `"Vector2D"`, `"Momentum2D"`, `"Vector3D"`, `"Momentum3D"`, `"Vector4D"`, and `"Momentum4D"`
2. the field names must be recognized coordinate names, following the same conventions as [vector objects](object.md)
3. the array must have `vector.backends.awkward.behavior` as its `behavior`.

When Awkward Arrays are saved in files, such as with [ak.to_parquet](https://awkward-array.org/doc/main/reference/generated/ak.to_parquet.html), they retain their record names and field names, so conditions 1 and 2 above are persistent. They don't preserve condition 3, the behaviors, since these are Python classes and functions.

To make sure that Vector behaviors are always available, you can call [vector.register_awkward](make_awkward.md#vector.register_awkward) at the beginning of every script, like this:

```python
import awkward as ak
import vector
vector.register_awkward()
```

This function copies Vector's behaviors into Awkward's global [ak.behavior](https://awkward-array.org/doc/main/reference/ak.behavior.html) so that any array with the right record and field names (such as one read from a file) automatically have Vector behaviors.

Vector also has a [vector.Array](make_awkward.md#vector.Array) constructor, which works like [ak.Array](https://awkward-array.org/doc/main/reference/generated/ak.Array.html) but sets `with_name` automatically, as well as [vector.zip](make_awkward.md#vector.zip), which works like [ak.zip](https://awkward-array.org/doc/main/reference/generated/ak.zip.html) and sets `with_name` automatically. However, these functions still require you to set field names appropriately and if you need to do something complex, it's easier to use Awkward Array's own functions and assign the record name after the array is built, using [ak.with_name](https://awkward-array.org/doc/main/reference/generated/ak.with_name.html).

## Using an Awkward array of vectors

First, let's make some arrays to use in examples:

In [5]:
import numpy as np
import awkward as ak
import vector

vector.register_awkward()

In [6]:
def array_of_momentum3d(num_vectors):
    return ak.zip(
        {
            "px": np.random.normal(0, 1, num_vectors),
            "py": np.random.normal(0, 1, num_vectors),
            "pz": np.random.normal(0, 1, num_vectors),
        },
        with_name="Momentum3D",
    )


def array_of_lists_of_momentum3d(mean_num_per_list, num_lists):
    num_per_list = np.random.poisson(mean_num_per_list, num_lists)
    return ak.unflatten(
        array_of_momentum3d(np.sum(num_per_list)),
        num_per_list,
    )


a = array_of_momentum3d(10)
b = array_of_lists_of_momentum3d(1.5, 10)

In [7]:
a

In [8]:
b

Awkward Array uses array-at-a-time functions like NumPy, so if we want to compute dot products of each vector in `a` with every vector of each list in `b`, we'd say:

In [9]:
a.dot(b)

Note that `a` and `b` have different numbers of vectors, but the same array lengths. The operation above [broadcasts](https://awkward-array.org/doc/main/user-guide/how-to-math-broadcasting.html) array `a` into `b`, like the following code:

In [10]:
for i in range(len(a)):
    print("[", end="")

    for j in range(len(b[i])):
        out = a[i].dot(b[i, j])

        print(out, end=" ")

    print("]")

[0.3470024410105361 -1.32262374856204 ]
[]
[-0.0350058354646221 -0.3845727581312185 ]
[-0.05513156034267678 0.1939866533163302 2.2806023784977056 -2.861051860111839 -2.3860250251022475 ]
[-0.17338156209593328 -0.7816799910864897 ]
[-0.8071770038569903 0.7044860439866077 ]
[-0.1279745100114199 ]
[-1.3483450919978617 -0.44626327125953613 ]
[0.9449043237160631 ]
[-1.1124522691465186 -5.579496184145224 ]


Like NumPy, the array-at-a-time expression is more concise and faster:

In [11]:
a = array_of_momentum3d(10000)
b = array_of_lists_of_momentum3d(1.5, 10000)

In [12]:
%%timeit -n1 -r1

out = np.zeros(10000)

for i in range(len(a)):
    for j in range(len(b[i])):
        out[i] += a[i].dot(b[i, j])

9.08 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


In [13]:
%%timeit

out = np.sum(a.dot(b), axis=1)

2.44 ms ± 20.2 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


(Note the units.)

Just as with NumPy, all of the coordinate transformations and vector operations are implemented for Awkward Arrays of vectors.

## Some troubleshooting hints

Make sure that the Vector behaviors are actually installed and applied to your data. In the data type, the record type should appear as `"Vector2D"`, `"Momentum2D"`, `"Vector3D"`, `"Momentum3D"`, `"Vector4D"`, or `"Momentum4D"`, rather than the generic curly brackets `{` and `}`, and if you extract one record from the array, can you perform a vector operation on it?

Make sure that your arrays broadcast the way that you want them to. If the vector behaviors are clouding the picture, make simpler arrays with numbers in place of records. Can you add them with `+`? (Addition uses the same broadcasting rules as all other operations.)

If your code runs but doesn't give the results you expect, try slicing the arrays to just the first two items with `arr[:2]`. Step through the calculation on just two elements, observing the results of each operation. Are they what you expect?

## Advanced: subclassing Awkward-Vector behaviors

It is possible to write subclasses for Awkward-Vector behaviors as mixins to extend the vector functionalities. For instance, the `MomentumAwkward` classes can be extended in the following way:

In [14]:
behavior = vector.backends.awkward.behavior


@ak.mixin_class(behavior)
class TwoVector(vector.backends.awkward.MomentumAwkward2D):
    pass


@ak.mixin_class(behavior)
class ThreeVector(vector.backends.awkward.MomentumAwkward3D):
    pass


# required for transforming vectors
# the class names must always end with "Array"
TwoVectorArray.ProjectionClass2D = TwoVectorArray  # noqa: F821
TwoVectorArray.ProjectionClass3D = ThreeVectorArray  # noqa: F821
TwoVectorArray.MomentumClass = TwoVectorArray  # noqa: F821

ThreeVectorArray.ProjectionClass2D = TwoVectorArray  # noqa: F821
ThreeVectorArray.ProjectionClass3D = ThreeVectorArray  # noqa: F821
ThreeVectorArray.MomentumClass = ThreeVectorArray  # noqa: F821

In [15]:
vec = ak.zip(
    {
        "pt": [[1, 2], [], [3], [4]],
        "phi": [[1.2, 1.4], [], [1.6], [3.4]],
    },
    with_name="TwoVector",
    behavior=behavior,
)
vec

The binary operators are not automatically registered by Awkward, but Vector methods can be used to perform operations on subclassed vectors.

In [16]:
vec.add(vec)

Similarly, other vector methods can be used by the new methods internally.

In [17]:
import numbers

In [18]:
@ak.mixin_class(behavior)
class LorentzVector(vector.backends.awkward.MomentumAwkward4D):
    @ak.mixin_class_method(np.divide, {numbers.Number})
    def divide(self, factor):
        return self.scale(1 / factor)


# required for transforming vectors
# the class names must always end with "Array"
LorentzVectorArray.ProjectionClass2D = TwoVectorArray  # noqa: F821
LorentzVectorArray.ProjectionClass3D = ThreeVectorArray  # noqa: F821
LorentzVectorArray.ProjectionClass4D = LorentzVectorArray  # noqa: F821
LorentzVectorArray.MomentumClass = LorentzVectorArray  # noqa: F821

In [19]:
vec = ak.zip(
    {
        "pt": [[1, 2], [], [3], [4]],
        "eta": [[1.2, 1.4], [], [1.6], [3.4]],
        "phi": [[0.3, 0.4], [], [0.5], [0.6]],
        "energy": [[50, 51], [], [52], [60]],
    },
    with_name="LorentzVector",
    behavior=behavior,
)
vec

In [20]:
vec / 2

In [21]:
vec.like(vector.obj(x=1, y=2))

In [22]:
vec.like(vector.obj(x=1, y=2, z=3))

It is also possible to manually add binary operations in vector's behavior dict to enable binary operations.

In [23]:
_binary_dispatch_cls = {
    "TwoVector": TwoVector,
    "ThreeVector": ThreeVector,
    "LorentzVector": LorentzVector,
}
_rank = [TwoVector, ThreeVector, LorentzVector]

for lhs, lhs_to in _binary_dispatch_cls.items():
    for rhs, rhs_to in _binary_dispatch_cls.items():
        out_to = min(lhs_to, rhs_to, key=_rank.index)
        behavior[(np.add, lhs, rhs)] = out_to.add
        behavior[(np.subtract, lhs, rhs)] = out_to.subtract

In [24]:
vec + vec

In [25]:
vec.to_2D() + vec.to_2D()

Finally, instead of manually registering the superclass ufuncs, one can use the utility `copy_behaviors` function to copy behavior items for a new subclass -

In [26]:
behavior.update(ak._util.copy_behaviors("Vector2D", "TwoVector", behavior))
behavior.update(ak._util.copy_behaviors("Vector3D", "ThreeVector", behavior))
behavior.update(ak._util.copy_behaviors("Momentum4D", "LorentzVector", behavior))

In [27]:
vec + vec

In [28]:
vec.to_2D() + vec.to_2D()