Grouping by multiple arrays with Xarray

Monday, September 2nd, 2024 (17 days ago)



TLDR#

Xarray now supports grouping by multiple variables (docs). 🎉 😱 🤯 🥳. Try it out!

How do I use it?#

Install xarray>=2024.09.0 and optionally flox for better performance with reductions.

Simple example#

Simple grouping by multiple categorical variables is easy:

1import xarray as xr
2from xarray.groupers import UniqueGrouper
3
4da = xr.DataArray(
5    np.array([1, 2, 3, 0, 2, np.nan]),
6    dims="d",
7    coords=dict(
8        labels1=("d", np.array(["a", "b", "c", "c", "b", "a"])),
9        labels2=("d", np.array(["x", "y", "z", "z", "y", "x"])),
10    ),
11)
12
13gb = da.groupby(["labels1", "labels2"])
14gb
15
<DataArrayGroupBy, grouped over 2 grouper(s), 9 groups in total:
	'labels1': 3 groups with labels 'a', 'b', 'c'
	'labels2': 3 groups with labels 'x', 'y', 'z'>

Reductions work as usual:

1gb.mean()
2
Loading data...

So does map:

1gb.map(lambda x: x[0])
2
Loading data...

More complex time grouping#

Grouping by multiple /virtual/ variables like "time.month" is also supported:

1import xarray as xr
2
3ds = xr.tutorial.open_dataset("air_temperature")
4ds.groupby(["time.year", "time.month"]).mean()
5
Loading data...

Multiple Grouper types#

The above syntax da.groupby(["labels1", "labels2"]) is a short cut for using Grouper objects.

1da.groupby(labels1=UniqueGrouper(), labels2=UniqueGrouper())
2

Grouper objects allow you to express more complicated GroupBy problems. For example, combining different grouper types is allowed. That is you can combine categorical grouping with UniqueGrouper, binning with BinGrouper, and resampling with TimeResampler.

1from xarray.groupers import BinGrouper
2
3ds = xr.Dataset(
4        {"foo": (("x", "y"), np.arange(12).reshape((4, 3)))},
5        coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
6    )
7gb = ds.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper())
8gb
9
<DatasetGroupBy, grouped over 2 grouper(s), 4 groups in total:
	'x_bins': 2 groups with labels (5,, 15], (15,, 25]
	'letters': 2 groups with labels 'a', 'b'>

Now reduce as usual

1gb.mean()
2
Loading data...
Back to Blog

xarray logo

© 2024, Xarray core developers. Apache 2.0 Licensed.

316e7af

TwitterGitHubYouTubeBlog RSS Feed
Powered by â–² Vercel