CuPy-Xarray: Xarray on GPUs!

Wednesday, January 17th, 2024 (3 months ago)



TLDR#

The CuPy-Xarray project makes mixing GPU acceleration with Xarray workflows very convenient! Explore the new documentation and tutorials to explore how CuPy-Xarray enables GPU accelerations on large multidimensional datasets. 🎉 🥳 🚀

Background#

What is CuPy-Xarray?#

CuPy is a GPU-accelerated library for numerical computations. CuPy provides a NumPy-like array object -- a duck array -- that follows various standard array protocols and executes computations on CUDA-capable devices. Xarray can wrap duck array objects (i.e. NumPy-like arrays) that follow specific protocols.

Thus Xarray can handle CuPy arrays, and cupy-xarray provides a number of useful methods under the xarray_object.cupy namespace, allowing seamless transition between CPU and GPU computations in your data pipeline.

Why is this important?#

GPU acceleration is becoming increasingly important in scientific research, data analysis, and AI/ML techniques due to its ability to perform massively parallel computations. GPUs can greatly accelerate the processing of array datasets, allowing for faster analysis and modeling of large datasets. By leveraging the power of GPUs with tools such as CuPy and CuPy-Xarray, Xarray users can gain significant performance improvements and unlock new opportunities for scientific discovery.

New Documentation and Tutorials#

We have recently created detailed documentation with examples to help users get started with CuPy-Xarray. Check it out at this link.

The new documentation offers the following topics:

  1. Basics of CuPy : An introduction to CuPy, basics of GPU computing, and data transfer between host and device.
  2. Introduction to CuPy-Xarray
  3. Basic Computations with CuPy-Xarray
  4. High-level Computation with CuPy-Xarray : Applying high-level functions like groupby, resample, rolling, and apply_ufunc to xarray objects.
  5. Custom Kernels with apply_ufunc : Custom CUDA kernels for apply_ufunc and how to use apply_ufunc with groupby and resample.
  6. A real world example : This section introduces how to use CuPy-Xarray to accelerate a real world earth system model analysis workflow. In this demo, we used the NASA Earth Exchange Global Daily Downscaled Projections (NEX-GDDP-CMIP6) to demonstrate how to use CuPy-Xarray to speed-up computations on climate data variables.

If you have any questions, encounter issues, or want to contribute, the community forum is a great place to start.

Upstream Work#

We also worked to improve upstream support for the primitives that Xarray needs. For example this pull request enabled the use of Xarray's .rolling methods. An open pull request, when merged, will make it more clear when Xarray objects are wrapping CuPy arrays.

Summary#

CuPy-Xarray is a Python library helps you use CuPy, a GPU array library, and Xarray, a library for multi-dimensional labeled array computations, to enable fast and friendly data processing on GPUs. With the new documentation and tutorials, users can quickly adapt to this integration and optimize their data science workflows.🚀

Acknowledgments#

A special thanks to the Xarray, CuPy, and Pangeo communities for making this integration possible. Collaborations like these are a testament to the power of open-source and community-driven development. 💪 Much thanks to the NVIDIA RAPIDS team (specifically Jacob Tomlinson, John Kirkham) for initiating the cupy-xarray project and guiding us along the way. This work was partly funded by NSF Earthcube award "Jupyter Meets the Earth" (1928374); and NASA's Open Source Tools, Frameworks, and Libraries award "Enhancing analysis of NASA data with the open-source Python Xarray Library" (80NSSC22K0345).

Appendix I: Installation Instructions#

From anaconda:

1conda install cupy-xarray -c conda-forge
2

From PyPI:

1python -m pip install cupy-xarray
2

Appendix II: Additional Resources#

  1. CuPy User Guide
  2. Xarray User Guide
  3. Cupy-Xarray Github
  4. NCAR GPU Workshop
Back to Blog

xarray logo

© 2024, Xarray core developers. Apache 2.0 Licensed.

d9de440

TwitterGitHubYouTubeBlog RSS Feed
Powered by â–² Vercel