Mfdataset Using Glob Syntax From S3 Not Working Since Zarr V3
What happened?
Since the package update to implement the new zarr specification, the glob syntax for loading S3 files is not working anymore. This change has caused inconvenience for users who relied on the previous syntax for loading S3 files using the open_mfdataset
function from the xarray
library.
Example of non-working code
import xarray as xr
xr.open_mfdataset("s3://mybucket/myzarr/*.zarr", engine="zarr")
Error message
TypeError: Unsupported type for store_like: 'FSMap'
What did you expect to happen?
The previous syntax is expected to remain working. The change to the new zarr specification should not have broken the glob syntax for loading S3 files.
Minimal Complete Verifiable Example
import xarray as xr
from s3fs import S3FileSystem
s3 = S3FileSystem()
xr.open_mfdataset(["s3://" + file for file in s3.glob("s3://mybucket/myzarr/*.zarr")], engine="zarr")
MVCE confirmation
- [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
- [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
Anything else we need to know?
No response
Environment
xarray: 2025.1.2 pandas: 2.2.3 numpy: 2.2.3 scipy: 1.15.2 netCDF4: 1.7.2 pydap: 3.5.3 h5netcdf: None h5py: 3.12.1 zarr: 3.0.4 cftime: 1.6.4 nc_time_axis: None iris: None bottleneck: None dask: 2025.2.0 distributed: 2025.2.0 matplotlib: 3.10.1 cartopy: None seaborn: None numbagg: None fsspec: 2025.2.0 cupy: None pint: None sparse: None flox: 0.10.0 numpy_groupies: 0.11.2 setuptools: 75.8.2 pip: 25.0.1 conda: None pytest: None mypy: None IPython: 9.0.1 sphinx: None
Workaround
To load S3 files using the glob syntax, you can use the following workaround:
from s3fs import S3FileSystem
import xarray as xr
s3 = S3FileSystem()
xr.open_mfdataset(["s3://" + file for file in s3.glob("s3://mybucket/myzarr/*.zarr")], engine="zarr")
This workaround requires importing the S3FileSystem
class from the s3fs
library and using a list comprehension to append the "s3://" prefix to the file names.
Future plans
It is unclear whether the old behavior will be re-implemented. However, the xarray
library is constantly evolving, and new features and improvements are being added regularly. Users are encouraged to stay up-to-date with the latest developments and to provide feedback on the library's design and functionality.
Conclusion
Q: What happened to the glob syntax for loading S3 files using the open_mfdataset
function from the xarray
library?
A: The glob syntax for loading S3 files using the open_mfdataset
function from the xarray
library is not working since the update to the new zarr specification.
Q: What is the error message I get when trying to use the glob syntax?
A: The error message you get when trying to use the glob syntax is:
TypeError: Unsupported type for store_like: 'FSMap'
Q: What is the workaround for loading S3 files using the glob syntax?
A: To load S3 files using the glob syntax, you can use the following workaround:
from s3fs import S3FileSystem
import xarray as xr
s3 = S3FileSystem()
xr.open_mfdataset(["s3://" + file for file in s3.glob("s3://mybucket/myzarr/*.zarr")], engine="zarr")
This workaround requires importing the S3FileSystem
class from the s3fs
library and using a list comprehension to append the "s3://" prefix to the file names.
Q: Why do I need to import the S3FileSystem
class from the s3fs
library?
A: You need to import the S3FileSystem
class from the s3fs
library because it provides the functionality for working with S3 files. The xarray
library relies on the s3fs
library to handle S3 file operations.
Q: Why do I need to use a list comprehension to append the "s3://" prefix to the file names?
A: You need to use a list comprehension to append the "s3://" prefix to the file names because the glob
function from the s3fs
library returns a list of file names without the "s3://" prefix. The list comprehension is used to create a new list with the "s3://" prefix appended to each file name.
Q: Is the old behavior going to be re-implemented?
A: It is unclear whether the old behavior will be re-implemented. However, the xarray
library is constantly evolving, and new features and improvements are being added regularly. Users are encouraged to stay up-to-date with the latest developments and to provide feedback on the library's design and functionality.
Q: What can I do to stay up-to-date with the latest developments in the xarray
library?
A: You can stay up-to-date with the latest developments in the xarray
library by:
- Checking the official
xarray
documentation for updates and changes - Following the
xarray
team on social media to stay informed about new features and releases - Participating in the
xarray
community by asking questions and providing feedback on the library's design and functionality
Q: How can I provide feedback on the xarray
library's design and functionality?
A: You can provide feedback on the xarray
library's design and functionality by:
- Submitting issues and feature requests on the
xarray
GitHub repository - Participating in the
xarray
community by asking questions and providing feedback on the library's design and functionality - Reaching out to the
xarray
team directly to provide feedback and suggestions for improvement.