Package netCDF4 :: Module _netCDF4 :: Class Dataset
[hide private]
[frames] | no frames]

Class Dataset

object --+
         |
        Dataset
Known Subclasses:

A netCDF `netCDF4.Dataset` is a collection of dimensions, groups, variables and attributes. Together they describe the meaning of data and relations among data fields stored in a netCDF file. See `netCDF4.Dataset.__init__` for more details.

A list of attribute names corresponding to global netCDF attributes defined for the `netCDF4.Dataset` can be obtained with the `netCDF4.Dataset.ncattrs` method. These attributes can be created by assigning to an attribute of the `netCDF4.Dataset` instance. A dictionary containing all the netCDF attribute name/value pairs is provided by the `__dict__` attribute of a `netCDF4.Dataset` instance.

The following class variables are read-only and should not be modified by the user.

**`dimensions`**: The `dimensions` dictionary maps the names of dimensions defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.Dimension` class.

**`variables`**: The `variables` dictionary maps the names of variables defined for this `netCDF4.Dataset` or `netCDF4.Group` to instances of the `netCDF4.Variable` class.

**`groups`**: The groups dictionary maps the names of groups created for this `netCDF4.Dataset` or `netCDF4.Group` to instances of the `netCDF4.Group` class (the `netCDF4.Dataset` class is simply a special case of the `netCDF4.Group` class which describes the root group in the netCDF4 file).

**`cmptypes`**: The `cmptypes` dictionary maps the names of compound types defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.CompoundType` class.

**`vltypes`**: The `vltypes` dictionary maps the names of variable-length types defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.VLType` class.

**`data_model`**: `data_model` describes the netCDF data model version, one of `NETCDF3_CLASSIC`, `NETCDF4`, `NETCDF4_CLASSIC` or `NETCDF3_64BIT`.

**`file_format`**: same as `data_model`, retained for backwards compatibility.

**`disk_format`**: `disk_format` describes the underlying file format, one of `NETCDF3`, `HDF5`, `HDF4`, `PNETCDF`, `DAP2`, `DAP4` or `UNDEFINED`. Only available if using netcdf C library version >= 4.3.1, otherwise will always return `UNDEFINED`.

**`parent`**: `parent` is a reference to the parent `netCDF4.Group` instance. `None` for a the root group or `netCDF4.Dataset` instance.

**`path`**: `path` shows the location of the `netCDF4.Group` in the `netCDF4.Dataset` in a unix directory format (the names of groups in the hierarchy separated by backslashes). A `netCDF4.Dataset` instance is the root group, so the path is simply `'/'`.

**`keepweakref`**: If `True`, child Dimension and Variables objects only keep weak references to the parent Dataset or Group.

Instance Methods [hide private]
 
__delattr__(...)
x.__delattr__('name') <==> del x.name
 
__enter__(...)
 
__exit__(...)
 
__getattr__(...)
 
__getattribute__(...)
x.__getattribute__('name') <==> x.name
 
__getitem__(x, y)
x[y]
 
__init__(...)
**`__init__(self, filename, mode="r", clobber=True, diskless=False, persist=False, weakref=False, format='NETCDF4')`**
a new object with type S, a subtype of T
__new__(T, S, ...)
 
__repr__(x)
repr(x)
 
__setattr__(...)
x.__setattr__('name', value) <==> x.name = value
 
__unicode__(...)
 
_enddef(...)
 
_redef(...)
 
close(...)
**`close(self)`**
 
createCompoundType(...)
**`createCompoundType(self, datatype, datatype_name)`**
 
createDimension(...)
**`createDimension(self, dimname, size=None)`**
 
createGroup(...)
**`createGroup(self, groupname)`**
 
createVLType(...)
**`createVLType(self, datatype, datatype_name)`**
 
createVariable(...)
**`createVariable(self, varname, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, fill_value=None)`**
 
delncattr(...)
**`delncattr(self,name,value)`**
 
filepath(...)
**`filepath(self)`**
 
getncattr(...)
**`getncattr(self,name)`**
 
ncattrs(...)
**`ncattrs(self)`**
 
renameAttribute(...)
**`renameAttribute(self, oldname, newname)`**
 
renameDimension(...)
**`renameDimension(self, oldname, newname)`**
 
renameGroup(...)
**`renameGroup(self, oldname, newname)`**
 
renameVariable(...)
**`renameVariable(self, oldname, newname)`**
 
set_auto_mask(...)
**`set_auto_mask(self, True_or_False)`**
 
set_auto_maskandscale(...)
**`set_auto_maskandscale(self, True_or_False)`**
 
set_auto_scale(...)
**`set_auto_scale(self, True_or_False)`**
 
set_fill_off(...)
**`set_fill_off(self)`**
 
set_fill_on(...)
**`set_fill_on(self)`**
 
setncattr(...)
**`setncattr(self,name,value)`**
 
setncatts(...)
**`setncatts(self,attdict)`**
 
sync(...)
**`sync(self)`**

Inherited from object: __format__, __hash__, __reduce__, __reduce_ex__, __sizeof__, __str__, __subclasshook__

Properties [hide private]
  __orthogonal_indexing__
  _grpid
  _isopen
  cmptypes
  data_model
  dimensions
  disk_format
  file_format
  groups
  keepweakref
  parent
  path
  variables
  vltypes

Inherited from object: __class__

Method Details [hide private]

__delattr__(...)

 

x.__delattr__('name') <==> del x.name

Overrides: object.__delattr__

__getattribute__(...)

 

x.__getattribute__('name') <==> x.name

Overrides: object.__getattribute__

__init__(...)
(Constructor)

 

**`__init__(self, filename, mode="r", clobber=True, diskless=False, persist=False, weakref=False, format='NETCDF4')`**

`netCDF4.Dataset` constructor.

**`filename`**: Name of netCDF file to hold dataset.

**`mode`**: access mode. `r` means read-only; no data can be modified. `w` means write; a new file is created, an existing file with the same name is deleted. `a` and `r+` mean append (in analogy with serial files); an existing file is opened for reading and writing. Appending `s` to modes `w`, `r+` or `a` will enable unbuffered shared access to `NETCDF3_CLASSIC` or `NETCDF3_64BIT` formatted files. Unbuffered acesss may be useful even if you don't need shared access, since it may be faster for programs that don't access data sequentially. This option is ignored for `NETCDF4` and `NETCDF4_CLASSIC` formatted files.

**`clobber`**: if `True` (default), opening a file with `mode='w'` will clobber an existing file with the same name. if `False`, an exception will be raised if a file with the same name already exists.

**`format`**: underlying file format (one of `'NETCDF4', 'NETCDF4_CLASSIC', 'NETCDF3_CLASSIC'` or `'NETCDF3_64BIT'`. Only relevant if `mode = 'w'` (if `mode = 'r','a'` or `'r+'` the file format is automatically detected). Default `'NETCDF4'`, which means the data is stored in an HDF5 file, using netCDF 4 API features. Setting `format='NETCDF4_CLASSIC'` will create an HDF5 file, using only netCDF 3 compatibile API features. netCDF 3 clients must be recompiled and linked against the netCDF 4 library to read files in `NETCDF4_CLASSIC` format. `'NETCDF3_CLASSIC'` is the classic netCDF 3 file format that does not handle 2+ Gb files very well. `'NETCDF3_64BIT'` is the 64-bit offset version of the netCDF 3 file format, which fully supports 2+ GB files, but is only compatible with clients linked against netCDF version 3.6.0 or later.

**`diskless`**: If `True`, create diskless (in memory) file. This is an experimental feature added to the C library after the netcdf-4.2 release.

**`persist`**: if `diskless=True`, persist file to disk when closed (default `False`).

**`keepweakref`**: if `True`, child Dimension and Variable instances will keep weak references to the parent Dataset or Group object. Default is `False`, which means strong references will be kept. Having Dimension and Variable instances keep a strong reference to the parent Dataset instance, which in turn keeps a reference to child Dimension and Variable instances, creates circular references. Circular references complicate garbage collection, which may mean increased memory usage for programs that create may Dataset instances with lots of Variables. Setting `keepweakref=True` allows Dataset instances to be garbage collected as soon as they go out of scope, potential reducing memory usage. However, in most cases this is not desirable, since the associated Variable instances may still be needed, but are rendered unusable when the parent Dataset instance is garbage collected.

Overrides: object.__init__

__new__(T, S, ...)

 
Returns: a new object with type S, a subtype of T
Overrides: object.__new__

__repr__(x)
(Representation operator)

 

repr(x)

Overrides: object.__repr__

__setattr__(...)

 

x.__setattr__('name', value) <==> x.name = value

Overrides: object.__setattr__

close(...)

 

**`close(self)`**

Close the Dataset.

createCompoundType(...)

 

**`createCompoundType(self, datatype, datatype_name)`**

Creates a new compound data type named `datatype_name` from the numpy dtype object `datatype`.

***Note***: If the new compound data type contains other compound data types (i.e. it is a 'nested' compound type, where not all of the elements are homogenous numeric data types), then the 'inner' compound types **must** be created first.

The return value is the `netCDF4.CompoundType` class instance describing the new datatype.

createDimension(...)

 

**`createDimension(self, dimname, size=None)`**

Creates a new dimension with the given `dimname` and `size`.

`size` must be a positive integer or `None`, which stands for "unlimited" (default is `None`). Specifying a size of 0 also results in an unlimited dimension. The return value is the `netCDF4.Dimension` class instance describing the new dimension. To determine the current maximum size of the dimension, use the `len` function on the `netCDF4.Dimension` instance. To determine if a dimension is 'unlimited', use the `netCDF4.Dimension.isunlimited` method of the `netCDF4.Dimension` instance.

createGroup(...)

 

**`createGroup(self, groupname)`**

Creates a new `netCDF4.Group` with the given `groupname`.

If `groupname` is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary (analagous to `mkdir -p` in unix). For example, `createGroup('/GroupA/GroupB/GroupC')` will create `GroupA`, `GroupA/GroupB`, and `GroupA/GroupB/GroupC`, if they don't already exist. If the specified path describes a group that already exists, no error is raised.

The return value is a `netCDF4.Group` class instance.

createVLType(...)

 

**`createVLType(self, datatype, datatype_name)`**

Creates a new VLEN data type named `datatype_name` from a numpy dtype object `datatype`.

The return value is the `netCDF4.VLType` class instance describing the new datatype.

createVariable(...)

 

**`createVariable(self, varname, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, fill_value=None)`**

Creates a new variable with the given `varname`, `datatype`, and `dimensions`. If dimensions are not given, the variable is assumed to be a scalar.

If `varname` is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary For example, `createVariable('/GroupA/GroupB/VarC'),('x','y'),float)` will create groups `GroupA` and `GroupA/GroupB`, plus the variable `GroupA/GroupB/VarC`, if the preceding groups don't already exist.

The `datatype` can be a numpy datatype object, or a string that describes a numpy dtype object (like the `dtype.str` attribue of a numpy array). Supported specifiers include: `'S1' or 'c' (NC_CHAR), 'i1' or 'b' or 'B' (NC_BYTE), 'u1' (NC_UBYTE), 'i2' or 'h' or 's' (NC_SHORT), 'u2' (NC_USHORT), 'i4' or 'i' or 'l' (NC_INT), 'u4' (NC_UINT), 'i8' (NC_INT64), 'u8' (NC_UINT64), 'f4' or 'f' (NC_FLOAT), 'f8' or 'd' (NC_DOUBLE)`. `datatype` can also be a `netCDF4.CompoundType` instance (for a structured, or compound array), a `netCDF4.VLType` instance (for a variable-length array), or the python `str` builtin (for a variable-length string array). Numpy string and unicode datatypes with length greater than one are aliases for `str`.

Data from netCDF variables is presented to python as numpy arrays with the corresponding data type.

`dimensions` must be a tuple containing dimension names (strings) that have been defined previously using `netCDF4.Dataset.createDimension`. The default value is an empty tuple, which means the variable is a scalar.

If the optional keyword `zlib` is `True`, the data will be compressed in the netCDF file using gzip compression (default `False`).

The optional keyword `complevel` is an integer between 1 and 9 describing the level of compression desired (default 4). Ignored if `zlib=False`.

If the optional keyword `shuffle` is `True`, the HDF5 shuffle filter will be applied before compressing the data (default `True`). This significantly improves compression. Default is `True`. Ignored if `zlib=False`.

If the optional keyword `fletcher32` is `True`, the Fletcher32 HDF5 checksum algorithm is activated to detect errors. Default `False`.

If the optional keyword `contiguous` is `True`, the variable data is stored contiguously on disk. Default `False`. Setting to `True` for a variable with an unlimited dimension will trigger an error.

The optional keyword `chunksizes` can be used to manually specify the HDF5 chunksizes for each dimension of the variable. A detailed discussion of HDF chunking and I/O performance is available [here](http://www.hdfgroup.org/HDF5/doc/H5.user/Chunking.html). Basically, you want the chunk size for each dimension to match as closely as possible the size of the data block that users will read from the file. `chunksizes` cannot be set if `contiguous=True`.

The optional keyword `endian` can be used to control whether the data is stored in little or big endian format on disk. Possible values are `little, big` or `native` (default). The library will automatically handle endian conversions when the data is read, but if the data is always going to be read on a computer with the opposite format as the one used to create the file, there may be some performance advantage to be gained by setting the endian-ness.

The `zlib, complevel, shuffle, fletcher32, contiguous, chunksizes` and `endian` keywords are silently ignored for netCDF 3 files that do not use HDF5.

The optional keyword `fill_value` can be used to override the default netCDF `_FillValue` (the value that the variable gets filled with before any data is written to it, defaults given in `netCDF4.default_fillvals`). If fill_value is set to `False`, then the variable is not pre-filled.

If the optional keyword parameter `least_significant_digit` is specified, variable data will be truncated (quantized). In conjunction with `zlib=True` this produces 'lossy', but significantly more efficient compression. For example, if `least_significant_digit=1`, data will be quantized using `numpy.around(scale*data)/scale`, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). From the [PSD metadata conventions](http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml): "least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value." Default is `None`, or no quantization, or 'lossless' compression.

When creating variables in a `NETCDF4` or `NETCDF4_CLASSIC` formatted file, HDF5 creates something called a 'chunk cache' for each variable. The default size of the chunk cache may be large enough to completely fill available memory when creating thousands of variables. The optional keyword `chunk_cache` allows you to reduce (or increase) the size of the default chunk cache when creating a variable. The setting only persists as long as the Dataset is open - you can use the set_var_chunk_cache method to change it the next time the Dataset is opened. Warning - messing with this parameter can seriously degrade performance.

The return value is the `netCDF4.Variable` class instance describing the new variable.

A list of names corresponding to netCDF variable attributes can be obtained with the `netCDF4.Variable` method `netCDF4.Variable.ncattrs`. A dictionary containing all the netCDF attribute name/value pairs is provided by the `__dict__` attribute of a `netCDF4.Variable` instance.

`netCDF4.Variable` instances behave much like array objects. Data can be assigned to or retrieved from a variable with indexing and slicing operations on the `netCDF4.Variable` instance. A `netCDF4.Variable` instance has six Dataset standard attributes: `dimensions, dtype, shape, ndim, name` and `least_significant_digit`. Application programs should never modify these attributes. The `dimensions` attribute is a tuple containing the names of the dimensions associated with this variable. The `dtype` attribute is a string describing the variable's data type (`i4, f8, S1,` etc). The `shape` attribute is a tuple describing the current sizes of all the variable's dimensions. The `name` attribute is a string containing the name of the Variable instance. The `least_significant_digit` attributes describes the power of ten of the smallest decimal place in the data the contains a reliable value. assigned to the `netCDF4.Variable` instance. If `None`, the data is not truncated. The `ndim` attribute is the number of variable dimensions.

delncattr(...)

 

**`delncattr(self,name,value)`**

delete a netCDF dataset or group attribute. Use if you need to delete a netCDF attribute with the same name as one of the reserved python attributes.

filepath(...)

 

**`filepath(self)`**

Get the file system path (or the opendap URL) which was used to open/create the Dataset. Requires netcdf >= 4.1.2

getncattr(...)

 

**`getncattr(self,name)`**

retrievel a netCDF dataset or group attribute. Use if you need to get a netCDF attribute with the same name as one of the reserved python attributes.

ncattrs(...)

 

**`ncattrs(self)`**

return netCDF global attribute names for this `netCDF4.Dataset` or `netCDF4.Group` in a list.

renameAttribute(...)

 

**`renameAttribute(self, oldname, newname)`**

rename a `netCDF4.Dataset` or `netCDF4.Group` attribute named `oldname` to `newname`.

renameDimension(...)

 

**`renameDimension(self, oldname, newname)`**

rename a `netCDF4.Dimension` named `oldname` to `newname`.

renameGroup(...)

 

**`renameGroup(self, oldname, newname)`**

rename a `netCDF4.Group` named `oldname` to `newname` (requires netcdf >= 4.3.1).

renameVariable(...)

 

**`renameVariable(self, oldname, newname)`**

rename a `netCDF4.Variable` named `oldname` to `newname`

set_auto_mask(...)

 

**`set_auto_mask(self, True_or_False)`**

Call `netCDF4.Variable.set_auto_mask` for all variables contained in this `netCDF4.Dataset` or `netCDF4.Group`, as well as for all variables in all its subgroups.

**`True_or_False`**: Boolean determining if automatic conversion to masked arrays shall be applied for all variables.

***Note***: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

set_auto_maskandscale(...)

 

**`set_auto_maskandscale(self, True_or_False)`**

Call `netCDF4.Variable.set_auto_maskandscale` for all variables contained in this `netCDF4.Dataset` or `netCDF4.Group`, as well as for all variables in all its subgroups.

**`True_or_False`**: Boolean determining if automatic conversion to masked arrays and variable scaling shall be applied for all variables.

***Note***: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

set_auto_scale(...)

 

**`set_auto_scale(self, True_or_False)`**

Call `netCDF4.Variable.set_auto_scale` for all variables contained in this `netCDF4.Dataset` or `netCDF4.Group`, as well as for all variables in all its subgroups.

**`True_or_False`**: Boolean determining if automatic variable scaling shall be applied for all variables.

***Note***: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

set_fill_off(...)

 

**`set_fill_off(self)`**

Sets the fill mode for a `netCDF4.Dataset` open for writing to `off`.

This will prevent the data from being pre-filled with fill values, which may result in some performance improvements. However, you must then make sure the data is actually written before being read.

set_fill_on(...)

 

**`set_fill_on(self)`**

Sets the fill mode for a `netCDF4.Dataset` open for writing to `on`.

This causes data to be pre-filled with fill values. The fill values can be controlled by the variable's `_Fill_Value` attribute, but is usually sufficient to the use the netCDF default `_Fill_Value` (defined separately for each variable type). The default behavior of the netCDF library correspongs to `set_fill_on`. Data which are equal to the `_Fill_Value` indicate that the variable was created, but never written to.

setncattr(...)

 

**`setncattr(self,name,value)`**

set a netCDF dataset or group attribute using name,value pair. Use if you need to set a netCDF attribute with the with the same name as one of the reserved python attributes.

setncatts(...)

 

**`setncatts(self,attdict)`**

set a bunch of netCDF dataset or group attributes at once using a python dictionary. This may be faster when setting a lot of attributes for a `NETCDF3` formatted file, since nc_redef/nc_enddef is not called in between setting each attribute

sync(...)

 

**`sync(self)`**

Writes all buffered data in the `netCDF4.Dataset` to the disk file.