Library Reference¶
First level variables¶
-
__version__
¶ The version of the blosc package.
-
blosclib_version
¶ The version of the Blosc C library.
-
clib_versions
¶ A map for the versions of the compression libraries included in C library.
-
cnames
¶ The list of compressors included in C library.
-
cname2clib
¶ A map between compressor names and its libraries (or formats).
-
ncores
¶ The number of cores detected.
Public functions¶
-
blosc.
compress
(bytesobj, typesize[, clevel=9, shuffle=True, cname='blosclz']])¶ Compress bytesobj, with a given type size.
Parameters: bytesobj : str / bytes
The data to be compressed.
typesize : int
The data type size.
clevel : int (optional)
The compression level from 0 (no compression) to 9 (maximum compression). The default is 9.
shuffle : bool (optional)
Whether you want to activate the shuffle filter or not. The default is True.
cname : string (optional)
The name of the compressor used internally in Blosc. It can be any of the supported by Blosc (‘blosclz’, ‘lz4’, ‘lz4hc’, ‘snappy’, ‘zlib’ and maybe others too). The default is ‘blosclz’.
Returns: out : str / bytes
The compressed data in form of a Python str / bytes object.
Raises: TypeError
If bytesobj is not of type bytes or string.
ValueError
If bytesobj is too long. If typesize is not within the allowed range. if clevel is not not within the allowed range.
Examples
>>> import array >>> a = array.array('i', range(1000*1000)) >>> a_bytesobj = a.tostring() >>> c_bytesobj = blosc.compress(a_bytesobj, typesize=4) >>> len(c_bytesobj) < len(a_bytesobj) True
-
blosc.
compress_ptr
(address, items, typesize[, clevel=9, shuffle=True, cname='blosclz']])¶ Compress the data at address with given items and typesize.
Parameters: address : int or long
the pointer to the data to be compressed
items : int
The number of items (of typesize) to be compressed.
typesize : int
The data type size.
clevel : int (optional)
The compression level from 0 (no compression) to 9 (maximum compression). The default is 9.
shuffle : bool (optional)
Whether you want to activate the shuffle filter or not. The default is True.
cname : string (optional)
The name of the compressor used internally in Blosc. It can be any of the supported by Blosc (‘blosclz’, ‘lz4’, ‘lz4hc’, ‘snappy’, ‘zlib’ and maybe others too). The default is ‘blosclz’.
Returns: out : str / bytes
The compressed data in form of a Python str / bytes object.
Raises: TypeError
If address is not of type int or long.
ValueError
If items * typesize is larger than the maximum allowed buffer size. If typesize is not within the allowed range. If clevel is not within the allowed range. If cname is not within the supported compressors.
Notes
This function can be used anywhere that a memory address is available in Python. For example the Numpy “__array_interface__[‘data’][0]” construct, or when using the ctypes modules.
Importantly, the user is responsible for making sure that the memory address is valid and that the memory pointed to is contiguous. Passing a non-valid address has a high likelihood of crashing the interpreter by segfault.
Examples
>>> import numpy >>> items = 7 >>> np_array = numpy.arange(items) >>> c = blosc.compress_ptr(np_array.__array_interface__['data'][0], items, np_array.dtype.itemsize) >>> d = blosc.decompress(c) >>> np_ans = numpy.fromstring(d, dtype=np_array.dtype) >>> (np_array == np_ans).all() True
>>> import ctypes >>> typesize = 8 >>> data = [float(i) for i in range(items)] >>> Array = ctypes.c_double * items >>> a = Array(*data) >>> c = blosc.compress_ptr(ctypes.addressof(a), items, typesize) >>> d = blosc.decompress(c) >>> import struct >>> ans = [struct.unpack('d', d[i:i+typesize])[0] for i in range(0,items*typesize,typesize)] >>> data == ans True
-
blosc.
decompress
(bytesobj)¶ Decompresses a bytesobj compressed object.
Parameters: bytesobj : str / bytes
The data to be decompressed.
Returns: out : str / bytes
The decompressed data in form of a Python str / bytes object.
Raises: TypeError
If bytesobj is not of type bytes or string.
Examples
>>> import array >>> a = array.array('i', range(1000*1000)) >>> a_bytesobj = a.tostring() >>> c_bytesobj = blosc.compress(a_bytesobj, typesize=4) >>> a_bytesobj2 = blosc.decompress(c_bytesobj) >>> a_bytesobj == a_bytesobj2 True >>> b"" == blosc.decompress(blosc.compress(b"", 1)) True >>> b"1"*7 == blosc.decompress(blosc.compress(b"1"*7, 8)) True
-
blosc.
decompress_ptr
(bytesobj, address)¶ Decompresses a bytesobj compressed object into the memory at address.
Parameters: bytesobj : str / bytes
The data to be decompressed.
address : int or long
the pointer to the data to be compressed
Returns: nbytes : int
the number of bytes written to the buffer
Raises: TypeError
If bytesobj is not of type bytes or string. If address is not of type int or long.
Notes
This function can be used anywhere that a memory address is available in Python. For example the Numpy “__array_interface__[‘data’][0]” construct, or when using the ctypes modules.
Importantly, the user is responsible for making sure that the memory address is valid and that the memory pointed to is contiguous and can be written to. Passing a non-valid address has a high likelihood of crashing the interpreter by segfault.
Examples
>>> import numpy >>> items = 7 >>> np_array = numpy.arange(items) >>> c = blosc.compress_ptr(np_array.__array_interface__['data'][0], items, np_array.dtype.itemsize) >>> np_ans = numpy.empty(items, dtype=np_array.dtype) >>> nbytes = blosc.decompress_ptr(c, np_ans.__array_interface__['data'][0]) >>> (np_array == np_ans).all() True >>> nbytes == items * np_array.dtype.itemsize True
>>> import ctypes >>> typesize = 8 >>> data = [float(i) for i in range(items)] >>> Array = ctypes.c_double * items >>> in_array = Array(*data) >>> c = blosc.compress_ptr(ctypes.addressof(in_array), items, typesize) >>> out_array = ctypes.create_string_buffer(items*typesize) >>> nbytes = blosc.decompress_ptr(c, ctypes.addressof(out_array)) >>> import struct >>> ans = [struct.unpack('d', out_array[i:i+typesize])[0] for i in range(0,items*typesize,typesize)] >>> data == ans True >>> nbytes == items * typesize True
-
blosc.
pack_array
(array[, clevel=9, shuffle=True, cname='blosclz']])¶ Pack (compress) a NumPy array.
Parameters: array : ndarray
The NumPy array to be packed.
clevel : int (optional)
The compression level from 0 (no compression) to 9 (maximum compression). The default is 9.
shuffle : bool (optional)
Whether you want to activate the shuffle filter or not. The default is True.
cname : string (optional)
The name of the compressor used internally in Blosc. It can be any of the supported by Blosc (‘blosclz’, ‘lz4’, ‘lz4hc’, ‘snappy’, ‘zlib’ and maybe others too). The default is ‘blosclz’.
Returns: out : str / bytes
The packed array in form of a Python str / bytes object.
Raises: TypeError
If array does not quack like a numpy ndarray.
ValueError
- If array.itemsize * array.size is larger than the maximum allowed
buffer size.
If typesize is not within the allowed range. If clevel is not within the allowed range. If cname is not within the supported compressors.
Examples
>>> import numpy >>> a = numpy.arange(1e6) >>> parray = blosc.pack_array(a) >>> len(parray) < a.size*a.itemsize True
-
blosc.
unpack_array
(packed_array)¶ Unpack (decompress) a packed NumPy array.
Parameters: packed_array : str / bytes
The packed array to be decompressed.
Returns: out : ndarray
The decompressed data in form of a NumPy array.
Raises: TypeError
If packed_array is not of type bytes or string.
Examples
>>> import numpy >>> a = numpy.arange(1e6) >>> parray = blosc.pack_array(a) >>> len(parray) < a.size*a.itemsize True >>> a2 = blosc.unpack_array(parray) >>> numpy.alltrue(a == a2) True
Utilities¶
-
blosc.
clib_info
(cname)¶ Return info for compression libraries in C library.
Parameters: cname : str
The compressor name.
Returns: out : tuple
The associated library name and version.
-
blosc.
compressor_list
()¶ Returns a list of compressors available in C library.
Parameters: None
Returns: out : list
The list of names.
-
blosc.
detect_number_of_cores
()¶ Detect the number of cores in this system.
Returns: out : int
The number of cores in this system.
-
blosc.
free_resources
()¶ Free possible memory temporaries and thread resources.
Returns: out : None Notes
Blosc maintain a pool of threads waiting for work as well as some temporary space. You can use this function to release these resources when you are not going to use Blosc for a long while.
Examples
>>> blosc.free_resources() >>>
-
blosc.
get_clib
(bytesobj)¶ Return the name of the compression library for Blosc
bytesobj
buffer.Parameters: bytesobj : str / bytes
The compressed buffer.
Returns: out : str
The name of the compression library.
-
blosc.
set_nthreads
(nthreads)¶ Set the number of threads to be used during Blosc operation.
Parameters: nthreads : int
The number of threads to be used during Blosc operation.
Returns: out : int
The previous number of used threads.
Raises: ValueError
If nthreads is larger that the maximum number of threads blosc can use.
Notes
The number of threads for Blosc is the maximum number of cores detected on your machine (via
detect_number_of_cores
). In some cases Blosc gets better results if you set the number of threads to a value slightly below than your number of cores.Examples
Set the number of threads to 2 and then to 1:
>>> oldn = blosc.set_nthreads(2) >>> blosc.set_nthreads(1) 2
-
blosc.
print_versions
()¶ Print all the versions of software that python-blosc relies on.