Storage
This module provides both non-persistent and persistent storage and interpolation to be used by any module in Colossus.
Basics
There are two levels of storage: all stored fields are stored in dictionaries in memory. The data can be of any type, if persistent storage is used the data must be pickleable. The persistent storage can be turned on and off by the user, both generally and for each field individually.
Each “user” of the storage module receives their own storage space and a uniquely identifying hash code that can be used to detect changes that make it necessary to reset the storage, for example changes in physical parameters to a model. The code example below shows how to set up a storage user within a class:
from colossus.utils import storage
class DemoClass():
def __init__(self, some_parameter = 1.5):
self.some_parameter = some_parameter
self.su = storage.StorageUser('myModule', 'rw', self.getName, self.getHashableString,
self.reportChanges)
self.some_data = [2.6, 9.5]
self.su.storeObject('test_data', self.some_data, persistent = False)
return
def getName(self):
return 'MyModule'
def getHashableString(self):
param_string = 'MyModule_%.4f' % (self.some_parameter)
return param_string
def reportChanges(self):
print('Changes in this module detected, storage has been reset.')
return
def loadData(self):
data = self.su.getStoredObject('test_data')
return data
In the constructor, we have given the storage module pointers to three functions that return a
unique name for this module, a hashable string, and one that should be called when changes are
detected. The hashable string must change if the previously stored data is to be discarded upon a
parameter change. In the class above, we discard the field some_data
when a change in
some_parameter
is detected. The persistent
parameter determines whether this object will
be written to disk as part of a pickle and loaded next time the same user class (same name and
same hash code) is instantiated. Let us now try loading the data:
dc = DemoClass()
print(dc.loadData())
>>> [2.6, 9.5]
Let us now change some_parameter
. The data object is discarded and the load function returns
None
, indicating that no valid some_data
for the new hash string was found:
dc.some_parameter = 1.2
print(dc.loadData())
>>> 'Changes in this module detected, storage has been reset.'
>>> None
The storage module offers native support for interpolation tables. For example, if we have stored a table of variables x and y, we can get a spline interpolator for y(x) or even a reverse interpolator for x(y) by calling:
interp_y_of_x = su.getStoredObject('xy', interpolator = True)
interp_x_of_y = su.getStoredObject('xy', interpolator = True, inverse = True)
where su
is a StorageUser
object. The
getStoredObject()
function returns None
if no object is
found.
Module reference
- class utils.storage.StorageUser(module, persistence, func_name, func_hashstring, func_changed)
A storage user object allows access to persistent and non-persistent storage.
- Parameters
- module: str
The name of the module to which this user belongs. This name determines the cache sub- directory where files will be stored.
- persistence: str
A combination of
'r'
and'w'
, e.g.'rw'
or''
, indicating whether the storage is read and/or written from and to disk.- func_name: function
A function that takes no parameters and returns the name of the user class.
- func_hashstring: function
A function that takes no parameters and returns a unique string identifying the user class and any of its properties that, if changed, should trigger a resetting of the storage. If the hash string changes, the storage is emptied.
- func_changed: function
A function that takes no parameters and will be called if the hash string has been found to have changed (see above).
Methods
Check whether the properties of the user class have changed.
getHash
()Get a unique string from the user class and convert it to a hash.
getStoredObject
(object_name[, interpolator, ...])Retrieve a stored object from memory or file.
Create a unique filename for this storage user.
Reset the storage arrays and load persistent storage from file.
storeObject
(object_name, object_data[, ...])Save an object in memory and/or file storage.
- getHash()
Get a unique string from the user class and convert it to a hash.
- Returns
- hash: str
A string that changes if the input string is changed, but can be much shorter than the input string.
- getUniqueFilename()
Create a unique filename for this storage user.
- Returns
- filename: str
A filename that is unique to this module, storage user name, and the properties of the user as encapsulated in its hashable string.
- checkForChangedHash()
Check whether the properties of the user class have changed.
- Returns
- has_changed: bool
Returns
True
if the hash has changed compared to the last stored hash.
- resetStorage()
Reset the storage arrays and load persistent storage from file.
- storeObject(object_name, object_data, persistent=True)
Save an object in memory and/or file storage.
The object is written to a dictionary in memory, and also to file if
persistent == True
(unless persistence does not contain'w'
).- Parameters
- object_name: str
The name of the object by which it can be retrieved later.
- object_data: any
The object; can be any picklable data type.
- persistent: bool
If
True
, store this object on disk (if persistence is activated globally).
- getStoredObject(object_name, interpolator=False, inverse=False, path=None, store_interpolator=True, store_path_data=True)
Retrieve a stored object from memory or file.
If an object is already stored in memory, return it. If not, try to load it from file, otherwise return None. If the object is a 2-dimensional table, this function can also return an interpolator. If the
path
parameter is passed, the file is loaded from that file path.- Parameters
- object_name: str
The name of the object to be loaded.
- interpolator: bool
If
True
, return a spline interpolator instead of the underlying table. For this to work, the object data must either be an array of dimensionality[2, n]
or a tuple with three entries of the format(x, y, z[x, y])
wherex
andy
are ascending arrays andz
is of dimensionalitylen(x), len(y)
.- inverse: bool
Return an interpolator that gives x(y) instead of y(x). This parameter only works for a 1-dimensional interpolator (see
interpolator
above).- path: str
If not
None
, data is loaded from this file path (unless it has already been loaded, in which case it is found in memory).- store_interpolator: bool
If
True
(the default), an interpolator that has been created is temporarily stored so that it does not need to be created again.- store_path_data: bool
If
True
(the default), data loaded from a file defined by path is stored temporarily so that it does not need to be loaded again.
- Returns
- object_data: any
Returns the loaded object (any pickleable data type), or a scipy.interpolate.InterpolatedUnivariateSpline interpolator object, or
None
if no object was found.
- utils.storage.getCacheDir(module=None)
Get a directory for the persistent caching of data. The function attempts to locate the home directory and (if necessary) create a ‘.colossus’ sub-directory. In the rare case where that fails, the location of this code file is used as a base directory.
- Parameters
- module: string
The name of the module that is requesting this cache directory. Each module has its own directory in order to avoid name conflicts.
- Returns
- path: string
The cache directory.