Reference Manual
All data of MemCom objects are 1- or 2-dimensional numpy
arrays. MemCom supports the data types numpy.char
,
numpy.int32
, numpy.int64
, numpy.float32
,
numpy.float64
, numpy.complex32
, numpy.complext64
or a
Python sequence of a relational table, i.e a sequence of one or more
memcom.tb
objects.
The db Object
A Database Object is associated with a MemCom database. It is similar to a Python mapping type, mapping a string (the dataset name) to a dataset object (the value of the object).
A MemCom database is opened when creating the memcom.db
object. To close a Database (1) call the method
memcom.db.close()
or (2) delete the memcom.db
object. The MemCom database is implicitly closed when the
memcom.db
object is deleted (i.e. when this object becomes
unreachable and garbage-collected). But since garbage collection is
not guaranteed to happen and a Python implementation is allowed to
postpone garbage collection, an explicit close() method is given for
the memcom.db
object.
The Database Object class and member functions reference below is completed by a list of the Database Object operators.
- class db(dirname=None, mode='r', npages=None, psize=None, handle=None)
Creates a Database instance and opens the database files. No data are transferred at this stage.
- Parameters:
dirname (str) – Path to database.
mode (str) – Open mode: ‘r’ means read only (default), ‘w’ means read/write.
npages (int) – If specified, open the MemCom database in buffered mode with npages pages of size psize bytes.
psize (int) – Page size (bytes) if specified to gether with
npages
handle (int) – Existing database handle. Default: No handle, i.e database not open.
- Raises:
MemcomIOError – If name is not a MemCom database.
- clear(wildcard=None, match=None, search=None)
Removes all selected datasets. By default all datasets are removed.
- Parameters:
wildcard (str) – Selects names with regexp expressions.
match (str) – Selects names by exact match.
search (str) – Selects names by pattern.
- close()
Close the MemCom database. Note that the db object is still there but with no connection to the database files.
Explicitly closing a database invalidates all references to the database. Since closing does not mean saving all data, datasets not explicitly saved to the database will be lost.
- copy(wildcard=None, match=None, search=None)
Returns a (shallow) copy of selected datasets.
- Parameters:
wildcard (str) – Selects names with regexp expressions.
match (str) – Selects names by exact match.
search (str) – Selects names by pattern.
- Returns:
A dictionary containing the copies of the datasets.
- get(name, default=None)
Returns the dataset object identified by name. If the [:] operator is appended the content of the dataset is returned. See examples. get is equivalent to the db[] operator.
- Parameters:
name (str) – Dataset name.
- Returns:
Dataset object.
- Raises:
MemcomKeyError – Dataset not found.
- has_key(name)
Checks if a dataset identified by name exists on the database.
- Parameters:
name (str) – Dataset name.
- Returns:
True if dataset exists and False else.
- keys(wildcard=None, match=None, search=None)
Searches for selected dataset names.
- Parameters:
wildcard (str) – Selects names with regexp expressions.
match (str) – Selects names by exact match.
search (str) – Selects names by pattern.
- Returns:
List containing the selected dataset names.
- items(wildcard=None, match=None, search=None)
Searches for selected dataset names.
- Parameters:
wildcard (str) – Selects names with regexp expressions.
match (str) – Selects names by exact match.
search (str) – Selects names by pattern.
- Returns:
List of pairs (name,
dataset
).
- pop(name, *arg)
Removes a dataset identified by
name
from the database and returns the object.- Parameters:
name (str) – Dataset name.
- Returns:
Popped
dataset
.- Raises:
MemcomKeyError – Dataset not found.
- setdefault(name, value=None)
Searches the database for a dataset identified by name. If the dataset exists, the content, i.e the value, is returned. If not, the dataset is set to value and the content, i.e the value, is returned.
- Parameters:
name (str) – Dataset name
value (array) – Value to be set. Default is None.
- Returns:
The
dataset
object db[name].
- values(wildcard=None, match=None, search=None)
Searches the database by dataset name and returns the values of selected
dataset
.
- new_dataset(name, typ, dims, colspec=None)
Creates a new empty dataset on the database.
Relational tables created with subsets cannot be filled on a one by one basis - they must be filled with a list containing all subset dictionaries (relational tables).
- Parameters:
name (str) – Dataset name.
typ (str) – Dataset type.
dims (tuple) – List containing dimensions of the dataset, i.e (nrows,) or (nrows, ncols).
colspec (str) – List containing column names (optional, required for array tables only).
- Raises:
MemcomErrorRonly – Database is read only.
MemcomErrorSetExists – Dataset exists.
MemcomIOError – Cannot create dataset.
- pack()
Pack the database file be removing unused space.
- get_stream(address, type, size)
Return an array from the database at a specified address.
- Parameters:
int (size) – Byte address on database file where the array starts. address must be greater or equal to 0, i.e. the lowest byte address is 0.
str (type) – Array type.
int – Array size to be read (data elements, not, bytes). Must be greater or equal to 0, i.e the lowest byte address is 0.
A numerical array of * elements of type type is returned.
- set_stream(address, values)
Write an array to the database at a specified address (caution!).
- Parameters:
address (int) – Byte address on database file stream where the array starts (data elements, not, bytes). Must be greater or equal to 0, i.e the lowest byte address is 0.
values (array) – Values to be written.
- dir(file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)
Prints dataset attributes of all datasets. Note:
print db_object
will print dataset names only.
- property addrsize
Returns the creator platform address size (4 or 8 bytes, int).
- property cdate
Returns the creation date in seconds since the Epoch (int).
- property adate
Returns the last access in seconds since the Epoch (int).
- property handle
Returns the MemCom file handle (int).
- property iotype
Returns the database file I/O type (str).
- property mode
Returns the database access mode (str).
- property dbname
Returns the database full name.
- property naddr
Returns the size of the db stream (int).
- property nentries
Returns the number of dataset entries on the database (int).
- property nhtentries
of hash table entries.
- property nsets
number of datasets
Operator |
Description |
db |
List of all dataset names of database object db. |
db[name] |
Addresses a dataset object identified by name. Raises a MemcomKeyError exception if the dataset is not in the database. |
db[name] = value |
Create or replace the content of the dataset identified by name and initialize it with value. If a dataset name already exists, it is removed from the database before inserting the new dataset. |
db1[name1] == db2[name2] |
Compares the dataset object identified by name to another object identified by name and returns True if equal or False else. |
del db[name] |
Deletes the dataset identified by name from the database and
invalidates any reference the object. All |
db1[name1] != db2 [name2] |
Compares dataset objects and returns True if not equal and False else. |
len(db[name]) |
Returns the number of datasets on the database. |
name in db |
True if the dataset name is in the database db and False else. |
name not in db |
True if the dataset name is not in the database db and False else. |
Examples
Open a database in read only mode (default):
db = memcom.db("test.mc")
Open a database file in read-write mode:
my_db = memcom.db("test.mc",'rw')
Get a list of all dataset names in db:
db.keys()
Get a list of a dataset names matching 'COOR.*'
:
db.keys('COOR.*') ['COOR.3', 'COOR.2', 'COOR.1']
Note that datasets are not listed in sorted order. To get sorted order:
sorted(db.keys('COOR.*') ['COOR.1', 'COOR.2', 'COOR.3']
The dataset Object
The memcom.dataset
object is associated to a MemCom dataset of a
MemCom database. It is similar to a multidimensional mutable
sequence Python type.
Depending on type of the memcom.dataset
, accessing a
dataset object returns a multidimensional numpy array of
type numpy.char
, numpy.int32
, numpy.int64
,
numpy.float32
, numpy.float64
, numpy.complex32
,
numpy.complext64
or a Python sequence of relational tables, i.e a
sequence of one or more memcom.tb
objects.
- class dataset(db, name=None, attr=None)
Create a reference to an existing MemCom dataset. Note: The value of a dataset object is similar to a multidimensional sequence.
- Parameters:
db (db) – Database object.
name (str) – Dataset name. If not specified, the MemCom dataset attributes object attr parameter will be interrogated for the name.
- get_db()
Database object of this dataset.
- property db
Database object
- browse()
Start the MemCom browser for the dataset
- browse_desc()
Start the mcbrowser for the descriptor of the set
- get_colspec()
Column specification of Array Table (AT) or Sparse Table (ST) dataset. A list of lists.
- property colspec
column specification
- get_desc()
Dataset descriptor (dict). Setting the descriptor replaces any existing descriptor.
- property desc
Descriptor
- get_dims()
Shape (dimensions) of dataset. Tist of int (nrows,) or (nrows, ncols).
- property dims
Shape of dataset
- get_shape()
Shape (dimensions) of dataset. Tuple of int (nrows,) or (nrows, ncols).
- property shape
Shape of dataset
- get_dsize()
Dataset descriptor size (in bytes).
- property dsize
Dataset descriptor size
- get_faddr()
Stream byte address of the dataset on the database.
- property faddr
File address
- get_name()
Return dataset name (str).
- set_name(value)
Dataset name (str). Setting the name does rename the dataset on the database.
- property name
Dataset name
- get_size()
Returns the total size of the dataset in units of array elements (int).
- property size
Returns the total size of the dataset in units of array elements (int).
- get_type()
Dataset type (str).
- property type
Dataset type
The following standard mutable sequence operations are defined on a
dataset
instance:
Operator |
Description |
---|---|
ds [ i ] |
The subarray of datasets ds selected by the slice i. |
ds [ i ] = value |
The subarray of datasets ds selected by the slice i is replaced by value. |
ds1 == ds2 |
True if the two datasets have the same content. |
ds1 != ds2 |
True if the two datasets do not have the same content. |
The following operations and proprieties are also defined on a
dataset
instance:
Operation |
Result |
---|---|
ds.name |
Name of the dataset. |
ds.name = value |
Rename (change the name) of the dataset ds to value. |
ds.type |
Type of the dataset ds. |
ds.dims |
Dimension of the dataset ds. |
ds`.dims = dims |
Change the dimension(s)) of the dataset ds to dims. |
ds.colspec |
The colspec of a dataset of type RTable*. If the dataset is of type ATable a description of the columns of the dataset is returned. |
ds.desc |
Descriptor (RTable object) of the dataset ds. |
ds.desc = d |
Change the descriptor of the dataset ds to d (dict). |
The tb Object
The tb
object, a dictionary, is associated with a loaded
MemCom relational table, i.e a relational table mapped in memory. It
maps a string (the key) to a numpy array of type numpy.char
,
numpy.int32
, numpy.int64
, numpy.float32
,
numpy.float64
, numpy.complex32
, numpy.complext64
, referred
to as value.
- class tb(dataset=None, desc=None, pos=-1, value=None)
Creates a dictionary or array of dictionaries object associated to MemCom relational table datasets, mapping one or more keys to a numpy arrays of type
numpy.char
,numpy.int32
,numpy.int64
,numpy.float32
,numpy.float64
,numpy.complex32
,numpy.complext64
, referred to as value.- get_name()
Dataset name (str).
- property name
Dataset name
- close()
Closes and invalidates the relational table. The content of the table is NOT saved.
- has_key(key)
Checks if a key exists in the relational table.
- Parameters:
key (str) – Key name.
- Returns:
True if
key
exists and False else.
- get(key, value=None)
Returns the value of
key
in the relational table.- Parameters:
key (str) – Key name.
- Returns:
value or None.
- setdefault(key, value='')
Searches for an entry key in the relational table. If key exists, the content, i.e the value, is returned. If not, a new entry key with value is inserted in the relational table and value is returned. Note that setdefault() is similar to get().
- Parameters:
name (str) – Dataset name
value (array) – Value to be set. Default is None.
- Returns:
The value of the entry.
- pop(key, *arg)
Removes a relational table entry identified by key from the relational table and returns the value.
- Parameters:
name (str) – Dataset name.
- Raise:
MemcomKeyError when no default value is given and the dataset is not found.
- popitem()
Removes and returns an arbitrary (key, value) pair from the relational table.
- Raise:
MemcomKeyError if dataset is not found or table empty.
- clear(wildcard=None, match=None, search=None)
Clears the relational table buffer, i.e removes all entries.
- Parameters:
wildcard (str) – Selects relational table entries (names) with regexp expressions.
match (str) – Selects relational table entries (names) by exact match.
search (str) – Selects relational table entries (names) by pattern.
- Raise:
MemcomIOError if the table on the databasecannot be cleared.
- keys(wildcard=None, match=None, search=None)
Retuns list of the entries (names) contained in the relational table buffer.
- Parameters:
wildcard (str) – Selects relational table entries (names) with regexp expressions.
match (str) – Selects relational table entries (names) by exact match.
search (str) – Selects relational table entries (names) by pattern.
- values(wildcard=None, match=None, search=None)
Returns a list of the values of selected entries.
- Parameters:
wildcard (str) – Selects relational table entries (names) with regexp expressions.
match (str) – Selects relational table entries (names) by exact match.
search (str) – Selects relational table entries (names) by pattern.
- items(wildcard=None, match=None, search=None)
Returns a list containing selected relational table objects by (name, value) pairs.
- Parameters:
wildcard (str) – Selects relational table entries (names) with regexp expressions.
match (str) – Selects relational table entries (names) by exact match.
search (str) – Selects relational table entries (names) by pattern.
- update(b)
- Updates the relational table with a dictionary b containing
(key, value) pairs to be updated.
- Parameters:
b (dict) – Dictionary of relational table entries to be updated.
- copy(wildcard=None, match=None, search=None)
Returns a dict with copies of table objects.
- Parameters:
wildcard (str) – Selects relational table entries (names) with regexp expressions.
match (str) – Selects relational table entries (names) by exact match.
search (str) – Selects relational table entries (names) by pattern.
- iterkeys(wildcard=None, match=None, search=None)
Returns an iterator over the selected relational table entries.
- Parameters:
wildcard (str) – Selects relational table entries (names) with regexp expressions.
match (str) – Selects relational table entries (names) by exact match.
search (str) – Selects relational table entries (names) by pattern.
- itervalues(wildcard=None, match=None, search=None)
Returns an iterator over the selected relational table entries.
- Parameters:
wildcard (str) – Selects relational table entries (names) with regexp expressions.
match (str) – Selects relational table entries (names) by exact match.
search (str) – Selects relational table entries (names) by pattern.
- iteritems(wildcard=None, match=None, search=None)
Returns an iterator over the selected datasets.
- Parameters:
wildcard (str) – Selects relational table entries (names) with regexp expressions.
match (str) – Selects relational table entries (names) by exact match.
search (str) – Selects relational table entries (names) by pattern.
- sync()
Saves the current state of the relational table to the database.
- raise MemcomIOError: if the database is open in read-only
mode.
- get_size()
Relational table buffer size (bytes)
- property size
Relational table size
- get_used_size()
Relational table used space (bytes)
- property used_size
Relational table used size
The following standard mapping type operations are defined on a rt
instance:
Operation |
Result |
---|---|
dict(tb) |
Returns the Python dict of tb. |
len(tb) |
Returns the number of entries (key,value) of a relational table. Iterates through all datasets of the relational table (complexity of operation: number of key in the relational table). |
len(tb[key]) |
Returns the number of array elements of key name. |
tb [key] |
Addresses value of key. |
tb[key] = value |
Assigns a new value to key. Any old value is overwritten. |
del tb[key] |
Removes the entry key from the table tb. |
tb.clear() |
Removes all entries from the table tb. |
tb.copy() |
Makes a (shallow) copy of tb. |
key in tb |
1 if tb contains key, 0 else. |
key not in tb |
1 if tb does not contain key, 0 else. |
tb == tb1 |
Returns True if the two relational table objects do have the same content. |
tb != tb1 |
Returns True if the two relational table objects do not have the same content. |
The following operations and proprieties are also defined on a rt
instance:
Operation |
Description |
---|---|
tb.close() |
Saves the relational table a on the database and invalidate a. |
tb.sync() |
Saves the relational table a on the database. |
tb.size |
Returns the size of the relational table tb in number of bytes. |
tb.used_size |
Returns the size of the relational table buffer (bytes). |
Utility Functions
- is_db(path)
Return 1 if path is a MemCom database else return 0. If the file path does not exist -1 is returned.
- get_version()
Return a tuple of the current MemCom version.
Extended Slices
An extended slice selects a multidimensional sub-array in a
multidimensional array (e.g., a dataset
object or a
numpy array. Extended slicing may be used as expressions or as
targets in assignments.
The dimension of the selected slice can by between 0 (e.g., only one element is selected) and the dimension of the multidimensional array (e.g. all elements are selected).
numpy Arrays
Please consult the numpy documentation.