Reference Manual

All data of MemCom objects are 1- or 2-dimensional numpy arrays. MemCom supports the data types numpy.char, numpy.int32, numpy.int64, numpy.float32, numpy.float64, numpy.complex32, numpy.complext64 or a Python sequence of a relational table, i.e a sequence of one or more memcom.tb objects.

The db Object

A Database Object is associated with a MemCom database. It is similar to a Python mapping type, mapping a string (the dataset name) to a dataset object (the value of the object).

A MemCom database is opened when creating the memcom.db object. To close a Database (1) call the method memcom.db.close() or (2) delete the memcom.db object. The MemCom database is implicitly closed when the memcom.db object is deleted (i.e. when this object becomes unreachable and garbage-collected). But since garbage collection is not guaranteed to happen and a Python implementation is allowed to postpone garbage collection, an explicit close() method is given for the memcom.db object.

The Database Object class and member functions reference below is completed by a list of the Database Object operators.

class db(dirname=None, mode='r', npages=None, psize=None, handle=None)

Creates a Database instance and opens the database files. No data are transferred at this stage.

Parameters:
  • dirname (str) – Path to database.

  • mode (str) – Open mode: ‘r’ means read only (default), ‘w’ means read/write.

  • npages (int) – If specified, open the MemCom database in buffered mode with npages pages of size psize bytes.

  • psize (int) – Page size (bytes) if specified to gether with npages

  • handle (int) – Existing database handle. Default: No handle, i.e database not open.

Raises:

MemcomIOError – If name is not a MemCom database.

clear(wildcard=None, match=None, search=None)

Removes all selected datasets. By default all datasets are removed.

Parameters:
  • wildcard (str) – Selects names with regexp expressions.

  • match (str) – Selects names by exact match.

  • search (str) – Selects names by pattern.

close()

Close the MemCom database. Note that the db object is still there but with no connection to the database files.

Explicitly closing a database invalidates all references to the database. Since closing does not mean saving all data, datasets not explicitly saved to the database will be lost.

copy(wildcard=None, match=None, search=None)

Returns a (shallow) copy of selected datasets.

Parameters:
  • wildcard (str) – Selects names with regexp expressions.

  • match (str) – Selects names by exact match.

  • search (str) – Selects names by pattern.

Returns:

A dictionary containing the copies of the datasets.

get(name, default=None)

Returns the dataset object identified by name. If the [:] operator is appended the content of the dataset is returned. See examples. get is equivalent to the db[] operator.

Parameters:

name (str) – Dataset name.

Returns:

Dataset object.

Raises:

MemcomKeyError – Dataset not found.

has_key(name)

Checks if a dataset identified by name exists on the database.

Parameters:

name (str) – Dataset name.

Returns:

True if dataset exists and False else.

keys(wildcard=None, match=None, search=None)

Searches for selected dataset names.

Parameters:
  • wildcard (str) – Selects names with regexp expressions.

  • match (str) – Selects names by exact match.

  • search (str) – Selects names by pattern.

Returns:

List containing the selected dataset names.

items(wildcard=None, match=None, search=None)

Searches for selected dataset names.

Parameters:
  • wildcard (str) – Selects names with regexp expressions.

  • match (str) – Selects names by exact match.

  • search (str) – Selects names by pattern.

Returns:

List of pairs (name, dataset).

pop(name, *arg)

Removes a dataset identified by name from the database and returns the object.

Parameters:

name (str) – Dataset name.

Returns:

Popped dataset.

Raises:

MemcomKeyError – Dataset not found.

setdefault(name, value=None)

Searches the database for a dataset identified by name. If the dataset exists, the content, i.e the value, is returned. If not, the dataset is set to value and the content, i.e the value, is returned.

Parameters:
  • name (str) – Dataset name

  • value (array) – Value to be set. Default is None.

Returns:

The dataset object db[name].

values(wildcard=None, match=None, search=None)

Searches the database by dataset name and returns the values of selected dataset.

Parameters:
  • wildcard (str) – Selects names with regexp expressions.

  • match (str) – Selects names by exact match.

  • search (str) – Selects names by pattern.

Returns:

List containing the values (content) of the selected dataset. By default all dataset are processed.

new_dataset(name, typ, dims, colspec=None)

Creates a new empty dataset on the database.

Relational tables created with subsets cannot be filled on a one by one basis - they must be filled with a list containing all subset dictionaries (relational tables).

Parameters:
  • name (str) – Dataset name.

  • typ (str) – Dataset type.

  • dims (tuple) – List containing dimensions of the dataset, i.e (nrows,) or (nrows, ncols).

  • colspec (str) – List containing column names (optional, required for array tables only).

Raises:
  • MemcomErrorRonly – Database is read only.

  • MemcomErrorSetExists – Dataset exists.

  • MemcomIOError – Cannot create dataset.

pack()

Pack the database file be removing unused space.

get_stream(address, type, size)

Return an array from the database at a specified address.

Parameters:
  • int (size) – Byte address on database file where the array starts. address must be greater or equal to 0, i.e. the lowest byte address is 0.

  • str (type) – Array type.

  • int – Array size to be read (data elements, not, bytes). Must be greater or equal to 0, i.e the lowest byte address is 0.

A numerical array of * elements of type type is returned.

set_stream(address, values)

Write an array to the database at a specified address (caution!).

Parameters:
  • address (int) – Byte address on database file stream where the array starts (data elements, not, bytes). Must be greater or equal to 0, i.e the lowest byte address is 0.

  • values (array) – Values to be written.

dir(file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)

Prints dataset attributes of all datasets. Note: print db_object will print dataset names only.

property addrsize

Returns the creator platform address size (4 or 8 bytes, int).

property cdate

Returns the creation date in seconds since the Epoch (int).

property adate

Returns the last access in seconds since the Epoch (int).

property handle

Returns the MemCom file handle (int).

property iotype

Returns the database file I/O type (str).

property mode

Returns the database access mode (str).

property dbname

Returns the database full name.

property naddr

Returns the size of the db stream (int).

property nentries

Returns the number of dataset entries on the database (int).

property nhtentries
  1. of hash table entries.

property nsets

number of datasets

Database Object Operators

Operator

Description

db

List of all dataset names of database object db.

db[name]

Addresses a dataset object identified by name. Raises a MemcomKeyError exception if the dataset is not in the database.

db[name] = value

Create or replace the content of the dataset identified by name and initialize it with value. If a dataset name already exists, it is removed from the database before inserting the new dataset.

db1[name1] == db2[name2]

Compares the dataset object identified by name to another object identified by name and returns True if equal or False else.

del db[name]

Deletes the dataset identified by name from the database and invalidates any reference the object. All memcom.dataset objects that reference the deleted dataset become invalid. Raises a MemcomKeyError exception if the dataset is not in the database.

db1[name1] != db2 [name2]

Compares dataset objects and returns True if not equal and False else.

len(db[name])

Returns the number of datasets on the database.

name in db

True if the dataset name is in the database db and False else.

name not in db

True if the dataset name is not in the database db and False else.

Examples

Open a database in read only mode (default):

db = memcom.db("test.mc")

Open a database file in read-write mode:

my_db = memcom.db("test.mc",'rw')

Get a list of all dataset names in db:

db.keys()

Get a list of a dataset names matching 'COOR.*':

db.keys('COOR.*')
['COOR.3', 'COOR.2', 'COOR.1']

Note that datasets are not listed in sorted order. To get sorted order:

sorted(db.keys('COOR.*')
['COOR.1', 'COOR.2', 'COOR.3']

The dataset Object

The memcom.dataset object is associated to a MemCom dataset of a MemCom database. It is similar to a multidimensional mutable sequence Python type.

Depending on type of the memcom.dataset, accessing a dataset object returns a multidimensional numpy array of type numpy.char, numpy.int32, numpy.int64, numpy.float32, numpy.float64, numpy.complex32, numpy.complext64 or a Python sequence of relational tables, i.e a sequence of one or more memcom.tb objects.

class dataset(db, name=None, attr=None)

Create a reference to an existing MemCom dataset. Note: The value of a dataset object is similar to a multidimensional sequence.

Parameters:
  • db (db) – Database object.

  • name (str) – Dataset name. If not specified, the MemCom dataset attributes object attr parameter will be interrogated for the name.

get_db()

Database object of this dataset.

property db

Database object

browse()

Start the MemCom browser for the dataset

browse_desc()

Start the mcbrowser for the descriptor of the set

get_colspec()

Column specification of Array Table (AT) or Sparse Table (ST) dataset. A list of lists.

property colspec

column specification

get_desc()

Dataset descriptor (dict). Setting the descriptor replaces any existing descriptor.

property desc

Descriptor

get_dims()

Shape (dimensions) of dataset. Tist of int (nrows,) or (nrows, ncols).

property dims

Shape of dataset

get_shape()

Shape (dimensions) of dataset. Tuple of int (nrows,) or (nrows, ncols).

property shape

Shape of dataset

get_dsize()

Dataset descriptor size (in bytes).

property dsize

Dataset descriptor size

get_faddr()

Stream byte address of the dataset on the database.

property faddr

File address

get_name()

Return dataset name (str).

set_name(value)

Dataset name (str). Setting the name does rename the dataset on the database.

property name

Dataset name

get_size()

Returns the total size of the dataset in units of array elements (int).

property size

Returns the total size of the dataset in units of array elements (int).

get_type()

Dataset type (str).

property type

Dataset type

The following standard mutable sequence operations are defined on a dataset instance:

Dataset standard mutable sequence operations

Operator

Description

ds [ i ]

The subarray of datasets ds selected by the slice i.

ds [ i ] = value

The subarray of datasets ds selected by the slice i is replaced by value.

ds1 == ds2

True if the two datasets have the same content.

ds1 != ds2

True if the two datasets do not have the same content.

The following operations and proprieties are also defined on a dataset instance:

Standard mutable sequence dataset operations

Operation

Result

ds.name

Name of the dataset.

ds.name = value

Rename (change the name) of the dataset ds to value.

ds.type

Type of the dataset ds.

ds.dims

Dimension of the dataset ds.

ds`.dims = dims

Change the dimension(s)) of the dataset ds to dims.

ds.colspec

The colspec of a dataset of type RTable*. If the dataset is of type ATable a description of the columns of the dataset is returned.

ds.desc

Descriptor (RTable object) of the dataset ds.

ds.desc = d

Change the descriptor of the dataset ds to d (dict).

The tb Object

The tb object, a dictionary, is associated with a loaded MemCom relational table, i.e a relational table mapped in memory. It maps a string (the key) to a numpy array of type numpy.char, numpy.int32, numpy.int64, numpy.float32, numpy.float64, numpy.complex32, numpy.complext64, referred to as value.

class tb(dataset=None, desc=None, pos=-1, value=None)

Creates a dictionary or array of dictionaries object associated to MemCom relational table datasets, mapping one or more keys to a numpy arrays of type numpy.char, numpy.int32, numpy.int64, numpy.float32, numpy.float64, numpy.complex32, numpy.complext64, referred to as value.

get_name()

Dataset name (str).

property name

Dataset name

close()

Closes and invalidates the relational table. The content of the table is NOT saved.

has_key(key)

Checks if a key exists in the relational table.

Parameters:

key (str) – Key name.

Returns:

True if key exists and False else.

get(key, value=None)

Returns the value of key in the relational table.

Parameters:

key (str) – Key name.

Returns:

value or None.

setdefault(key, value='')

Searches for an entry key in the relational table. If key exists, the content, i.e the value, is returned. If not, a new entry key with value is inserted in the relational table and value is returned. Note that setdefault() is similar to get().

Parameters:
  • name (str) – Dataset name

  • value (array) – Value to be set. Default is None.

Returns:

The value of the entry.

pop(key, *arg)

Removes a relational table entry identified by key from the relational table and returns the value.

Parameters:

name (str) – Dataset name.

Raise:

MemcomKeyError when no default value is given and the dataset is not found.

popitem()

Removes and returns an arbitrary (key, value) pair from the relational table.

Raise:

MemcomKeyError if dataset is not found or table empty.

clear(wildcard=None, match=None, search=None)

Clears the relational table buffer, i.e removes all entries.

Parameters:
  • wildcard (str) – Selects relational table entries (names) with regexp expressions.

  • match (str) – Selects relational table entries (names) by exact match.

  • search (str) – Selects relational table entries (names) by pattern.

Raise:

MemcomIOError if the table on the databasecannot be cleared.

keys(wildcard=None, match=None, search=None)

Retuns list of the entries (names) contained in the relational table buffer.

Parameters:
  • wildcard (str) – Selects relational table entries (names) with regexp expressions.

  • match (str) – Selects relational table entries (names) by exact match.

  • search (str) – Selects relational table entries (names) by pattern.

values(wildcard=None, match=None, search=None)

Returns a list of the values of selected entries.

Parameters:
  • wildcard (str) – Selects relational table entries (names) with regexp expressions.

  • match (str) – Selects relational table entries (names) by exact match.

  • search (str) – Selects relational table entries (names) by pattern.

items(wildcard=None, match=None, search=None)

Returns a list containing selected relational table objects by (name, value) pairs.

Parameters:
  • wildcard (str) – Selects relational table entries (names) with regexp expressions.

  • match (str) – Selects relational table entries (names) by exact match.

  • search (str) – Selects relational table entries (names) by pattern.

update(b)
Updates the relational table with a dictionary b containing

(key, value) pairs to be updated.

Parameters:

b (dict) – Dictionary of relational table entries to be updated.

copy(wildcard=None, match=None, search=None)

Returns a dict with copies of table objects.

Parameters:
  • wildcard (str) – Selects relational table entries (names) with regexp expressions.

  • match (str) – Selects relational table entries (names) by exact match.

  • search (str) – Selects relational table entries (names) by pattern.

iterkeys(wildcard=None, match=None, search=None)

Returns an iterator over the selected relational table entries.

Parameters:
  • wildcard (str) – Selects relational table entries (names) with regexp expressions.

  • match (str) – Selects relational table entries (names) by exact match.

  • search (str) – Selects relational table entries (names) by pattern.

itervalues(wildcard=None, match=None, search=None)

Returns an iterator over the selected relational table entries.

Parameters:
  • wildcard (str) – Selects relational table entries (names) with regexp expressions.

  • match (str) – Selects relational table entries (names) by exact match.

  • search (str) – Selects relational table entries (names) by pattern.

iteritems(wildcard=None, match=None, search=None)

Returns an iterator over the selected datasets.

Parameters:
  • wildcard (str) – Selects relational table entries (names) with regexp expressions.

  • match (str) – Selects relational table entries (names) by exact match.

  • search (str) – Selects relational table entries (names) by pattern.

sync()

Saves the current state of the relational table to the database.

raise MemcomIOError: if the database is open in read-only

mode.

get_size()

Relational table buffer size (bytes)

property size

Relational table size

get_used_size()

Relational table used space (bytes)

property used_size

Relational table used size

The following standard mapping type operations are defined on a rt instance:

Standard tb mapping type operations

Operation

Result

dict(tb)

Returns the Python dict of tb.

len(tb)

Returns the number of entries (key,value) of a relational table. Iterates through all datasets of the relational table (complexity of operation: number of key in the relational table).

len(tb[key])

Returns the number of array elements of key name.

tb [key]

Addresses value of key.

tb[key] = value

Assigns a new value to key. Any old value is overwritten.

del tb[key]

Removes the entry key from the table tb.

tb.clear()

Removes all entries from the table tb.

tb.copy()

Makes a (shallow) copy of tb.

key in tb

1 if tb contains key, 0 else.

key not in tb

1 if tb does not contain key, 0 else.

tb == tb1

Returns True if the two relational table objects do have the same content.

tb != tb1

Returns True if the two relational table objects do not have the same content.

The following operations and proprieties are also defined on a rt instance:

tb operations and properties

Operation

Description

tb.close()

Saves the relational table a on the database and invalidate a.

tb.sync()

Saves the relational table a on the database.

tb.size

Returns the size of the relational table tb in number of bytes.

tb.used_size

Returns the size of the relational table buffer (bytes).

Utility Functions

is_db(path)

Return 1 if path is a MemCom database else return 0. If the file path does not exist -1 is returned.

get_version()

Return a tuple of the current MemCom version.

Extended Slices

An extended slice selects a multidimensional sub-array in a multidimensional array (e.g., a dataset object or a numpy array. Extended slicing may be used as expressions or as targets in assignments.

The dimension of the selected slice can by between 0 (e.g., only one element is selected) and the dimension of the multidimensional array (e.g. all elements are selected).

numpy Arrays

Please consult the numpy documentation.