Using the MemCom Python Module

Import the Module

Make sure that the PYTHONPATH environment variable is defined and that you call the correct Python version (The MemCom Python Module works with Python2 and Python3). Start the Python interpreter by launching Python from the shell and import the MemCom module

import memcom

Optionally, import the numpy module, which may be needed when manipulating datasets:

import numpy

Database Open and Close Functions

MemCom database files are mapped to the Python class memcom. When a memcom object is created in Python, an associated MemCom database file must be specified in the argument list of the constructor to memcom.

Example Create a new memcom object, assigning a new MemCom database file demo.mc to it:

db = memcom.db("demo.mc", "n")

Create a new memcom object, assigning an existing MemCom database file toto.mc to it:

db=memcom.db("toto.mc")

By default an existing MemCom database file is in read-only mode. To open an existing MemCom database file toto.mc in read and write mode:

db=memcom.db("toto.mc","rw")

An open MemCom database does not have to closed explicitly, since it will be sync’d and closed when leaving Python or when no references to the object exist anymore. To save the current state of the open MemCom database file, issue

db.sync()

Close a MemCom database file and delete the object (not the database file!) in the current Python run:

del db

Note that exiting Python with Ctrl+D automatically saves and closes all open MemCom database files.

Data viewing functions

To view the directory of an open MemCom database file db, type

db.dir()

To view the content of a dataset COOR.1 of an open MemCom database file db, type

print(db['COOR.1'][:])

This command prints all elements of the set! To print selected elements, make use of Python’s subscription operator defining extended slices.

Example Print the first 10 elements (here: rows) of a two-dimensional data set COOR.1

print(db['COOR.1'][0:10])

 [[  9774.2        0.      1632.76 ]
  [  9774.29     512.203   1634.77 ]
  [  9773.65    1032.49    1634.42 ]
  [ 11228.7        0.      1606.1  ]
  [ 11229.       513.084   1607.76 ]
  [ 11228.9     1033.3     1609.29 ]
  [ 10502.97       0.      1707.75 ]
  [ 10502.83    1030.59    1713.92 ]
  [ 10503.75     510.311   1711.1  ]
  [  9764.22       0.      1040.54 ]]

The range operator will be explained later. To select specific datasets, make use of the keys method of the db class.

Example Loop over all datasets DISP.1.*.0.1:

for s in db.keys('DISP.1.*.0.1'):
    print(s)

Data manipulation functions

MemCom discerns between positional and non-positional data. For positional data sets the Python numpy module is activated, i.e positional data sets are loaded to and stored from numpy arrays. For non-positional data sets (relational tables), Python dictionaries are activated, i.e. MemCom relational tables are loaded in Python dictionaries and vice versa. For dictionaries the same rules as the ones for positional data apply, i.e integer and floating-point values are represented as numpy arrays.

MemCom floating-point and integer values are typed according to the desired binary resolution. While there is a unique type relation between MemCom and numpy, there is no such relation between Python and MemCom. The table below explains the relation between numpy and MemCom data types:

numpy data type

MemCom data type

numpy.int32

I (32bit signed integer value)

numpy.int64

J (64bit signed integer value)

numpy.float32

E (32bit floating-point value)

numpy.float64

F (32bit floating-point value)

Positional data sets

A dataset of a database object db is addressed with the [] operator. Data sets can be modified directly on the database or by copying them to Python objects first and copy back to the database.

Example Identify a dataset object

db["COOR.1"]

To extract data from the database, simply address the dataset.

Example Copy the first row of dataset COOR.1 from the database db to the Python variable c:

c = db["COOR.1"][0]

Example Copy the dataset COOR.1 from the database db to the numpy array coor:

coor = db["COOR.1"][:]

To extract the entire positional data set COOR.1 in a numpy array coor, make use of the [:] operator.

Example Copy the content of ["COOR.1"] to the numpy array coor.

coor = db["COOR.1"][:]

To insert an element in a positional data set on the database db, simply assign a value to the element of the set.

Example Insert a row of data:

db["COOR.1"][0] = (1., 2., 3.)

The operator [:] means “select all elements of the array”. Note also the numpy array syntax i:j:k means “from i to k-1 step k”. In case of the two-dimensional B2000++ array COOR.1 containing n rows of 3 columns each, the complete two-dimensional array is loaded in a numpy array with the above instruction.

If a single row or a block of rows of data are to be loaded, the numpy array syntax holds.

Example Extract the first 10 rows, i.e. rows 0 to 9, of COOR.1 in the numpy* array xyz:

xyz = db["COOR.1"][0:10]

Observe: If the index operator [0]``or ``[] is not specified, such as

db["COOR.1"] = xyz

the data set COOR.1 of the database file db is deleted and re-written (creates unused space in the database and might required compression, see db.pack())!

Relational Table Datasets

Relational tables are similar to Python dictionaries: When loaded from the database they become Python dictionary objects if converted with the Python dict() function.

Relational tables in a MemCom database have a fixed maximum size. A relational table must therefore be created or reserved in a database file with a suitable size in order to avoid that the table is too small if data are later appended to it (an alternative way of working with relational tables is described below).

New relational tables in a database are created with the db method.

db.new_set(name, memcom.RTable, (size,))

Example Create a single empty relational table in the database db, dimensioned to 4096 bytes:

db.new_set("RTABLE1", memcom.RTable, (4096,))

Example Create a relational table ETAB.1 with 10000 sub-tables of 128 bytes each, in the database db:

db.new_set("RTABLE1", memcom.RTable, (10000, 128))

The dataset type memcom.RTable is passed to the new_set function, and the tuple (4096,) specifies the size. Note the appended comma! Elements of the table are then inserted by assigning dictionary values to the table.

Example Insert an integer NE with the value 123 in the table RTABLE1 in the database db:

db["RTABLE1"][:]["NE"] = 123

This looks rather complicated: In fact it means “insert key NE with a value of 123 in the dataset RTABLE1, relational table pool [:] in the database db”.

If the dataset is an array of relational tables, such as the B2000++ dataset ELEMENT-PARAMETERS, the syntax is similar.

Example Extract the content of key NAME of the third entry of ELEMENT-PARAMETERS:

name = db["ELEMENT-PARAMETERS"][2]["NAME"]

Multiple empty relational tables (subsets) are created by specifying the number of tables and the table size.

Example Create a data set ETAB.1 containing 10 tables, each of them 4096 bytes wide:

db.new_set("ETAB.1", memcom.RTable, (10, 4096,))

Dataset descriptors are dictionaries similar to relational tables, with the exception that there is only one relational table in a descriptor.

Example Extract the key ANALYSIS from the dataset descriptor

of dataset CASE.1:

ncycles = db["CASE.1"].desc["ANALYSIS"]

An alternative method of working with relational tables consists of (1) loading a table, converting it to a dict, (2) working with the dict, and (3) putting the dict back to the relational table in the database. The problem then is that when putting back a dict to the database a new set is created. Thus, one has to remove or rename the old dataset and put back the dict as a new set. This increases the database size, unless the database is packed with db.pack().