OMPC parser/syntax adaptor

OMPC is not a full compiler stack. OMPC takes the advantage of similarity of MATLAB's m-files syntax and the syntax of many other dynamic languages. Python as one of them happensto havea stable numerical library that allows very similar
In recent years there was u number of projects that try to put the same interpreter or engine behing multiple. So rather than a full reimplementation we choose to compile m-files into Python bytecode. Doing this directly wouldn't give us any advantage and we would still deal with all the troubles analyzing source code.
Therefore we translate (adapt) m-files to syntax that is acceptable by the Python interpreter. So after running our OMPC compiler

a = 1;
for i = 1:10,
    a = a + 1;
end
a

becomes

a = 1;
for i = _mslice[1:10]
    a = a + 1;
end
print a

and the Python compiler takes care of everything else. _mslice is an object that returns
The proper way of useing _mslice is as a function _mslice(start,stop,step) which is how the Python's slice function works. The function _mslice() differs from the Pythons version in the fact that assumes the first index is 1 and the last one is included.

To make things easier the 'end' keyword has to be defined with a special value because of it's multiple functions:
  • stated on it's own it closes one local scope, any value for object is acceptable here
  • stated in a slice a(1:end) would resolve in Python equivalent a(_mslice(1,None))
  • stated in a slice as a(1:end-1) would resolve in
The resolution is that we use end = 0 and _mslice will interpret 0 in the stop position is the last item (included of course) returning the Python slice(1,None).

The advantages are obvious after running very simple test and measuring their times. Simple loops are faster in Python than in MATLAB. The Python standard Library allows us to look at the result of compilation, the bytecode, and analyze it. There are attempt to optimize code that can be written in pure Python or other written in C, like Psyco.

Parsing/Translation rules


It is very common to use some kind of parser/lexer library that after a grammar is specified generates a full parser in a chosen programming language. OMPC takes a shortcut and the aim is to let the Python parser to take care of interpreting the language. OMPC's role is only to make a number of adjustments to make m-file complaint with the syntactical rules of Python. MATLAB however has luckily a very simple syntax.

Statements end either with and endline '\r\n', '\r' or '\n' character or the semicolon character ';'. Each of such statements are supposed to be syntactically correct.
There is one exception in vertical stacking, we can specify elements enclosed in [] and {} without using continuations.


Keywords


The following keywords are assumed:

"break", "case", "catch", "continue", "else", "elseif", "end", "for", "function", "global", "if", "otherwise", "persistent", "return", "switch", "try", "while"

Most of them are compatible with Python the exceptions are:

"case" -> special treatment,
"catch" -> except,
"elseif" -> elif,
"end" -> can be left in place if it is defined as None,
"function" -> def,
"otherwise" -> else for switch statement,
"persistent" -> equivalent of static variable, _mfunction decorator should take care of such variables,
"switch"

M-functions


An example M-function look like this

function [ou1,out2] =

All functions are

for nargout('fun') function

Anonymous functions '@' and lambda


Python 'lambda' has a very similar functionality to MALTAB's '@' symbol. There are however subtle differences in treatment of local variables variables. Python uses references

Python and numpy (no OMPC)
MATLAB
>>> a = zeros([1,5]);
>>> b = lambda x: a+x;
>>> b(2)

array([[ 2.,  2.,  2.,  2.,  2.]])

>>> a[:] = ones([1,5]);
>>> a

array([[ 1.,  1.,  1.,  1.,  1.]])

>>> b(2)

array([[ 3.,  3.,  3.,  3.,  3.]])
>> a = zeros(1,5);
>> b = @(x) a+x;
>> b(2)
ans =
     2     2     2     2     2

>> a(:) = ones(1,5);
>> a
a =
     1     1     1     1     1

>> b(2)
ans =
     2     2     2     2     2



The possible solution is to implement 'anonymous' function that takes MATLAB's '@' and returns a regular python function with local variables that are copies of the parameter values.

Strings


All strings have to compiled using the raw operator

'\1' ---> r'\1'

especially in the case above, common in regular expressions, Python interprets '\1' as '\x01'


Comments