Parsing Toratope Notation

By Paul English, Tue 01 August 2017, in category Parsing

mathematics, parsing, plotting, python

Toratope notation is a rather simple notational scheme for representing functions of arbitrary quadratic variables and root sum square operations. It's not any kind of particular mathematical standard, but I came across it on a small wiki, and thought it was a curious notation. At the minimum it provides a simple one-off parsing problem.

The Notation

From what I can gather from the short wiki page on the topic there are effectively two rules. In the notation I represents a new variable and parenthesese around any variables or other expression represent the root sum square of those minus some radius term, $\sqrt{x_1^2 + x_2^2 + \cdots + x_k^2}$.

With these notation rules we can describe quite a few equations of commonly known objects, including:

Since this notation is simple enough, it presents a good first example of the idea of parsing and evaluation. A parser is any kind of function that can take some arbitrary string of characters and convert the characters into a tree based representation of what the characters mean. Programming languages are all parsed. Natural language is also parsed by computers, but that's different enough not to be categorized as traditional parsing.

Additionally it's important to note, that parsing is independent from evaluation or making used of a parse tree. Parsing only cares about grammar and valid syntax, it doesn't care at all about semantics of a language.

Parsing

Sticking to the problem at hand, we'll describe our notation using a library built for generating parsers. In python, we'll use the parsec library.

The first thing to describe is just the character I which we define as a variable. Then we can assume to start that a toratope expression is any chain of variables.

import parsec

variable = parsec.string('I')
toratope = parsec.many(variable)

Unfortunately this is incomplete, and we can't express much with just single variables. (Of course our parser doesn't know anything it works)

Next we want to represent the notation for the root sum square operation. To do this we need to define the other two literals that are valid, the left and right parentheses. With those three root symbols, we can say that a root sum square operation is anything that starts with a left parens contains one or more toratope expressions, and then has a right parens.

lparen = parsec.string('(')
rparen = parsec.string(')')

@parsec.generate('root sum square')
def root_sum_square_expr():
    yield lparen
    variables = yield parsec.many(toratope_expr)
    yield rparen
    return ['rss', variables]

toratope_expr = variable | root_sum_square_expr

toratope = parsec.many(toratope_expr)

It's important to note that this is recursively defined. We assign a toratope_expr in the second to last statement, but also make use of this in the definition of the root_sum_square_expr. This is the key that makes this parser work for any deeply nested toratope string.

With that, what do we get now when we parse this notation?

toratope.parse('I')
# > ['I']

toratope.parse('II')
# > ['I', 'I']

toratope.parse('(II)')
# > ['rss', ['I', 'I']]

We can see that we're able to parse an expression with parentheses, and get an object response that represents the structure of that expression. We haven't added any examples of it, but at this point the parser is fully functional, and can parse any nested toratope string.

Evaluation

A parsed toratope is a nested list of lists, which is one way we represent a tree. If we now build a routine for traversing this tree, we can make sense of the parsed expression. Jumping into the deep end let's create a function that traverses this particular tree structure and provides us with a sympy equation, which we can make use of.

import sympy
import re

def next_symbol(symbols):
    return chr(len([s for s in symbols if not re.search('r[0-9]+', s)]) + 97)

def symbolic_toratope(tree, symbols=None, depth=0):
    if isinstance(tree, str):
        tree = toratope.parse(tree)
    if symbols == None:
        symbols = []

    if len(tree) == 0:
        return None, symbols

    summand = tree[0]
    rest = tree[1:]

    if summand == 'I':
        sym = next_symbol(symbols)
        symbols.append(sym)
        expr = sympy.symbols(sym)**2
    elif isinstance(summand, list) and summand[0] == 'rss':
        sub_expr, symbols = symbolic_toratope(summand[1], symbols, depth=depth+1)

        radius = 'r%s' % (depth+1)
        symbols.append(radius)

        expr = sympy.sqrt(sub_expr) - sympy.symbols(radius)
    else:
        raise Exception('Invalid tree')

    next_expr, symbols = symbolic_toratope(rest, symbols, depth=depth+1)
    if next_expr != None:
        expr += next_expr

    return expr, symbols

Here we've built a recursive function (recursion and trees tend to work well together) for traversing a tree and returning sympy symbols and root sum square expressions. This is a complex function, but in essence it's taking one symbol at a time, and building a sympy equation for everything it sees as it goes along. Whenever we see an rss we know we need to recurse, and get a nested expression from the next element in our tree.

Now, thanks to the work sympy is doing for us, we can see a formula result when passing in toratope string.

expr, symbols = symbolic_toratope('(II)')
expr
# > -r1 + sqrt(a**2 + b**2)

We can build on this and evaluate one of these sympy functions. Note that our notation is in implicit form so we pass in all our variables and the result is some arbitrary number. The surface of our shape, in this case a Torus, exists everywhere that this function returns zero. Depending on if our result is positive or negative tells us if we're inside our outside of this shape given the coordinates and parameters we've supplied.

from sympy.utilities.lambdify import lambdify

def implicit_toratope(toratope_string):
    expr, symbols = symbolic_toratope(toratope_string)
    return lambdify(symbols, expr)

sphere = implicit_toratope('(II)')
sphere(2, 2, 1)
# > 1.8284271247461903

Plotting

This isn't strictly interesting alone so lets try plotting these shapes. Using a tip from StackOverflow we can find contour lines for each independent dimension. Drawing these together gives us a method of plotting the shape of the implicit function.

from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
import numpy as np

def plot_implicit(fn, bbox=(-2.5,2.5), contour=100, slices=15):
    ''' create a plot of an implicit function
    fn  ...implicit function (plot where fn==0)
    bbox ..the x,y,and z limits of plotted interval'''

    xmin, xmax, ymin, ymax, zmin, zmax = bbox*3
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')

    A = np.linspace(xmin, xmax, contour)
    B = np.linspace(xmin, xmax, slices)

    A1,A2 = np.meshgrid(A,A) # grid on which the contour is plotted

    for z in B: # plot contours in the XY plane
        X,Y = A1,A2
        Z = fn(X,Y,z)
        cset = ax.contour(X, Y, Z+z, [z], zdir='z')
        # [z] defines the only level to plot for this contour for this value of z

    for y in B: # plot contours in the XZ plane
        X,Z = A1,A2
        Y = fn(X,y,Z)
        cset = ax.contour(X, Y+y, Z, [y], zdir='y')

    for x in B: # plot contours in the YZ plane
        Y,Z = A1,A2
        X = fn(x,Y,Z)
        cset = ax.contour(X+x, Y, Z, [x], zdir='x')

    # must set plot limits because the contour will likely extend
    # way beyond the displayed level.  Otherwise matplotlib extends the plot limits
    # to encompass all values in the contour.
    ax.set_zlim3d(zmin,zmax)
    ax.set_xlim3d(xmin,xmax)
    ax.set_ylim3d(ymin,ymax)
torus = implicit_toratope('((II)I)')
r1,r2 = 0.5, 2
impl_torus = lambda a,b,c: torus(a,b,r2,c,r1)

plot_implicit(impl_torus, slices=20, contour=200)

Implicit Torus Plot

f = implicit_toratope('((I(I))I)')
r1,r2,r4 = 0.3, 0.8, 1
impl_f = lambda a,b,c: f(a,b,r4,r2,c,r1)

plot_implicit(impl_f, slices=20, contour=200)

Implicit Toratope Plot

Conclusion

This is one of the easiest parsing examples I've come across, but it does offer some useful practice with recursion and basic understanding of what a parser combinator is doing. Being able to take an arbitrary notation, or any language, and turn it into something a computer can understand is a key idea in computer science, and useful in many areas. I find it rare to need to build a parser in most businesses, but occasionally it's a necessary tool.