Monday, September 24, 2007

TerraCotta and Jython - The Naive Approach

Reading Jonas Bonér's blog entry on clustering JRuby with Terracotta had me wondering how hard clustering Jython with Terracotta would be. The naive implementation proved to be educational, but unfortunately a dead end due to machine specific requirements of the interpreter. Looking at failures is often a way to see where a tool will be useful and where it breaks down.

In the naive implementation I started by trying to cluster the Jython interpreter directly. Specifically I tried to cluster the variable PythonInterpreter.module and PythonInterpreter.locals. These two variables contain the main dictionaries used to store Jython objects for the interpreter. Unfortunately, as with any non-trivial system not designed with multi-process execution in mind, a few problems arose. These are somewhat educational to review as they may present themselves in applications designed without Terracotta in mind.

The first problem was a java.lang.ref.ReferenceQueue that was used in org.python.core.PyType. Terracotta cannot cluster weak references as they are direct memory pointers on the local heap. This queue is used to clean up the hierarchal references to all of the subclasses used in Python. Simply marking this item as Transient in the tc-config.xml was not enough to get the classes to load. Since PyType.subclasses_refq does not have a 0 argument constructor, either BeanShell scripting was required or a custom initTransient method was required to prevent NullPointerExceptions from appearing when any object was created. A trivial initTransients method seemed to work.

  public void initTransients() {
    subclasses_refq = new \
       java.lang.ref.ReferenceQueue();
  }

First in the <instrumented-classes> block the following is necessary:

<include>
  <class-expression>org.python.core.PyType</class-expression>
  <on-load>
    <method>initTransients</method>
  </on-load>
</include>

Combining that with the following snippet of XML, first we need to mark the field as transient in the appropriate location with:

<transient-fields>   <field-name>org.python.core.PyType.subclasses_refq</field-name>
</transient-fields>

These are the 2 difficult points that get us to the point of being able to get 2 VM's running Jython and being able to see the roots. Unfortunately, executing the built-in function dir() in any except the original Jython interpreter gives yet another NullPointerException. This null pointer exception is coming from org.python.core.PyStringMap and looks to be due to interned strings. This null pointer exception can be gotten around by modifying PyStringMap.java and adding one line marked here with //temp fix:

  public synchronized PyObject __finditem__(String key) {
    if (keys == null) resize(1); // temp fix.

Unfortunately, while this gets rid of all of the null pointer exceptions and you can see the Roots in the console, this doesn't appear to be a complete fix as a dir() in the remaining consoles now produces an empty namespace. This appears to be due to the fact that Jython uses StringMaps for the __dict__ objects and further, the Strings in the StringMaps are interned strings. Interned strings look to be un-clusterable in Terracotta at this point. Hopefully, at some point a workable solution to clustering interned strings will be found, as well as a way to cluster WeakReferences. Until those 2 things happen however clustering the Jython interpreter looks to require quite a heavy re-write of Jython and may adversely affect the operation of Jython on a single node which would immediately cause the solution to be unviable. Stay tuned however for how to access clustered Java objects from within separate Jython VM's.

Files modified for this exercise:
tc-config.xml
org.python.core.PyTypes.java
org.python.core.PyStringMap.java