Monday, May 25, 2009

Software Development as Tradecraft

While browsing through the New York Times today an article caught my interest titles "The Case for Working With Your Hands." I thought after not updating this blog in a while that this would be a good way to start back in. The article discusses the difference between working in the trades and what is classically termed knowledge work. Many people, myself included, make the analogy of development of software to that of building houses. They state without a good foundation the whole edifice can come crumbling down around your head. I however wonder how many people developing software have experience with the more mundane parts of the trades where the impact of your decisions really can have direct physical consequences. I still remember to this day one of the key lessons that was taught to me in a tiny little crawlspace under a house helping my dad test drain lines.

A little background for those with no plumbing background. When you are doing a new house you are required to fill test your drains. This requires plugging all of the drain lines with plugs and filling the entire system until water runs out over the roof. You then have to make sure all of this water is held for some time and doesn't leak. As you can imagine in a multi-story house the sheer weight of all of this water will give you immediate feedback to your craftsmanship.

This is not the only lesson that you learn, one of the most interesting one relates to subtle timing issues. In order to drain the system you are required to unscrew to remove the force compressing 2 interlocking rubber balls while quickly yanking one of the balls out from the main drain line to have it cover the other 4" hole that you inserted the test plug through (see inset picture for a view of the test plug). If this is done correctly, very little water leaks out and the system. If you do this wrong however, the entire head of water will end up under the house with you making for a cold mud bath.

Over the years I have struggled with how to train software developers and other high tech knowledge workers the ability to see through to the consequenses of their actions. The conclusion I usually come to is that they should be sent to spend a summer working with my dad.

Monday, October 8, 2007

Readline setup in Jython on OSX

In the last blog post I neglected to state that I included enabling readline support in the startup script for Jython. to remove this comment out:

-Dpython.console=org.python.util.ReadlineConsole \
-Dpython.console.readlinelib=GnuReadline \

in tc-jython.sh.

If you want to enable readline support on OSX however download this file, unzip the file and move the 2 files into the /Library/Java/Extensions folder. Do not move the folder there just the 2 files that are contained by the folder. I did not compile these personally, unfortunately I lost the reference to the site that lead me to this epiphany.

Jython Clustering Interface

In our last installment I attempted to cluster the entire Jython interpreter. This failed due to the lack of support of WeakReferences, Interned Strings and likely other JVM specific items. In this installment I create a way to interact with the Terracotta DSO cluster from an API within Jython.

Much of this is a simple port with Jython style semantics from Jonas Bonér's blog entry, JRuby with Terracotta. One thing to notice about Jonas' implementation of clustering with Jruby is that he uses anonymous blocks, a feature that Ruby has that Python has yet to embrace, so the API needed to make this clustering will be a little different. We will be a little more verbose in our translation of:

lock target
  transaction begin
    modify target
  transaction commit
unlock target

Rather than:

guard @messages do
  @messages.add msg
end

We will have to do something like the following:

try:
  guard(messages)
  messages.add(msg)
finally:
  endGuard(messages)

This is due simply to the fact that there is a hidden assumption in the implementation of guard where it accepts an anonymous block and wraps it with the equivalent to a try/finally block in python. In the attached file chatter.py the only lines dealing with Terracotta are the ones adding transaction boundaries to the addition of an ArrayList:

-->  DSO.guard(messages)
     messages.add("[#"+time.asctime()+" "+name+"] "+text)
-->  DSO.endGuard(messages)

And the following line to set up the shared root:

  messages=DSO.lookupOrCreateRoot("myRoot", messages)

It also appears that since Jonas wrote his article on how to cluster JRuby, the requirements for the startup have changed slightly. The following line is required to fire up a jython interpreter that can talk to a Terracotta cluster:

java -Xbootclasspath/p:$BOOTJAR \
  -Dtc.config=$JYTHON_INSTALL_DIR/tc-config.xml \
  -Dtc.install-root=$TC_INSTALL_DIR \
  -Dpython.console=org.python.util.ReadlineConsole\
  -Dpython.console.readlinelib=GnuReadline \
  -cp \ $JYTHON_INSTALL_DIR/jython.jar:\
$TC_INSTALL_DIR/lib/tc.jar org.python.util.jython chatter.py


Where $BOOTJAR is the location of your Terracotta boot.jar file.

Files of interest:
tc-jython.sh
DSO.py
chatter.py

Monday, September 24, 2007

TerraCotta and Jython - The Naive Approach

Reading Jonas Bonér's blog entry on clustering JRuby with Terracotta had me wondering how hard clustering Jython with Terracotta would be. The naive implementation proved to be educational, but unfortunately a dead end due to machine specific requirements of the interpreter. Looking at failures is often a way to see where a tool will be useful and where it breaks down.

In the naive implementation I started by trying to cluster the Jython interpreter directly. Specifically I tried to cluster the variable PythonInterpreter.module and PythonInterpreter.locals. These two variables contain the main dictionaries used to store Jython objects for the interpreter. Unfortunately, as with any non-trivial system not designed with multi-process execution in mind, a few problems arose. These are somewhat educational to review as they may present themselves in applications designed without Terracotta in mind.

The first problem was a java.lang.ref.ReferenceQueue that was used in org.python.core.PyType. Terracotta cannot cluster weak references as they are direct memory pointers on the local heap. This queue is used to clean up the hierarchal references to all of the subclasses used in Python. Simply marking this item as Transient in the tc-config.xml was not enough to get the classes to load. Since PyType.subclasses_refq does not have a 0 argument constructor, either BeanShell scripting was required or a custom initTransient method was required to prevent NullPointerExceptions from appearing when any object was created. A trivial initTransients method seemed to work.

  public void initTransients() {
    subclasses_refq = new \
       java.lang.ref.ReferenceQueue();
  }

First in the <instrumented-classes> block the following is necessary:

<include>
  <class-expression>org.python.core.PyType</class-expression>
  <on-load>
    <method>initTransients</method>
  </on-load>
</include>

Combining that with the following snippet of XML, first we need to mark the field as transient in the appropriate location with:

<transient-fields>   <field-name>org.python.core.PyType.subclasses_refq</field-name>
</transient-fields>

These are the 2 difficult points that get us to the point of being able to get 2 VM's running Jython and being able to see the roots. Unfortunately, executing the built-in function dir() in any except the original Jython interpreter gives yet another NullPointerException. This null pointer exception is coming from org.python.core.PyStringMap and looks to be due to interned strings. This null pointer exception can be gotten around by modifying PyStringMap.java and adding one line marked here with //temp fix:

  public synchronized PyObject __finditem__(String key) {
    if (keys == null) resize(1); // temp fix.

Unfortunately, while this gets rid of all of the null pointer exceptions and you can see the Roots in the console, this doesn't appear to be a complete fix as a dir() in the remaining consoles now produces an empty namespace. This appears to be due to the fact that Jython uses StringMaps for the __dict__ objects and further, the Strings in the StringMaps are interned strings. Interned strings look to be un-clusterable in Terracotta at this point. Hopefully, at some point a workable solution to clustering interned strings will be found, as well as a way to cluster WeakReferences. Until those 2 things happen however clustering the Jython interpreter looks to require quite a heavy re-write of Jython and may adversely affect the operation of Jython on a single node which would immediately cause the solution to be unviable. Stay tuned however for how to access clustered Java objects from within separate Jython VM's.

Files modified for this exercise:
tc-config.xml
org.python.core.PyTypes.java
org.python.core.PyStringMap.java