GPars performance test

August 29th, 2011 6 comments

We just added a REST interface for replicating data between servers. Parts of the service require us to GET a collection of URLs all at once and to POST an object to a collection of remote servers all at once. I thought this would be a great time to try out GPars. After all, you can’t get much easier pooling/multi-threaded support than:

    GParsPool.withPool {
        urlList.eachParallel {
            ...get/post with Jersey client...
        }
    }

Some of my co-workers expressed concern that this would involve creating and destroying the pool data structures (specifically, Java threads) for every url or server collection we submitted. They thought this would take too much overhead. So I decided to put together a few tests to see what GPars could get us using its simplest form of concurrency. First, an interesting tidbit from the reference guide:

While the GParsPool class relies on the jsr-166y Fork/Join framework and so offers greater functionality and better performance, the GParsExecutorsPool uses good old Java executors and so is easier to set up in a managed or restricted environment. It needs to be stated, however, that GParsPool performs typically much better than GParsExecutorsPool does. (Section 3 intro).

and, from Groovy in Action v2:

GParsPool does not create threads. Instead, it takes them from a fork/join thread pool of the jsr166y library, which is a candidate for inclusion in future Java versions. GPars uses this library extensively, especially its support for parallel arrays that are the basis for all parallel collection processing in GPars.” (Section 17.2)

If these statements were correct, then hopefully we didn’t have to worry about maintaining an existing pool and setting up a countdown latch of some form.

Tests

I threw together some quick-and-dirty tests:

  • A series of mathematical operations (i.e. pure cpu)
  • Open and read in the text of 120 files, each about 1-4Kb
  • Get the contents of a small web page (9kb) hosted on a machine on the local network

Obviously, these were not meant to be a definitive test of all of GPars capabilities. I just wanted to see if we could use the simplest form of GPars notation or if we had to do something more complicated.

I ran each test in a loop various numbers of times (100 up to 10000), just to see if there was significant difference over time. The results I list below are for the test that ran the loop 5000 times. The core bit of code I timed was something like this:

    int ms = 0
    5000.times {
        StopWatch timer = new StopWatch().go()
        GParsExecutorsPool.withPool {
            List result = data.collectParallel {
                //(1..100).sum {i -> i^it}
                //or
                //it.text.size()
            }
        }
        ms += timer.stop
    }
    def message = DebugUtils.logTimePerItem("GParsPool", numLoops, ms)
    println message

where “it” was a File or URL (or a number for test #1).

I ran the test using regular sequential code (i.e. commenting out the withPool block and changing collectParallel to the normal collect), and then using GParsPool.withPool and GParsExecutorsPool.withPool. I also tried using the GPars ParallelEnhancer class and the makeConcurrent, both of which let me just use the normal collect call rather than having to write collectParallel. For some reason, these conventions slowed down the collection processing by a noticeable amount. I did not dive into why that happens, but I suspect it has to do with the additional overhead of the custom MetaClass handling.

Results

These are the results on my Dell Precision M6400 running 32-bit Fedora 14. The JVM had 1.5G of RAM.

Test: Mathematical Operation

Normal: 5000 in 14594 (343/s)
GParsPool: 5000 in 12480 (401/s)
GParsExecutorsPool: 5000 in 15685 (319/s)

I reran this test with the timer inside the withPool call to see what the overhead of creating the pool was, i.e.:

        GParsExecutorsPool.withPool {
            MemStopWatch timer = new MemStopWatch().go()
            List result = data.collectParallel {
                (1..100).sum {i -> i^it}
            }
            ms += timer.stop
        }

with these results:
GParsPool: 5000 in 11253 (444/s)
GParsExecutorsPool: 5000 in 13308 (376/s)

So setting up the pool each time definitely has some overhead cost, but even doing that, the GParsPool is still faster than normal, single-threaded sequential execution, even on my little ol’ dual core machine.

Test: Open and Read Files

Normal: 5000 in 31726 (157/s)
GParsPool: 5000 in 24050 (207/s)
GParsExecutorsPool: 5000 in 25351 (197/s)

Very similar scale of results as with the straight mathematical operation.

Test: Get small web page over local network

All tests resulted in the same numbers – network latency was the deciding factor. Sorry for not having the exact metrics on this one.

Conclusion

So what does all this mean? I think it means that just using the simple GParsPool.withPool structure to iterate over a collection is perfectly fine for our needs. We could optimize a bit with different structures and a pre-existing pool, but it honestly won’t make a bit of difference in real performance given that network latency is the deciding factor for us. Your mileage may vary, especially if you are running an open server that has higher load requirements.

Categories: groovy Tags: , ,

Xerces and xml-api Dependency Hell

June 29th, 2011 3 comments

One of the project I work on includes a whole mish-mash of XML-related libraries including xerces, jdom, dom4j, jaxen, xalan. Some are direct dependencies and some are pulled in by other third-party dependencies like hibernate, tika, gate, etc. Many of these libraries have transitive dependencies on xerces and/or on some form of xml-api artifact, though the exact artifact name, and even the group name seem to vary randomly. What was xerces:xmlParserApis vs xml-apis:xml-apis vs xml-apis:xmlParserAPIs? Why were there versions of xml-api artifacts in the 2.0.x range, but they seemed older than version 1.0.b2 which so many libs depend on?

I recently tried to upgrade the included version of xerces from 2.6.2 to 2.9.1. This is the latest official release posted to Maven Central, though it is nearly 4 years old. (The latest official xerces release, 2.11.0, and the previous one, 2.10.0, are not in the primary maven repos. See XERCESJ-1454 if interested in more on why.) The upgrade caused some rather strange class loader errors that forced me to finally dig into this. What follows are my rough notes on the various xml-api related artifacts. They go in chronological order.

Group IDArtifact IDVersionRelease DateNotes
xerces
xml-apis
xmlParserApis
xmlParserApis
2.0.0
2.0.0
01/30/2002
xerces
xml-apis
xmlParserApis
xmlParserApis
2.0.2
2.0.2
06/21/2002
xercesxmlParserApis2.2.111/11/2002includes all classes in 2.0.2, plus some security support stuff and other mods
xml-apisxml-apis1.0.b2
2.0.0
2.0.2
12/01/2002includes all but some security support and other util class in xerces:xmlParserApis:2.2.1, plus some additions
xercesxmlParserApis2.6.0
2.6.0
2.6.2
11/18/2003* all but 1 class from xml-apis:1.0.b2, plus the security support classes that were in xerces:xmlParserApis:2.2.1
* 2.6.2 was the last of this artifact
xml-apisxml-apis1.2.01* no jar, just a relocation tag to xerces:xmlParserApis:2.6.2
* Looks like this was added on 02/03/2010 (judging by date in http://repo1.maven.org/maven2/xml-apis/xml-apis/), about 3 years after other xml-apis:xml-apis entries like 1.3.04
xml-apisxml-apis1.3.0207/22/2005* includes all but 1 class from v2.6.2 (dropped older security support stuff), plus many additions
* Included with xerces 2.7.1
xml-apisxml-apis1.3.0302/25/2006* released with xerces 2.8.0
* xercesImpl:2.8.0 was the first one where they included dependency info in the pom
xml-apisxml-apis1.3.0411/19/2006* xerces:xercesImpl:2.9.1 (09/14/2007) depends on this
* this is the last of this artifact in maven repos

One interesting note is that xml-apis:xml-apis:2.0.0 and 2.0.2 are newer than their equivalent versions of xerces:xmlParserApis and xml-apis:xmlParserAPIs.

While tedious, working out these relationships helped me track down the conflicting dependencies.  I added these entries to my root project’s dependencyManagement section:

<dependency>
    <groupId>xml-apis</groupId>
    <artifactId>xml-apis</artifactId>
    <version>1.3.04</version>
</dependency>
<dependency>
  <groupId>jaxen</groupId>
  <artifactId>jaxen</artifactId>
  <version>1.1.1</version>
    <exclusions>
        <exclusion>
            <groupId>xerces</groupId>
            <artifactId>xmlParserAPIs</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
  <groupId>jmimemagic</groupId>
  <artifactId>jmimemagic</artifactId>
  <version>0.1.2</version>
    <exclusions>
        <exclusion>
            <groupId>xml-apis</groupId>
            <artifactId>xmlParserAPIs</artifactId>
        </exclusion>
    </exclusions>
</dependency>

and all was good in the world again.

Categories: maven, XML Tags: , ,

Representing an XML qualified name as a string

May 31st, 2011 1 comment

I am working on a project where we need to store qualified XML names (QNames i.e. namespace and local name) as strings outside of an XML document. This includes QNames from any third party namespace that a user of our package wants to include. So I set out to find the standard way of doing this in a way that would give other apps the best chance of being able to properly parse the string back into a QName, especially for QNames which already had a somewhat widely used string representation. We are storing meta-data about “things” (documents, sensor recordings, you name it), so I paid particular attention to popular schemas in the semantic web space. Should we use ns:name, ns/name, ns#name, or something else? After spending way too much time on this, here is what I found:

  • There is no official standard. A qualified name is officially defined as two strings – the namespace and the local name. Oh, great.
  • One of the first papers on this by James Clark says {namespace}local is proper. This is what javax.xml.namespace.QName.toString produces, and the QName.valueOf method will parse that format. This form is also what the groovy QName class uses, but, interestingly, the equals for that class will accept a string that uses a colon delimiter.
  • http://docstore.mik.ua/orelly/xml/xmlnut/ch04_02.htm talks of both {namespace}local and namespace#local
  • http://www.rpbourret.com/xml/NamespacesFAQ.htm#names_15 has great detail on namespaces overall. It talks of {namespace}local and another form, namespace^local, which is what SAX filter uses, according to the page. I found no other examples or mention of this “caret” format.
  • javax.xml.soap.Name uses namespace:local. Apache axis does the same thing, which is not surprising considering I believe one came from the other.
  • ECMAScript for XML (and, thus, Adobe ActionScript) uses 2 colons – namespace::local. This is partly because it uses the two colons as an operator of sorts, and needed to separate it from other uses of a colon in the ECMAScript syntax.
  • Dublin Core (DC) explicitly defines the URIs of the terms in its schema. It uses “the path divider ‘/’ as the delimiter between namespace and local name. Of note, if you try to put one of those URIs into a web browser as a URL, it will redirect to a page which uses ‘#’ to note the fragment in an RDF schema. For example, http://purl.org/dc/terms/ will resolve to http://dublincore.org/2010/10/11/dcterms.rdf#name. I didn’t find any other schema/taxonomy that explicitly defines the URI for each element.
  • Regardless of the above behaviour, the Dublin Core XSD defines the namespace to include the ending ‘/’.
  • The namespaces of the RDF and OWL specifications include an ending ‘#’.
  • All namespaces included in the output from pingthesemanticweb, which lists the most popular semantic schemas, end in ‘/’ or ‘#’. Even the few that use urn format end in ‘#’ (e.g. urn:x-inspire:specification:gmlas:HydroPhysicalWaters:3.0#).
  • The Department of Defense Discovery Metadata Specification (DDMS) namespace, based heavily on Dublin Core, includes the ending ‘/’ just as DC does.
  • I could not find any namespaces that end in ‘}’, ‘^’, or ‘:’ (the first two of which are illegal, I think)

  • So, you might be thinking that we could just concatenate the namespace and local name together to form the string. To parse it, we could then split the string at the last occurrence of the delimiter character, keeping the delimiter as part of the namespace if it is a ‘/’ or a ‘#’. But wait! There’s more…

  • Many non-semantic-web schemas, like the XML Schema itself, xlink, and the OGC standards like gml, do not include the ending delimiter in their namespaces.
  • National Information Exchange Model (NIEM) namespaces, arguably somewhat-semantic, also do not include a trailing delimiter.
  • Neither does the Intelligence Community Metadata Standard for
    Information Security Marking (IC-ISM)
    namespace (which is in urn format).
  • Nor does the DOD core metadata OWL schema, at least as far as I can tell. Sorry, I couldn’t find an exact reference to that one.

Resolution Rules

So if you want to represent a particular qualified name as a string and do it in a way that others are most likely to recognize as the “accepted” way to represent that particular QName and you want it to be reversible, at least within your own app, the best rules I could come up with are:

Creating the String

Call the path divider ‘/’ and fragment ‘#’ symbols sticky delimiters because they may be a part of (i.e. stick to) a namespace. Call the other possibilities (‘:’, ‘::’, ‘}’, ‘^’) formal delimiters because you know they only serve the purpose of being a delimiter.

  1. If the namespace ends in a delimiter of any form, simple append the local name directly to it.
  2. Else, use ‘:’, ‘^’ or, to be totally safe, surround the namespace string with ‘{}’ and then append the local name. I chose ‘:’ because I at least saw some uses of that form on various pages while I never saw any uses of the caret ‘^’ or the surrounding ‘{}’. If you have total control of your input and output, use the surrounding braces format since it is totally unambiguous.

Parsing the String

  1. If there is a ‘{}’ pair, can assume form is {namespace}local
  2. Else, find the last possible delimiter in the string. If it is a “formal” delimiter, then drop the delimiter and make the namespace the chars before it and local name the chars after it.
  3. Else, if the last delimiter is “sticky”, you have to guess whether to keep it in the namespace. I put some basic logic in my code to recognize well known namespaces (like those above) that do not end in a delimiter, but then otherwise assume that a sticky delimiter should be included in the namespace.

It’s not a perfect solution, but that’s what you get when there is no standard.

Categories: groovy, OGC, semantic web, XML Tags: ,

Running latest Groovy from Maven

April 5th, 2011 2 comments

Say you have a groovy-project that you build with maven.  You use the org.codehaus.gmaven:gmaven-plugin to compile your groovy code and run groovy tests without a problem.  Then you add some features or tests that need groovy 1.7.  You add the proper dependency and version to the <dependencies> section of your pom, run your test… and watch it blow up because the gmaven-plugin defaults to using groovy 1.6.  So you dig around on the web and find references for how to use the <providerSelection> tag of the gmaven-plugin to get your code compiled with 1.7 and to use 1.7 when running tests.  Things seem good.  Until…

You add a feature that requires some version of groovy greater than 1.7.4 (the version included with the latest gmaven-plugin, 1.3).  In my case, I used the @Delegate annotation with some inheritance in a test configuration and hit a bug that was fixed in groovy 1.7.6.  No matter what version I used in my dependencies section, my tests were executed under groovy 1.7.4.  I finally came up with the configuration below which let me run with a different groovy.  Note that it made no difference what I included in the dependencies section.  The gmaven-plugin configuration appears to be completely independent of that.

<plugin>
    <groupId>org.codehaus.gmaven</groupId>
    <artifactId>gmaven-plugin</artifactId>
    <version>1.3</version>
    <configuration>
        <providerSelection>1.7</providerSelection>
        <!-- This is only used if you want to run a groovy script from the command line using maven -->
        <source>${groovy.script}</source>
    </configuration>
    <executions>
        <execution>
            <goals>
                <goal>compile</goal>
                <goal>testCompile</goal>
            </goals>
        </execution>
    </executions>
    <!-- This block is required in order to make the gmaven plugins use a groovy other than 1.7.4.
     This is independent of the groovy entry in the dependencies section.  This does not affect the class path.

     What is interesting is that there must be both the gmaven.runtime entry with the explicit excludes
     and the additional dependency on whatever version we do want to use.  If you exclude the former,
     it will throw an exception. -->
    <dependencies>
        <dependency>
            <groupId>org.codehaus.gmaven.runtime</groupId>
            <artifactId>gmaven-runtime-1.7</artifactId>
            <version>1.3</version>
            <exclusions>
                 <exclusion>
                     <groupId>org.codehaus.groovy</groupId>
                     <artifactId>groovy-all</artifactId>
                 </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.codehaus.groovy</groupId>
            <artifactId>groovy-all</artifactId>
            <version>1.7.6</version>
        </dependency>
    </dependencies>
</plugin>

It can happen to you: SIOCSIFFLAGS: Unknown error 132

November 10th, 2010 No comments

Came home from work with my laptop. Brought it out of sleep mode. No wireless. Menu says “Wireless disabled.” Huh? Try “ifconfig wlan0 up” and get back this oh so helpful message:

SIOCSIFFLAGS: Unknown error 132

Tweak, tweak. Google, google. Find page with people joking about how they always forget to check if the “disable wireless” switch on their machine has been set. Naaahhh…#$%^$%&%^&^%!!! It must have gotten toggled while in my bag.  I now have a new use for duct tape holding that thing in place.

Categories: Uncategorized Tags:

You don’t know what you don’t know

October 12th, 2010 No comments

Disclaimer: If you are reading this post, then you probably don’t need to.  It likely doesn’t contain anything that you don’t already know or practice.  But perhaps it will help you convince a co-worker who is still pounding away in the dark.

I used to believe that I didn’t need to keep up with technical RSS feeds or newsgroups. I learn things quickly and know how to use Google. Thus, if I ever wanted to find some tool to accomplish a certain task or if I hit a problem with a library I was using, I could quickly go out and find what I was looking for. Spending time reading the Hibernate user group and the JBoss user groups and the Spring user group and all the rest just took away time from designing and implementing the next feature on the schedule. Who could possibly remember all the other new product, library, and feature announcements unless you had a need for them right then?

I now know that this is an excellent way to regularly reinvent the wheel. Why? Because you don’t know what you don’t know. Sometimes it’s easy to guess when there are tools, libraries, or entire languages that can help you do your job. For example, we all know that there are many ORM, logging, and xml-parsing libraries.  There are many situations, however, where it is unlikely that you will realize you are facing a problem that someone else has already solved.

This happened to me last spring.  We were in crunch time trying to finish up a release and I had fallen behind on my feed/newsgroup reading.  I winced every time I opened up Google Reader and saw the unread count tick higher and higher.  As part of this release, I wrote a handy little utility routine that let me override a single method of a class by using groovy’s ExpandoMetaClass.  The nice thing about the routine was that it always discarded the modified MetaClass after the closure that was passed to it finished.  I could thus remove the many try…finally blocks that were piling up in my tests as a tried to make sure I didn’t corrupt a MetaClass from one test to another.

A couple of days later, I was able to whittle down the backlog in Google Reader.  That’s when I saw this entry from mrhaki: http://mrhaki.blogspot.com/2010/03/grails-goodness-using-metaclass-with.html.  Built into the GrailsUnitTestCase class was a method that did exactly what I wanted, plus a lot more – registerMetaClass.  Given, my utility routine can be used outside of a Grails’ unit test, so it’s not a complete waste.  But I could have saved myself a few hours of effort if I had been up to date on my reading.  Perhaps I could have spent those hours catching up on the rest of my “To Read” list…

Categories: pain, Testing Tags: ,

Using Hudson to run tests in parallel

October 11th, 2010 1 comment

We have been using Hudson to do CI on our grails application for a few months now. We use it to run unit, integration, jsunit, and functional (via selenium) tests, as well as code coverage metrics using Cobertura. Since this was the first project that we used Hudson for, the person who originally set it up just put all of these steps into a single job. It worked fairly well, but a single build was taking between 20-30 minutes, depending on how extensive the changes were. This can be a real pain, especially during very active development cycles when you most need rapid feedback. I decided to look into how I could run the tests in parallel, and hopefully get our build times to under 5 minutes. This is what I have come up with so far. It’s not perfect, but the tests do run more quickly (around 5 1/2 minutes total, primarily limited by the speed of the Selenium tests).

Primary Job – Poll and Build

The primary job is very simple. It polls Subversion and executes clean and test-compile targets. (We use maven with the grails-plugin to compile the app.) It normally only takes 30-40s to run. It doesn’t do a full package/install because that adds another minute to the build time.

Primary Build Steps

Build steps for the primary job

Using the Groovy Plugin

One shortcoming of not running the package step is that the job doesn’t generate a unique file that can be used for fingerprinting like a jar or a war. You need to turn on fingerprinting in order to aggregate test results.  (More on that subject, below.)

To resolve this, I used the Groovy plugin to execute a one-line script to generate a unique file for this purpose. I pass in the path and build information as script parameters rather than environment properties because the Groovy plugin doesn’t resolve Hudson environment variables when they are set in the Properties field. This seems like a big shortcoming, so perhaps I just misunderstand how to pass them correctly as properties.

A side-effect of using the Groovy command option rather than invoking an existing Groovy script is that you end up with a copy of the script in your workspace directory. In fact, you get a copy of it for each build. I am not certain why the plugin needs to generate the script file at all as opposed to just running it in memory. Hopefully, this will get corrected in a future release. For now, I may put the script in a regular file just so I don’t have to clear out all the temps.

Downstream Jobs – Test, Test, Test

The tests are divided into 3 downstream jobs that are kicked off when the primary job has a successful build. The jobs run the unit and integration tests, the Selenium tests, and the JsUnit tests, respectively. I didn’t bother splitting out the unit and integration tests into separate jobs because they take less time combined than do the Selenium tests. I could easily split them out later.

Cloning the Workspace

In order to speed up the downstream jobs, I didn’t want to pull the code for each and recompile. Instead, I used the Clone Workspace SCM Plugin to make copies of the primary job’s workspace for each downstream job. This adds only a few seconds to the primary job in order to create the zip archive.

Cloning a workspace - upstream project view

Cloning a workspace - downstream project view

I had two small issues with the plugin:

  1. The zip command doesn’t include empty directories by default. This can be an issue if any of your tests in downstream jobs expect some directories to be there already.
  2. In the downstream job, the working directory is set to the workspace directory itself rather than the (sub-)directory pulled from Subversion as it is in the primary job. This makes sense since the Clone Workspace plugin does clone the entire workspace and not just what was pulled from the SCM system. It just threw off some of the tests that expected files at specific paths. (Yes, I know – the tests should be more flexible. It’s “on the list…”)

I need to do more research to see if I can use this approach when I spin up some slave machines. I will post when I tackle that issue.

Downstream Build Steps

The build steps for the downstream jobs look like this:

Unit and Integration Tests Job

Funtional Tests Job

JsUnit Tests Job

You can see that the two jobs which run the server-side test jobs execute a shell command rather than a regular maven target. This was because of the working directory issue I mentioned above. I use the shell to cd into the proper directory first, and then execute the tests. A little hackish, but it was a good experiment with using the Execute shell build step.

You can also see how many different ports are used by the tests – the JsUnit acceptor servlet, the Selenium server servlet, the app itself when launched for Selenium, etc. I use the Hudson Port Allocator Plug-in to keep tests from stomping on each other.

Aggregating Test Results and Fingerprinting

I turned on test aggregation in the primary job so I could get a single view of test results. In order for this to work, the primary job needs to know how to associate specific downstream builds with specific builds in it. This is done through the fingerprinting feature. None of the docs mention this connection. I didn’t figure it out until I did a build, clicked on the Aggregate Test Results link, and saw this error message:

Generating a unique-per-build file using the above Groovy script (discussed above) let me link the jobs together for aggregating.

NOTE: You do not select the “Publish JUnit test result report” option in the primary job as part of this feature. That option is only used in the jobs that actually run the tests. If you turn it on in the primary job, you will get an error during the build because there are no JUnit files for it to process.

The Aggregate Test Results feature is nice since it provides links through to the test results pages for the downstream builds and a simple table listing the fail/total counts from the downstream builds.

Aggregate results report for a good build

Unfortunately, there appears to be a bug in it where it will not report results for any downstream job that has failures. A failing downstream job is shown in the build details page for the primary job

Aggregate results summary with failing job

but you can see that the Aggregated Test Result link lists “no failures.” Clicking that link shows the summary table, but it is missing a line for the downstream job with failures:

Aggregate results, but failing job is missing

In addition to having this bug, the feature does not show the test result trend chart or the tests by packages in the primary build. This makes it of very limited usefulness.

Next Steps

I was able to accomplish my primary goal of cutting our CI build times, but not without losing some required output.  Most of the shortcomings for this approach are related to one core issue – Multiple jobs means multiple places to configure and view results. For example, I had to configure failure email notifications in all downstream jobs rather than just one. Also, there is no way to get an aggregated code coverage report that spans all the tests (unit, integration, and functional). I could live with not having a single view of all test failures since I can get that info from the downstream jobs, but not having accurate code coverage metrics is not an option. I have to figure out a way around that.

Since most of the problems with this configuration were related to aggregating the test results (both direct results and coverage stats), my next step will be to try a “diamond” build configuration using the Join Plugin. Hopefully, I can pull all of the test results, coberturra.ser, and other such files into the bottom of the diamond to get a single place to view the status of the build.

I also want to get CodeNarc output displaying for the job via the Violations plugin. I can generate the CodeNarc output, and the Violations tries to parse it, but it then crashes with an NPE. I need to narrow down what part of the file is causing the exception so I can report the issue.

Categories: Testing Tags: ,

Hudson + Windows + JsUnit = Boom

October 5th, 2010 No comments

Anyone thinking about upgrading Hudson on a Windows box might want to hold off for a bit if you have any Ant build steps. In Hudson v1.370, they introduced a change for parsing some Ant parameters (HUDSON-7108). However, this caused several other problems, including one that will bust any step you have that runs JsUnit tests (HUDSON-7657). I banged my head against this for a while before I realized that my configuration changes hadn’t caused the problem, so I thought I’d try to save others some pain.

Categories: Testing Tags: ,

Mr. Haki tears it up

June 15th, 2010 No comments

I always love the Groovy Goodness blog. So many gems in such a compact space. He just posted about a dozen tips in the past day alone, all on stuff that I could use in one project or another. The only thing I don’t like about it is it makes me want to go in and update all the places in my code where I could use the new calls right now. Major time suck… must resist… ohhhh… may just one little tweak…

Categories: Uncategorized Tags:

Validation differences between Grails 1.1 and 1.2

June 9th, 2010 No comments

…or My Penance For Ignoring Validation Errors…

We recently updated our app from Grails 1.1.1 to 1.2.2. (We wanted to move all the way to 1.3.1 but since we build with maven, we finally decided to wait until a new grails-maven plugin is released. See GRAILS-6327.) During the upgrade, we hit two particularly annoying issues related to persistence and setting up associations.

The first involved a belongsTo relation like this:

class AppUser {
    UserPrefs prefs = new UserPrefs()
}

class UserPrefs {
    static belongsTo = [user: AppUser]
}

This worked under Grails 1.1, but under Grails 1.2 the prefs object was not getting persisted when the user was saved. Other belongsTo relations, both one-to-many and one-to-one worked as expected. We finally discovered that using the short form of the relation notation did work:

static belongsTo = AppUser

This post provided a clue as to where the problem may be. The user property of the Prefs object is not automatically set because the relationship is one-to-one. (At least, it is not automatically set all the time. If we load an AppUser object with Hibernate, it appears that the user property is set in the UserPrefs object. Go figure.) Under Grails 1.1, having a null user was apparently fine. My guess (based on the second error we hit, below) is that any validation errors caused by the child object were not stopping the save of the parent object. Grails 1.2, on the other hand, does care about the child object validation, though I could never get it to report anything in the errors property of either the parent or child object.

In order to work around the issue, I loosened the constraints on UserPrefs a little:

class UserPrefs {
    static constraints = {
        user(nullable: true)
    }
    static belongsTo = [user: AppUser]
}

With this change, cascading persistence works. Given, it’s not an ideal solution, but it does let us have access to the user property if needed.

What made this extra confusing is that we have a similar relationship in another set of classes that worked without a problem. The only difference there is that the child class is not automatically created with a static initializer as is done in for the prefs property in AppUser. It is always explicitly set into the owning object. I’ve run out of time to pursue this one further, so I don’t have a final answer. Anyone else out there have more insight into this?

The second persistence issue related to this section of code in the update method of one of our controllers:

def update = {
    def hubMap = HubMap.get(params.id);
    hubMap.properties = params;
    if (hubMap.save()) {
        ...add success message to response...
    } else {
        ...add failure message to response...
    }
    ...
}

Again, the code worked fine under Grails 1.1.1, but the save call failed under 1.2.2. Unfortunately, there was a hole in both our unit and integration tests for this method, so we didn’t catch it until much later in the release cycle.

Was the JSON conversion in the controller’s get method that generated the original data for the browser different? Nope.
Had the behaviour of the properties meta-method changed? Nope.

The difference is in how Grails handles any existing validation errors on a domain object when you call save(). In our case, the JSON that was being sent to the controller (via a Prototype AJAX call) contained two properties that were references to other objects. The javascript object conversion in the browser was not setting these properties in a meaningful way for the AJAX call; they were both coming across with the string value [object Object]. Since these fields are never updated in this particular workflow, we had never checked what was happening to them. Grails obviously could not convert from the string values to the proper objects, it ignored the values and set two errors on the domain object to record them. However, we didn’t check for errors after the properties call. We went straight to the save call. Under Grails 1.1, deep within the saving sequence in a random class called AbstractDynamicPersistentMethod, you come across this bit of code:

doInvokeInternal(...) {
    ...
    Errors errors = new BeanPropertyBindingResult(target, target.getClass().getName());
    mc.setProperty(target, ERRORS_PROPERTY, errors);
    ...
}

Any existing errors are replaced with a fresh, clean BeanPropertyBindingResult object, wiping out the error information. We never spotted it because we never expected the properties with errors to change anyway. That hole has been closed in Grails 1.2:

doInvokeInternal(...) {
    ...
    Errors errors = setupErrorsProperty(target);
    ...
}

The new setupErrorsProperty call will copy out existing errors. We put in a few adjustments to not attempt to update those properties and all is well.

So there you have it. A couple of gotchas in the upgrade path for grails. Hope this saves some folks from banging their head against the wall.