Zope Zeo vs. standalone setups

We do some Plone development here at Redi. As known, Plone is a powerful, but unfortunately quite a heavy CMS which is best suited for Intranets. Thus, we are always looking for speed increase.

Enter Zeo cluster - a feature that nowadays comes bundled with Zope and allows one database (practically Data.fs) to be used by multiple Zope instances, or more accurately Zeo clients. In standalone installation only one CPU / CPU core can be used for processing requests (as Zope / Python implementation is single-threaded AFAIK). So if there are any concurrent requests the database (ZODB, the Zope Object Database) usually has to wait for the request processing before it is asked for the data and only part of the processing power is used as requests are queued. Using Zeo server-client architecture however, each Zeo client can do the processing on their own CPU/core (thus efficiently using the whole CPU prosessing power available) and also minimize the hard disk idle time by asking for data in an ~asynchronous manner (in separate queues). Actually ZODB even serves the same object simultaneously to different client processes for performance reasons. This might raise database ConflictErrors, which are nothing to fear of, however, as noted some paragraphs below.

Similarly, you could also deploy Zeo clients on different computers in local network (or wherever you want), but that’s not the scope of this article. Having clients running on different machines is a similar case with the same performance basis, but there are connection lags, bandwith limits and such that decrease performance.

Theory vs. practice

Deploying a Zeo cluster instead of standalone Zope instance should theoretically increase the performance by factor of extra available CPUs / CPU cores. There might be some overheads from this setup though, so we tested it out using ApacheBenchmark - the benchmarking module that comes bundled with Apache nowadays. But first something about…

Setting up Zeo & converting from standalone mode

In the easiest scenario, setting Zeo up is rather easy: the unified installer supports Zeo-server setup out of the box (=there is a recipe for it). Just run the unified installer like:

$ ./install.sh zeo

Luckily, the unified installer uses buildout from Plone 3.1 onwards. Thus, converting your current buildout instances to Zeo cluster is nothing but change of buildout configuration. Where you would normally need ‘instance’ section in your buildout.cfg you will now need the following:

[zeoserver]
recipe = plone.recipe.zope2zeoserver
zope2-location = ${zope2:location}
zeo-address = 127.0.0.1:12000
#effective-user = __EFFECTIVE_USER__
[client1]
recipe = plone.recipe.zope2instance
zope2-location = ${zope2:location}
zeo-client = true
zeo-address = ${zeoserver:zeo-address}
# The line below sets only the initial password. It will not change an
# existing password.
user = admin:mysecretpassword
http-address = 12001
#effective-user = __EFFECTIVE_USER__
#debug-mode = on
#verbose-security = on

# If you want Zope to know about any additional eggs, list them here.
# This should include any development eggs you listed in develop-eggs above,
# e.g. eggs = ${buildout:eggs} ${plone:eggs} my.package
eggs =
    ${buildout:eggs}
    ${plone:eggs}

# If you want to register ZCML slugs for any packages, list them here.
# e.g. zcml = my.package my.other.package
zcml =

products =
    ${buildout:directory}/products
    ${productdistros:location}
    ${plone:products}

To add more clients (which is quite the point here), append as many times the extra client sections like this:

[client2]
recipe = plone.recipe.zope2instance
zope2-location = ${zope2:location}
zeo-client = true
zeo-address = ${zeoserver:zeo-address}
user = ${client1:user}
http-address = 12002
#effective-user = __EFFECTIVE_USER__
#debug-mode = on
#verbose-security = on
eggs = ${client1:eggs}
zcml = ${client1:zcml}
products = ${client1:products}

That minimizes the need for retyping user names, password etc. These examples were taken from Plone unified installer buildout.cfg with ports changed.

Starting, stopping & restarting

Now, to start your Zeo-powered Plon clients you could type:

bin/zeoserver start
bin/client1 start
bin/client2 start
...same for all the clients...

However, the unified installer has a recipe which automatically generates nice and simple shell scripts to control your cluster. In the end of your buildout.cfg, add:

[unifiedinstaller]
recipe = plone.recipe.unifiedinstaller
user = ${client1:user}
primary-port = ${client1:http-address}

That should generate the scripts. In fact, it propably does also something else, something which I’m not aware of. However, I didn’t bump into any problems, yet :) Anyway, to start the whole cluster (server & clients), type:

bin/startcluster.sh

And that does it (it start server and the clients). Shut it down via:

bin/shutdowncluster.sh

And restart:

bin/restartcluster.sh

ConflictErrors - not that errerous

As noted before, in Zeo mode the ZODB might serve the same objects to two more clients at the same time. If one client manipulates the object before others (ie. edits values and saves changes) the other requests will propably fail. This raises ConflicError which looks like this:

ConflictError: database conflict error (oid 0x0f39, class HelpSys.HelpSys.ProductHelp)

In this case ZODB tries to reprocess the failed requests. This should be common database approach and thus a feature, not a bug (although Zope might want to tell that in error message!). For more accurate explanation see Plone discussion.

Parsing it together with web server

The Zeo components (server and clients) talk to each other via standard Internet protocols (TCP or UDP, not sure). In the default setup, the Zeo server listens to port 8100 and Zeo clients to 8080, 8081, etc. Thus, to access the separate clients as ‘one site’ we need to serve the requests to multiple clients. This can be achieved with load balancers. Apache has at least one: mod_proxy_balancer which should do exactly what we need. Apache isn’t the best choice for achieving high requests per second values, but it will do for our tests (compare to more lightweight but also more limited lighttpd). Just remember that there are other alternatives/methods available, like using squid as load balancer.

Our configuration is as follows (inside VirtualHost-directive):

  <Proxy balancer://lb>
    BalancerMember http://127.0.0.1:12001/
    BalancerMember http://127.0.0.1:12002/
    BalancerMember http://127.0.0.1:12003/
    BalancerMember http://127.0.0.1:12004/
  </Proxy>

  <Location /balancer-manager>
    SetHandler balancer-manager
    Order Deny,Allow
    Allow from all
  </Location>

  ProxyPass /balancer-manager !
  ProxyPass             / balancer://lb/http://localhost/VirtualHostBase/http/www.mydomain.com:80/plonesite/VirtualHostRoot/
  ProxyPassReverse      / balancer://lb/http://localhost/VirtualHostBase/http/www.mydomain.com:80/plonesite/VirtualHostRoot/

This setup also allows us to use the balancer-manager (accessible at /balancer-manager) that comes with mod_proxy_balancer. It’s useful for checking if the configuration is working and balancer is dividing the requests equally. In my setup the balancer is using the default Request Counting -algorithm which divides the requests numerically equally between the instances, but you might want to also try Weighted Traffic Counting, which should be for actual use. In our test only the frontpage is accessed however, so each request’s data transfer is equal and the weighted traffic counting isn’t of use.

The test

The server machine

The setup

The tests where run locally in development environment to minimize the network lag (was 0-1ms).

The test commands

ApacheBenchmark commands:

$ ab -n N -c C myurl

where N was either 1000 or 9000 (requests) and C 1, 10, 100 or 1000 (concurrent requests).

The results

You can download the more in-depth test sheet Plone Standalone vs. Zeo installation (PDF).

To put it simple: theory and practise meet well - Zeo server is a lot more powerful with concurrent requests. On non-concurrent requests the results are about the same.

Having as many Zeo clients as CPUs / CPU cores can boost the performance up to number of extra CPUs/cores. For example, in our quad-core server with Zeo setup we gained nearly 4 times the requests per second of standalone installation (~370% to be accurate). Increasing Zeo clients to 6 didn’t help any as there’s no processing power left from 4 heavily stressed client processes. Also to be noted is that the waiting times for clients nearly tripled (median jumped from 126 to 305 ms) when raising concurrency from 1 to 10. This isn’t bad though - those are still low figures compared to standalone’s median of 1215 ms! Only when raising concurrency to 100 we began to see some 3,6 seconds waiting times (6 seconds for standalone). Increasing concurrency didn’t bring down the requests/second rates much (less than 5%) as expected.

Overall, the results were expected, but now we have evidence of it: under concurrent request load Zeo server is a good option to multiply the performance of your site. With very low traffic sites which rarely get more than 1 request at time this doesn’t matter.

One bad word about the resource requirements though: The used RAM increase for 6 client Zeo setup (standard Plone 3.1.2 + 12 additional Products) was whopping 621 MB (1132 MB -> 1753 MB). That means about 100 MB per Zeo client as the Zeo server memory intake was only about 12-15 MB. Thus, only use as many Zeo clients as absolutely necessary or you might find your beloved server machine under very serious Zope flu!

Plone business, part II, conference in Shangai

Let me first frame this blog entry: I was Shangai’ed last night. I picked up this new wonderful term when I was eavesdropping a conversation in a Plone conference. “Did you Shangai’ed last night?” “What do you mean?” “Drink till 5 am.” I must admit that I Shangai’ed quite badly. But you don’t need to worry about me. Friends from my hotel, they were totally Finnish’ed.

So, I am sitting at Fiumicino airport on my way back from a Plone conference, enjoying a well deserved hangover. Since I really don’t have anything I could do, I finally find time for updating my blog. 6-7 months ago I wrote a blog entry about what it feels like starting one’s own software sweatshop. Things have progressed quite nicely and I’d like to give a status update.

First about the conference. Plone conference 2007 (Plone being the best open source content management system out there with a strong business oriented community) was held in Naples, Italy. It was my first big open source gathering - 350 participants were present from all over the world. My home city, Oulu, is a bit off from the international routes and travelling was expensive: my wallet got Finnish’ed too. Still, I believe the marvellous conference was worth every euro.

I finally met one of my US clients face-to-face. They are a nice senior couple, who run a family web business. We had a lot of wonderful discussions, and got some humane depth into our business relationships, which I believe greatly helps getting possible joint ventures done more smoothly in the future.

Getting faces and voices for IRC nicks was a little confusing, in a positive way. You were hit with thoughts like “Oh my, this guy was so tall” or “I expected a younger one”. I also noticed I have started referring the Plone scene as “we” instead of “them” - a clear sign that the community has stolen my soul.

What comes to presentations and workshop sessions, Plone developers are, quoting a sentence from Limi, “- geniuses”. Plone/Zope is currently very hard to master software-development-wise, but this is a recognized problem. People are working on it to make it easier and more like other agile web frameworks out there (Django, Ruby on Rails or TurboGears).

By the way, writing on my Macbook feels like being the Sex and The City main character. 90% of people in the Plone scene are using Macs. I wonder why we don’t ask for sponsorship from Apple, since being in touch with Plone folks is an ultra effective mouth-to-mouth marketing channel for Apple. I didn’t have a Macintosh laptop in the previous Plone event I participated. The social pressure to get one is enormous - I had to order a Macbook just for this trip =)

The business part

Currently, Red Innovation incorporates four employees and one contractor. I am managing things, doing less coding every day. Three of my friends work as flexible part timers until they finish their studies. One freelancer is doing contracting through me for bigger clients. Plone is not the only thing in our boat: we have been doing everything from embedded mobile phone Linux to video and social sites on every known programming language.

In the conference, there were really cool presentations on how to become a Plone consultant and how to work with Plone consultants. I hope I could send the latter presentation video for all of my clients. Two of its main points were to be open and to have a budget for consultants. It’s really hard to work outside the information loop and all feature requests come as granted from the Heaven without explanation of the logic behind them. You can implement the feature, but the result is what the client told you, not what the client was thinking. Small fixes and adjustments follow after the initial implementation when the client clarifies his/her wishes. Even more small adjustments follow and micro-management ensures. It’s stressful to work this way, since you have zero degree of creative freedom and the same thing must be fixed repeatedly.

In the conference presentation of consulting business they told that the first few years are hard. You are not paid well and you don’t have any cash reserve, which drives you to accept all the work you can get. This slowly drifts you away from developing a sustainable growing business and your original goal as an open source entrepreneur.
Sadly, I can confirm this to be very true. Currently I am working hard just to keep the company running. Contracts are too short. The continuance is missing. Average hourly fee is somewhere between 40€-50€/hour. As they mentioned in the presentation, in the UK Plone hour price level was more like 80€/hour, and this was for contracts shorter than a year. I can only dream about getting a Plone contract longer than a year. The most horrible thing to reveal is that our company is accepting PHP jobs whilst missing suitable projects for our core technology platform.

However, this is just a start. It was expected. On the bright side, our average hour price is rising, slowly but steadily. I have achieved almost everything I dreamed of when I finished high school. I have my own luxury top floor office and I can decide my own working hours. Most importantly, I am not doing Dilbert job as being a code monkey transforming design documents to code implementation whilst boring myself to death of insignificance in a grey corporate mass.

In the conference, a friend of mine asked something along the lines: “Mikko, How high you think you are in Plone developer ranking?” This sputtered the following truth: I won’t ever wave a flag at the very top of Mt. Plone-developers. When you want to run a business, be prepared to sacrifice the happiness of coding. You will write Word documents, emails, talk on the phone, and so on. Technology goes forward; you don’t have time to catch the train. Sooner or later you’ll see only the backlights of your prior superiour technical skills.

But, my dear Service Buying Friend, this doesn’t matter from your point of view as a client. Nowadays I offer something better over my coding thumbs. Instead being able to solve your problem, I can point a person to do it (and take my little slice in the middle). I want to change the world and make it happier place for all of us. I believe I achieve a far greater effect by establishing a role model of running open source business instead of just code, code and code. Even if I code very cool stuff.

User group Finland?

Finland lacks open source business scene. Contracting work business is generally very conservative here. Companies, with the exception of Nokia, neither have a clue how to be a part of a community nor why to contribute back to the community. Our public sector is enjoying high Microsoft-corruption where ASP and Sharepoint are the words.

There must be a change in this. Even for the sake of the national budget: I want my tax money spent on cost effective Finnish open source solutions instead of the crap produced in Redmond.

Maybe the Plone-Python user group, working with patience, could have an effect here. The evangelistic work of teaching customers, teaching public sector and keeping noise in media would eventually convert the hearts of non-techies. Besides Red Innovation, there is one known active Plone consulting shop in Finland. Two universities are using Plone in large scale. The great presentation of Roberto Allende about starting a user group just might have motivated me enough to gather some leadership and accept the task of bringing pieces together.

P.S. Plone 3.0 compatible DataGridField release will come soon! I promise!

P.P.S Posting this blog with Wordpress update and all took 7 days. DGF 1.6 is already out there…

Old blog

The old blog can be found from http://www.redinnovation.com/old-blog for now on.

There might be some information and instructions you are looking for.

Gravatars = Globally recognized avatars

When travelling in Blogosphere you will be recognized, besides by your nick, by your avatar. Avatar is “user icon”, an user chosen image shown next to his/her comments. This little cheerful image brings a whole new dimension of feeling into otherwise so dull textual messages.

In the past, it has been a problem to register and upload your image to every site you visit. Not anymore. Gravatar is a service where one can register globally recognized avatar for himself/herself. An image is associated to an email address and every time you post somewhere with this email services, your avatar image is automatically fetched from gravatar.com. Simple and cool. The Gravatar images have even age limits!

Of course, this requires Gravatar support from the service you are posting into. But since the service is free and has received good community feedback, we can expect seeing more Gravatars in the future. To help the progress of the world, I added Gravatar support for blog.redinnovation.com. You are welcome to comment our blog with your Gravatar image to see whether this service works.

Also, I found a little bug from ZenPax’s Gravatar2 plug-in for Wordpress. Wordpress 2.2 rich text editor was sanitizing XML mark-up used in blog posts. Gravatar2 plug-in used misformed syntax in its “place your Gravatar to post” code. Gravatar2 was cleaned away. Here is required changes you need to do to make the plug-in work with Wordpress 2.2.

In function gravatars_in_post() change line (1055) beginning with preg_match_all to

preg_match_all("/<gravatar>([^<]+)</gravatar>/", $content, $matches);

Now you can place Gravatar images to the blog post. In the Wordpress rich-text editor, toggle Code view and use following tag:

<gravatar>youremailaddress@internet.com</gravatar>

Voilá!

Hello Internet

I am proud to present new Red Innovation Ltd. company blog. We are ditching our old and faithful Plone-based Quills blog in the favor of Wordpress. Now, when we have several people doing blogging, this change makes our life easier.

Though Plone is a wonderful CMS platform, it really lacks a finished blog product upon it. I had only seen Wordpress blogs before. When installing Wordpress, I must admit that I was impressed, something which is not easily achieved after all these years in Internet. It’s so easy. Hundreds of times easier than playing around with Plone/Zope. My natural skeptism towards PHP products just got hit hard. Though PHP might not be the sharpest sword in programming language armory, it doesn’t seem to prevent creating nice end user products.

Copyright © Red Innovation Ltd. 2008 All Rights Reserved. | Log in | XHTML
Close
E-mail It