Relativity of time - shortcomings in Python datetime, and workaround

Recently I found out that the standard library support for date and time calculations in Python is not quite as able as I needed. It turned out that the superficial leanness and simplicity of Python’s datetime module bit hard back sooner than you expected. Unfortunately, looking for replacements, I found out that the existing replacement modules have some issues on their own. This blog entry highlights various problems with the current Python datetime implementation. A partial solution will be offered, too.

Basics of time zones

Time zones are a relatively new invention in the long history of measuring time. During most of the 19th century pretty much each European town had its own definition of local time. It was not until 1880 that Greenwich Mean Time was officially made the standard time in the Great Britain; much of the remaining world had adopted the idea by the 1920s. Today, all countries in the world use standard time zones, though not every one is using full-hour offsets to the GMT as it was originally conceived.

The concept of summer time (daylight saving time in AmE) complicates things further: for example in European Union the member states will switch to summer time on the last Sunday of March at 01:00 GMT exact. The summer time lasts until the last Sunday of October, 01:00 GMT. In Finland, this means that this year on 30th March the official time stepped from 02:59:59 EET to 04:00:00 EEST in an instant. Likewise, on 26th October this year, the summer time clocks will tick up to 03:59:59 EEST, and on the next second the local time will be 03:00:00 EET; and almost a hour later, 03:59:59 EET. Thus, the number of seconds between 02:59:59 and 04:00:00 on a single day might be 1, 3601, or 7201; the difference between 02:59:59 and 03:00:01 might likewise be 2 or 3602 seconds… or even undefined.

To alleviate obvious confusions and misunderstandings, a reference time scale can be used for calculations that concern different time zones. The obvious choice is Coordinated Universal Time (UTC) that replaced Greenwich Mean Time as the standard reference time scale for civilian applications in 1972. During the Internet era UTC has become increasingly important.

Time zones in Python - welcome to hell

Suppose you have a shared web calendar application that is used by people all over the world. Each user wants to view the calendar in their respective local time, and you wish to use UTC on the server. The server has been set up with Europe/Helsinki as the local timezone. And you wish to use the facilities provided by the Python standard library modules. Simple date arithmetic would be needed - what could possibly go wrong? You will soon find out that it is not at all simple. Actually it is annoyingly complicated:

>>> from datetime import datetime
>>> dt = datetime.now()
>>> dt
datetime.datetime(2008, 6, 19, 14, 51, 41, 296552)
>>> # ok, it prints the local time. Let's try to
>>> # convert it to UTC time...
>>> dt.utctimetuple()
(2008, 6, 19, 14, 51, 41, 3, 171, 0)
>>> # wait, ahem... 14:51:41... that can't be right...
>>> # the docs say: utctimetuple(...)
>>> #     Return UTC time tuple, compatible with time.localtime().
>>> #
>>> # ok.. so UTC time tuple, compatible with localtime...
>>> # WTF?? my local time zone is not UTC... strangely enough
>>> # the last field in the tuple, "is_dst", is 0, or false...
>>>># I thought June was in summer...
>>>
>>> # Ok, the factory method I need seems to be utcnow
>>> # - that way I can get the time in UTC?)
>>> datetime.utcnow()
datetime.datetime(2008, 6, 19, 11, 59, 9, 750844)
>>> # fair enough, UTC time.

>>> # Let's try simple date arithmetic: the difference
>>> # between now... and now...
>>> datetime.now() - datetime.utcnow()
datetime.timedelta(0, 10799, 999984)
>>> # Hmm... now did that statement really
>>> # take 3 hours to execute?

The reason for these anomalies is that without any time zone information, instances of the datetime class behave as if they stored time in UTC. For our purposes this is unacceptable: if a user of the hypothetical calendar application proposes a meeting 2 hours from now, be it 17:15 EEST or 14:15 UTC, meeting.start - datetime.now() should on this very moment result in 2 hours regardless of the time zone of the user asking it.

While there are several freely available Python modules that provide date and time calculations, like Zope’s DateTime, the problem with them is that none of them is really compatible with datetime interface - if you use code that expects datetime instances, Zope’s DateTime objects will not help you. Some of the replacement modules, like mxDateUtil seems to use dubious date arithmetic, and are not really useful either. Clearly, we have to either fix the python datetime class somehow, or provide a compatible implementation that works as expected.

Fixing datetime

Fortunately, Python datetimes can be made time zone aware, by supplying an instance of tzinfo in the constructor. Unfortunately enough, the Python standard library does not provide any concrete implementations. Dang! Enters: pytz, a Python library that supplies hundreds of concrete time zone definitions.

>>> import pytz
>>> eurhel = pytz.timezone("Europe/Helsinki")
>>> localt = datetime.now(eurhel)
>>> utct = datetime.now(pytz.utc)
>>> utct - localt
datetime.timedelta(0, 0, 3410)

Works as expected. And, utct - datetime.utcnow() fails with “TypeError: can’t subtract offset-naive and offset-aware datetimes” - which is good, as it would not yield sensible results. However, a look under the hood reveals that something is fundamentally wrong:

>>> datetime.datetime.now()
datetime.datetime(2008, 6, 23, 18, 2, 31, 101025,
        tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)
>>> datetime.datetime(2008, 6, 23, 18, 2, 31, 101025, eurhel)
datetime.datetime(2008, 6, 1, 18, 0, tzinfo=<DstTzInfo 'Europe/Helsinki' HMT+1:40:00 STD>)
>>> # after a minute...
>>> datetime.datetime(2008, 6, 23, 18, 2, 31, 101025, eurhel) - datetime.datetime.now(eurhel)
datetime.timedelta(0, 4687, 688091)

That’s right, the datetime object created by a call to datetime.datetime constructor now seems to think that Finland uses the ancient “Helsinki Mean Time” which was obsoleted in the 1920s. The reason for this behaviour is clearly documented on the pytz page: it seems the Python datetime implementation never asks the tzinfo object what the offset to UTC on the given date would be. And without knowing it pytz seems to default to the first historical definition. Now, some of you fellow readers could insist on the problem going away simply by defaulting to the latest time zone definition. However, the problem would still persist: For example, Venezuela switched to GMT-04:30 on 9th December, 2007, causing the datetime objects representing dates either before, or after the change to become invalid.

The solution offered by pytz pages is to use the normalize and localize methods of pytz tzinfo instances, however this renders the whole datetime system too cumbersome to use. As I wanted to use datetime objects with time zones as easily as possible, I had to subclass the python datetime implementation and hack some internal aspects of it. The module, fixed_datetime also contains a method, set_default_timezone, to allow mimicking of the naive datetime objects; unlike ordinary datetime objects, fixed_datetime.datetime objects are never ‘naive’, but many of the methods will default to the time zone set by the said method.

>>> import fixed_datetime

>>> # set default timezone...
>>> fixed_datetime.set_default_timezone("Europe/Helsinki")

>>> # uses default timezone...
>>> fixed_datetime.datetime.now()
fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486,
        tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)

>>> # also works correctly
>>> fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486)
fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486,
        tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)

>>> # UTC timestamps returned with UTC tzinfo
>>> fixed_datetime.datetime.utcnow()
fixed_datetime.datetime(2008, 6, 23, 15, 37, 44, 777729, tzinfo=<UTC>)

>>> # subtraction still works correctly!
>>> utcdt = fixed_datetime.datetime.utcnow()
>>> heldt = fixed_datetime.datetime.now()
>>> heldt - utcdt
datetime.timedelta(0, 5, 495702)

As a bonus, fixed_datetime.datetime contains methods to convert datetimes from ISO 8601 format. The method support parsing the time zone field, too:

>>> fixed_datetime.datetime.fromisoformat("20081010T010203+0500")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3, tzinfo=<UTC+05:00>)

>>> fixed_datetime.datetime.fromisoformat("2008-10-10 01:02:03Z")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3, tzinfo=<UTC>)

>>> # fractional hours, decimal comma, odd timezone
>>> fixed_datetime.datetime.fromisoformat("2008-10-10 01,0341666667-04:37")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3,
        tzinfo=<UTC-04:37>)

>>> fixed_datetime.datetime.today().isoformat(' ')
'2008-06-23 18:54:32+03:00'

>>> # isoformat supports short format, too
>>> fixed_datetime.datetime.now().isoformat(short=True)
'20080623T185303.489792+0300'

>>> # addition across DST boundary works as expected:
>>> before = fixed_datetime.datetime(2008, 10, 26, 2, 0, 0)
>>> before
fixed_datetime.datetime(2008, 10, 26, 2, 0, tzinfo=
        <DstTzInfo ‘Europe/Helsinki’ EEST+3:00:00 DST>)

>>> # now, add 2 hours
>>> before + fixed_datetime.timedelta(seconds=7200)
fixed_datetime.datetime(2008, 10, 26, 3, 0, tzinfo=
        <DstTzInfo ‘Europe/Helsinki’ EET+2:00:00 STD>)

You can download the said module below.

Remaining issues

Not every remaining issue is solved. Fixed datetime still does not accept “24″ as hour value (mandated by ISO standard), and will throw an exception on positive leap seconds. Fixed datetime is much slower than the python implementation - many of the operations need to create intermediate 2 or 3 datetime instances.

Sadly it seems that Java got it right: having one class (Date) that stores times in UTC seconds relative to Unix Epoch, and subclasses of abstract Calendar class that deal with getting and setting individual components and date arithmetic in a localized way would indeed be the best long-term solution. To some Java’s date and calendar handling would seem overly complicated, to me it is the simplest way of representing the complex world of different calendars, time zones and other aspects of time keeping. If only someone could persuade Python devs to add something similar to the standard library…

Download

Download fixed_datetime.py, released under 3-clause BSD license.

Viivi & Wagner strip scraper

I wrote this little script as a mental exercise and to prove the power of Python programming language. If anyone accepts the challenge, I’d like to see submissions in other programming langauges ;)

For the foreigners: this is the best comic in Finland, so I hope you’ll get translations soon! It tells about the relationship of a woman and a pig (sic) reflecting the deepest shadows of Finnish social life.

"""
	Creats local mirror from Viivi & Wagner strips by fetching all of them from hs.fi.

	Will create downloaded strips as
		2004/1.1.2004.gif
		2004/2.1.2004.gif
		...
		until today

	Try this in C++!

	Motivation: No one has build Viivi & Wagner search engine with speech bubble OCR support
	and I desperately wanted to find "Kottarainen lentaa korvaan" strip for my gf.

	Time to complete: 20 min.

"""

__docformat__ = "epytext"
__author__ = "Mikko Ohtamaa"
__license__ = "BSD"
__copyright__ = "2008 Mikko Ohtamaa"

import os
import re
import urllib
from BeautifulSoup import BeautifulSoup

# 1.1.2004 start page
url = "http://www.hs.fi/viivijawagner/1073386660690"

# Loop until there is no longer next link
while True:
	stream = urllib.urlopen(url)
	html = stream.read()
	stream.close()
	soup = BeautifulSoup(html)

	# Parse strip date from contents
	date = None

	# Find strip date, which is next to a title
	h1 = soup.findAll(text="Viivi ja Wagner")
	# Should be present always
	date = h1[0].parent.parent.p.string

	print “Fetching ” + date

	# Scrape strip
	strip = soup.findAll(”div” , { “class” : “strip” })
	img = strip[0].img

	stream = urllib.urlopen(img["src"])
	data = stream.read()
	stream.close()

	# For each year, give a new folder to avoid file system stress
	# (lotsa files in a folder kill poor Gnome)
	day, month, year = date.split(”.”)
	folder = year

	if not os.path.exists(folder):
		os.mkdir(folder)	

	# Store contents
	fname = os.path.join(folder, date + “.gif”)
	f = open(fname, “wb”)
	f.write(data)
	f.close()

	# Find next url, it is a containing one img tag
	img = soup.findAll(alt=”seuraava”)
        if len(img) == 0:
             break
	a = img[0].parent
	url = a["href"]

See preview


PyS60 application release build toolchain

A common question for Python for Series 60 newcomers is how to build standalone Symbian applications from Python source code. We have been using Makefile based toolchain internally. I describe it in this picture, I didn’t bother to add thumbnail for the image, since it’s a 3400 pixels wide diagram.

The diagram describes building a PyS60 application with some Python extensions (Symbian native C++) mixed in and bundling it all to one downloadable SIS file. The application will appear as any first class S60 application in the menu and the user does not know it’s running Python internally, besides bad installation experience (it challenges Microsoft installers with all those unnecessary yes/no questions), extra uninstaller entries and slow start-up time.

The biggest problems are caused by embedded SISs (SIS inside other SIS files) which are not treaded very wel by several Symbian parties.  In theory, it could be build one monolithic SIS, but you’d need to recompile PyS60 from scratch and patch UIDs inside it for your own UIDs received from symbiansigned.com. We are planning to explore SCons based build solution to address this problem, since Makefiles are a bit unflexible with tasks like PKG file and UID range generation.

Here is a PKG file example for final user distributable SIS file.

Also, see UIKludges project for additional details for PKG files of Python extensions.

You need to have

You need to master

Pros

Cons

Ps. I would have put this thing to wiki.opensource.nokia.com, but their webmaster email address is non-functional and one cannot upload images to their Wiki.

The good, the bad and the Zope

I want to use Zope 3 interface package to write component architecture i.e. have a plug-ins easily in Python. Zope 3 interfaces are very handy and, which cannot be conducted from the name, are available outside Zope too. From my prior experiences I know that Zope 3 interfaces package is one of the best and most underrated Python packages out there. It even influenced to the new design of Python 3k.

Well then… I haven’t used Zope 3 interfaces standalone before, so the first thing what I do is writing “zope 3 interfaces” into my Google search this.

This page comes up.

It’s horrible - the very reason I write this quick blog entry. Some notes below (I have written things from the point of external visitor - I have hands deep in Zope myself, so you don’t need to clarify these things for me or teach anything)

In the post “No, you are not smart enough for Zope” Martjin Faassen highlights some problems of Zope community. “It’s hard to get good content written” Martjin claims. I disagree. Whoever created the page originally could have thought what people coming to the page want. They don’t want to decrypt the brain core dump of hardcore Zope developer. They want to know what is this thing, how this thing is beneficial for them, how do I get started with it and how do I use it.

You all know how Internet works. You all have visit on web pages. You all are customers for the same thing you also produce. So writing a basic web page is not something you couldn’t do.

Hints:

Pardon me the tone of this post. Zope is the 23th best thing out there, but the Zope community has stagnated badly in some aspects. Some things were acceptable ten years ago when web was still young and Python developers hardcore, but if you don’t keep with the pace you lose all the mindshare.

SDK released - Python in iPhone?

I just read waffle’s blog entry about iPhone SDK release. Looks like Objective C is the only supported language by default (I am just downloading SDK).The comments speculated that embedding Python is not possible due to size constraints. Bollocks I say =) Python for Series 60 phones is 500 kb download without trimming. It’s less than the size of HTML page you are viewing now - RAM footprint is even smaller) If Series 60 phones, which have much more modest hardware specifications, can run Python it shouldn’t be a problem for iPhones either.

Why Apple didn’t add additional language support by default? Well they seem to have their hands full to get SDK out at all (delays) so we shouldn’t expect to have perfect set in 1.0 release.

Now, who wants start a porting project with me? ;)

Debugging Django memory leak with TrackRefs and Guppy

I run Django in a standalone long-running application (video encoding server). It leaked memory severely. By using htop, one was seeing two gigabytes reserved for /usr/bin/python after a while. Before starting the debugging session, I had no faintest idea what could be the cause of the problem. Django is robust technology - this kind of things haven’t happened for me before. Since I was running Django in standalone mode, I suspected that some query cache does not get cleared. But random poking around the source code didn’t give any clues.

It was time to do some serious memory debugging for Python.

Python as is doesn’t leak memory, since it’s garbage collected virtual machine. All “leaks” are design problems in the application logic.I found a good primer here what’s going inside Python’s memory management.

First I tried this nice TrackRefs class from Zope. It relies on Python’s own in-interpreter functions to monitor objects.

class TrackRefs:
    """Object to track reference counts across test runs."""

    def __init__(self, limit=40):
        self.type2count = {}
        self.type2all = {}
        self.limit = limit

    def update(self):
        obs = sys.getobjects(0)
        type2count = {}
        type2all = {}
        for o in obs:
            all = sys.getrefcount(o)

            if type(o) is str and o == '<dummy key>':
                # avoid dictionary madness
                continue
            t = type(o)
            if t in type2count:
                type2count[t] += 1
                type2all[t] += all
            else:
                type2count[t] = 1
                type2all[t] = all

        ct = [(type2count[t] - self.type2count.get(t, 0),
               type2all[t] - self.type2all.get(t, 0),
               t)
              for t in type2count.iterkeys()]
        ct.sort()
        ct.reverse()
        printed = False

        logger.debug(”———————-”)
        logger.debug(”Memory profiling”)
        i = 0
        for delta1, delta2, t in ct:
            if delta1 or delta2:
                if not printed:
                    logger.debug(”%-55s %8s %8s” % (”, ‘insts’, ‘refs’))
                    printed = True

                logger.debug(”%-55s %8d %8d” % (t, delta1, delta2))

                i += 1
                if i >= self.limit:
                    break 

        self.type2count = type2count
        self.type2all = type2all

You need to have Python compiled in debug mode to have sys.getobjects() method. Luckily this beefed up Python binary is availalble from Ubuntu’s stock repository:

sudo apt-get install python-dbg python-mysqldb-dbg

Note that native Python extensions don’t work unless they are specifically compiled against the Python debug build (python-mysqldb-dbg)..

Then I add TrackRefs to my main loop:

    def run(self):

        self.running = True

        logger.info("Started worker " + self.get_worker_id_string())

        # Memory leak tracking
        tracker = TrackRefs()

        while self.running:

            self.mark_for_download()

            self.process_downloads()

            self.process_encodings()

            tracker.update() # Dump memory here
            time.sleep(settings.WORKER_POLL_DELAY)

And after running a while I start getting interesting results:

7956 [2008-03-07 02:59:28,767] INFO Jobs needing sources to download 0
7956 [2008-03-07 02:59:28,768] DEBUG Processable jobs: 0
7956 [2008-03-07 02:59:29,754] DEBUG ———————-
7956 [2008-03-07 02:59:29,754] DEBUG Memory profiling
7956 [2008-03-07 02:59:29,754] DEBUG                                                            insts     refs
7956 [2008-03-07 02:59:29,754] DEBUG <type ‘int’>                                                 150   137406
7956 [2008-03-07 02:59:29,755] DEBUG <type ‘tuple’>                                               117   130211
7956 [2008-03-07 02:59:29,755] DEBUG <type ‘dict’>                                                  5     8331
7956 [2008-03-07 02:59:29,755] DEBUG <type ’str’>                                                   3    27643
7956 [2008-03-07 02:59:29,755] DEBUG <type ‘unicode’>                                               3     4606
7956 [2008-03-07 02:59:29,755] DEBUG <type ‘list’>                                                  3     3492
7956 [2008-03-07 02:59:29,756] DEBUG <type ‘frame’>                                                 1      962
7956 [2008-03-07 02:59:29,756] DEBUG <type ‘cell’>                                                  0    12948
7956 [2008-03-07 02:59:29,756] DEBUG <type ‘function’>                                              0     9479

Woah! Who reserved 130 000 ints and tuples? No wonder that soon python gulps 1 gigabytes of memory. Since this is the only number which grows during the main loop cycling and there is no references to classes or objects debugging becomes a bit more difficult. I need to try to cross-reference the difficult tuple objects.

This didn’t go well - with gc.get_referrers() recurive parsing I got some results (example below). But it became soom clearthat debug references from the system itself was impossible: the memory debugging code will always create nasty cyclic references to the system, since it needs to track the objects. I gave up. There had to be something better.

9154 [2008-03-07 04:05:23,571] DEBUG /var/lib/python-support/python2.5/MySQLdb/connections.pyc
9154 [2008-03-07 04:05:23,571] DEBUG <type ‘function’>
9154 [2008-03-07 04:05:23,572] DEBUG defaulterrorhandler
9154 [2008-03-07 04:05:23,572] DEBUG <type ‘function’>
9154 [2008-03-07 04:05:23,572] DEBUG string_literal
9154 [2008-03-07 04:05:23,572] DEBUG <type ‘function’>
9154 [2008-03-07 04:05:23,573] DEBUG unicode_literal
9154 [2008-03-07 04:05:23,573] DEBUG <type ‘function’>
9154 [2008-03-07 04:05:23,573] DEBUG string_decoder
9154 [2008-03-07 04:05:23,573] DEBUG <type ‘function’>
9154 [2008-03-07 04:05:23,573] DEBUG __exit__
9154 [2008-03-07 04:05:23,574] DEBUG <type ‘function’>
9154 [2008-03-07 04:05:23,574] DEBUG begin
9154 [2008-03-07 04:05:23,574] DEBUG <type ‘function’>
9154 [2008-03-07 04:05:23,574] DEBUG __init__
9154 [2008-03-07 04:05:23,574] DEBUG <type ‘function’>
9154 [2008-03-07 04:05:23,575] DEBUG show_warnings

There was: Guppy. Thank you Sverker Nilsson! You saved my day.

Since the API of Guppy is a little eccentric, here are some examples for you:

# init heapy
heapy = guppy.hpy()

# Print memory statistics
def update():
  print heapy.heap()

# Print relative memory consumption since last sycle
def update():
   print heapy.heap()
   heapy.setref()

# Print relative memory consumption w/heap traversing
def update()
    print heapy.heap().get_rp(40)
    heapy.setref()

With heapy.heap() ; heapy.setref() I got this output:

Partition of a set of 12 objects. Total size = 3544 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      3  25     2244  63      2244  63 unicode
     1      2  17      708  20      2952  83 types.FrameType
     2      3  25      432  12      3384  95 dict (no owner)
     3      3  25      120   3      3504  99 str
     4      1   8       40   1      3544 100 list

One adds get_rp() travelsal magic and everything becomes clear:

Reference Pattern by <[dict of] class>.
 0: _ — [-] 14 (dict (no owner) | list | str | types.FrameType | types.Gene…
 1: a      [-] 3 dict (no owner): 0×8c11f34*2, 0×8c1bd54*2, 0×8c1f854*2
 2: aa —- [-] 1 list: 0×833c504*18
 3: a3       [-] 1 dict of django.db.backends.mysql.base.DatabaseWrapper: 0×8…
 4: a4 —— [-] 1 dict (no owner): 0×83a65d4*2
 5: a5         [R] 1 guppy.heapy.heapyc.RootStateType: 0xb787c7a8L
 6: a3b —– [-] 1 django.db.backends.mysql.base.DatabaseWrapper: 0×8356a34
 7: a3ba       [S] 7 dict of module: ..db, ..models, ..query, ..transaction…
 8: b —- [S] 1 types.FrameType: <<lambda> at 0×8b16ecc>
 9: c      [-] 2 list: 0×833c504*18, 0xb7dafe6cL*5
<Type e.g. ‘_.more’ for more.>

What there could in DatabaseWrapper object which is growing and growing… query debugger. Django keeps track of all queries for debugging purposes (connection.queries). This list is reseted at the end of HTTP request. But in standalone mode, there are no requests. So you need to manually reset to queries list after each working cycle.

        while self.running:

            self.mark_for_download()

            self.process_downloads()

            self.process_encodings()

            tracker.update()

            time.sleep(settings.WORKER_POLL_DELAY)

            # Clear database connection ad reset query debugger
            # between cycles to make sure that
            # related resources get released
            reset_queries()
            connection.close()

            print str(connection.queries)

But even after this fix, I got increase in tuple and int usage when monitoring with TrackRefs. But when I run heapy.heap() alone, there is no increase. So the tuple and int consumption must have been caused by TrackRef, sys.getobjects, gc, etc. magic itself.

Eclipse web developer plug-in memo

Currently I work in quite wide field of software development: Python (standalone, Plone, Zope, Django), PHP, Java, Symbian and embedded Linux. I am using Eclipse for development, since it’s pretty much the only consistent platform filling my needs. The nature of work also forces me to use different computers (Mac/Windows/Linux) with different clients. This drives me to reinstall Eclipse now and then.

Below are my personal notes what plug-ins are needed to get “perfect” Eclipse set-up. Basically they are just my own notes so that I don’t need to Google everything all over again every time I reinstall. I hope the readers can find new pearls here or suggest improvements.

Eclipse setup

Eclipse has internal updater/web installer. All plug-ins are downloaded as ZIP files and extracted to Eclipse folder or installed through the internal updater. Paste Eclipse update site URLs to menu Help -> Software updates -> Find and Install, New Remote Location. You can use dummy text as the name of update site.

Eclipse WTP (Web Tools Platform)

Eclipse Web Tools Platform bundles Eclipse, Java development tools, HTML editor, CSS editor and some other generic useful stuff.

Python

PyDev is a plug-in for Python and Jython development.

Site URL: http://pydev.sourceforge.net

Eclipse update site URL: http://pydev.sourceforge.net/updates/

PDT

PDT download provides Eclipse, HTML editor, PHP editor and CSS editor.

Site URL: http://www.eclipse.org

Eclipse update site URL: http://download.eclipse.org/tools/pdt/updates/

Subclipse

Subclipse provides Subversion version control integration to Eclipse.

Eclipse update site URL: http://subclipse.tigris.org/update_1.2.x

In the installer, uncheck the integration modules checkbox or the installer will complain about missing modules.

JSEclipse

JSEclipse provides a better editor (over WTP) for Javascript files, with impressive outlining and autofill capabilities.

Download requires Adobe developer account or similar fill-in-the-fields crap.

Site URL: http://labs.adobe.com/technologies/jseclipse/

ShellEd

Syntax coloring for Unix shell scripts

Project site: http://sourceforge.net/projects/shelled

SQL Explorer

SQL editor with limited GUI capabilities. Based on Eclipse platform. Comes standalone and as Eclipse plug-in.

needs MySQL JDBC driver

Technorati tags:

A quick tryout: Documentation Generating for Plone products

Plone is a modular CMS, which can be expanded with additional products. That means new features are easy to install, and also to customize. However, quickly understanding code that other people wrote, might turn tricky as there are as each coder uses his own style. Therefore, it might be useful to get an overall picture of the system before diving into details.

Documentation generators are useful for giving a comprehensive view on code. These are applications that traverse through code and extract information out of it. They use the structured information then to produce a nice looking reference of the code. Ever heard about API? Yep. Ever seen that sort of documentation among any 3rd party Plone product? At least I haven’t.

Luckily, there a few choices suitable for Plone/Python:

Parsers: doxygen (generic), epydoc (defines ‘epytext’, parses also others), docutils (defines and parses ‘reStructuredText’)
Extensions: graphviz (builds visualization graphs)
Plugins: eclox (an Eclipse plugin that uses doxygen, which uses graphviz)

(Plone API’s at api.plone.org use epydoc btw.)

Out of these, I quickly tested doxygen on a Plone product called EasyShop. The result was interesting but without use. EasyShop does only little subclassing and therefore the documentation doxygen produced was basically listings of separate classes and methods. Doxygen uses graphviz to build graphical visualizations of class relations, but those were out of use also. The problem here is that Plone products are not common python packages: they have adapters, utilities, views, events, subscribers and such. Creating dececnt API out of these would need a specific solution targeted at the platform.

Documentation generating seems interesting, however, and graphviz the most providing out of the whole bunch. Unfortunately, I couldn’t produce anything useful on my first few tries, but the subject just needs a little more research. After all, think about it: an API-like documentation with UML-like graphs of any Plone product, wouldn’t that be nice?

Copyright © Red Innovation Ltd. 2008 All Rights Reserved. | Log in | XHTML
Close
E-mail It