Relativity of time - shortcomings in Python datetime, and workaround
Recently I found out that the standard library support for date and time calculations in Python is not quite as able as I needed. It turned out that the superficial leanness and simplicity of Python’s datetime module bit hard back sooner than you expected. Unfortunately, looking for replacements, I found out that the existing replacement modules have some issues on their own. This blog entry highlights various problems with the current Python datetime implementation. A partial solution will be offered, too.
Basics of time zones
Time zones are a relatively new invention in the long history of measuring time. During most of the 19th century pretty much each European town had its own definition of local time. It was not until 1880 that Greenwich Mean Time was officially made the standard time in the Great Britain; much of the remaining world had adopted the idea by the 1920s. Today, all countries in the world use standard time zones, though not every one is using full-hour offsets to the GMT as it was originally conceived.
The concept of summer time (daylight saving time in AmE) complicates things further: for example in European Union the member states will switch to summer time on the last Sunday of March at 01:00 GMT exact. The summer time lasts until the last Sunday of October, 01:00 GMT. In Finland, this means that this year on 30th March the official time stepped from 02:59:59 EET to 04:00:00 EEST in an instant. Likewise, on 26th October this year, the summer time clocks will tick up to 03:59:59 EEST, and on the next second the local time will be 03:00:00 EET; and almost a hour later, 03:59:59 EET. Thus, the number of seconds between 02:59:59 and 04:00:00 on a single day might be 1, 3601, or 7201; the difference between 02:59:59 and 03:00:01 might likewise be 2 or 3602 seconds… or even undefined.
To alleviate obvious confusions and misunderstandings, a reference time scale can be used for calculations that concern different time zones. The obvious choice is Coordinated Universal Time (UTC) that replaced Greenwich Mean Time as the standard reference time scale for civilian applications in 1972. During the Internet era UTC has become increasingly important.
Time zones in Python - welcome to hell
Suppose you have a shared web calendar application that is used by people all over the world. Each user wants to view the calendar in their respective local time, and you wish to use UTC on the server. The server has been set up with Europe/Helsinki as the local timezone. And you wish to use the facilities provided by the Python standard library modules. Simple date arithmetic would be needed - what could possibly go wrong? You will soon find out that it is not at all simple. Actually it is annoyingly complicated:
>>> from datetime import datetime >>> dt = datetime.now() >>> dt datetime.datetime(2008, 6, 19, 14, 51, 41, 296552) >>> # ok, it prints the local time. Let's try to >>> # convert it to UTC time... >>> dt.utctimetuple() (2008, 6, 19, 14, 51, 41, 3, 171, 0) >>> # wait, ahem... 14:51:41... that can't be right... >>> # the docs say: utctimetuple(...) >>> # Return UTC time tuple, compatible with time.localtime(). >>> # >>> # ok.. so UTC time tuple, compatible with localtime... >>> # WTF?? my local time zone is not UTC... strangely enough >>> # the last field in the tuple, "is_dst", is 0, or false... >>>># I thought June was in summer... >>> >>> # Ok, the factory method I need seems to be utcnow >>> # - that way I can get the time in UTC?) >>> datetime.utcnow() datetime.datetime(2008, 6, 19, 11, 59, 9, 750844) >>> # fair enough, UTC time. >>> # Let's try simple date arithmetic: the difference >>> # between now... and now... >>> datetime.now() - datetime.utcnow() datetime.timedelta(0, 10799, 999984) >>> # Hmm... now did that statement really >>> # take 3 hours to execute?
The reason for these anomalies is that without any time zone information, instances of the datetime class behave as if they stored time in UTC. For our purposes this is unacceptable: if a user of the hypothetical calendar application proposes a meeting 2 hours from now, be it 17:15 EEST or 14:15 UTC, meeting.start - datetime.now() should on this very moment result in 2 hours regardless of the time zone of the user asking it.
While there are several freely available Python modules that provide date and time calculations, like Zope’s DateTime, the problem with them is that none of them is really compatible with datetime interface - if you use code that expects datetime instances, Zope’s DateTime objects will not help you. Some of the replacement modules, like mxDateUtil seems to use dubious date arithmetic, and are not really useful either. Clearly, we have to either fix the python datetime class somehow, or provide a compatible implementation that works as expected.
Fixing datetime
Fortunately, Python datetimes can be made time zone aware, by supplying an instance of tzinfo in the constructor. Unfortunately enough, the Python standard library does not provide any concrete implementations. Dang! Enters: pytz, a Python library that supplies hundreds of concrete time zone definitions.
>>> import pytz
>>> eurhel = pytz.timezone("Europe/Helsinki")
>>> localt = datetime.now(eurhel)
>>> utct = datetime.now(pytz.utc)
>>> utct - localt
datetime.timedelta(0, 0, 3410)
Works as expected. And, utct - datetime.utcnow() fails with “TypeError: can’t subtract offset-naive and offset-aware datetimes” - which is good, as it would not yield sensible results. However, a look under the hood reveals that something is fundamentally wrong:
>>> datetime.datetime.now()
datetime.datetime(2008, 6, 23, 18, 2, 31, 101025,
tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)
>>> datetime.datetime(2008, 6, 23, 18, 2, 31, 101025, eurhel)
datetime.datetime(2008, 6, 1, 18, 0, tzinfo=<DstTzInfo 'Europe/Helsinki' HMT+1:40:00 STD>)
>>> # after a minute...
>>> datetime.datetime(2008, 6, 23, 18, 2, 31, 101025, eurhel) - datetime.datetime.now(eurhel)
datetime.timedelta(0, 4687, 688091)
That’s right, the datetime object created by a call to datetime.datetime constructor now seems to think that Finland uses the ancient “Helsinki Mean Time” which was obsoleted in the 1920s. The reason for this behaviour is clearly documented on the pytz page: it seems the Python datetime implementation never asks the tzinfo object what the offset to UTC on the given date would be. And without knowing it pytz seems to default to the first historical definition. Now, some of you fellow readers could insist on the problem going away simply by defaulting to the latest time zone definition. However, the problem would still persist: For example, Venezuela switched to GMT-04:30 on 9th December, 2007, causing the datetime objects representing dates either before, or after the change to become invalid.
The solution offered by pytz pages is to use the normalize and localize methods of pytz tzinfo instances, however this renders the whole datetime system too cumbersome to use. As I wanted to use datetime objects with time zones as easily as possible, I had to subclass the python datetime implementation and hack some internal aspects of it. The module, fixed_datetime also contains a method, set_default_timezone, to allow mimicking of the naive datetime objects; unlike ordinary datetime objects, fixed_datetime.datetime objects are never ‘naive’, but many of the methods will default to the time zone set by the said method.
>>> import fixed_datetime
>>> # set default timezone...
>>> fixed_datetime.set_default_timezone("Europe/Helsinki")
>>> # uses default timezone...
>>> fixed_datetime.datetime.now()
fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486,
tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)
>>> # also works correctly
>>> fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486)
fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486,
tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)
>>> # UTC timestamps returned with UTC tzinfo
>>> fixed_datetime.datetime.utcnow()
fixed_datetime.datetime(2008, 6, 23, 15, 37, 44, 777729, tzinfo=<UTC>)
>>> # subtraction still works correctly!
>>> utcdt = fixed_datetime.datetime.utcnow()
>>> heldt = fixed_datetime.datetime.now()
>>> heldt - utcdt
datetime.timedelta(0, 5, 495702)
As a bonus, fixed_datetime.datetime contains methods to convert datetimes from ISO 8601 format. The method support parsing the time zone field, too:
>>> fixed_datetime.datetime.fromisoformat("20081010T010203+0500")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3, tzinfo=<UTC+05:00>)
>>> fixed_datetime.datetime.fromisoformat("2008-10-10 01:02:03Z")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3, tzinfo=<UTC>)
>>> # fractional hours, decimal comma, odd timezone
>>> fixed_datetime.datetime.fromisoformat("2008-10-10 01,0341666667-04:37")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3,
tzinfo=<UTC-04:37>)
>>> fixed_datetime.datetime.today().isoformat(' ')
'2008-06-23 18:54:32+03:00'
>>> # isoformat supports short format, too
>>> fixed_datetime.datetime.now().isoformat(short=True)
'20080623T185303.489792+0300'
>>> # addition across DST boundary works as expected:
>>> before = fixed_datetime.datetime(2008, 10, 26, 2, 0, 0)
>>> before
fixed_datetime.datetime(2008, 10, 26, 2, 0, tzinfo=
<DstTzInfo ‘Europe/Helsinki’ EEST+3:00:00 DST>)
>>> # now, add 2 hours
>>> before + fixed_datetime.timedelta(seconds=7200)
fixed_datetime.datetime(2008, 10, 26, 3, 0, tzinfo=
<DstTzInfo ‘Europe/Helsinki’ EET+2:00:00 STD>)
You can download the said module below.
Remaining issues
Not every remaining issue is solved. Fixed datetime still does not accept “24″ as hour value (mandated by ISO standard), and will throw an exception on positive leap seconds. Fixed datetime is much slower than the python implementation - many of the operations need to create intermediate 2 or 3 datetime instances.
Sadly it seems that Java got it right: having one class (Date) that stores times in UTC seconds relative to Unix Epoch, and subclasses of abstract Calendar class that deal with getting and setting individual components and date arithmetic in a localized way would indeed be the best long-term solution. To some Java’s date and calendar handling would seem overly complicated, to me it is the simplest way of representing the complex world of different calendars, time zones and other aspects of time keeping. If only someone could persuade Python devs to add something similar to the standard library…
Download
Download fixed_datetime.py, released under 3-clause BSD license.
Viivi & Wagner strip scraper
I wrote this little script as a mental exercise and to prove the power of Python programming language. If anyone accepts the challenge, I’d like to see submissions in other programming langauges
For the foreigners: this is the best comic in Finland, so I hope you’ll get translations soon! It tells about the relationship of a woman and a pig (sic) reflecting the deepest shadows of Finnish social life.
"""
Creats local mirror from Viivi & Wagner strips by fetching all of them from hs.fi.
Will create downloaded strips as
2004/1.1.2004.gif
2004/2.1.2004.gif
...
until today
Try this in C++!
Motivation: No one has build Viivi & Wagner search engine with speech bubble OCR support
and I desperately wanted to find "Kottarainen lentaa korvaan" strip for my gf.
Time to complete: 20 min.
"""
__docformat__ = "epytext"
__author__ = "Mikko Ohtamaa"
__license__ = "BSD"
__copyright__ = "2008 Mikko Ohtamaa"
import os
import re
import urllib
from BeautifulSoup import BeautifulSoup
# 1.1.2004 start page
url = "http://www.hs.fi/viivijawagner/1073386660690"
# Loop until there is no longer next link
while True:
stream = urllib.urlopen(url)
html = stream.read()
stream.close()
soup = BeautifulSoup(html)
# Parse strip date from contents
date = None
# Find strip date, which is next to a title
h1 = soup.findAll(text="Viivi ja Wagner")
# Should be present always
date = h1[0].parent.parent.p.string
print “Fetching ” + date
# Scrape strip
strip = soup.findAll(”div” , { “class” : “strip” })
img = strip[0].img
stream = urllib.urlopen(img["src"])
data = stream.read()
stream.close()
# For each year, give a new folder to avoid file system stress
# (lotsa files in a folder kill poor Gnome)
day, month, year = date.split(”.”)
folder = year
if not os.path.exists(folder):
os.mkdir(folder)
# Store contents
fname = os.path.join(folder, date + “.gif”)
f = open(fname, “wb”)
f.write(data)
f.close()
# Find next url, it is a containing one img tag
img = soup.findAll(alt=”seuraava”)
if len(img) == 0:
break
a = img[0].parent
url = a["href"]
See preview
PyS60 application release build toolchain
A common question for Python for Series 60 newcomers is how to build standalone Symbian applications from Python source code. We have been using Makefile based toolchain internally. I describe it in this picture, I didn’t bother to add thumbnail for the image, since it’s a 3400 pixels wide diagram.
The diagram describes building a PyS60 application with some Python extensions (Symbian native C++) mixed in and bundling it all to one downloadable SIS file. The application will appear as any first class S60 application in the menu and the user does not know it’s running Python internally, besides bad installation experience (it challenges Microsoft installers with all those unnecessary yes/no questions), extra uninstaller entries and slow start-up time.
The biggest problems are caused by embedded SISs (SIS inside other SIS files) which are not treaded very wel by several Symbian parties. In theory, it could be build one monolithic SIS, but you’d need to recompile PyS60 from scratch and patch UIDs inside it for your own UIDs received from symbiansigned.com. We are planning to explore SCons based build solution to address this problem, since Makefiles are a bit unflexible with tasks like PKG file and UID range generation.
Here is a PKG file example for final user distributable SIS file.
Also, see UIKludges project for additional details for PKG files of Python extensions.
You need to have
- Ensymble tool
- Series 60 SDK (contains some old GNU make)
You need to master
- A build tool (make)
- Symbian PKG file structure
- Lots of different command line tools
Pros
- It’s the best one we have for now
Cons
- Symbian signing and certification companies don’t understand embedded SIS files (all SIS files must be signed prior embedding) and may have hard time signing SIS files containing only an extension DLL for Pyton. Symbian Signed test criteria has been built only UI application based SIS files in the mind.
- You cannot cook your own patched PyS60 distribution without revamping some hardcoded UIDs and paths, since otherwise there are UID conflicts (EXE and DLL file UIDs are in Nokia’s protected range)
- S60 installers askes extra confirmation for every embedded SIS file, even in the middle of the progress bar, so the user experience of installation is screwed up
- There will extra uninstallation entry for every embedded SIS file in S60 application manager confusing the user
- As you can see, most cons come from Symbian and Symbian signing limitations and have nothing to do with Python
Ps. I would have put this thing to wiki.opensource.nokia.com, but their webmaster email address is non-functional and one cannot upload images to their Wiki.
The good, the bad and the Zope
I want to use Zope 3 interface package to write component architecture i.e. have a plug-ins easily in Python. Zope 3 interfaces are very handy and, which cannot be conducted from the name, are available outside Zope too. From my prior experiences I know that Zope 3 interfaces package is one of the best and most underrated Python packages out there. It even influenced to the new design of Python 3k.
Well then… I haven’t used Zope 3 interfaces standalone before, so the first thing what I do is writing “zope 3 interfaces” into my Google search this.
It’s horrible - the very reason I write this quick blog entry. Some notes below (I have written things from the point of external visitor - I have hands deep in Zope myself, so you don’t need to clarify these things for me or teach anything)
- The information is tangled mess: please use subtitles and images
- You could mention that the package can be used outside Zope
- You could mention tat the package is a generic Python package
- The list of CamelCaseWordsWhichGoesOnAndOnUnexplained made me puke. Please use proper English and explain the meaning of the links.
- It has a comment box: I tried to comment the page but I will get permission error. Please do not show the comment box if the commenting is not actually possbile.
- Then I tried to register to Zope.org to comment or fix the basic things on the page. You have a join link in zope.org, but there is no registration form. Manual email conversation seem to be prerequisite for the registration. Is Zope a secret society or something…? The contribution barrier just rise too high.
- In this point, I give up. Even if you had the best interface package out there, I don’t care anymore. Looks like getting involved to fixing thing takes too much of my time. You had your 10 minutes of time to impress me and you failed. Not only that, but I got so frustrated that I want to learn smoking, fast.
In the post “No, you are not smart enough for Zope” Martjin Faassen highlights some problems of Zope community. “It’s hard to get good content written” Martjin claims. I disagree. Whoever created the page originally could have thought what people coming to the page want. They don’t want to decrypt the brain core dump of hardcore Zope developer. They want to know what is this thing, how this thing is beneficial for them, how do I get started with it and how do I use it.
You all know how Internet works. You all have visit on web pages. You all are customers for the same thing you also produce. So writing a basic web page is not something you couldn’t do.
Hints:
- apt-get install python-zopeinterface
- README.TXT
Pardon me the tone of this post. Zope is the 23th best thing out there, but the Zope community has stagnated badly in some aspects. Some things were acceptable ten years ago when web was still young and Python developers hardcore, but if you don’t keep with the pace you lose all the mindshare.
SDK released - Python in iPhone?
I just read waffle’s blog entry about iPhone SDK release. Looks like Objective C is the only supported language by default (I am just downloading SDK).The comments speculated that embedding Python is not possible due to size constraints. Bollocks I say =) Python for Series 60 phones is 500 kb download without trimming. It’s less than the size of HTML page you are viewing now - RAM footprint is even smaller) If Series 60 phones, which have much more modest hardware specifications, can run Python it shouldn’t be a problem for iPhones either.
Why Apple didn’t add additional language support by default? Well they seem to have their hands full to get SDK out at all (delays) so we shouldn’t expect to have perfect set in 1.0 release.
Now, who wants start a porting project with me? ![]()
Debugging Django memory leak with TrackRefs and Guppy
I run Django in a standalone long-running application (video encoding server). It leaked memory severely. By using htop, one was seeing two gigabytes reserved for /usr/bin/python after a while. Before starting the debugging session, I had no faintest idea what could be the cause of the problem. Django is robust technology - this kind of things haven’t happened for me before. Since I was running Django in standalone mode, I suspected that some query cache does not get cleared. But random poking around the source code didn’t give any clues.
It was time to do some serious memory debugging for Python.
Python as is doesn’t leak memory, since it’s garbage collected virtual machine. All “leaks” are design problems in the application logic.I found a good primer here what’s going inside Python’s memory management.
First I tried this nice TrackRefs class from Zope. It relies on Python’s own in-interpreter functions to monitor objects.
class TrackRefs:
"""Object to track reference counts across test runs."""
def __init__(self, limit=40):
self.type2count = {}
self.type2all = {}
self.limit = limit
def update(self):
obs = sys.getobjects(0)
type2count = {}
type2all = {}
for o in obs:
all = sys.getrefcount(o)
if type(o) is str and o == '<dummy key>':
# avoid dictionary madness
continue
t = type(o)
if t in type2count:
type2count[t] += 1
type2all[t] += all
else:
type2count[t] = 1
type2all[t] = all
ct = [(type2count[t] - self.type2count.get(t, 0),
type2all[t] - self.type2all.get(t, 0),
t)
for t in type2count.iterkeys()]
ct.sort()
ct.reverse()
printed = False
logger.debug(”———————-”)
logger.debug(”Memory profiling”)
i = 0
for delta1, delta2, t in ct:
if delta1 or delta2:
if not printed:
logger.debug(”%-55s %8s %8s” % (”, ‘insts’, ‘refs’))
printed = True
logger.debug(”%-55s %8d %8d” % (t, delta1, delta2))
i += 1
if i >= self.limit:
break
self.type2count = type2count
self.type2all = type2all
You need to have Python compiled in debug mode to have sys.getobjects() method. Luckily this beefed up Python binary is availalble from Ubuntu’s stock repository:
sudo apt-get install python-dbg python-mysqldb-dbg
Note that native Python extensions don’t work unless they are specifically compiled against the Python debug build (python-mysqldb-dbg)..
Then I add TrackRefs to my main loop:
def run(self):
self.running = True
logger.info("Started worker " + self.get_worker_id_string())
# Memory leak tracking
tracker = TrackRefs()
while self.running:
self.mark_for_download()
self.process_downloads()
self.process_encodings()
tracker.update() # Dump memory here
time.sleep(settings.WORKER_POLL_DELAY)
And after running a while I start getting interesting results:
7956 [2008-03-07 02:59:28,767] INFO Jobs needing sources to download 0 7956 [2008-03-07 02:59:28,768] DEBUG Processable jobs: 0 7956 [2008-03-07 02:59:29,754] DEBUG ———————- 7956 [2008-03-07 02:59:29,754] DEBUG Memory profiling 7956 [2008-03-07 02:59:29,754] DEBUG insts refs 7956 [2008-03-07 02:59:29,754] DEBUG <type ‘int’> 150 137406 7956 [2008-03-07 02:59:29,755] DEBUG <type ‘tuple’> 117 130211 7956 [2008-03-07 02:59:29,755] DEBUG <type ‘dict’> 5 8331 7956 [2008-03-07 02:59:29,755] DEBUG <type ’str’> 3 27643 7956 [2008-03-07 02:59:29,755] DEBUG <type ‘unicode’> 3 4606 7956 [2008-03-07 02:59:29,755] DEBUG <type ‘list’> 3 3492 7956 [2008-03-07 02:59:29,756] DEBUG <type ‘frame’> 1 962 7956 [2008-03-07 02:59:29,756] DEBUG <type ‘cell’> 0 12948 7956 [2008-03-07 02:59:29,756] DEBUG <type ‘function’> 0 9479
Woah! Who reserved 130 000 ints and tuples? No wonder that soon python gulps 1 gigabytes of memory. Since this is the only number which grows during the main loop cycling and there is no references to classes or objects debugging becomes a bit more difficult. I need to try to cross-reference the difficult tuple objects.
This didn’t go well - with gc.get_referrers() recurive parsing I got some results (example below). But it became soom clearthat debug references from the system itself was impossible: the memory debugging code will always create nasty cyclic references to the system, since it needs to track the objects. I gave up. There had to be something better.
9154 [2008-03-07 04:05:23,571] DEBUG /var/lib/python-support/python2.5/MySQLdb/connections.pyc 9154 [2008-03-07 04:05:23,571] DEBUG <type ‘function’> 9154 [2008-03-07 04:05:23,572] DEBUG defaulterrorhandler 9154 [2008-03-07 04:05:23,572] DEBUG <type ‘function’> 9154 [2008-03-07 04:05:23,572] DEBUG string_literal 9154 [2008-03-07 04:05:23,572] DEBUG <type ‘function’> 9154 [2008-03-07 04:05:23,573] DEBUG unicode_literal 9154 [2008-03-07 04:05:23,573] DEBUG <type ‘function’> 9154 [2008-03-07 04:05:23,573] DEBUG string_decoder 9154 [2008-03-07 04:05:23,573] DEBUG <type ‘function’> 9154 [2008-03-07 04:05:23,573] DEBUG __exit__ 9154 [2008-03-07 04:05:23,574] DEBUG <type ‘function’> 9154 [2008-03-07 04:05:23,574] DEBUG begin 9154 [2008-03-07 04:05:23,574] DEBUG <type ‘function’> 9154 [2008-03-07 04:05:23,574] DEBUG __init__ 9154 [2008-03-07 04:05:23,574] DEBUG <type ‘function’> 9154 [2008-03-07 04:05:23,575] DEBUG show_warnings
There was: Guppy. Thank you Sverker Nilsson! You saved my day.
Since the API of Guppy is a little eccentric, here are some examples for you:
# init heapy
heapy = guppy.hpy()
# Print memory statistics
def update():
print heapy.heap()
# Print relative memory consumption since last sycle
def update():
print heapy.heap()
heapy.setref()
# Print relative memory consumption w/heap traversing
def update()
print heapy.heap().get_rp(40)
heapy.setref()
With heapy.heap() ; heapy.setref() I got this output:
Partition of a set of 12 objects. Total size = 3544 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 3 25 2244 63 2244 63 unicode
1 2 17 708 20 2952 83 types.FrameType
2 3 25 432 12 3384 95 dict (no owner)
3 3 25 120 3 3504 99 str
4 1 8 40 1 3544 100 list
One adds get_rp() travelsal magic and everything becomes clear:
Reference Pattern by <[dict of] class>. 0: _ — [-] 14 (dict (no owner) | list | str | types.FrameType | types.Gene… 1: a [-] 3 dict (no owner): 0×8c11f34*2, 0×8c1bd54*2, 0×8c1f854*2 2: aa —- [-] 1 list: 0×833c504*18 3: a3 [-] 1 dict of django.db.backends.mysql.base.DatabaseWrapper: 0×8… 4: a4 —— [-] 1 dict (no owner): 0×83a65d4*2 5: a5 [R] 1 guppy.heapy.heapyc.RootStateType: 0xb787c7a8L 6: a3b —– [-] 1 django.db.backends.mysql.base.DatabaseWrapper: 0×8356a34 7: a3ba [S] 7 dict of module: ..db, ..models, ..query, ..transaction… 8: b —- [S] 1 types.FrameType: <<lambda> at 0×8b16ecc> 9: c [-] 2 list: 0×833c504*18, 0xb7dafe6cL*5 <Type e.g. ‘_.more’ for more.>
What there could in DatabaseWrapper object which is growing and growing… query debugger. Django keeps track of all queries for debugging purposes (connection.queries). This list is reseted at the end of HTTP request. But in standalone mode, there are no requests. So you need to manually reset to queries list after each working cycle.
while self.running:
self.mark_for_download()
self.process_downloads()
self.process_encodings()
tracker.update()
time.sleep(settings.WORKER_POLL_DELAY)
# Clear database connection ad reset query debugger
# between cycles to make sure that
# related resources get released
reset_queries()
connection.close()
print str(connection.queries)
But even after this fix, I got increase in tuple and int usage when monitoring with TrackRefs. But when I run heapy.heap() alone, there is no increase. So the tuple and int consumption must have been caused by TrackRef, sys.getobjects, gc, etc. magic itself.
Eclipse web developer plug-in memo
Currently I work in quite wide field of software development: Python (standalone, Plone, Zope, Django), PHP, Java, Symbian and embedded Linux. I am using Eclipse for development, since it’s pretty much the only consistent platform filling my needs. The nature of work also forces me to use different computers (Mac/Windows/Linux) with different clients. This drives me to reinstall Eclipse now and then.
Below are my personal notes what plug-ins are needed to get “perfect” Eclipse set-up. Basically they are just my own notes so that I don’t need to Google everything all over again every time I reinstall. I hope the readers can find new pearls here or suggest improvements.
Eclipse setup
Eclipse has internal updater/web installer. All plug-ins are downloaded as ZIP files and extracted to Eclipse folder or installed through the internal updater. Paste Eclipse update site URLs to menu Help -> Software updates -> Find and Install, New Remote Location. You can use dummy text as the name of update site.
Eclipse WTP (Web Tools Platform)
Eclipse Web Tools Platform bundles Eclipse, Java development tools, HTML editor, CSS editor and some other generic useful stuff.
- No separate Eclipse download needed. Download the bundle from http://download.eclipse.org/webtools/downloads/
Python
PyDev is a plug-in for Python and Jython development.
Site URL: http://pydev.sourceforge.net
Eclipse update site URL: http://pydev.sourceforge.net/updates/
PDT
PDT download provides Eclipse, HTML editor, PHP editor and CSS editor.
Site URL: http://www.eclipse.org
Eclipse update site URL: http://download.eclipse.org/tools/pdt/updates/
Subclipse
Subclipse provides Subversion version control integration to Eclipse.
Eclipse update site URL: http://subclipse.tigris.org/update_1.2.x
In the installer, uncheck the integration modules checkbox or the installer will complain about missing modules.
JSEclipse
JSEclipse provides a better editor (over WTP) for Javascript files, with impressive outlining and autofill capabilities.
Download requires Adobe developer account or similar fill-in-the-fields crap.
Site URL: http://labs.adobe.com/technologies/jseclipse/
ShellEd
Syntax coloring for Unix shell scripts
Project site: http://sourceforge.net/projects/shelled
SQL Explorer
SQL editor with limited GUI capabilities. Based on Eclipse platform. Comes standalone and as Eclipse plug-in.
- Download ZIP from http://sourceforge.net/project/showfiles.php?group_id=132863
needs MySQL JDBC driver
Technorati tags: Python Plone Django PHP Eclipse Web development Subclipse Javascript SQL
A quick tryout: Documentation Generating for Plone products
Plone is a modular CMS, which can be expanded with additional products. That means new features are easy to install, and also to customize. However, quickly understanding code that other people wrote, might turn tricky as there are as each coder uses his own style. Therefore, it might be useful to get an overall picture of the system before diving into details.
Documentation generators are useful for giving a comprehensive view on code. These are applications that traverse through code and extract information out of it. They use the structured information then to produce a nice looking reference of the code. Ever heard about API? Yep. Ever seen that sort of documentation among any 3rd party Plone product? At least I haven’t.
Luckily, there a few choices suitable for Plone/Python:
Parsers: doxygen (generic), epydoc (defines ‘epytext’, parses also others), docutils (defines and parses ‘reStructuredText’)
Extensions: graphviz (builds visualization graphs)
Plugins: eclox (an Eclipse plugin that uses doxygen, which uses graphviz)
(Plone API’s at api.plone.org use epydoc btw.)
Out of these, I quickly tested doxygen on a Plone product called EasyShop. The result was interesting but without use. EasyShop does only little subclassing and therefore the documentation doxygen produced was basically listings of separate classes and methods. Doxygen uses graphviz to build graphical visualizations of class relations, but those were out of use also. The problem here is that Plone products are not common python packages: they have adapters, utilities, views, events, subscribers and such. Creating dececnt API out of these would need a specific solution targeted at the platform.
Documentation generating seems interesting, however, and graphviz the most providing out of the whole bunch. Unfortunately, I couldn’t produce anything useful on my first few tries, but the subject just needs a little more research. After all, think about it: an API-like documentation with UML-like graphs of any Plone product, wouldn’t that be nice?
RSS