XSLT and big files

2007-10-28T10:49:43Z
Dave Pawson.  link
Home

XSLT and big files

XSLT and big files

I've been processing my blog for a while now and it's taking progressively longer to chunk it up into html, sort it by date etc. The integrated file is only 685K which isn't particularly big, until you hear Mike Kay say that typically processing such a file takes ten times that in terms of memory. This has been bugging me for a while so this morning I sat down and looked at the workflow. I write individual entries, root element entry in the atom namespace. These are saved to files reflecting the date for ease of processing. The only time I integrate them into a feed is to sort them by date order to generate atom.xml, with just the most recent 30 days worth of feeds. Daft really, so I re-wrote that to use the file list, generated by Python which has just the filenames. This is easy to sort since even the two per day entries are simply related. The date comparison jumps by an order. This entry is 0710281 (second entry of the day). When comparing with an XSLT formatted date of 'today' 071028, I simply multiply todays date by 10 to do the comparison. Then a rough A - B < 30 gives me the last 30 days worth for the atom feed.

I wonder if the XSLT WG are actively looking at processing large documents. It's been taken up FOSS wise, but nothing from W3C.

Anyway, if this feed isn't looking right, it's probably because I've changed my processing so please let me know and I'll try and fix it.

Keywords: atom

Comments (View)

Return to main index