On Fri, Dec 31, 2004 at 11:48:19AM -0500, seth vidal wrote:
> using the reader at the C level, this include decompressing the archive
> and walking though all nodes. The main cost is to turn the parsed data into
> Python's internal representation as I said.
>
> > than wouldn't be useful to
> > implement that small portion in C? or it isn't so small part?
>
> The string interning is in the Python lib, probably in C as it's a C API
> as far as I can tell. And no I din't looked at python internal code.
I'm talking from ignorance here:
Would it be possible to speed up the string interning by providing your
own __repr__ methods in the libxml2 python module?
Unfortunately that's not where the problem lies assuming I understand
what you suggest, __repr__ is used to make a string representation from
a python object, while the problem we have is about building that python
object (which happen to be a string) based on the C string.
We should double-check where time is actually spent. Using (k)cachegrind
is very useful to make such an analysis.
Daniel
--
Daniel Veillard | Red Hat Desktop team
http://redhat.com/
veillard(a)redhat.com | libxml GNOME XML XSLT toolkit
http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine
http://rpmfind.net/