Monday, January 23, 2006

rencode -- Reduced length encodings

A space-efficient Python serialization module based on bencode,
which can be used to decode strings from untrusted sources. Works well for complex, heterogeneous data structures with many small elements. The encodings take considerably less space than bencode, gherkin, and Recipe 415503 [1] for that use case, and this module is faster, too. (Source code), (Comparison).


Nathan said...

Thanks for posting this code, it was helpful to me. I am using Google App Engine to make python games with distributed state, this little module was perfect for replacing pickle and marshal.

Brandon said...

Benchmarks, including size before/after gzip:

Time to serialize/deserialize a large dict
str/eval (186000/55250): 116.62 msec/pass
cPickle (324904/104183): 37.10 msec/pass
pickle (324900/104237): 368.52 msec/pass
rencode (99202/53403): 108.34 msec/pass

Time to serialize/deserialize a list of strings
str/eval (90000/213): 2.21 msec/pass
cPickle (40015/88): 0.47 msec/pass
pickle (40015/88): 4.13 msec/pass
split/join (99995/230): 0.09 msec/pass
rencode (60002/122): 2.15 msec/pass