Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   persistent composites (http://www.velocityreviews.com/forums/t687819-persistent-composites.html)

Aaron Brady 06-14-2009 02:27 PM

persistent composites
 
Hi, please forgive the multi-posting on this general topic.

Some time ago, I recommended a pursuit of keeping 'persistent
composite' types on disk, to be read and updated at other times by
other processes. Databases provide this functionality, with the
exception that field types in any given table are required to be
uniform. Python has no such restriction.

I tried out an implementation of composite collections, specifically
lists, sets, and dicts, using 'sqlite3' as a persistence back-end.
It's significantly slower, but we might argue that attempting to do it
by hand classifies as a premature optimization; it is easy to optimize
debugged code.

The essentials of the implementation are:
- each 'object' gets its own table.
= this includes immutable types
- a reference count table
= when an object's ref. count reaches zero, its table is dropped
- a type-map table
= maps object's table ID to a string of its type
- a single 'entry point' table, with the table ID of the entry-point
object
= the entry point is the only data structure available to new
connections. (I imagine it will usually be a list or dict.)

I will be sure to kill any interest you might have by now, by
"revealing" a snippet of code.

The object creation procedure:

def new_table( self, type ):
''' 'type' is a string, the name of the class the object is an
instance of '''
cur= self.conn.cursor( )
recs= cur.execute( '''SELECT max( tableid ) FROM refcounts''' )
rec= cur.fetchone( )
if rec[ 0 ] is None:
obid= 0
else:
obid= rec[ 0 ]+ 1
cur.execute( '''INSERT INTO types VALUES( ?, ? )''', ( obid,
type ) )
cur.execute( '''INSERT INTO refcounts VALUES( ?, ? )''', ( obid,
1 ) )

The increment ref. count procedure:

def incref( self, obid ):
cur= self.conn.cursor( )
recs= cur.execute( '''SELECT count FROM refcounts WHERE tableid
= ?''', ( obid, ) )
rec= recs.fetchone( )
newct= rec[ 0 ]+ 1
cur.execute( '''UPDATE refcounts SET count = ? WHERE tableid = ?''',
( newct, obid ) )

The top-level structure contains these two procedures, as well as
'setentry', 'getentry', and 'revive' procedures.

Most notably, user-defined types are possible. The dict is merely a
persistent dict. 'revive' checks the global namespace by name for the
original type, subject to the same restrictions that we all know and
love that 'pickle' has.

As usual, deadlocks and cyclic garbage pose the usual problems. The
approach I used last time was to maintain a graph of acquired locks,
and merely check for cycles to avert deadlocks, which would go in a
separate table. For garbage, I can't find a better solution than
Python already uses.

From the 3.0 docs:
gc.garbage

A list of objects which the collector found to be unreachable but
could not be freed (uncollectable objects).
....
Python doesn’t collect such [garbage] cycles automatically because, in
general, it isn’t possible for Python to guess a safe order in which
to run the __del__() methods. If you know a safe order, you can force
the issue by examining the garbage list, and explicitly breaking
cycles due to your objects within the list.

Before I go and flesh out the entire interfaces for the provided
types, does anyone have a use for it?

kindly 06-14-2009 02:54 PM

Re: persistent composites
 
On Jun 14, 3:27*pm, Aaron Brady <castiro...@gmail.com> wrote:
> Hi, please forgive the multi-posting on this general topic.
>
> Some time ago, I recommended a pursuit of keeping 'persistent
> composite' types on disk, to be read and updated at other times by
> other processes. *Databases provide this functionality, with the
> exception that field types in any given table are required to be
> uniform. *Python has no such restriction.
>
> I tried out an implementation of composite collections, specifically
> lists, sets, and dicts, using 'sqlite3' as a persistence back-end.
> It's significantly slower, but we might argue that attempting to do it
> by hand classifies as a premature optimization; it is easy to optimize
> debugged code.
>
> The essentials of the implementation are:
> * - each 'object' gets its own table.
> * * = this includes immutable types
> * - a reference count table
> * * = when an object's ref. count reaches zero, its table is dropped
> * - a type-map table
> * * = maps object's table ID to a string of its type
> * - a single 'entry point' table, with the table ID of the entry-point
> object
> * * = the entry point is the only data structure available to new
> connections. *(I imagine it will usually be a list or dict.)
>
> I will be sure to kill any interest you might have by now, by
> "revealing" a snippet of code.
>
> The object creation procedure:
>
> def new_table( self, type ):
> * ''' 'type' is a string, the name of the class the object is an
> instance of '''
> * cur= self.conn.cursor( )
> * recs= cur.execute( '''SELECT max( tableid ) FROM refcounts''' )
> * rec= cur.fetchone( )
> * if rec[ 0 ] is None:
> * * obid= 0
> * else:
> * * obid= rec[ 0 ]+ 1
> * cur.execute( '''INSERT INTO types VALUES( ?, ? )''', ( obid,
> type ) )
> * cur.execute( '''INSERT INTO refcounts VALUES( ?, ? )''', ( obid,
> 1 ) )
>
> The increment ref. count procedure:
>
> def incref( self, obid ):
> * cur= self.conn.cursor( )
> * recs= cur.execute( '''SELECT count FROM refcounts WHERE tableid
> = ?''', ( obid, ) )
> * rec= recs.fetchone( )
> * newct= rec[ 0 ]+ 1
> * cur.execute( '''UPDATE refcounts SET count = ? WHERE tableid = ?''',
> ( newct, obid ) )
>
> The top-level structure contains these two procedures, as well as
> 'setentry', 'getentry', and 'revive' procedures.
>
> Most notably, user-defined types are possible. *The dict is merely a
> persistent dict. *'revive' checks the global namespace by name for the
> original type, subject to the same restrictions that we all know and
> love that 'pickle' has.
>
> As usual, deadlocks and cyclic garbage pose the usual problems. *The
> approach I used last time was to maintain a graph of acquired locks,
> and merely check for cycles to avert deadlocks, which would go in a
> separate table. *For garbage, I can't find a better solution than
> Python already uses.
>
> From the 3.0 docs:
> gc.garbage
>
> * * A list of objects which the collector found to be unreachable but
> could not be freed (uncollectable objects).
> ...
> Python doesn’t collect such [garbage] cycles automatically because, in
> general, it isn’t possible for Python to guess a safe order in which
> to run the __del__() methods. If you know a safe order, you can force
> the issue by examining the garbage list, and explicitly breaking
> cycles due to your objects within the list.
>
> Before I go and flesh out the entire interfaces for the provided
> types, does anyone have a use for it?


I like it as a concept, have not got any use for it this minute, but I
am sure it will be useful someday.

Steven D'Aprano 06-14-2009 03:24 PM

Re: persistent composites
 
Aaron Brady wrote:

> Some time ago, I recommended a pursuit of keeping 'persistent
> composite' types on disk, to be read and updated at other times by
> other processes. Databases provide this functionality, with the
> exception that field types in any given table are required to be
> uniform. Python has no such restriction.

....
> I will be sure to kill any interest you might have by now, by
> "revealing" a snippet of code.


How about telling us what problem you're trying to solve? Giving us your
implementation isn't useful if we don't even know what it's for. Persistent
data on disk, okay, I get that -- but how is it different from (say) XML,
or INI files, or pickles, or similar? Is the idea to have *live* data
stored on disk, updated by multiple processes simultaneously? Don't assume
that people will remember your recommendation from "some time ago".


--
Steven


Aaron Brady 06-14-2009 04:41 PM

Re: persistent composites
 
On Jun 14, 8:24*am, Steven D'Aprano
<st...@REMOVETHIS.cybersource.com.au> wrote:
> Aaron Brady wrote:
> > Some time ago, I recommended a pursuit of keeping 'persistent
> > composite' types on disk, to be read and updated at other times by
> > other processes. *Databases provide this functionality, with the
> > exception that field types in any given table are required to be
> > uniform. *Python has no such restriction.

> ...
> > I will be sure to kill any interest you might have by now, by
> > "revealing" a snippet of code.

>
> How about telling us what problem you're trying to solve? Giving us your
> implementation isn't useful if we don't even know what it's for. Persistent
> data on disk, okay, I get that -- but how is it different from (say) XML,
> or INI files, or pickles, or similar? Is the idea to have *live* data
> stored on disk, updated by multiple processes simultaneously? Don't assume
> that people will remember your recommendation from "some time ago".
>
> --
> Steven


It was no time at all in the Usenet world, I suppose. There are lots
of strategies for storing data on a disk, and lots of strategies for
storing data in memory. I was trying on disk what Python uses on
memory.

If you want to argue that Python isn't useful, you're in the wrong
place. If you want to argue that Python's strategy is useful on
disks, you're talking to the right person.

I'll draw an analogy. static data structures : sql & co. :: Python
data structures : this undertaking. It has all the advantages over
SQL & XML structures that Python has over C structures: sort of a
'worst of neither' combination. I haven't written a textbook on DSs,
though, so I don't quite know where to begin. In my eyes, you're
practically asking, "What's your use case for Python?" Either that,
or I don't know enough other languages.

I have nothing to use it for, but someone might, and it might be just
plumb interesting to fellow Pythoneers. If I did, I'd have somewhere
to start.

Jaime Fernandez del Rio 06-14-2009 05:18 PM

Re: persistent composites
 
On Sun, Jun 14, 2009 at 4:27 PM, Aaron Brady<castironpi@gmail.com> wrote:
>
> Before I go and flesh out the entire interfaces for the provided
> types, does anyone have a use for it?


A real-world application of persistent data structures can be found here:

http://stevekrenzel.com/persistent-list

Jaime

--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
planes de dominación mundial.

Diez B. Roggisch 06-15-2009 12:45 PM

Re: persistent composites
 
Aaron Brady wrote:

> Hi, please forgive the multi-posting on this general topic.
>
> Some time ago, I recommended a pursuit of keeping 'persistent
> composite' types on disk, to be read and updated at other times by
> other processes. Databases provide this functionality, with the
> exception that field types in any given table are required to be
> uniform. Python has no such restriction.
>
> I tried out an implementation of composite collections, specifically
> lists, sets, and dicts, using 'sqlite3' as a persistence back-end.
> It's significantly slower, but we might argue that attempting to do it
> by hand classifies as a premature optimization; it is easy to optimize
> debugged code.


<snip/>

Sounds like you are re-inventing the ZODB.

Diez

Aahz 06-15-2009 03:37 PM

Re: persistent composites
 
In article <79mtt7F1r4807U1@mid.uni-berlin.de>,
Diez B. Roggisch <deets@nospam.web.de> wrote:
>Aaron Brady wrote:
>>
>> Some time ago, I recommended a pursuit of keeping 'persistent
>> composite' types on disk, to be read and updated at other times by
>> other processes. Databases provide this functionality, with the
>> exception that field types in any given table are required to be
>> uniform. Python has no such restriction.
>>
>> I tried out an implementation of composite collections, specifically
>> lists, sets, and dicts, using 'sqlite3' as a persistence back-end.
>> It's significantly slower, but we might argue that attempting to do it
>> by hand classifies as a premature optimization; it is easy to optimize
>> debugged code.

>
>Sounds like you are re-inventing the ZODB.


....or SQLAlchemy or pickles in a SQL BLOB or...
--
Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer

Aaron Brady 06-15-2009 03:56 PM

Re: persistent composites
 
On Jun 15, 5:45*am, "Diez B. Roggisch" <de...@nospam.web.de> wrote:
> Aaron Brady wrote:
> > Hi, please forgive the multi-posting on this general topic.

>
> > Some time ago, I recommended a pursuit of keeping 'persistent
> > composite' types on disk, to be read and updated at other times by
> > other processes. *Databases provide this functionality, with the
> > exception that field types in any given table are required to be
> > uniform. *Python has no such restriction.

>
> > I tried out an implementation of composite collections, specifically
> > lists, sets, and dicts, using 'sqlite3' as a persistence back-end.
> > It's significantly slower, but we might argue that attempting to do it
> > by hand classifies as a premature optimization; it is easy to optimize
> > debugged code.

>
> <snip/>
>
> Sounds like you are re-inventing the ZODB.
>
> Diez


Alright, Diez. Here is some private consulting for free.

''Section 2.6.1:
The most common idiom that isn't caught by the ZODB is mutating a list
or dictionary''

My approach performs this for free.

The docs also don't mention interprocess communication, which is one
of the two primary functions that I satisfy in my approach.

The syntax is different, non-trivially so, and mine is more Pythonic.

Aaron Brady 06-15-2009 03:56 PM

Re: persistent composites
 
On Jun 14, 10:18*am, Jaime Fernandez del Rio <jaime.f...@gmail.com>
wrote:
> On Sun, Jun 14, 2009 at 4:27 PM, Aaron Brady<castiro...@gmail.com> wrote:
>
> > Before I go and flesh out the entire interfaces for the provided
> > types, does anyone have a use for it?

>
> A real-world application of persistent data structures can be found here:
>
> http://stevekrenzel.com/persistent-list
>
> Jaime


Jaime, thanks for the link. I contacted its author.

Aaron Brady 06-15-2009 04:18 PM

Re: persistent composites
 
On Jun 15, 8:37*am, a...@pythoncraft.com (Aahz) wrote:
> In article <79mtt7F1r480...@mid.uni-berlin.de>,
> Diez B. Roggisch <de...@nospam.web.de> wrote:
>
> >Aaron Brady wrote:

>
> >> Some time ago, I recommended a pursuit of keeping 'persistent
> >> composite' types on disk, to be read and updated at other times by
> >> other processes. *Databases provide this functionality, with the
> >> exception that field types in any given table are required to be
> >> uniform. *Python has no such restriction.

>
> >> I tried out an implementation of composite collections, specifically
> >> lists, sets, and dicts, using 'sqlite3' as a persistence back-end.
> >> It's significantly slower, but we might argue that attempting to do it
> >> by hand classifies as a premature optimization; it is easy to optimize
> >> debugged code.

>
> >Sounds like you are re-inventing the ZODB.

>
> ...or SQLAlchemy or pickles in a SQL BLOB or...


I recognize your aversion to my claim of making a new approach on
something. I suppose an austere review of literature would compare
with the existing alternatives.

My approach does not define tables, so it is not SQL Alchemy; it is
not mere sugar for SQL. It defines a 'set' class and a 'tuple' class,
so some of the functionality may (and probably should be expected to)
overlap.

http://www.sqlalchemy.org/docs/05/or...create-a-table

My approach does not pickle objects, so it is not a mere BLOB pickle.
If a user writes, listA[ 50 ].append( "abcde" ), one addition is made
to an existing structure, and the remaining contents of 'listA' are
unread and unchanged.


All times are GMT. The time now is 11:54 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.