Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Slow loading of large in-memory tables (http://www.velocityreviews.com/forums/t335470-slow-loading-of-large-in-memory-tables.html)

Philipp K. Janert, Ph.D. 09-07-2004 05:14 AM

Slow loading of large in-memory tables
 

Dear All!

I am trying to load a relatively large table (about 1 Million
rows) into an sqlite table, which is kept in memory. The
load process is very slow - on the order of 15 minutes or
so.

I am accessing sqlite from Python, using the pysqlite driver.
I am loading all records first using cx.execute( "insert ..." ).
Only once I have run cx.execute() for all records, I commit all
the preceding inserts with conn.commit()

I have tried using cx.executemany(), but if anything, this
makes the process slower.

I have not tried mucking manually with transactions.
I have sufficiently convinced myself that the slow part
is in fact the cx.execute() - not reading the data from file
or anything else.

My system specs and versions:
SuSE 9.1
Python 2.3.3
SQLite 2.8.12
pysqlite 0.5.1
1 GB memory (I am not swapping, this is not the problem).

Are there ways to make this process faster?

Also, I want to keep the DB in memory, since I use it later
to run a very DB-intensive simulation against it. However,
this implies that I need to load the DB from the same python
program which will later run the simulation - I think.

Any hints appreciated!

(Please cc me when replying to the list in regards to this
message!)

Best regards,

Ph.

Thorsten Kampe 09-15-2004 02:27 PM

Re: Slow loading of large in-memory tables
 
* Philipp K. Janert, Ph.D. (2004-09-07 07:14 +0200)
> I am trying to load a relatively large table (about 1 Million
> rows) into an sqlite table, which is kept in memory. The
> load process is very slow - on the order of 15 minutes or
> so.
>
> I am accessing sqlite from Python, using the pysqlite driver.
> I am loading all records first using cx.execute( "insert ..." ).
> Only once I have run cx.execute() for all records, I commit all
> the preceding inserts with conn.commit()
>
> I have tried using cx.executemany(), but if anything, this
> makes the process slower.
>
> I have not tried mucking manually with transactions.
> I have sufficiently convinced myself that the slow part
> is in fact the cx.execute() - not reading the data from file
> or anything else.
>
> Are there ways to make this process faster?


According to [1]:

pragma temp_store = memory;
# or any bigger value ('2000' is the default)
pragma cache_size = 4000;
pragma count_changes = off;
pragma synchronous = off;

Also SQLite makes a commit after every SQL statement (not only those
that alter the database)[2]. Therefor you have to manually start a
transaction before the first SQL statement and manually commit after
the last statement. You have to turn the integrated pysqlite
committing off to be able to do this:

connection = sqlite.connect(database,
autocommit = 1)

> Also, I want to keep the DB in memory, since I use it later
> to run a very DB-intensive simulation against it. However,
> this implies that I need to load the DB from the same python
> program which will later run the simulation - I think.


Yes.

Thorsten

[1] http://web.utk.edu/~jplyon/sqlite/SQ...ation_FAQ.html
[2] http://web.utk.edu/~jplyon/sqlite/SQ...l#transactions


All times are GMT. The time now is 02:52 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.