For my job at Intel, I recently started working with some python code that – among other stuff – accesses an sqlite database. The DB in question is currently around 3GB in size. One of the things this code does is generate a report for a given month, say “2013-10”, and it submits an sql statement to sqlite that goes something like
SELECT * FROM <table> WHERE date => 2013-10-01 AND date <= 2013-10-31.
Since I’m expecting to have to run this for every month of 2013, and experience tells me that when my internal clients will want it, they’ll want it right now, I wrote a wrapper script around this code that calls this python script N times in parallel, each for a different month. This worked fine as long as N was 2 or less. When I started increasing N to 3 or more, sooner or later the sqlite ‘execute’ statement would throw an exception:
sqlite3.OperationalError: database or disk is full
but the disk the DB file is on is nowhere near full (unless you consider close to 700GB free space as “full” in this context ;-). After doing some research, I found out that this happens because sqlite uses
/var/tmp to store temporary data, and
/var/tmp (this is a linux box) is running out of disk space. This makes sense as
/var/tmp had about 7.5G free space before I started my script, and this free space would drop to below 1GB during my tests.
Further research showed that using the sqlite
pragma temp_store command I can tune where temporary data is stored, according to the documentation setting this value to 2 causes temporary data to be stored in memory. I first tried this using the commandline utility:
sqlite> pragma temp_store = 2;
but my tool still threw an exception. When I exited and then re-started sqlite, the value was back to 0 – this seems to be a per-connection setting. I put this piece of code into the session establishment function:
self.__cursor.execute("PRAGMA temp_store = 2")
This has caused the exception to disappear.