On Sun, 05 Nov 2006 12:00:07 +0100, Jaap <> declaimed
the following in comp.lang.python:
>
> Apart from this I have a configuration file, which contains the list of
> itemID's i need to focus on per month. Not all itemID's are relevant for
> each month, but for example only every second or third month. All
> records in the logfile with other itemID's can be ignored. I have yet to
> define the format of this configuration file, but am thinking about a 0
> or 1 for each month, and then the itemID, like:
> "1 0 0 1 0 0 1 0 0 1 0 0 123456" for a itemID 123456 which only needs
> consideration at first month of each quarter.
>
Personally -- I'd put the ID first... Maybe even as a INI file
style:
123456: 1, 0, 0, 1, ...
and use the config parser to read those... Question though:
"consideration at first month of each quarter" => does that mean only
process that month, or process the entire quarter just in that month?
> My question to this forum is: which data structure would you propose?
> The logfile is not very big (about 200k max, average 200k) so I assume I
> can store in internal memory/list?
>
SQLite in-memory database... Should be relatively easy to then
select/group the data with SQL statements. Even the summation might be
coded directly into the SQL. If the processing configuration (above)
doesn't change often, go to an SQLite physical file database, and store
the ID/Month pairs that need processing -- then a subselect to get the
months per ID could be used to select the ID records for summation.
--
Wulfraed Dennis Lee Bieber KD6MOG
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff:
web-)
HTTP://www.bestiaria.com/