Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: best way to handle this in Python

Reply
Thread Tools

Re: best way to handle this in Python

 
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      07-20-2012
{NOTE: preferences for comp.lang.python are to follow the RFC on
"netiquette" -- that is, post comments /under/ quoted material, trimming
what is not relevant... I've restructured this reply to match}

On Thu, 19 Jul 2012 21:28:12 -0400, Rita <(E-Mail Removed)>
declaimed the following in gmane.comp.python.general:

>
>
> On Thu, Jul 19, 2012 at 8:52 PM, Dave Angel <(E-Mail Removed)> wrote:
>
> > On 07/19/2012 07:51 PM, Rita wrote:
> > > Hello,
> > >
> > > I have data in many files (/data/year/month/day/) which are named like
> > > YearMonthDayHourMinute.gz.
> > >
> > > I would like to build a data structure which can easily handle querying

> > the
> > > data. So for example, if I want to query data from 3 weeks ago till

> > today,
> > > i can do it rather quickly.
> > >
> > > each YearMonthDayHourMinute.gz file look like this and they are about 4to
> > > 6kb
> > > red 34
> > > green 44
> > > blue 88
> > > orange 4
> > > black 3
> > > while 153
> > >
> > > I would like to query them so I can generate a plot rather quickly but

> > not
> > > sure what is the best way to do this.
> > >
> > >
> > >

> >
> > What part of your code is giving you difficulty? You didn't post any
> > code. You don't specify the OS, nor version of your Python, nor what
> > other programs you expect to use along with Python.
> >

> Using linux 2.6.31; Python 2.7.3.
> I am not necessary looking for code just a pythonic way of doing it.
> Eventually, I would like to graph the data using matplotlib
>
>

Which doesn't really answer the question. After all, since the
source data is already in date/time-stamped files, a simple, sorted,
"glob" of files within a desired span would answer the requirement.

But -- it would mean that you reparse the files for each processing
run.

An alternative would be to run a pre-processor that parses the files
into, say, an SQLite3 database (and which can determine, from the
highest datetime entry in the database, which /new/ files need to be
parsed on subsequent runs). Then do the query/plotting from a second
program which retrieves data from the database.

But if this is a process that only needs to be run once, or at rare
intervals, maybe you only need to parse the files into an in-memory data
structure... Say a list of tuples of the form:

[ (datetime, {color: value, color2: value2, ...}), (datetime2,
....) ]

--
Wulfraed Dennis Lee Bieber AF6VN
http://www.velocityreviews.com/forums/(E-Mail Removed) HTTP://wlfraed.home.netcom.com/

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: best way to handle this in Python Ian Kelly Python 1 07-21-2012 12:43 AM
Re: best way to handle this in Python Dennis Lee Bieber Python 0 07-20-2012 06:22 PM
Re: best way to handle this in Python Dave Angel Python 0 07-20-2012 12:52 AM
Best way to handle documents in ASP.NET Thomas Scheiderich ASP .Net 11 05-20-2004 05:57 PM
Question: Best way to handle DBNULL in datareaders Ravikanth[MVP] ASP .Net 6 07-18-2003 10:51 AM



Advertisments