Velocity Reviews > Re: Eliminate "extra" variable

# Re: Eliminate "extra" variable

Tim Chase
Guest
Posts: n/a

 12-08-2013
On 2013-12-07 23:14, Igor Korot wrote:
> def MyFunc(self, originalData):
> self.dates = []
> data = {}
> dateStrs = []
> for i in xrange(0, len(originalData)):
> dateStr, freq, source = originalData[i]
> data[str(dateStr)] = {source: freq}
> dateStrs.append(dateStr)
> for i in xrange(0, len(dateStrs) - 1):
> currDateStr = str(dateStrs[i])
> nextDateStr = str(dateStrs[i + 1])
> self.dates.append(currDateStr)
> currDate = datetime.datetime.strptime(currDateStr,
> '%Y-%m-%d') nextDate = datetime.datetime.strptime(nextDateStr,
> '%Y-%m-%d') if nextDate - curDate < datetime.timedelta(days=31):
> d = currDate + datetime.timedelta(days=1)
> while d < nextDate:
> self.dates.append(d.strftime('%Y-%m-%d'))
> d = d + datetime.timedelta(days=1)
> lastDateStr = dateStrs[-1]
> self.dates.append(str(lastDateStr))
> return data

It would help to know what you want this function to accomplish:
"MyFunc" isn't exactly descriptive. From what I gather by reading
it, you want it to do two things:

- append each date in the range from originalData[0] through
originalData[-1] to self.dates every time this function is called
(which means that multiple calls to this will grow self.dates
every time)

- if there's less than 31 days between N and N+1, also append all
the dates in between (this seems weird, but okay). Again, every
time this function is called.

- return a dictionary that maps dates in the input-data to the
associated source:freq dictionary.

It's hard to tell what you intend to do with these results. If you
just intend to iterate over them once, asking for associated data,
you could even create a generator that yields the date along with
either None or the associated data. See below for that.
Alternatively, you can return both the dict-mapping and the date-list
from the function:

def f(...):
return (the_dict, the_list)

a_dict, a_list = f(...)

That would prevent repeated mutation of self.dates

> As you can see there is many conversion going on and there's
> unneeded dateStrs which is used just to loop thru the dictionary
> and get the 2 consecutive keys.

The final snippet of code that I provided handles this pretty nicely
by zipping up the staggered lists and iterating over them while
unpacking them into sensible variable names. Unless you have a need
to operate on the dates as string, I'd just keep them as dates
throughout the code and only turn them into strings upon output.

> But if you see a better way - please share.

I'd likely incorporate Peter's sliding_window() suggestion and do
something like the following (I commented out using a tuple for the
values, but using a tuple/namedtuple might make more sense)

import itertools
def sliding_window(i):
a, b = itertools.tee(i)
next(b)
return itertools.izip(a, b)

def some_descriptive_function_name(self, original_data):
# construct these once-per-call rather than every loop
# or even move them out to module-scope
ONE_DAY = datetime.timedelta(days=1)
MONTHISH = datetime.timedelta(days=31)
for (
(cur_dt, cur_freq, cur_source),
(next_dt, next_freq, next_source)
) in sliding_window(original_data):
info = {cur_source: cur_freq}
# info = (cur_source, cur_freq)
yield cur_dt, info
if next_dt - cur_dt < MONTHISH:
d = cur_dt + ONE_DAY
while d < next_dt:
yield d, None
d += ONE_DAY
info = {next_source: next_freq}
# info = (next_source, next_freq)
yield next_dt, info

which can then be used with

for dt, info in self.some_descriptive_function_name(data):
# dates should be returned in the same sequence
if info is None:
do_something_when_no_info(dt)
else:
do_something_with_dt_and_info(dt, info)

If you need to iterate over the data multiple times, you can just do

tmp = list(self.some_descriptive_function_name(data))
do_first(tmp)
do_second(tmp)

-tkc