Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > How to speed up this slow part of my program

Reply
Thread Tools

How to speed up this slow part of my program

 
 
Justin C
Guest
Posts: n/a
 
      03-30-2012
On 2012-03-28, Ben Morrow <> wrote:
>
> Quoth Justin C <>:
>>
>> We have a database of thousands of clothing items. Some of the items are
>> almost identical apart from their size. Consequently we use the same
>> image in our web-shop to advertise items of the same style, design and
>> colour.
>>
>> In a program I have to get new images from the art guy's computer I end
>> up grepping the entire list of items $#(list-of-items) times, there must
>> be a better way. The file names are exactly the same as the style codes
>> apart from the size suffix being dropped. I'm using File::Find.
>>
>> Here's some code:
>>
>> find(\&set_flag, (keys %{ $stock_groups->{text2code} }));
>>
>> sub set_flag {
>> return unless (-f $_ );
>>
>> (my $item_code_part = $_) =~ s/\.jpg//;
>> $item_code_part = uc($item_code_part);
>> $item_code_part =~ s|_|/|g;
>>
>> my @matches = grep(/$item_code_part/, keys %{ $stock_items });

>
> Careful: you want \Q there, even if you think you're sure the filenames
> are all safe.


Done that, thank you.


>> foreach my $i (@matches) {
>> $stock_items->{$i}{got_image} = 1;
>> }
>> }

>
> I would probably turn this into a big pattern match. Something like
> this:
>
> use File::Find::Rule;
>
> my ($imgs) = map qr/$_/, join "|", map "\Q\U$_",
> map { (my ($x) = /(.*)\.jpg/) =~ s!_!/!g; $x }
> File::Find::Rule->file->in(keys %{...});
>
> while (my ($item, $entry) = each %$stock_items) {
> $item =~ $imgs and $entry->{got_image} = 1;
> }


That takes a bit of understanding, I'll have a read of the docs.


> If you're using 5.14 you can get rid of the ugly map block using s///r
> and tr///r:
>
> map tr!_!/!r, map s/\.jpg//r,


Still on 5.10, we follow Debian 'stable' and so won't be upgrading for a
while.

Thanks for the suggestions.

Justin.

--
Justin C, by the sea.
 
Reply With Quote
 
 
 
 
Justin C
Guest
Posts: n/a
 
      03-30-2012
On 2012-03-29, J. Gleixner <glex_no-> wrote:
> On 03/28/12 10:24, Justin C wrote:
>> We have a database of thousands of clothing items. Some of the items are
>> almost identical apart from their size. Consequently we use the same
>> image in our web-shop to advertise items of the same style, design and
>> colour.
>>[...]
>> The bottle-neck, as I see it, is running grep 20k times, once for each
>> image found. Can anyone suggest a better way?

>
> Since you already have data in a DB, I'd suggest looking at
> associating these files, to the items, in the database.
>
> Maybe store the path to the file, or possibly the image as a BLOB, a
> many to one relationship.


Unfortunately it's not a DB I'm authorised to write to. I can run
queries that extract data, but any changes to the data must be done
through the 3rd party supplied interface.


Justin.

--
Justin C, by the sea.
 
Reply With Quote
 
 
 
 
Dr.Ruud
Guest
Posts: n/a
 
      03-30-2012
On 2012-03-28 17:24, Justin C wrote:

> We have a database of thousands of clothing items. Some of the items are
> almost identical apart from their size. Consequently we use the same
> image in our web-shop to advertise items of the same style, design and
> colour.


Are these (hard- or soft-) linked to a single file? If so, then you can
use that attribute.

--
Ruud
 
Reply With Quote
 
J. Gleixner
Guest
Posts: n/a
 
      03-30-2012
On 03/30/12 08:05, Justin C wrote:
> On 2012-03-29, J. Gleixner<glex_no-> wrote:
>> On 03/28/12 10:24, Justin C wrote:
>>> We have a database of thousands of clothing items. Some of the items are
>>> almost identical apart from their size. Consequently we use the same
>>> image in our web-shop to advertise items of the same style, design and
>>> colour.
>>> [...]
>>> The bottle-neck, as I see it, is running grep 20k times, once for each
>>> image found. Can anyone suggest a better way?

>>
>> Since you already have data in a DB, I'd suggest looking at
>> associating these files, to the items, in the database.
>>
>> Maybe store the path to the file, or possibly the image as a BLOB, a
>> many to one relationship.

>
> Unfortunately it's not a DB I'm authorised to write to. I can run
> queries that extract data, but any changes to the data must be done
> through the 3rd party supplied interface.


OK. Then put it in your own database. MySQL, Postgres, SQLite, even a
DBM file might work.
 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      03-30-2012
"J. Gleixner" <glex_no-> writes:
> On 03/30/12 08:05, Justin C wrote:
>> On 2012-03-29, J. Gleixner<glex_no-> wrote:
>>> On 03/28/12 10:24, Justin C wrote:
>>>> We have a database of thousands of clothing items. Some of the items are
>>>> almost identical apart from their size. Consequently we use the same
>>>> image in our web-shop to advertise items of the same style, design and
>>>> colour.
>>>> [...]
>>>> The bottle-neck, as I see it, is running grep 20k times, once for each
>>>> image found. Can anyone suggest a better way?
>>>
>>> Since you already have data in a DB, I'd suggest looking at
>>> associating these files, to the items, in the database.
>>>
>>> Maybe store the path to the file, or possibly the image as a BLOB, a
>>> many to one relationship.

>>
>> Unfortunately it's not a DB I'm authorised to write to. I can run
>> queries that extract data, but any changes to the data must be done
>> through the 3rd party supplied interface.

>
> OK. Then put it in your own database. MySQL, Postgres, SQLite, even a
> DBM file might work.


This is a 'simple' key <-> value mapping, only complicated by the fact
that keys can have multiple values associating with them. Provided
that the code for a particular stock item can be calculated, the
simple first attempt at a solution would still be to use a Perl hash
mapping codes to sets of stock item keys. This hash can be generated
once in advance and then be used for any number of 'fast' lookups. A
more elaborate design would be to use a flat-file hashed database ('DBM
file') to store the mappings, update that whenever the set of stock
items changes and use it to associate presently existing images with
stock items without recalculating the mappings. An even better idea
would be to use a persistent database mapping stock items to image
files and update this database whenever the set of image files
changes.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Re: slow slow slow! Expert lino fitter Computer Support 5 12-12-2008 04:00 PM
Re: slow slow slow! Beauregard T. Shagnasty Computer Support 2 12-10-2008 09:03 PM
Re: slow slow slow! Expert lino fitter Computer Support 0 12-10-2008 02:33 PM
speed speed speed a.metselaar Computer Support 14 12-30-2003 03:34 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57