Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > smarter zipcode search algorithm

Reply
Thread Tools

smarter zipcode search algorithm

 
 
premgrps@gmail.com
Guest
Posts: n/a
 
      07-05-2006
Hi,
I have a database with two tables
a) A table of 2 million records with city, zip and associated
information (say XYZ) and
b) zipcode latitude, longitude table having >40,000 records/zip codes

PROBLEM:
I need to find the the XYZs within the the range of a certain zipcode.
This zipcode and radial range in miles is entered by the user (web
interface).

The brute force way is to calculate the distance between the user
zipcode and all the zipcodes in the database. Once the zipcode_range
subroutine gives back the zipcodes within a certain radius, I need to
find all the XYZs from the table #1.

Another approach is to find the zipcodes with a square region (min/max
of the user zipcode latitude/longitude position).

Both the approaches are consuming too much time. Especially if the
radial distance starts increasing.

My questions:
1. Is there any other smart way to do the above task.
2. I am working on a 2.4ghz/512MB RAM machine. Any suggestions how to
increase the performance. Right now each select command to the
2Million record table takes about a minute.

Thanks.

 
Reply With Quote
 
 
 
 
Dr.Ruud
Guest
Posts: n/a
 
      07-05-2006
http://www.velocityreviews.com/forums/(E-Mail Removed) schreef:

> I have a database with two tables
> a) A table of 2 million records with city, zip and associated
> information (say XYZ) and
> b) zipcode latitude, longitude table having >40,000 records/zip codes
>
> PROBLEM:
> I need to find the the XYZs within the the range of a certain zipcode.
> This zipcode and radial range in miles is entered by the user (web
> interface).
>
> The brute force way is to calculate the distance between the user
> zipcode and all the zipcodes in the database. Once the zipcode_range
> subroutine gives back the zipcodes within a certain radius, I need to
> find all the XYZs from the table #1.
>
> Another approach is to find the zipcodes with a square region (min/max
> of the user zipcode latitude/longitude position).
>
> Both the approaches are consuming too much time. Especially if the
> radial distance starts increasing.
>
> My questions:
> 1. Is there any other smart way to do the above task.
> 2. I am working on a 2.4ghz/512MB RAM machine. Any suggestions how to
> increase the performance. Right now each select command to the
> 2Million record table takes about a minute.


This is not a Perl question.

Which RDBMS?

1. Make sure there are indexes on the lattitude and longitude columns of
the zipcode table. I assume that zipcode is the PK.

2. Pre-calculate the (min_lattitude, max_lattitude, min_longitude,
max_longitude) from the user input.

3. Fill a temporary zipcode table (again with the same indexes) with a
selection from the full zipcode table, with only the candidate zipcodes.
Be sure to have the zipcode column left from the comparison:
WHERE (zipcode.lattitude >= min_lattitude AND zipcode.lattitude <=
max_lattitude AND zipcode.longtitude >= min_longtitude AND
zipcode.longtitude <= max_longtitude).

4. If necessary, refine that result (the temporary zipcode table) by
deleting the records that aren't in the radial distance.

That temporary zipcode table can also be a query, on which you run the
second (radial distance) query.

For the city table, have an index on the zipcode, and join that with the
(refined) result.

--
Affijn, Ruud

"Gewoon is een tijger."


 
Reply With Quote
 
 
 
 
anno4000@radom.zrz.tu-berlin.de
Guest
Posts: n/a
 
      07-06-2006
<(E-Mail Removed)> wrote in comp.lang.perl.misc:
> Hi,
> I have a database with two tables
> a) A table of 2 million records with city, zip and associated
> information (say XYZ) and
> b) zipcode latitude, longitude table having >40,000 records/zip codes
>
> PROBLEM:
> I need to find the the XYZs within the the range of a certain zipcode.
> This zipcode and radial range in miles is entered by the user (web
> interface).


[...]

Your question has nothing to do with Perl (nor with zip codes, btw.).
It is a problem in computational geometry, and that is where I would
start looking for algorithms.

Anno
 
Reply With Quote
 
gmillerd@gmail.com
Guest
Posts: n/a
 
      07-06-2006
i wrote this many years ago for a singles website i made ... you may
find that just making a sql table with zipcode_to,zipcode_from,distance
is where you want to be. making a table for three digits and for five
digit zipcodes is a plus as well.

also you can do the lookups now with google map api i just noticed, way
to get banned with your 29000 zipcode entries

good luck

use strict;
use Math::Trig;

##
## racine wisconsin
##
my $a_lat = 42.13404;
my $a_long = -87.934097;

##
## wheeling illinois
##
my $b_lat = 42.716112;
my $b_long = -87.823329;

my $b_lat = 41.37372;
my $b_long = -93.376719;

################################################## ####################

print
gc(
$a_lat,
$a_long,
$b_lat,
$b_long),
"\n";

################################################## ####################

sub gc
{
my ($a_lat, $a_long, $b_lat, $b_long) = @_;

##
## convert degrees to radians
##
$a_long = deg2rad($a_long);
$b_long = deg2rad($b_long);
$a_lat = deg2rad($a_lat);
$b_lat = deg2rad($b_lat);

##
## meters in a mile
##
my $meterspermile = 1609.344;

##
## earth's radius
##
my $earth_radius = 6367000;
my $earth_radius = 6378.14 * 1000;

my $d = acos(sin($a_lat) * sin($b_lat) + cos($a_lat) * cos($b_lat)
* cos($a_long - $b_long));

$d = ($earth_radius * $d) / $meterspermile;

return($d);
}




http://www.velocityreviews.com/forums/(E-Mail Removed)-berlin.de wrote:
> <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> > Hi,
> > I have a database with two tables
> > a) A table of 2 million records with city, zip and associated
> > information (say XYZ) and
> > b) zipcode latitude, longitude table having >40,000 records/zip codes
> >
> > PROBLEM:
> > I need to find the the XYZs within the the range of a certain zipcode.
> > This zipcode and radial range in miles is entered by the user (web
> > interface).

>
> [...]
>
> Your question has nothing to do with Perl (nor with zip codes, btw.).
> It is a problem in computational geometry, and that is where I would
> start looking for algorithms.
>
> Anno


 
Reply With Quote
 
gmillerd@gmail.com
Guest
Posts: n/a
 
      07-06-2006
i wrote this many years ago for a singles website i made ... you may
find that just making a sql table with zipcode_to,zipcode_from,distance
is where you want to be. making a table for three digits and for five
digit zipcodes is a plus as well.

also you can do the lookups now with google map api i just noticed, way
to get banned with your 29000 zipcode entries

good luck

use strict;
use Math::Trig;

##
## racine wisconsin
##
my $a_lat = 42.13404;
my $a_long = -87.934097;

##
## wheeling illinois
##
my $b_lat = 42.716112;
my $b_long = -87.823329;

my $b_lat = 41.37372;
my $b_long = -93.376719;

################################################## ####################

print
gc(
$a_lat,
$a_long,
$b_lat,
$b_long),
"\n";

################################################## ####################

sub gc
{
my ($a_lat, $a_long, $b_lat, $b_long) = @_;

##
## convert degrees to radians
##
$a_long = deg2rad($a_long);
$b_long = deg2rad($b_long);
$a_lat = deg2rad($a_lat);
$b_lat = deg2rad($b_lat);

##
## meters in a mile
##
my $meterspermile = 1609.344;

##
## earth's radius
##
my $earth_radius = 6367000;
my $earth_radius = 6378.14 * 1000;

my $d = acos(sin($a_lat) * sin($b_lat) + cos($a_lat) * cos($b_lat)
* cos($a_long - $b_long));

$d = ($earth_radius * $d) / $meterspermile;

return($d);
}




(E-Mail Removed)-berlin.de wrote:
> <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> > Hi,
> > I have a database with two tables
> > a) A table of 2 million records with city, zip and associated
> > information (say XYZ) and
> > b) zipcode latitude, longitude table having >40,000 records/zip codes
> >
> > PROBLEM:
> > I need to find the the XYZs within the the range of a certain zipcode.
> > This zipcode and radial range in miles is entered by the user (web
> > interface).

>
> [...]
>
> Your question has nothing to do with Perl (nor with zip codes, btw.).
> It is a problem in computational geometry, and that is where I would
> start looking for algorithms.
>
> Anno


 
Reply With Quote
 
xhoster@gmail.com
Guest
Posts: n/a
 
      07-06-2006
(E-Mail Removed) wrote:
> Hi,
> I have a database with two tables
> a) A table of 2 million records with city, zip and associated
> information (say XYZ) and
> b) zipcode latitude, longitude table having >40,000 records/zip codes


40,001 and 10**89 are both greater than 40,000. I assuming you mean ~40,
000 rather than >40,000, otherwise it is meaningless.

Is that ~40,000 records where each record is a zipcode?
Or is it ~40,000 records per zip code?


> PROBLEM:
> I need to find the the XYZs within the the range of a certain zipcode.
> This zipcode and radial range in miles is entered by the user (web
> interface).


Are you assuming zip codes to be points?


> The brute force way is to calculate the distance between the user
> zipcode and all the zipcodes in the database. Once the zipcode_range
> subroutine gives back the zipcodes within a certain radius, I need to
> find all the XYZs from the table #1.


OK. You could probably even build a table giving the distances between
every pair of zip codes, or at least every pair within 300 miles or so
of each other.

> Another approach is to find the zipcodes with a square region (min/max
> of the user zipcode latitude/longitude position).


Yep, that is another approach. You could construct a square around the
circle, use a data structure to get everything in the square, then brute
force just those things in the square to see if they are also within the
circle. (neglecting the curvature of the earth.)

> Both the approaches are consuming too much time. Especially if the
> radial distance starts increasing.


Starts increasing? You make it sound like an organic creature. Isn't it
a simple number provided by the end user?

Anyway, what is taking a long time? Getting a list of zip codes within
X miles of the reference zip code? Turning that list into a list of all
the qualifying XYZ?

>
> My questions:
> 1. Is there any other smart way to do the above task.


We only have the vaguest idea of how you are doing it now. You can't make
tuning decisions based on vague notions.

> 2. I am working on a 2.4ghz/512MB RAM machine. Any suggestions how to
> increase the performance. Right now each select command to the
> 2Million record table takes about a minute.


What database system are you using? What query are you issuing to it?
What indices exist? What does this have to do with Perl?

If you are using Perl as your database system, have you looked at Tree::R?

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
PGPS
Guest
Posts: n/a
 
      07-11-2006
Thanks for all your answers. I finally got the problem figured out
myself with a subquery. I'm new to databases though.

Thanks.

(E-Mail Removed) wrote:
> (E-Mail Removed) wrote:
> > Hi,
> > I have a database with two tables
> > a) A table of 2 million records with city, zip and associated
> > information (say XYZ) and
> > b) zipcode latitude, longitude table having >40,000 records/zip codes

>
> 40,001 and 10**89 are both greater than 40,000. I assuming you mean ~40,
> 000 rather than >40,000, otherwise it is meaningless.
>
> Is that ~40,000 records where each record is a zipcode?
> Or is it ~40,000 records per zip code?
>
>
> > PROBLEM:
> > I need to find the the XYZs within the the range of a certain zipcode.
> > This zipcode and radial range in miles is entered by the user (web
> > interface).

>
> Are you assuming zip codes to be points?
>
>
> > The brute force way is to calculate the distance between the user
> > zipcode and all the zipcodes in the database. Once the zipcode_range
> > subroutine gives back the zipcodes within a certain radius, I need to
> > find all the XYZs from the table #1.

>
> OK. You could probably even build a table giving the distances between
> every pair of zip codes, or at least every pair within 300 miles or so
> of each other.
>
> > Another approach is to find the zipcodes with a square region (min/max
> > of the user zipcode latitude/longitude position).

>
> Yep, that is another approach. You could construct a square around the
> circle, use a data structure to get everything in the square, then brute
> force just those things in the square to see if they are also within the
> circle. (neglecting the curvature of the earth.)
>
> > Both the approaches are consuming too much time. Especially if the
> > radial distance starts increasing.

>
> Starts increasing? You make it sound like an organic creature. Isn't it
> a simple number provided by the end user?
>
> Anyway, what is taking a long time? Getting a list of zip codes within
> X miles of the reference zip code? Turning that list into a list of all
> the qualifying XYZ?
>
> >
> > My questions:
> > 1. Is there any other smart way to do the above task.

>
> We only have the vaguest idea of how you are doing it now. You can't make
> tuning decisions based on vague notions.
>
> > 2. I am working on a 2.4ghz/512MB RAM machine. Any suggestions how to
> > increase the performance. Right now each select command to the
> > 2Million record table takes about a minute.

>
> What database system are you using? What query are you issuing to it?
> What indices exist? What does this have to do with Perl?
>
> If you are using Perl as your database system, have you looked at Tree::R?
>
> Xho
>
> --
> -------------------- http://NewsReader.Com/ --------------------
> Usenet Newsgroup Service $9.95/Month 30GB


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[ANN] FastRI 0.2.0: full-text searching, smarter search strategies Mauricio Fernandez Ruby 7 11-21-2006 04:17 PM
[YO] Free Swingers Search, See Pics, Read Stories, Search Via Zipcode! Swin4Real Mon Computer Support 0 11-23-2004 03:20 AM
[OT] Free Swingers Search, See Pics, Read Stories, Search Via Zipcode! Joe Deluga Computer Support 0 11-17-2004 05:19 AM
ZIPCODE DB ACCESS TO SQL DaveF ASP .Net 6 09-29-2004 11:44 PM
Smarter search???? BillKirch Computer Information 3 12-28-2003 09:12 PM



Advertisments