Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > URL detection follow-up

Reply
Thread Tools

URL detection follow-up

 
 
\Dandy\ Randy
Guest
Posts: n/a
 
      09-10-2003
Hi,

As per my previous posts, I am searching for a way to open a text file that
contains a few paragraphs of text, locate web URL's and replace them with
the needed html tags such as <a href & </a> etc. Most of the responses
suggested using a perl module ... URI::Find and similair methods.
Unfortunately, the hosting company I run my scripts from does not have this
module installed, and they are not prepared to install it just for my needs.
I also cannot change hosting companies.

So ... is there ANY other way my goal can be accomplished giving I cannot
use URI::Find? Here is an example of theoretical code:

#!/usr/bin/perl

# get text file data
open (TEXT, "<data/data.txt") or die "Can't open file: $!";
@data=<TEXT>;
close(TEXT);

# find web URL's and replace occurances with needed HTML tags
scan @list > replace;

# write the changed data back to the text file
open (TEXT, ">data/data.txt") or die "Can't open file: $!";
print DATA @text;
close(TEXT);

Please, it is very important to me to find this solution, if you have any
ideas, post back. Working code examples are very welcomed. Thanx everyone!

Randy


 
Reply With Quote
 
 
 
 
A. Sinan Unur
Guest
Posts: n/a
 
      09-10-2003
"\"Dandy\" Randy" <(E-Mail Removed)> wrote in
news:sMM7b.935815$(E-Mail Removed). ca:

> methods. Unfortunately, the hosting company I run my scripts from does
> not have this module installed, and they are not prepared to install
> it just for my needs. I also cannot change hosting companies.


perldoc -q lib
Found in C:\Perl\lib\pod\perlfaq8.pod

How do I keep my own module/library directory?

When you build modules, use the PREFIX option when generating
Makefiles:

perl Makefile.PL PREFIX=/u/mydir/perl

then either set the PERL5LIB environment variable before you run
scripts that use the modules/libraries (see perlrun) or say

use lib '/u/mydir/perl';

This is almost the same as

BEGIN {
unshift(@INC, '/u/mydir/perl');
}

except that the lib module checks for machine-dependent
subdirectories. See Perl's lib for more information.


--
A. Sinan Unur
http://www.velocityreviews.com/forums/(E-Mail Removed)
Remove dashes for address
Spam bait: (E-Mail Removed)
 
Reply With Quote
 
 
 
 
Brian Wakem
Guest
Posts: n/a
 
      09-10-2003

""Dandy" Randy" <(E-Mail Removed)> wrote in message
news:sMM7b.935815$(E-Mail Removed). ca...
> Hi,
>
> As per my previous posts, I am searching for a way to open a text file

that
> contains a few paragraphs of text, locate web URL's and replace them with
> the needed html tags such as <a href & </a> etc. Most of the responses
> suggested using a perl module ... URI::Find and similair methods.
> Unfortunately, the hosting company I run my scripts from does not have

this
> module installed, and they are not prepared to install it just for my

needs.
> I also cannot change hosting companies.
>
> So ... is there ANY other way my goal can be accomplished giving I cannot
> use URI::Find? Here is an example of theoretical code:
>
> #!/usr/bin/perl
>
> # get text file data
> open (TEXT, "<data/data.txt") or die "Can't open file: $!";
> @data=<TEXT>;
> close(TEXT);
>
> # find web URL's and replace occurances with needed HTML tags
> scan @list > replace;
>
> # write the changed data back to the text file
> open (TEXT, ">data/data.txt") or die "Can't open file: $!";
> print DATA @text;
> close(TEXT);
>
> Please, it is very important to me to find this solution, if you have any
> ideas, post back. Working code examples are very welcomed. Thanx everyone!



If they are all like http://www.domain.com/dir/file.html then you could do
something like -


foreach(@data) {
s!(http://.*?)(?:\s|$)!<a href="$1">$1</a>!gi;
}

Not perfect, but it'll get you started.

--
Brian Wakem


 
Reply With Quote
 
\Dandy\ Randy
Guest
Posts: n/a
 
      09-10-2003
Awesome ... works great ... now ... can you formulate a replacement command
that will take an email address and add the <a href="mailto: commands so
that email addresses will also become linked? You've been agreat help!

Randy

"Brian Wakem" <(E-Mail Removed)> wrote in message
news:bjo6ce$lfb7n$(E-Mail Removed)-berlin.de...
>
> ""Dandy" Randy" <(E-Mail Removed)> wrote in message
> news:sMM7b.935815$(E-Mail Removed). ca...
> > Hi,
> >
> > As per my previous posts, I am searching for a way to open a text file

> that
> > contains a few paragraphs of text, locate web URL's and replace them

with
> > the needed html tags such as <a href & </a> etc. Most of the responses
> > suggested using a perl module ... URI::Find and similair methods.
> > Unfortunately, the hosting company I run my scripts from does not have

> this
> > module installed, and they are not prepared to install it just for my

> needs.
> > I also cannot change hosting companies.
> >
> > So ... is there ANY other way my goal can be accomplished giving I

cannot
> > use URI::Find? Here is an example of theoretical code:
> >
> > #!/usr/bin/perl
> >
> > # get text file data
> > open (TEXT, "<data/data.txt") or die "Can't open file: $!";
> > @data=<TEXT>;
> > close(TEXT);
> >
> > # find web URL's and replace occurances with needed HTML tags
> > scan @list > replace;
> >
> > # write the changed data back to the text file
> > open (TEXT, ">data/data.txt") or die "Can't open file: $!";
> > print DATA @text;
> > close(TEXT);
> >
> > Please, it is very important to me to find this solution, if you have

any
> > ideas, post back. Working code examples are very welcomed. Thanx

everyone!
>
>
> If they are all like http://www.domain.com/dir/file.html then you could do
> something like -
>
>
> foreach(@data) {
> s!(http://.*?)(?:\s|$)!<a href="$1">$1</a>!gi;
> }
>
> Not perfect, but it'll get you started.
>
> --
> Brian Wakem
>
>



 
Reply With Quote
 
Brian Wakem
Guest
Posts: n/a
 
      09-10-2003

""Dandy" Randy" <(E-Mail Removed)> wrote in message
news:VpN7b.927838$(E-Mail Removed). ca...
> Awesome ... works great ... now ... can you formulate a replacement

command
> that will take an email address and add the <a href="mailto: commands so
> that email addresses will also become linked? You've been agreat help!
>
> Randy
>


Nice example of top posting.

To match email addresses perfectly every time is probably impossible, but a
simple and effective way of matching 99%+ would be:-

foreach(@data) {
s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a>!g;
}

--
Brian Wakem


 
Reply With Quote
 
\Dandy\ Randy
Guest
Posts: n/a
 
      09-10-2003
"Brian Wakem" wrote:

> To match email addresses perfectly every time is probably impossible, but

a
> simple and effective way of matching 99%+ would be:-
>
> foreach(@data) {
> s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a>!g;
> }


Brian, thankx again, that one worked too. Here is what i'm now using that
seems to work correctly:

$contents=~ s/http:\/\///g;
$contents=~ s!(www.*?)(?:\s|$)!<a href="http://$1">$1</a> !gi;
$contents=~ s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a> !g;

The first code eliminates the http:// in case the text contained a full url,
then adjusted your code to start looking for www. You also may notice a
deliberate space after the </a> tags ... this was needed as your code seemed
to kill the trailing space. Owe you one.

Randy

P.S. Sorry about the last top post.


 
Reply With Quote
 
Brian Wakem
Guest
Posts: n/a
 
      09-10-2003

""Dandy" Randy" <(E-Mail Removed)> wrote in message
news:KEN7b.125548$(E-Mail Removed) ...
> "Brian Wakem" wrote:
>
> > To match email addresses perfectly every time is probably impossible,

but
> a
> > simple and effective way of matching 99%+ would be:-
> >
> > foreach(@data) {
> > s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a>!g;
> > }

>
> Brian, thankx again, that one worked too. Here is what i'm now using that
> seems to work correctly:
>
> $contents=~ s/http:\/\///g;
> $contents=~ s!(www.*?)(?:\s|$)!<a href="http://$1">$1</a> !gi;
> $contents=~ s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a> !g;
>
> The first code eliminates the http:// in case the text contained a full

url,
> then adjusted your code to start looking for www. You also may notice a
> deliberate space after the </a> tags ... this was needed as your code

seemed
> to kill the trailing space. Owe you one.



Yes it would have swallowed the space.

s!(http://.*?)(\s|$)!<a href="$1">$1</a>$2!gi;

Instead should sort that out.

I'm glad they worked for you, but it's important to understand why, in case
you need to alter something. It's also important to understand why those
regexs are not perfect and will not work for every url or email, and from
time-to-time, could match things that aren't urls or email addresses.

--
Brian Wakem


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
URL - substitution of a correct URL by a GUID like URL in favorites. Just D. ASP .Net Mobile 0 08-11-2004 04:26 PM
Relative URL's to absolute URL's function ? wl ASP .Net 1 07-14-2004 10:28 AM
redirect URL's, return URL's, and URL Parameters Jon paugh ASP .Net 1 07-10-2004 05:29 AM
RE: The Web server reported the following error when attempting to create or open the Web project located at the following URL: <URL> =?Utf-8?B?VHJldm9yIEJlbmVkaWN0IFI=?= ASP .Net 0 06-07-2004 07:36 AM
Solution for: "Unable to validate that the file <name> matches the URL <url>" Doug ASP .Net 0 07-06-2003 02:40 PM



Advertisments