On 28 Feb 2004 15:22:09 -0800,
(J. VerSchave)
wrote:
>I am trying to do this using regular expressions in Java:
>
>replace
>
>http://whatever.com
>
>with
>
><a href="http://whatever.com">http://whatever.com</a>
>
>
>
>This URL is embedded within a String, i.e.:
>
>String buffer = "The other I came across http://whatever.com. It is
>cool.";
>
>I want to perform some operation on this buffer and have the result
>be:
>
>"The other I came across <a
>href="http://whatever.com">http://whatever.com</a>. It is cool."
>
>
>Seems like this should be easy to do but I have been unsuccessful thus
>far in finding a solution. Thanks.
>
>-j
How complex does it need to be? For the example you gave, this should
work:
String result = buffer.replaceAll("http://\\S+(?<=\\w)",
"<a href=\"$0\">$0</a>");
The "\\S+" matches everything up to the next whitespace character,
then the lookbehind, "(?<=\\w)", cause the match to back up, if
necessary, until the last character matched is a letter, digit, or
underscore. That's just a quick and dirty way to keep the period at
the end of the sentence or other punctuation) from being included in
the URL.
But what about https and ftp links? Or mailto links? It's easy
enough to include those:
String result = buffer.replaceAll(
"((https?|ftp)://|mailto

\\S+(?<=\\w)",
"<a href=\"$0\">$0</a>");
The real fun starts when you need to capture URLs that don't have the
protocol prefix, e.g., "whatever.com". But I won't go into that
unless you need it.
Alan