Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > ASP .Net > regex: how to loop through individual matches

Reply
Thread Tools

regex: how to loop through individual matches

 
 
darrel
Guest
Posts: n/a
 
      12-29-2004
I have some vb.net code that is running a regex, matching groups, and
replacing them. I'm trying to come up with a simple script that will strip
all attributes from all HTML tags.

This is what I have:

================================================== ===========

function stripAllAttributes(ByVal textToParse as String, ByVal tagToFind as
String) as String
dim s as String
dim r2 as new regex( _
"(?<theTag>(<" & tagToFind & "))" & _
"(?<everythingUpToEndTag>(([^/>].|\n)*))" _
, RegexOptions.IgnoreCase)
dim m2 as Match = r2.Match(textToParse)
dim strTheTag as String = m2.Groups("theTag").Value.toString
s = r2.Replace(textToParse, strTheTag)
return s
end function

================================================== ===========

This works, but, as you can see, I need to pass each tag I want to strip all
attributes from separately. The reason is that if I just use a regex like
this to grab the opening part of the tag:

(<)([^/>\s\n])*

it WILL grab the opening part of the first tag it sees, but it will then use
the first matched text to replace ALL matches it finds in the rest of the
text it is parsing. I imagine this is due more to my vb code than regex.

For example, if my markup is this:

<table width="100">
<tr width="100">
<td width="100">

And if I run the function (using the generic 'find all tags' regex) against
that, I get this returned:

<table>
<table>
<table>

When I want this:

<table>
<tr>
<td>

Off the top of my head, I can only think of doing it this way:

Function find first HTML tag to strip (ie, find the first tag that has at
least one attribute)
if there's a match
then pass that onto my current function (shown above) to replace all
instances of that tag.
then recursively call this same function so that it finds the next tag
else
assume it has stripped all attributes from all tags
end if

Or is there a way in my original script to do the same without the recursive
part?

-Darrel


 
Reply With Quote
 
 
 
 
Blair Bonnett
Guest
Posts: n/a
 
      01-03-2005
I'd try something like the following:
function stripAllAttributes(ByVal textToParse as String, ByVal tagToFind
as String) as String
dim s as String
dim r2 as new regex( _
"(?<theTag>(<" & tagToFind & "))" & _
"(?<everythingUpToEndTag>(([^/>].|\n)*))" _
, RegexOptions.IgnoreCase)
s = r2.Replace(textToParse, "$1>")
return s
end function

That uses a backreference to the first match ($1) in the replace
command. For more info on the backreference, check out
http://www.devarticles.com/c/a/VB.Ne...ons-in-.NET/1/

Blair
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Triple nested loop python (While loop insde of for loop inside ofwhile loop) Isaac Won Python 9 03-04-2013 10:08 AM
Re: How to loop through a list while inside the loop, the list size may be decreased? Roedy Green Java 3 09-13-2008 01:51 AM
Condition outside loop or separate loop for different condition? - Java 12 06-15-2005 08:50 AM
while loop in a while loop Steven Java 5 03-30-2005 09:19 PM
Loop the loop... =?Utf-8?B?VGltOjouLg==?= ASP .Net 2 02-16-2005 12:21 PM



Advertisments