Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Javascript > REgular expression to match a XML tag

Reply
Thread Tools

REgular expression to match a XML tag

 
 
Karthik
Guest
Posts: n/a
 
      11-02-2007
Hi All,

I am trying to match an XML tag using JS regular expressions. The
pattern I am using is

pattern="/(<" + tagname + "&gt" + "(*)" + "(<." + tagname +
">/g)";

where I want to replace the tagname variable with the name of the tag
which I want to search for. Unfortunately this doesn't work. If I
replace the tagname variable with the actual tag's name it works.
Any idea how to fix this issue?

If any of you could post a script that could do this it would be
great.

Thanks
Karthik

 
Reply With Quote
 
 
 
 
Karthik
Guest
Posts: n/a
 
      11-02-2007
Hi All,

MOdified the pattern to
var patt="(<" + tagname + "&gt" + "(*)" + "(<." + tagname +
"&gt";

without the intial / and ending /g still no go...
On Nov 1, 11:28 pm, Karthik <(E-Mail Removed)> wrote:
> Hi All,
>
> I am trying to match an XML tag using JS regular expressions. The
> pattern I am using is
>
> pattern="/(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
> "&gt;/g)";
>
> where I want to replace the tagname variable with the name of the tag
> which I want to search for. Unfortunately this doesn't work. If I
> replace the tagname variable with the actual tag's name it works.
> Any idea how to fix this issue?
>
> If any of you could post a script that could do this it would be
> great.
>
> Thanks
> Karthik




 
Reply With Quote
 
 
 
 
Karthik
Guest
Posts: n/a
 
      11-02-2007
On Nov 1, 11:43 pm, Karthik <(E-Mail Removed)> wrote:
> Hi All,
>
> MOdified the pattern to
> var patt="(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
> "&gt";
>
> without the intial / and ending /g still no go...
> On Nov 1, 11:28 pm, Karthik <(E-Mail Removed)> wrote:
>
> > Hi All,

>
> > I am trying to match an XML tag using JS regular expressions. The
> > pattern I am using is

>
> > pattern="/(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
> > "&gt;/g)";

>
> > where I want to replace the tagname variable with the name of the tag
> > which I want to search for. Unfortunately this doesn't work. If I
> > replace the tagname variable with the actual tag's name it works.
> > Any idea how to fix this issue?

>
> > If any of you could post a script that could do this it would be
> > great.

>
> > Thanks
> > Karthik


Here is the full script...
here str is just a temporary storage, Actually I will be applying the
pattern on the source of the HTML page of the "current window"
object.

<html>
<body>

<script type="text/javascript">
var tagname="ContentId";
var result="";
var str = "&lt;ContentId&gt;12345&lt;/ContentId&gt;";
var patt="(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
"&gt";
//var patt=/(&lt;ContentId&gt([\d]*)/g
document.write(patt + " &nbsp PAttern <BR>");
document.write(str + "<BR>");
var patt2=new RegExp(patt);

result=patt2.exec(str);
document.write(result + " Result &nbsp <BR>");
document.write(RegExp.$2);
</script>

</body>
</html>


 
Reply With Quote
 
Karthik
Guest
Posts: n/a
 
      11-02-2007
On Nov 1, 11:51 pm, Karthik <(E-Mail Removed)> wrote:
> On Nov 1, 11:43 pm, Karthik <(E-Mail Removed)> wrote:
>
>
>
> > Hi All,

>
> > MOdified the pattern to
> > var patt="(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
> > "&gt";

>
> > without the intial / and ending /g still no go...
> > On Nov 1, 11:28 pm, Karthik <(E-Mail Removed)> wrote:

>
> > > Hi All,

>
> > > I am trying to match an XML tag using JS regular expressions. The
> > > pattern I am using is

>
> > > pattern="/(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
> > > "&gt;/g)";

>
> > > where I want to replace the tagname variable with the name of the tag
> > > which I want to search for. Unfortunately this doesn't work. If I
> > > replace the tagname variable with the actual tag's name it works.
> > > Any idea how to fix this issue?

>
> > > If any of you could post a script that could do this it would be
> > > great.

>
> > > Thanks
> > > Karthik

>
> Here is the full script...
> here str is just a temporary storage, Actually I will be applying the
> pattern on the source of the HTML page of the "current window"
> object.
>
> <html>
> <body>
>
> <script type="text/javascript">
> var tagname="ContentId";
> var result="";
> var str = "&lt;ContentId&gt;12345&lt;/ContentId&gt;";
> var patt="(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
> "&gt";
> //var patt=/(&lt;ContentId&gt([\d]*)/g
> document.write(patt + " &nbsp PAttern <BR>");
> document.write(str + "<BR>");
> var patt2=new RegExp(patt);
>
> result=patt2.exec(str);
> document.write(result + " Result &nbsp <BR>");
> document.write(RegExp.$2);
> </script>
>
> </body>
> </html>


Got the expression...

here it is...
var regexpr= new RegExp("(&lt;" + tagname + "&gt([A-Z]*[[a-z]*[0-9]*)
(&lt;." + tagname + "&gt");
apply a exec of this pattern on any string/html source/xml file, it
will fetch you the values between the tags..
one word of warning though if the tag has got child tags, it will
retrieve all the child tags also

Thanks
Karthik

 
Reply With Quote
 
Jeremy
Guest
Posts: n/a
 
      11-02-2007
Karthik wrote:
> On Nov 1, 11:51 pm, Karthik <(E-Mail Removed)> wrote:
>> On Nov 1, 11:43 pm, Karthik <(E-Mail Removed)> wrote:
>>
>>
>>
>>> Hi All,
>>> MOdified the pattern to
>>> var patt="(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
>>> "&gt";
>>> without the intial / and ending /g still no go...
>>> On Nov 1, 11:28 pm, Karthik <(E-Mail Removed)> wrote:
>>>> Hi All,
>>>> I am trying to match an XML tag using JS regular expressions. The
>>>> pattern I am using is
>>>> pattern="/(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
>>>> "&gt;/g)";
>>>> where I want to replace the tagname variable with the name of the tag
>>>> which I want to search for. Unfortunately this doesn't work. If I
>>>> replace the tagname variable with the actual tag's name it works.
>>>> Any idea how to fix this issue?
>>>> If any of you could post a script that could do this it would be
>>>> great.
>>>> Thanks
>>>> Karthik

>> Here is the full script...
>> here str is just a temporary storage, Actually I will be applying the
>> pattern on the source of the HTML page of the "current window"
>> object.
>>
>> <html>
>> <body>
>>
>> <script type="text/javascript">
>> var tagname="ContentId";
>> var result="";
>> var str = "&lt;ContentId&gt;12345&lt;/ContentId&gt;";
>> var patt="(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
>> "&gt";
>> //var patt=/(&lt;ContentId&gt([\d]*)/g
>> document.write(patt + " &nbsp PAttern <BR>");
>> document.write(str + "<BR>");
>> var patt2=new RegExp(patt);
>>
>> result=patt2.exec(str);
>> document.write(result + " Result &nbsp <BR>");
>> document.write(RegExp.$2);
>> </script>
>>
>> </body>
>> </html>

>
> Got the expression...
>
> here it is...
> var regexpr= new RegExp("(&lt;" + tagname + "&gt([A-Z]*[[a-z]*[0-9]*)
> (&lt;." + tagname + "&gt");
> apply a exec of this pattern on any string/html source/xml file, it
> will fetch you the values between the tags..
> one word of warning though if the tag has got child tags, it will
> retrieve all the child tags also
>
> Thanks
> Karthik
>


Using regular expressions alone will never really get you a robust
parser. For example, "<foo>bar<afoo>" would match your current
expression, even though <afoo> doesn't close <foo>.

You want to search through the current document for a certain tag?
Wouldn't it be easier to use DOM for this purpose?

Jeremy
 
Reply With Quote
 
Bart Van der Donck
Guest
Posts: n/a
 
      11-03-2007
Karthik wrote:

> var regexpr= new RegExp("(&lt;" + tagname + "&gt([A-Z]*[[a-z]*[0-9]*)
> (&lt;." + tagname + "&gt");
> apply a exec of this pattern on any string/html source/xml file, it
> will fetch you the values between the tags..
> one word of warning though if the tag has got child tags, it will
> retrieve all the child tags also


And that's only the very beginning

Take a look at

http://groups.google.com/group/comp....b006db41efc7b/

to get idea about the complexity of real XML string parsing.

Do yourself a favour and load it into the XML parser.

--
Bart

 
Reply With Quote
 
Thomas 'PointedEars' Lahn
Guest
Posts: n/a
 
      11-08-2007
Karthik wrote:
> On Nov 1, 11:51 pm, Karthik <(E-Mail Removed)> wrote:
>> On Nov 1, 11:43 pm, Karthik <(E-Mail Removed)> wrote:
>>> MOdified the pattern to
>>> var patt="(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
>>> "&gt";
>>> without the intial / and ending /g still no go...
>>> On Nov 1, 11:28 pm, Karthik <(E-Mail Removed)> wrote:
>>>> Hi All,
>>>> I am trying to match an XML tag using JS regular expressions. The
>>>> pattern I am using is
>>>> pattern="/(&lt;" + tagname + "&gt" + "(*)" + "(&lt;." + tagname +
>>>> "&gt;/g)";
>>>> where I want to replace the tagname variable with the name of the tag
>>>> which I want to search for. Unfortunately this doesn't work. If I
>>>> replace the tagname variable with the actual tag's name it works.
>>>> Any idea how to fix this issue?
>>>> If any of you could post a script that could do this it would be
>>>> great.
>>>> [...]

>
> Got the expression...


Not at all, you don't.

> here it is...
> var regexpr= new RegExp("(&lt;" + tagname + "&gt([A-Z]*[[a-z]*[0-9]*)
> (&lt;." + tagname + "&gt");
> apply a exec of this pattern on any string/html source/xml file, it
> will fetch you the values between the tags..


Only if the content is ASCII-alphanumeric. XML, however, is UTF-8-safe.

> one word of warning though if the tag has got child tags, it will

^^^^^^^^^^^^^^^^^^^^^^^^^^
> retrieve all the child tags also

^^^^^^^^^^
http://www.w3.org/TR/REC-html40/intr...t.html#h-3.2.1 (esp. the last,
green-colored paragraph)

It will _not_ match any child _elements_, as you have explicitly excluded
their start tags from the content of the `tagname' element, assuming that
the double `[' was but a typo (if it was not, the expression would match `['
in the content as well). Why you escape `<' and `>' remains a mystery;
further assuming that you use it within an XHTML `script' element (where
declaring it as CDATA would have sufficed to avoid the character entity
references), the possible match would be

<foo>abc<bar>def</bar>ghi</foo>
^^^^^^^^^^

However, that match is discarded because `ar' does not match `fo'.

The Chomsky hierarchy, taught in computer science classes, tells us that
it is usually not possible to use (only) a regular grammar, such as the one
regular expressions are based on, to parse a context-free language, such as
SGML-based markup. Because every regular language is context-free, but not
every context-free language is regular.

Therefore, only if you need to parse the markup as such instead of accessing
the corresponding DOM objects, you are looking for a non-deterministic
pushdown automaton (which can parse those languages), implemented as an XML
parser (such as DOMParser in Gecko-based UAs), instead. If you don't want
to use such an external API, it is possible to combine the efficiency of
regular expression matching with the reliability of an NPDA in your code.

http://en.wikipedia.org/wiki/Chomsky_hierarchy


PointedEars
--
Anyone who slaps a 'this page is best viewed with Browser X' label on
a Web page appears to be yearning for the bad old days, before the Web,
when you had very little chance of reading a document written on another
computer, another word processor, or another network. -- Tim Berners-Lee
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
how do u invoke Tag b's Tag Handler from within Tag a's tag Handler? shruds Java 1 01-27-2006 03:00 AM
Regular Expression - looking to match 'www' only if it the start of a URL hooterbite@yahoo.com ASP .Net 4 07-12-2005 01:01 PM
how to match regular expression from right to left Liang Perl 2 08-27-2004 10:03 PM
match three digit number using regular expression championsleeper Perl 6 04-06-2004 08:54 PM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments