Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Javascript > Regular Expression Help

Reply
Thread Tools

Regular Expression Help

 
 
pbreah
Guest
Posts: n/a
 
      03-14-2007
I need to figure out a pattern that can match each letter of the
message, but leaves all the html entities alone.

For example, I have a input like this:

<div>
This is the content &nbsp; &lt; Hello &gt;
</div>

Just as an example to make it more clearer, If I wanted to replace the
all letters of the message with the number "1" I would have this
resut:

<div>
1111 11 111 1111111 &nbsp; &lt; 11111 &gt;
</div>

Can anyone help?

Thanks in advanced...

 
Reply With Quote
 
 
 
 
pbd22
Guest
Posts: n/a
 
      03-14-2007
On Mar 14, 11:11 am, "pbreah" <pbr...@gmail.com> wrote:
> I need to figure out a pattern that can match each letter of the
> message, but leaves all the html entities alone.
>
> For example, I have a input like this:
>
> <div>
> This is the content &nbsp; &lt; Hello &gt;
> </div>
>
> Just as an example to make it more clearer, If I wanted to replace the
> all letters of the message with the number "1" I would have this
> resut:
>
> <div>
> 1111 11 111 1111111 &nbsp; &lt; 11111 &gt;
> </div>
>
> Can anyone help?
>
> Thanks in advanced...



Hi.
Check out the "replace" method for javascript strings.
If that doesn't do what you are looking for, try checking
out the various ways of manipulating strings here:

http://javascriptkit.com/javatutors/string4.shtml

hope that helps.

 
Reply With Quote
 
 
 
 
Geoffrey Summerhayes
Guest
Posts: n/a
 
      03-14-2007
On Mar 14, 2:11 pm, "pbreah" <pbr...@gmail.com> wrote:
> I need to figure out a pattern that can match each letter of the
> message, but leaves all the html entities alone.
>
> For example, I have a input like this:
>
> <div>
> This is the content &nbsp; &lt; Hello &gt;
> </div>
>
> Just as an example to make it more clearer, If I wanted to replace the
> all letters of the message with the number "1" I would have this
> resut:
>
> <div>
> 1111 11 111 1111111 &nbsp; &lt; 11111 &gt;
> </div>
>


Not sure if it can be done in one line
but then I don't have too much experience
using regexps.

Best I could come up with off the cuff(partially tested):

var y=/(<[^<>&;]*>)|(&[a-z]*|([^<>&]*)/g;
var array=x.match(y);
var output="";
for(var i=0;i<array.length;i++)
{
var str=array[i];
if(/[<&]/.test(str))
output+=str;
else
{
output+=str.replace(/\S/g,"1");
}
}

--
Geoff

 
Reply With Quote
 
Dr J R Stockton
Guest
Posts: n/a
 
      03-15-2007
In comp.lang.javascript message <
oglegroups.com>, Wed, 14 Mar 2007 11:11:22, pbreah <>
posted:
>I need to figure out a pattern that can match each letter of the
>message, but leaves all the html entities alone.
>
>For example, I have a input like this:
>
><div>
>This is the content &nbsp; &lt; Hello &gt;
></div>
>
>Just as an example to make it more clearer, If I wanted to replace the
>all letters of the message with the number "1" I would have this
>resut:
>
><div>
>1111 11 111 1111111 &nbsp; &lt; 11111 &gt;
></div>
>
>Can anyone help?


The following code, probably slowly, encodes all alphanumeric entities
by adding 999 to their numerical value. Slightly tested.

It is then trivial to replace all remaining letters by "1" and to see
how to reverse the 999.

If your text may contain Russian, use another number or be more careful
about reversing.

St = "<div>\nThis is the content &nbsp; &lt; Hello &gt;\n</div>"

function WOK(P, x) { return P.replace(/(\w)/g, // Encode all chars by x
function (z, p1) {
return String.fromCharCode((p1.charCodeAt(0)+x)) } ) }

St = St.replace(/(&\w+/g, function (z, p1) { return WOK(p1, +999) } )




More is needed if the message can contain such as <b>no</b>, since the
markup would also need to be protected.

The complete tool should be able to irreversibly obfuscate the content
of a Web page, so that it could be submitted for criticism without
revealing its textual content.


Afterthought : put a semicolon at the beginning and an ampersand at the
end, and replace every letter between a semicolon and the next ampersand
with a "1",

It's a good idea to read the newsgroup and its FAQ. See below.

--
(c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 IE 6
news:comp.lang.javascript FAQ <URL:http://www.jibbering.com/faq/index.html>.
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Seek xpath expression where an attribute name is a regular expression GIMME XML 3 12-29-2008 03:11 PM
C/C++ language proposal: Change the 'case expression' from "integral constant-expression" to "integral expression" Adem C++ 42 11-04-2008 12:39 PM
C/C++ language proposal: Change the 'case expression' from "integral constant-expression" to "integral expression" Adem C Programming 45 11-04-2008 12:39 PM
Matching abitrary expression in a regular expression =?iso-8859-1?B?bW9vcJk=?= Java 8 12-02-2005 12:51 AM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57