Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > ASP .Net > ASP General > asp innerText?

Reply
Thread Tools

asp innerText?

 
 
Giles
Guest
Posts: n/a
 
      02-16-2006
in DHTML, body.innerText nicely strips out the raw textual contents of a
formatted page. Is there a straighforwards way to do this with a server-side
ASP function (e.g. on a string containing the HTML) ? It is to fill a
database field used for a simple search routine.
I don't have permission on this server to use 3rd party components, it's
plain IIS6.
Thanks.
Giles


 
Reply With Quote
 
 
 
 
Bob Barrows [MVP]
Guest
Posts: n/a
 
      02-16-2006
Giles wrote:
> in DHTML, body.innerText nicely strips out the raw textual contents
> of a formatted page. Is there a straighforwards way to do this with a
> server-side ASP function (e.g. on a string containing the HTML) ? It
> is to fill a database field used for a simple search routine.
> I don't have permission on this server to use 3rd party components,
> it's plain IIS6.


Use a Regular Expression.
Bob Barrows
--
Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"


 
Reply With Quote
 
 
 
 
Giles
Guest
Posts: n/a
 
      02-16-2006
from Bob Barrows [MVP]
> Giles wrote:
>> in DHTML, body.innerText nicely strips out the raw textual contents
>> of a formatted page. Is there a straighforwards way to do this with a
>> server-side ASP function (e.g. on a string containing the HTML) ? It
>> is to fill a database field used for a simple search routine.
>> I don't have permission on this server to use 3rd party components,
>> it's plain IIS6.

>
> Use a Regular Expression.
> Bob Barrows


RegExp is a black art to me! Off the top of the head,
delete from "<head" to "/head>"
delete from "<style" to "/style>" (in case not in head)
delete from "<script" to "/script>" (in case not in head)
replace anything in chevrons with nothing.
replace line-breaks with spaces
replace multiple spaces with single spaces
replace HTML entities with literals
Does that sound about right?
thanks, Giles


 
Reply With Quote
 
Bob Barrows [MVP]
Guest
Posts: n/a
 
      02-16-2006
Giles wrote:
> from Bob Barrows [MVP]
>> Giles wrote:
>>> in DHTML, body.innerText nicely strips out the raw textual contents
>>> of a formatted page. Is there a straighforwards way to do this with
>>> a server-side ASP function (e.g. on a string containing the HTML) ?
>>> It is to fill a database field used for a simple search routine.
>>> I don't have permission on this server to use 3rd party components,
>>> it's plain IIS6.

>>
>> Use a Regular Expression.
>> Bob Barrows

>
> RegExp is a black art to me!

Somewhat to me as well ...
A couple people in this group (Chris Hohmann comes to mind) have it down
pretty well. There are some websites out there that provide libraries of
regular expression patterns.

> Off the top of the head,
> delete from "<head" to "/head>"
> delete from "<style" to "/style>" (in case not in head)
> delete from "<script" to "/script>" (in case not in head)
> replace anything in chevrons with nothing.
> replace line-breaks with spaces
> replace multiple spaces with single spaces
> replace HTML entities with literals
> Does that sound about right?


I guess so, but why are you leaving the closing and opening brackets?


--
Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.


 
Reply With Quote
 
Justin Piper
Guest
Posts: n/a
 
      02-16-2006
Giles wrote:
> in DHTML, body.innerText nicely strips out the raw textual contents of a
> formatted page. Is there a straighforwards way to do this with a server-side
> ASP function (e.g. on a string containing the HTML) ? It is to fill a
> database field used for a simple search routine.


If you can, you might consider using the Indexing Services instead of
rolling your own search routine.

http://www.codeproject.com/asp/indexserver.asp

If that's not an option, you should be able to use Internet Explorer
from an ASP.

<% Option Explicit

Dim ie: Set ie = CreateObject("InternetExplorer.Application")
ie.Navigate "about:blank"

Dim doc: Set doc = ie.Document
doc.open
doc.writeln "<dl>"
doc.writeln "<dt>em</dt>"
doc.writeln "<dd>Indicates <em>emphasis</em></dd>"
doc.writeln "<dt>strong</dt>"
doc.writeln "<dd>Indicates <strong>stronger emphasis</strong></dd>"
doc.writeln "</dl>"
doc.close

Response.ContentType = "text/plain"
Response.Write doc.documentElement.InnerText
%>
 
Reply With Quote
 
Tom Kaminski [MVP]
Guest
Posts: n/a
 
      02-17-2006
"Giles" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> in DHTML, body.innerText nicely strips out the raw textual contents of a
> formatted page. Is there a straighforwards way to do this with a
> server-side ASP function (e.g. on a string containing the HTML) ? It is to
> fill a database field used for a simple search routine.
> I don't have permission on this server to use 3rd party components, it's
> plain IIS6.


With ASP you have complete control over the content of the page before it
gets written so it's not clear to me why you would need to do this ...

--
Tom Kaminski IIS MVP
http://www.microsoft.com/windowsserv...y/centers/iis/
http://mvp.support.microsoft.com/
http://www.iistoolshed.com/ - tools, scripts, and utilities for running IIS


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
global filter to redirect asp (old asp!) pages on asp.net site Daves ASP .Net 2 05-31-2006 08:33 AM
2.0: asp:Menu, asp:XmlDataSource, asp:PlaceHolder R.A.M. ASP .Net 1 03-29-2006 07:55 AM
[ASP.NET1.1]Should I kill the ASP.NET worker process after recompilingmy ASP.NET webservice? Andrea Raimondi ASP .Net 1 02-06-2006 08:10 AM
ASP.Net cookie -> ASP -> ASP.Net Ben ASP .Net 3 05-28-2004 03:35 PM
LOOP through an ASP form's pages (not ASP.NET - ASP classic) David A. Beck ASP General 10 04-13-2004 05:38 PM



Advertisments