Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > ASP .Net > ASP General > Search - stop words - array/database/text file?

Reply
Thread Tools

Search - stop words - array/database/text file?

 
 
Rob Meade
Guest
Posts: n/a
 
      02-07-2004
Lo all,

Ok - I'm adding site search functionality to a database driven website.

I have a list of 390 stop/ignore words, having looked at ASPFAQ already I
see that the example uses an array, what I was wondering was whether this
would still be the best practice for this quantity of stop words?

There is a larger over head in me defining the array initially as I will
have to hard code them all in, alternatively I thought I could import them
into a SQL Server table from the excel file they are currently in and then
query that, but I believe the ASPFAQ article gave a good reason for not
doing that, my last thought was to read them in from a text file...

Anyone got any thoughts? Would an array be equally as efficient for 390 stop
words as it is for 10-20? Is it better to hard code them rather than grab
them from a database?

Any help / advice would be appreciated.

Regards

Rob


 
Reply With Quote
 
 
 
 
Bob Barrows
Guest
Posts: n/a
 
      02-07-2004
Rob Meade wrote:
> Lo all,
>
> Ok - I'm adding site search functionality to a database driven
> website.
>
> I have a list of 390 stop/ignore words, having looked at ASPFAQ
> already I see that the example uses an array, what I was wondering
> was whether this would still be the best practice for this quantity
> of stop words?
>
> There is a larger over head in me defining the array initially as I
> will have to hard code them all in, alternatively I thought I could
> import them into a SQL Server table from the excel file they are
> currently in and then query that, but I believe the ASPFAQ article
> gave a good reason for not doing that, my last thought was to read
> them in from a text file...
>
> Anyone got any thoughts? Would an array be equally as efficient for
> 390 stop words as it is for 10-20? Is it better to hard code them
> rather than grab them from a database?
>
> Any help / advice would be appreciated.
>
> Regards
>
> Rob


If the list will be static, I would store it in an Application variable,
making the decision as to whether to store it in a database or a textfile
superfluous. If the list has no relationship to any of your database data,
then a text file on your web server seems to be indicated.

Moreover, I would suggest storing it as an XML DOMDocument, allowing you to
use the XML Parser DOM methods to easily search for values in the list.

Bob Barrows

--
Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"


 
Reply With Quote
 
 
 
 
Rob Meade
Guest
Posts: n/a
 
      02-07-2004
"Bob Barrows" wrote ...

> If the list will be static, I would store it in an Application variable,
> making the decision as to whether to store it in a database or a textfile
> superfluous.


Hi Bob,

Yes, initially this list will definately be static, I do not plan to add to
the list dynamically at this stage.

> If the list has no relationship to any of your database data,
> then a text file on your web server seems to be indicated.


ok

> Moreover, I would suggest storing it as an XML DOMDocument, allowing you

to
> use the XML Parser DOM methods to easily search for values in the list.


hmmm...hadn't thought of XML for this..

Wouldnt this be quite a bit of extra code considering all I want to do is
iterate through the list and chop those words out of the original search
string etc? Maybe not, not sure...can't see the advantages of this method?

Any further info appreciated..

Regards

Rob


 
Reply With Quote
 
Bob Barrows
Guest
Posts: n/a
 
      02-07-2004
Rob Meade wrote:
> "Bob Barrows" wrote ...
>
>> If the list will be static, I would store it in an Application
>> variable, making the decision as to whether to store it in a
>> database or a textfile superfluous.

>
> Hi Bob,
>
> Yes, initially this list will definately be static, I do not plan to
> add to the list dynamically at this stage.
>
>> If the list has no relationship to any of your database data,
>> then a text file on your web server seems to be indicated.

>
> ok
>
>> Moreover, I would suggest storing it as an XML DOMDocument, allowing
>> you to use the XML Parser DOM methods to easily search for values in
>> the list.

>
> hmmm...hadn't thought of XML for this..
>
> Wouldnt this be quite a bit of extra code considering all I want to
> do is iterate through the list and chop those words out of the
> original search string etc? Maybe not, not sure...can't see the
> advantages of this method?
>


Ah! I see. I was thinking you would need to do the opposite: find specific
words in the list.

To find a word in an array:
for i = 0 to ubound(ar)
if ar(i) = <something> then
exit for
end if
next

To find a word in an XML Document:
xmldoc.selectsinglenode("/root/node[value='<something>']")

There is no extra code involved in looping through a DOM Document:

for each oNode in xmldoc.documentelement.childnodes
'do something with oNode.Text
next

Given the comparative sizes of the array and xml document, if I did not need
search capabilities, I would go with the array.

Bob Barrows

--
Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"


 
Reply With Quote
 
Rob Meade
Guest
Posts: n/a
 
      02-07-2004
"Bob Barrows" wrote ...

> Given the comparative sizes of the array and xml document, if I did not

need
> search capabilities, I would go with the array.


Hi Bob,

Many thanks for the reply, and examples, I will use the array method for now
then - many thanks - if you have time - I've another question - see Search
(part2) D

Cheers

Rob


 
Reply With Quote
 
Roland Hall
Guest
Posts: n/a
 
      02-07-2004
"Bob Barrows" wrote:
: To find a word in an array:
: for i = 0 to ubound(ar)
: if ar(i) = <something> then
: exit for
: end if
: next
:
: To find a word in an XML Document:
: xmldoc.selectsinglenode("/root/node[value='<something>']")
:
: There is no extra code involved in looping through a DOM Document:
:
: for each oNode in xmldoc.documentelement.childnodes
: 'do something with oNode.Text
: next
:
: Given the comparative sizes of the array and xml document, if I did not
need
: search capabilities, I would go with the array.

Or you could use Filter and eliminate the For...Next loop:

<%@ Language=VBScript %>
<%
Option Explicit
Response.Buffer = True

sub lPrt(strMsg)
Response.Write(strMsg & "<br />" & vbCrLf)
end sub

sub Prt(strMsg)
Response.Write(strMsg)
end sub

sub findWord(arr, fWord)
if isFound(arr, fWord) = fWord Then
Response.Write(fWord & " found in array.<br />" & vbCrLf)
else
Response.Write(fWord & " not found in array.<br />" & vbCrLf)
end if
end sub

function isFound(arr, fWord)
dim f
f = Filter(arr, fWord)
if ubound(f) <> 0 Then
isFound = ""
else
isFound = f(0)
end if
end function

dim str, myarray, fWord
str = "one two three four five six seven eight nine ten"
myarray = Split(str)

lPrt("Using Filter to find words in an array")
lPrt("Array elements: " & str)
Prt("Testing eleven: ")
findWord myarray, "eleven"
Prt("Testing five: ")
findWord myarray, "five"
%>

http://kiddanger.com/lab/filter.asp

--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp


 
Reply With Quote
 
Bob Barrows
Guest
Posts: n/a
 
      02-08-2004
Roland Hall wrote:
> "Bob Barrows" wrote:
>> To find a word in an array:
>> for i = 0 to ubound(ar)
>> if ar(i) = <something> then
>> exit for
>> end if
>> next
>>
>> To find a word in an XML Document:
>> xmldoc.selectsinglenode("/root/node[value='<something>']")
>>
>> There is no extra code involved in looping through a DOM Document:
>>
>> for each oNode in xmldoc.documentelement.childnodes
>> 'do something with oNode.Text
>> next
>>
>> Given the comparative sizes of the array and xml document, if I did
>> not need search capabilities, I would go with the array.

>
> Or you could use Filter and eliminate the For...Next loop:
>
> <%@ Language=VBScript %>
> <%
> Option Explicit
> Response.Buffer = True
>
> sub lPrt(strMsg)
> Response.Write(strMsg & "<br />" & vbCrLf)
> end sub
>
> sub Prt(strMsg)
> Response.Write(strMsg)
> end sub
>
> sub findWord(arr, fWord)
> if isFound(arr, fWord) = fWord Then
> Response.Write(fWord & " found in array.<br />" & vbCrLf)
> else
> Response.Write(fWord & " not found in array.<br />" & vbCrLf)
> end if
> end sub
>
> function isFound(arr, fWord)
> dim f
> f = Filter(arr, fWord)
> if ubound(f) <> 0 Then
> isFound = ""
> else
> isFound = f(0)
> end if
> end function
>
> dim str, myarray, fWord
> str = "one two three four five six seven eight nine ten"
> myarray = Split(str)
>
> lPrt("Using Filter to find words in an array")
> lPrt("Array elements: " & str)
> Prt("Testing eleven: ")
> findWord myarray, "eleven"
> Prt("Testing five: ")
> findWord myarray, "five"
> %>
>
> http://kiddanger.com/lab/filter.asp


Hah! I had forgotten about that. Thanks for the heads-up.

Bob Barrows


--
Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"


 
Reply With Quote
 
Rob Meade
Guest
Posts: n/a
 
      02-08-2004
"Roland Hall" wrote ...

> No problem. I was workin' on it couple of days ago so it was fresh in my
> mind.


Thanks for that Roland,

Tell me, is using the filter method more efficient perhaps than what I
bashed out with two pencils stuck to my head yesterday (I'm guessing so but
figured would ask)...

<!--INSERT VERY LARGE ARRAY UP HERE-->

intMatch = 0

For intLoop = 0 To UBound(aSearchCriteria)

For intLoop2 = 0 To UBound(aIgnoreWords)

If UCase(aSearchCriteria(intLoop)) = UCase(aIgnoreWords(intLoop2)) Then

intMatch = 1

strIgnoredWords = strIgnoredWords & aIgnoreWords(intLoop2) & ", "

Exit For

End If

Next

If intMatch = 0 Then

strTempSearchCriteria = strTempSearchCriteria & aSearchCriteria(intLoop)
& " "

End If

intMatch = 0

Next

strSearchCriteria = Trim(strTempSearchCriteria)

In this I'm obviously iterating through the entire array of ignore words for
each word in the search criteria, I am then creating a new string of words
that are not found which eventually get used as criteria, and I also create
a string of 'ignored' words which then get dumped on the page to make it
look really clevaaarrr D

You example has less lines of code so I suspect its far more efficient and
probably the preferred way, mine was bashed out whilst drinking stella )

Regards

Rob


 
Reply With Quote
 
Roland Hall
Guest
Posts: n/a
 
      02-08-2004
"Bob Barrows" wrote:
: Roland Hall wrote:
: > "Bob Barrows" wrote:
: >> To find a word in an array:
: >> for i = 0 to ubound(ar)
: >> if ar(i) = <something> then
: >> exit for
: >> end if
: >> next
: >>
: >> To find a word in an XML Document:
: >> xmldoc.selectsinglenode("/root/node[value='<something>']")
: >>
: >> There is no extra code involved in looping through a DOM Document:
: >>
: >> for each oNode in xmldoc.documentelement.childnodes
: >> 'do something with oNode.Text
: >> next
: >>
: >> Given the comparative sizes of the array and xml document, if I did
: >> not need search capabilities, I would go with the array.
: >
: > Or you could use Filter and eliminate the For...Next loop:
: >
: > <%@ Language=VBScript %>
: > <%
: > Option Explicit
: > Response.Buffer = True
: >
: > sub lPrt(strMsg)
: > Response.Write(strMsg & "<br />" & vbCrLf)
: > end sub
: >
: > sub Prt(strMsg)
: > Response.Write(strMsg)
: > end sub
: >
: > sub findWord(arr, fWord)
: > if isFound(arr, fWord) = fWord Then
: > Response.Write(fWord & " found in array.<br />" & vbCrLf)
: > else
: > Response.Write(fWord & " not found in array.<br />" & vbCrLf)
: > end if
: > end sub
: >
: > function isFound(arr, fWord)
: > dim f
: > f = Filter(arr, fWord)
: > if ubound(f) <> 0 Then
: > isFound = ""
: > else
: > isFound = f(0)
: > end if
: > end function
: >
: > dim str, myarray, fWord
: > str = "one two three four five six seven eight nine ten"
: > myarray = Split(str)
: >
: > lPrt("Using Filter to find words in an array")
: > lPrt("Array elements: " & str)
: > Prt("Testing eleven: ")
: > findWord myarray, "eleven"
: > Prt("Testing five: ")
: > findWord myarray, "five"
: > %>
: >
: > http://kiddanger.com/lab/filter.asp
:
: Hah! I had forgotten about that. Thanks for the heads-up.

No problem. I was workin' on it couple of days ago so it was fresh in my
mind.

Roland


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Words and non-words, according to Microsoft et al Steve B NZ Computing 11 03-21-2008 11:52 PM
Replace stop words (remove words from a string) BerlinBrown Python 6 01-17-2008 02:37 PM
Words Words utab C++ 6 02-16-2006 07:00 PM
Non-noise words are incorrectly recognised as noise words. Peter Strĝiman ASP .Net 1 08-23-2005 01:26 PM
Re: A little bit of help regarding my linked list program required. - "words.c" - "words.c" Richard Heathfield C Programming 7 10-05-2003 02:38 PM



Advertisments