Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > ASP .Net > [Search Engine - Internal site] DB or not DB ?

Reply
Thread Tools

[Search Engine - Internal site] DB or not DB ?

 
 
rs
Guest
Posts: n/a
 
      06-08-2006
Hallo,

I have a site with more than 15000 (15 thousand) pages.
Each page has almost a textual content.
Each page is about 10-25 Kb.

I need to build an internal search engine
by using Asp Net code.


Which is the best way:

1)


create a DB (I have SQL 2005 Express)
with a Table containing 5 columns:
Id, page-link, page-title, keywords, all the textual content of the page

Column example:
05
/Einstein.htm
Einstein life
birth, death
Einstein was born in... and hand won the Nobel prize... and has dead in
Berlin.

then access to the DB using SELECT
and CONTAINS (for the 5th column)
and then go with
Me.Response.Write WhatIFound



or



2)


use no DB
and search among the page Tags (Title, Keywords, Body)
I presume by using the Regular Expression commands and the StringBuilder
and then go with
Me.Response.Write WhatIFound



-----------------

Which method of the two is better?

Also, any suggestion, optimization, advice... about
one or the two method is welcome.

-------------------


Thanks
 
Reply With Quote
 
 
 
 
sdbillsfan@gmail.com
Guest
Posts: n/a
 
      06-08-2006
Ask yourself, WWGD (what would google do). You definitely need to
create some sort of indexing tool here to spider the pages in case
content changes and then store the indexed results in a db. All that
being said, I wouldn't reinvent the wheel here. There are plenty of 3rd
party tools to do exactly what you want. Just search google for
intranet search engine


rs wrote:
> Hallo,
>
> I have a site with more than 15000 (15 thousand) pages.
> Each page has almost a textual content.
> Each page is about 10-25 Kb.
>
> I need to build an internal search engine
> by using Asp Net code.
>
>
> Which is the best way:
>
> 1)
>
>
> create a DB (I have SQL 2005 Express)
> with a Table containing 5 columns:
> Id, page-link, page-title, keywords, all the textual content of the page
>
> Column example:
> 05
> /Einstein.htm
> Einstein life
> birth, death
> Einstein was born in... and hand won the Nobel prize... and has dead in
> Berlin.
>
> then access to the DB using SELECT
> and CONTAINS (for the 5th column)
> and then go with
> Me.Response.Write WhatIFound
>
>
>
> or
>
>
>
> 2)
>
>
> use no DB
> and search among the page Tags (Title, Keywords, Body)
> I presume by using the Regular Expression commands and the StringBuilder
> and then go with
> Me.Response.Write WhatIFound
>
>
>
> -----------------
>
> Which method of the two is better?
>
> Also, any suggestion, optimization, advice... about
> one or the two method is welcome.
>
> -------------------
>
>
> Thanks


 
Reply With Quote
 
 
 
 
rs
Guest
Posts: n/a
 
      06-09-2006
I will not add a lot of pages (5-10 pages a year)
so indexing is not a problem.

I'm a new programmer and want to learn.

I'd like to receive technical information
about sizes, speed, query, chaching...
and at last to decide which of the two methods is better...


>Ask yourself, WWGD (what would google do). You definitely need to
>create some sort of indexing tool here to spider the pages in case
>content changes and then store the indexed results in a db. All that
>being said, I wouldn't reinvent the wheel here. There are plenty of 3rd
>party tools to do exactly what you want. Just search google for
>intranet search engine
>
>
>rs wrote:
>> Hallo,
>>
>> I have a site with more than 15000 (15 thousand) pages.
>> Each page has almost a textual content.
>> Each page is about 10-25 Kb.
>>
>> I need to build an internal search engine
>> by using Asp Net code.
>>
>>
>> Which is the best way:
>>
>> 1)
>>
>>
>> create a DB (I have SQL 2005 Express)
>> with a Table containing 5 columns:
>> Id, page-link, page-title, keywords, all the textual content of the page
>>
>> Column example:
>> 05
>> /Einstein.htm
>> Einstein life
>> birth, death
>> Einstein was born in... and hand won the Nobel prize... and has dead in
>> Berlin.
>>
>> then access to the DB using SELECT
>> and CONTAINS (for the 5th column)
>> and then go with
>> Me.Response.Write WhatIFound
>>
>>
>>
>> or
>>
>>
>>
>> 2)
>>
>>
>> use no DB
>> and search among the page Tags (Title, Keywords, Body)
>> I presume by using the Regular Expression commands and the StringBuilder
>> and then go with
>> Me.Response.Write WhatIFound
>>
>>
>>
>> -----------------
>>
>> Which method of the two is better?
>>
>> Also, any suggestion, optimization, advice... about
>> one or the two method is welcome.
>>
>> -------------------
>>
>>
>> Thanks


 
Reply With Quote
 
sdbillsfan@gmail.com
Guest
Posts: n/a
 
      06-09-2006
You want to automate the indexing here because the flexibility that
will allow makes the effort it would take to create well worth it.
Store your collection/indexing results in a database and the query,
caching, speed and sizes will be handled for you (you can learn about
database tuning here, a piece of knowledge almost all programmers
should have). You can use a built in text searching mechanism (every
RDBMS that I know of has one) or write (or reuse) an implementation of
any of the string searching algorithms out there. Make sure you
abstract whatever implementation you choose for each part,
collection/indexing/searching/etc as much as possible so you can modify
things as desired/needed (ie plugging in a different search algorithm,
database, etc).


rs wrote:
> I will not add a lot of pages (5-10 pages a year)
> so indexing is not a problem.
>
> I'm a new programmer and want to learn.
>
> I'd like to receive technical information
> about sizes, speed, query, chaching...
> and at last to decide which of the two methods is better...
>
>
> >Ask yourself, WWGD (what would google do). You definitely need to
> >create some sort of indexing tool here to spider the pages in case
> >content changes and then store the indexed results in a db. All that
> >being said, I wouldn't reinvent the wheel here. There are plenty of 3rd
> >party tools to do exactly what you want. Just search google for
> >intranet search engine
> >
> >
> >rs wrote:
> >> Hallo,
> >>
> >> I have a site with more than 15000 (15 thousand) pages.
> >> Each page has almost a textual content.
> >> Each page is about 10-25 Kb.
> >>
> >> I need to build an internal search engine
> >> by using Asp Net code.
> >>
> >>
> >> Which is the best way:
> >>
> >> 1)
> >>
> >>
> >> create a DB (I have SQL 2005 Express)
> >> with a Table containing 5 columns:
> >> Id, page-link, page-title, keywords, all the textual content of the page
> >>
> >> Column example:
> >> 05
> >> /Einstein.htm
> >> Einstein life
> >> birth, death
> >> Einstein was born in... and hand won the Nobel prize... and has dead in
> >> Berlin.
> >>
> >> then access to the DB using SELECT
> >> and CONTAINS (for the 5th column)
> >> and then go with
> >> Me.Response.Write WhatIFound
> >>
> >>
> >>
> >> or
> >>
> >>
> >>
> >> 2)
> >>
> >>
> >> use no DB
> >> and search among the page Tags (Title, Keywords, Body)
> >> I presume by using the Regular Expression commands and the StringBuilder
> >> and then go with
> >> Me.Response.Write WhatIFound
> >>
> >>
> >>
> >> -----------------
> >>
> >> Which method of the two is better?
> >>
> >> Also, any suggestion, optimization, advice... about
> >> one or the two method is welcome.
> >>
> >> -------------------
> >>
> >>
> >> Thanks


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
837. Unable to see internal web server from internal server. eric the brave Cisco 0 03-05-2006 01:52 PM
Cisco PIX 501 - Port forwarded to an internal host via Static NAT doesn't work from internal host JoelSeph Cisco 9 01-23-2006 03:52 PM
Internal Client Accessing Internal Server Via Public IP Address GeekMarine1972 Cisco 1 01-15-2005 02:49 AM
Redirect Internal IP to Different Internal IP on Same Subnet & Interface EG Cisco 5 12-30-2004 02:10 AM
internal to internal NAT? Mike Cisco 1 04-21-2004 12:15 PM



Advertisments