Snitz Forums 2000
Snitz Forums 2000
Home | Profile | Register | Active Topics | Members | Search | FAQ
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 Community Forums
 Code Support: ASP.NET (Non-Forum Related)
 Searching Snitz DB using ASP.NET
 New Topic  Topic Locked
 Printer Friendly
Author Previous Topic Topic Next Topic  

wildfiction
Junior Member

167 Posts

Posted - 22 December 2005 :  15:02:13  Show Profile  Visit wildfiction's Homepage
I'm interested in writing some search code in ASP.NET that makes searching the Snitz DB lightening fast. Does anybody have any links or suggestions where I can get some info to get started.

I'm generally looking for an algorithm that takes all of the forum text out of the DB and creates an index file which itself can be quickly searched and then refer back to the forum DB. The index file would obviously have to be updated at regular intervals to keep it current.

Any ideas or links for me?
Thanks!

laser
Advanced Member

Australia
3859 Posts

Posted - 22 December 2005 :  15:58:26  Show Profile
Why take the data out of the database ? Just use the database directly .... wouldn't you ?
Go to Top of Page

wildfiction
Junior Member

167 Posts

Posted - 22 December 2005 :  18:47:29  Show Profile  Visit wildfiction's Homepage
laser - yes you can use the DB directly but then (correct me if I'm wrong) you look at the contents of each and every record and search all of the data in each of the records for the word/phrase that you're looking for right?

If you had pre-processed all of those records and created a number of index files then you would be able to locate a word and that would tell you which records had that word in it.

So, for example, say you'd preprocessed your DB and generated 26 files - 1 for each letter of the alphabet.

Someone searches for 'acrobat' and so your search code opens the a.ndx file and finds the word acrobat in there and discovers that records 25698, 26186, and 127969 have the word acrobat in them.

You very quickly find the records you want. It is also quick to add NOT words. If you added '-circus' to your search then you'd open the c.ndx file and get the records that have 'circus' in them and exclude any records that matches the previous search.

Go to Top of Page

pdrg
Support Moderator

United Kingdom
2897 Posts

Posted - 23 December 2005 :  05:34:01  Show Profile  Send pdrg a Yahoo! Message
what's the db server? Just indexing properly in the db will help a load, but if you're using MS SQL Server, have a look at the 'fulltext' searches - probably exactly what you want

hth
Go to Top of Page

wildfiction
Junior Member

167 Posts

Posted - 09 January 2006 :  15:10:40  Show Profile  Visit wildfiction's Homepage
fulltext search is probably what would be best here. thanks for the idea.
Go to Top of Page

mios
Junior Member

United Kingdom
101 Posts

Posted - 10 January 2006 :  08:39:50  Show Profile  Send mios an ICQ Message
Take a look at Lucene http://www.dotlucene.net/ it's a .net port of apache lucene as is very fast and easy to use.

You basicaly add documents (documents are a collection of fields) to the index, so a document could contain topicID, subject, post, author, date..... these fields are then fully search able.

so for example

author:mios

Would return all my posts

I've used this for a DMS project, and have index 20,000 documents (including the full works of shakespeare), the index size is about 6GB and a search takes about 0.5 sec.

You can also use it in conjunction with the highlighter object, which generates summaries of the returned items, that are dependant on the search terms, very cool! Oh and it's free!
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Topic Locked
 Printer Friendly
Jump To:
Snitz Forums 2000 © 2000-2021 Snitz™ Communications Go To Top Of Page
This page was generated in 0.09 seconds. Powered By: Snitz Forums 2000 Version 3.4.07