Snitz Forums 2000
Snitz Forums 2000
Home | Profile | Register | Active Topics | Members | Search | FAQ
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 Help Groups for Snitz Forums 2000 Users
 Help: Database: MS SQL Server
 Search.asp: use full-text indexing?
 New Topic  Topic Locked
 Printer Friendly
Author Previous Topic Topic Next Topic  

aiken
Starting Member

28 Posts

Posted - 20 July 2002 :  13:23:30  Show Profile
Has anyone already modified a search.asp to use SQL's full text indexing? I'm about to embark on this task and thought I should make sure I'm not reinventing the wheel.

Cheers
-b

ruirib
Snitz Forums Admin

Portugal
26364 Posts

Posted - 20 July 2002 :  13:26:44  Show Profile  Send ruirib a Yahoo! Message
Not that I know of.

-------------------------------------------------
Installation Guide | Do's and Dont's | MODs
Go to Top of Page

HuwR
Forum Admin

United Kingdom
20595 Posts

Posted - 20 July 2002 :  13:39:25  Show Profile  Visit HuwR's Homepage
This would require you to convert your db to allow full text serching, something which is probably not worth the extra overhead for a facility very few people use.

Go to Top of Page

aiken
Starting Member

28 Posts

Posted - 20 July 2002 :  14:09:55  Show Profile
SQL Server really doesn't require any additional work to full-text index a database. It's pretty easy and painless; I already have a full-text catalog in the DB for other purposes, so it's just a matter of adding the _TOPICS and _REPLY tables to the index and writing a full-text query that unions the results of the two.

It should dramatically improve performance for search.asp, since it will be using the pre-built text indexes rather than a bunch of "like '%whatever%'"'s. I definitely see quite a bit of usage of the search page, so I'll go on with the project and post the modifications here.

Cheers
-b

Go to Top of Page

ruirib
Snitz Forums Admin

Portugal
26364 Posts

Posted - 20 July 2002 :  17:55:15  Show Profile  Send ruirib a Yahoo! Message
I agree with you on the performance benefits of using Full Text Indexing. Don't know if you have taken a look at the New Search feature in thesee forums. It's performance has increased so much that I thought HuwR had used full text indexing there also. He hasn't, but performance increased a lot anyway.

This doesn't mean that you should not use full-text indexing. I'm convinced that performance will be even better when using it. Unfortunately, from a Snitz point of view, this can only be used with SQL Server Dbs, so Access and MySQL won't benefit from it. But that should not stop you from doing it, and it would be most interesting to have your results posted here, maybe even as a mod.

-------------------------------------------------
Installation Guide | Do's and Dont's | MODs
Go to Top of Page

Gremlin
General Help Moderator

New Zealand
7528 Posts

Posted - 20 July 2002 :  18:56:17  Show Profile  Visit Gremlin's Homepage
I did muck around with a full text index about 6 months ago for Snitz, not sure if I've still got the code it definately did make an improvement however I've since found that most of the optimisation work I've done on the DB and number of calls mitigated the need for a full-text index so I dumped it.

www.daoc-halo.com

Edited by - Gremlin on 20 July 2002 18:56:35
Go to Top of Page

HuwR
Forum Admin

United Kingdom
20595 Posts

Posted - 20 July 2002 :  19:39:18  Show Profile  Visit HuwR's Homepage
quote:

SQL Server really doesn't require any additional work to full-text index a database.


No, but your server has to work a lot harder, since as well as running your db, it will mow be running mssearch which is what actually does the search against the catalog.

quote:

form microsofts technet article
Full-text indexing is a CPU and memory intensive operation. It is typical to see CPU usage close to 100 percent, especially during times of full population



Using the like clause results in a clustered index scan of the table (ie won't use afulltext index). A query using the contains clause performs a remote scan of the full text index, then a clustered index seek to locate the actual record.

Which is faster will be dependent upon your data , ie the number of rows etc. and the load on the server when the queery is executed.

Full text indexing is not designed to enhance the performance of a search queery, but to enable you to get different kinds of results like ranking and words which sound like the one you entered etc

Go to Top of Page

aiken
Starting Member

28 Posts

Posted - 20 July 2002 :  22:04:59  Show Profile
Hmm... well, HuwR, you're certainly plenty knowledgeable so I feel a bit odd contradicting you, but I still think FTS is an appealing solution.

First, a clustered index scan is a pretty expensive operation, especially on large tables, compared to a pre-built keyword index lookup from FTS combined with the clustered index seek. I guess it depends on table size, but I'm projecting forums with >250,000 posts within a year (and well over 750,000 total posts by that time), so it really does matter (I don't want to archive *anything* if I can avoid it).

FTS also gives you the benefit of built-in ranking, so search results can be more relevant.

However, I do agree that initial population of a FTS index is brutal on the CPU; but it's easy enough to schedule that for 3am or some other off-peak time. After that, change tracking and update in background allow for minimal impact.

I'll definitely look for the "new search" feature that ruirib mentioned; that may solve my problems and let me avoid creating one more branch away from standard snitz code. If I do end up implementing FTS, I'll post sample code and benchmark results here.

Cheers
-b

Go to Top of Page

HuwR
Forum Admin

United Kingdom
20595 Posts

Posted - 20 July 2002 :  22:22:04  Show Profile  Visit HuwR's Homepage
I'm not saying you shouldn't do it, just trying to show you the pifalls.

Web servers rarely have a fall off time, they are worldwide don't forget.
You may gain information ie ranking, but you will not see much improvement in performance
(depending on hardware configuration of course, but then if hardware is configured correctly, you will see improved performance anyway).

The 'New Search' referred to here is fast because it is a server side application, it does not run or pass it's data to you via IIS, it uses pretty much the same querry as the ordinary search, admittedly I tweeked it a little with better joins, but essentially it is the same query, which pretty conclusivley shows it is not queerying the database which is your bottleneck when talking to databases, but IIS. With some minor mods, the 'New search' could be made even faster, especially if I ditched ADO and talked directly to the SQL DB.

Go to Top of Page

grazman
Junior Member

USA
193 Posts

Posted - 21 July 2002 :  14:31:24  Show Profile  Visit grazman's Homepage
quote:
No, but your server has to work a lot harder, since as well as running your db, it will mow be running mssearch which is what actually does the search against the catalog.


I disagree. You only run mssearch when you are populating the index or searching it. You can schedule it to populate during downtimes. Contrary to what you say (in another reply) I think most web sites have downtimes. I know on sqlteam my peak hours (7AM - 7PM CST in America) are 4.5 times the traffic of my lowest hours. You can also do incrememtal populations that only update changed rows and then fully repopulate on the weekends or monthly.

With full-text indexing you are replacing a very slow search with a very fast search. A clustered index scan (as opposed to a clustered index seek) without a WHERE predicate in the SHOWPLAN is a very expensive operation. It's basically a table scan. The server is going through the entire table looking for rows that meet certain criteria. It's expensive in terms of disk I/O and expensive in terms of memory as all these pages are pulled off the disk and pulled into the memory buffer.

As an example, a clusted index seek on my FORUM_REPLY table runs 128.8 TIMES faster than a clustered index scan.

A full-text query searches it's index of words. It then joins that result set to the base table to return what you've requested. That should be significantly faster than running a clustered index scan (i.e. table scan).

quote:
This would require you to convert your db to allow full text serching, something which is probably not worth the extra overhead for a facility very few people use.


As was noted in other replies, you don't convert a database to use full-text indexing. You install the feature and then build full-text indexes.

quote:
I did muck around with a full text index about 6 months ago for Snitz, not sure if I've still got the code it definately did make an improvement however I've since found that most of the optimisation work I've done on the DB and number of calls mitigated the need for a full-text index so I dumped it.


This is a very interesting comment. I've been wanting to install full-text indexing on Snitz for a while myself. Unfortunately, my host doesn't support it. The key thing thing here is the reason the search page is slow isn't the searching for the keywords part. It's all the queries that are run on the resulting rows. In active.asp, topic.asp and forum.asp the forums run multiple queries PER ROW RETURNED. That's what slows down performance. That means that a search result that returns 100 topics will generate 200, 300 or maybe even 400 queries to get information about those 100 topics. It sounds like removing these is what Gremlin did to improve performance. I haven't tested the search result but I'm guessing it performs like the others do. They share code for moderation and subscriptions. This is the biggest thing you could do to improve performance in my opinion.

quote:
Full text indexing is not designed to enhance the performance of a search queery, but to enable you to get different kinds of results like ranking and words which sound like the one you entered etc


Again, I'd disagree with this statement. Full-text indexing is specifically designed to improve the performance of word searches. As part of this it provides some very neat additional features: ranking, advanced word matching, etc.

Overall, I'd say that full-text indexing would result in a performance improvement. It would also give you features that the search on Snitz doesn't have (i.e. matching ran to run). It does need more care and feeding than what's in place now though. Full-text indexing can be very particular and a pain to maintain. Microsoft says that for under a million rows indexed you don't need any specific configuration for it.

However, I think working to remove those extra queries run per row of returned data would give you a much, much larger performance improvement. As a matter of fact I think nothing anyone does to performance will make a huge difference until those extra, repetitive queries are removed. I'm also fairly certain that will be completed in the upcoming 3.4. And no, I don't know when it's coming out. And no, no else does either. Please don't ask.



SQLTeam.com - For SQL Server Developers and Administrators
Snitz Info - SQL Server info on Snitz Forums
Go to Top of Page

ruirib
Snitz Forums Admin

Portugal
26364 Posts

Posted - 21 July 2002 :  14:44:11  Show Profile  Send ruirib a Yahoo! Message
quote:

However, I think working to remove those extra queries run per row of returned data would give you a much, much larger performance improvement. As a matter of fact I think nothing anyone does to performance will make a huge difference until those extra, repetitive queries are removed. I'm also fairly certain that will be completed in the upcoming 3.4. And no, I don't know when it's coming out. And no, no else does either. Please don't ask.


I think something like this must have been done by HuwR in the New Search feature. Have you tried it? It would be nice if HuwR could publish the code sometime.

Regarding 3.4, although I cannot tell when it will be release, this time looks like it will be pretty soon. Certainly more than a week away, but maybe not that much more, I'd risk saying (and I'm saying it at my own risk, and no I have no previledge info nor am I a member of the Dev team).

-------------------------------------------------
Installation Guide | Do's and Dont's | MODs
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Topic Locked
 Printer Friendly
Jump To:
Snitz Forums 2000 © 2000-2021 Snitz™ Communications Go To Top Of Page
This page was generated in 0.19 seconds. Powered By: Snitz Forums 2000 Version 3.4.07