Author |
Topic |
|
StephenD
Senior Member
Australia
1044 Posts |
Posted - 30 December 2008 : 13:36:43
|
I finally got round to archiving my forum and ran into a few issues with 'timeouts' halting the archive process. After a few goes the archiving all seemed to go well but I find that I now have a lot of duplicate entries in the forum_a_reply table showing the same reply_id.
Any way to scan for and remove these duplicate entries? I have 137,000 rows in this table and I think it should be @100,000< |
|
AnonJr
Moderator
United States
5768 Posts |
Posted - 30 December 2008 : 14:34:57
|
I want to say that there has been a script or two posted 'round here already as others who aren't in the habit of archiving frequently have run into the same problem. Have you searched yet?< |
|
|
StephenD
Senior Member
Australia
1044 Posts |
Posted - 30 December 2008 : 19:11:20
|
I found this post by Huwr which is relevant : http://forum.snitz.com/forum/topic.asp?ARCHIVE=true&TOPIC_ID=63288&SearchTerms=duplicate,replies
to determine whether you have duplicate topics or replies, run the following queries
for topics SELECT * FROM FORUM_A_TOPICS AT WHERE EXISTS(SELECT TOPIC_ID FROM FORUM_TOPICS T WHERE T.TOPIC_ID = AT.TOPIC_ID) for replies SELECT * FROM FORUM_A_REPLY AR WHERE EXISTS(SELECT REPLY_ID FROM FORUM_REPLY R WHERE R.REPLY_ID = AR.REPLY_ID)
when you have discovered which tables has the duplicates, I would strongly recomend making a back up of your database, then you can run the following queeries to remove the duplicate records.
topics DELETE FROM FORUM_A_TOPICS AT WHERE EXISTS (SELECT TOPIC_ID FROM FORUM_TOPICS T WHERE T.TOPIC_ID = FT.TOPIC_ID) replies DELETE FROM FORUM_A_REPLY AR WHERE EXISTS (SELECT REPLY_ID FROM FORUM_REPLY R WHERE R.REPLY_ID = AR.REPLY_ID)
which shows me I have 6227 rows in forum_a_reply that exist in forum_reply table but not how many duplicates in forum_a_reply.< |
|
|
StephenD
Senior Member
Australia
1044 Posts |
Posted - 30 December 2008 : 19:18:02
|
Found this bit of SQL: SELECT reply_id, COUNT(reply_id) AS NumOccurrences FROM forum_a_reply GROUP BY reply_id HAVING ( COUNT(reply_id) > 1 )
Gives me 12,249 rows with some of the NumOccurrences ranging from 2 to 9.< |
|
|
StephenD
Senior Member
Australia
1044 Posts |
Posted - 30 December 2008 : 19:34:58
|
... to remove is it then:
DELETE FROM FORUM_A_REPLY WHERE REPLY_ID IN(SELECT reply_id, COUNT(reply_id) AS NumOccurrences FROM forum_a_reply GROUP BY reply_id HAVING ( COUNT(reply_id) > 1 ))
Can anyone confirm that for me please.
Edit : this won't work as it will delete the ones I want to keep too I think.
Maybe select the rows with a count of 1 into another table. Drop Forum_a_reply then rename the new table as Forum_a_reply? Any indexes to re-establish? < |
Edited by - StephenD on 30 December 2008 19:37:14 |
|
|
ruirib
Snitz Forums Admin
Portugal
26364 Posts |
Posted - 30 December 2008 : 20:10:13
|
The idea would be to select just unique records into another table, delete the records from the current table and then retrieve the records from the unique recors table into the original table.< |
Snitz 3.4 Readme | Like the support? Support Snitz too |
|
|
StephenD
Senior Member
Australia
1044 Posts |
Posted - 31 December 2008 : 21:25:42
|
Still struggling with the SQL to dump unique records into another table. Anyone able to assist please.< |
|
|
ruirib
Snitz Forums Admin
Portugal
26364 Posts |
|
StephenD
Senior Member
Australia
1044 Posts |
Posted - 31 December 2008 : 23:23:20
|
MS SQL< |
|
|
ruirib
Snitz Forums Admin
Portugal
26364 Posts |
Posted - 01 January 2009 : 06:24:31
|
Ok, a quick solution then, with manual creation of two helper columns.
1. Create columns named R2 and R3 on FORUM_A_REPLY. These should be int columns.
2. Run this script:
DECLARE @C int;
SET @C=0;
UPDATE FORUM_A_REPLY SET R3=@C, @C = @C + 1;
DECLARE @prev int;
DECLARE @ID int,@R3 int;
DECLARE RC CURSOR
FOR SELECT REPLY_ID,R3 FROM FORUM_A_REPLY ORDER BY REPLY_ID;
SET NOCOUNT ON
SET @prev=0
OPEN RC
FETCH NEXT FROM RC INTO @ID,@R3
WHILE @@FETCH_STATUS=0
BEGIN
IF @ID = @prev
UPDATE FORUM_A_REPLY SET R2=1 WHERE R3=@R3
Else
BEGIN
UPDATE FORUM_A_REPLY SET R2=0 WHERE R3=@R3
SET @prev = @ID
END
FETCH NEXT FROM RC INTO @ID,@R3
END
CLOSE RC
DEALLOCATE RC
DELETE FROM FORUM_A_REPLY WHERE R2=1
3. Check that you have no further duplicate replies and remove the columns R2 and R3.
Done.< |
Snitz 3.4 Readme | Like the support? Support Snitz too |
|
|
StephenD
Senior Member
Australia
1044 Posts |
Posted - 01 January 2009 : 09:22:11
|
Thank you so much Rui! Almost 2hrs to run :)
Very handy piece of code. Cheers and happy new year!< |
|
|
ruirib
Snitz Forums Admin
Portugal
26364 Posts |
|
|
Topic |
|