SQL: Getting Forum Topics 1 per User?

Snitz™ Forums 2000
https://forum.snitz.com/forumTopic/Posts/69976?pagenum=1
05 November 2025, 12:35

Topic


SiSL
SQL: Getting Forum Topics 1 per User?
02 August 2011, 12:44


Is there an easy way of like getting latest x number of topics but only 1 topic per author of topic or instead of author not having same title of topics (like spammed ones)? Like not gettings "SiSL"'s 3 topics if he opened 3 of last 10 topics but 1 topic from SiSL and rest 1 topic per author... with Topic ID's and such for SQL Server?




 

Replies ...


Carefree
02 August 2011, 14:25


So you want distinct entries. You'd use something like this:
Code:
strSql = "SELECT DISTINCT T_AUTHOR FROM " & strActivePrefix & "TOPICS"
SiSL
02 August 2011, 19:11


Yup but how may I join it with other data with Topics table

SELECT TOPIC_ID, T_SUBJECT... ?
Shaggy
03 August 2011, 04:30


Don't know MSSQL but this should work for all database types:
Code:
SELECT DISTINCT T_AUTHOR,[OTHER_FIELDS]
FROM FORUM_TOPICS
WHERE [CLAUSES] GROUP BY T_AUTHOR
ORDER BY [SORT_FIELD] LIMIT [X]
ruirib
03 August 2011, 07:09


Shaggy,

I must say that I never really understood well the use of a GROUP BY clause outside of an aggregation query and I never use it. So I did look at this solution of yours with some curiosity, because I said to myself - this cannot work tongue.
I wrote the query in MySQL this way:
Code:

SELECT DISTINCT T_AUTHOR, T_SUBJECT, T_LAST_POST FROM FORUM_TOPICS
GROUP BY T_AUTHOR
ORDER BY T_LAST_POST DESC
LIMIT 20
I then executed this over a live forum and compared it with the results of the same query, without the GROUP BY clause.

Both queries return 20 records (I used 20 to make sure I had multiple posts by the same author), but the query with the GROUP BY did not return the correct results, in terms of what is desired: last topics by different authors.
These are the results, no group by on the left, with group by on the right:


I removed the title to avoid identifiying the forum, but this shows what I said before. The last post date values are correct for the topics at the left, but not for the topics on the right. I can't explain what is happening, but this reinforces my willingness to only use GROUP BY in agregation queries alone.
I will try to post a solution that works in all cases.
ruirib
03 August 2011, 08:55


After giving this some considerable thought, I don't think it can be solved by a "simple" (or even complex) SQL statement. The simplest approach I could think of would be a multi statement table valued function. I will write one.
HuwR
03 August 2011, 09:56


in MSSQL you can't groupby T_AUTHOR if the select statement contains other columns, it will just error as the other columns are not in the aggregation function or the groupby.
ruirib
03 August 2011, 10:01


Originally posted by HuwR
in MSSQL you can't groupby T_AUTHOR if the select statement contains other columns, it will just error as the other columns are not in the aggregation function or the groupby.
I don't think that applies to a query that does not include aggregation functions, Huw.
HuwR
03 August 2011, 10:26


this query should give a list of the last 20 distinct authors who posted and when, once those results are found you would need to squirt them ino a temporary table and then join that to the topi table to get the other info you need like topic_id etc, not a simple thing to do.
SELECT TOP (20) T_AUTHOR, MAX(T_LAST_POST) AS LastPost
FROM FORUM_TOPICS
GROUP BY T_AUTHOR
ORDER BY LastPost DESC
HuwR
03 August 2011, 10:28


Originally posted by HuwR
in MSSQL you can't groupby T_AUTHOR if the select statement contains other columns, it will just error as the other columns are not in the aggregation function or the groupby.
I don't think that applies to a query that does not include aggregation functions, Huw. it applies to any query with more than one column in the select statement, they must either be in the group by or in an aggregate function as in my example above.
ruirib
03 August 2011, 10:42


Originally posted by HuwR
it applies to any query with more than one column in the select statement, they must either be in the group by or in an aggregate function as in my example above.
You are right about this. Your example is not the best one to counter my earlier objection, since you do have an aggregation function in there, but I confirmed it anyway.
HuwR
03 August 2011, 10:45


you can do it without the group by, but it still requires an agregate function, but it is just not possible to get all he info required in a simple or complex query because you can't return all the info you need like topicid etc inthe same query
ruirib
03 August 2011, 10:50


Originally posted by HuwR
you can do it without the group by, but it still requires an agregate function, but it is just not possible to get all he info required in a simple or complex query because you can't return all the info you need like topicid etc inthe same query
Yes, I had reached the same conclusion.
I was trying to use a function returning a table, but not a simple thing to do either, not without using cursors, which I am trying to avoid.
HuwR
03 August 2011, 10:56


ok, here goes, this should return the enire topic record for the last twenty topics posted to(onlythe latest topic for an author)

SELECT DISTINCT TOP (20) T_AUTHOR, MAX(T_LAST_POST) OVER(PARTITION BY T_AUTHOR) AS LastPost
INTO #TempTopics
FROM FORUM_TOPICS
ORDER BY LastPost DESC;

SELECT * FROM FORUM_TOPICS Topics, #TempTopics
WHERE Topics.T_AUTHOR = #TempTopics.T_AUTHOR AND Topics.T_LAST_POST = #TempTopics.LastPost
ruirib
03 August 2011, 11:01


No temp tables here:
Code:

CREATE FUNCTION GetLastTopicsOneByPoster()
RETURNS @topics TABLE
(
MEMBER_ID int PRIMARY KEY,
TOPIC_ID int
)
As
BEGIN
INSERT INTO @topics
SELECT T_AUTHOR, (SELECT TOPIC_ID FROM FORUM_TOPICS T2 WHERE T2.T_AUTHOR=T1.T_AUTHOR AND T2.T_LAST_POST=T1.LASTPOST) As T_ID
FROM (
SELECT TOP (10) T_AUTHOR, MAX(T_LAST_POST) AS LastPost
FROM FORUM_TOPICS
GROUP BY T_AUTHOR
ORDER BY LastPost DESC
) As T1

RETURN
END
and then

Code:

SELECT T.* FROM FORUM_TOPICS T INNER JOIN GetLastTopicsOneByPoster() G On T.TOPIC_ID=G.TOPIC_ID

Must say that I "pirated" from your initial query, Huw. I was following another path but this one is better. It returns just 10, but it's pretty easy to change it to return any number, passed as a parameter.
HuwR
03 August 2011, 11:07


well, you are infact creating a temp table smile that is what @topics is
HuwR
03 August 2011, 11:09


you could rewrite my query as follows to use an in emory table

SELECT DISTINCT TOP (20) T_AUTHOR, MAX(T_LAST_POST) OVER(PARTITION BY T_AUTHOR) AS LastPost
INTO @TempTopics
FROM FORUM_TOPICS
ORDER BY LastPost DESC;

SELECT * FROM FORUM_TOPICS Tp INNER JOIN @TempTopics ON
Tp.T_AUTHOR = @TempTopics.T_AUTHOR AND Tp.T_LAST_POST = @TempTopics.LastPost

actually to use an in memory table i is a bit more long winded.
DECLARE @TempTopics TABLE (
T_AUTHOR INT,
LastPost VARCHAR(14) );

INSERT @TempTopics(T_AUTHOR,LASTPOST)
SELECT DISTINCT TOP (20) T_AUTHOR, MAX(T_LAST_POST) OVER(PARTITION BY T_AUTHOR) AS LastPost
FROM FORUM_TOPICS
ORDER BY LastPost DESC;

SELECT * FROM FORUM_TOPICS Tp INNER JOIN @TempTopics Tt ON
Tp.T_AUTHOR = Tt.T_AUTHOR AND Tp.T_LAST_POST = Tt.LastPost
ruirib
03 August 2011, 11:11


Originally posted by HuwR
well, you are infact creating a temp table smile that is what @topics is
Yeah, just in memory... bigsmile
HuwR
03 August 2011, 11:17


code posted above to do the same with just an inline query, no functions needed
ruirib
03 August 2011, 11:21


Originally posted by HuwR
code posted above to do the same with just an inline query, no functions needed
Go away, mine is much neater bigsmile.
ruirib
03 August 2011, 11:25


Originally posted by HuwR
code posted above to do the same with just an inline query, no functions needed
Oh yeah?
Look, no temp tables:

Code:

SELECT T.* FROM FORUM_TOPICS T INNER JOIN 
(SELECT T_AUTHOR, (SELECT TOPIC_ID FROM FORUM_TOPICS T2 WHERE T2.T_AUTHOR=T1.T_AUTHOR AND T2.T_LAST_POST=T1.LASTPOST) As T_ID
FROM (
SELECT TOP (10) T_AUTHOR, MAX(T_LAST_POST) AS LastPost
FROM FORUM_TOPICS
GROUP BY T_AUTHOR
ORDER BY LastPost DESC
)T1
) As MyT ON MyT.T_ID=T.TOPIC_ID
ORDER BY T_LAST_POST DESC

This one also runs on MySQL, just by replacing TOP by LIMIT (and placing limit at the end of the subquery).
ruirib
03 August 2011, 11:34


This was fun, now to get back to work wink. Of course, I am hoping no one noticed that I contradicted myself and ended up solving it all with a very, very, simple (tongue) SQL statement.
HuwR
03 August 2011, 11:50


well, to be honest four nested selects isn't exactly simple smile
ruirib
03 August 2011, 11:55


Originally posted by HuwR
well, to be honest four nested selects isn't exactly simple smile
It's not simple ?!! bigsmiletonguecoolevil
I promise I wrote them one at a time wink.
HuwR
03 August 2011, 11:59


I still think my first attempt was the simplesttongue I just should have dropped the #TempTopics table after doing the select smile
ruirib
03 August 2011, 12:26


From an execution point of view, I tested both ways with a database from a live Snitz forum and your option had a higher subtree cost (0,8 vs. 3.17) and a higher number of scans and logical reads (rather big difference here). Probably a disadvantage on a high traffic forum.
I had performed a similar comparison before, with Snitz too (query performance has always been of interest to me) and using derived tables wasn't actually faster than using temporary tables, probably because the number of records involved was much higher. I thought it was interesting to compare it in this situation.
This is fun, but I have already spent 2 or 3 hours with this. I better go do something more boring smile.
HuwR
03 August 2011, 13:10


smile yes, I ought to finish what I was doing beore getting side tracked.
© 2000-2021 Snitz™ Communications