Postgres限制另一个表中WHERE IN id中每个行的行数

2024-05-10 • 问答

我有一个消息传递应用程序，需要返回用户所属的所有对话以及与每个对话相关联的消息。我想限制每次对话的消息数。

表结构如下：

用户

| id   | name | email    | created_at |
|------|------|----------|------------|
| 1    | Bob  | a@b.com  | timestamp  |
| 2    | Tom  | b@b.com  | timestamp  |
| 3    | Mary | c@b.com  | timestamp  |

消息

| id   | sender_id | conversation_id  | message | created_at |
|------|-----------|------------------|---------|------------|
| 1    | 1         | 1                | text    | timestamp  |
| 2    | 2         | 2                | text    | timestamp  |
| 3    | 2         | 1                | text    | timestamp  |
| 4    | 3         | 3                | text    | timestamp  |

对话

| id | created_at |
|----|------------|
| 1  | timestamp  |
| 2  | timestamp  |
| 3  | timestamp  |

Conversations_Users

| id | user_id | conversation_id |
|----|---------|-----------------|
| 1  | 1       | 1               |
| 2  | 2       | 1               |
| 3  | 2       | 2               |
| 3  | 3       | 2               |
| 4  | 3       | 3               |
| 5  | 1       | 3               |

我想加载用户（id 1）所在的所有对话（在示例中-对话1和3）。对于每个对话，我需要与之关联的消息，按conversation_id分组，按created_at ASC排序。我当前的查询处理此问题：

SELECT
    *
FROM
    messages
WHERE
    conversation_id IN (
        SELECT
            conversation_id
        FROM
            conversations_users
        WHERE
            user_id = 1
    )
ORDER BY
    conversation_id,created_at ASC;

但是，这会将大量数据粘贴到内存中。因此，我想限制每次对话的消息数量。

我看过rank()和ROW_NUMber()，但不确定如何实现它们/如果需要它们。

这是一个使用Memory来限制每100 users个会话的示例。以row_number()的顺序获取最新的descending。

conversations

您确实可以使用row_number()。以下查询将为您提供给定用户每次会话的最后10条消息：

select *
from (
    select 
        m.*,row_number() over(
            partition by cu.user_id,m.conversation_id 
            order by m.created_at desc
        ) rn
    from messages m
    inner join conversations_users cu 
        on  cu.conversation_id  = m.conversation_id 
        and cu.user_id = 1
) t
where rn <= 10
order by conversation_id,created_at desc

注意：

我将使用in的子查询转换为常规的join，因为我认为这是表达需求的一种更整洁的方式
我在分区子句中添加了用户ID。因此，如果您删除了对用户进行过滤的where子句，则会获得每个用户对话的最后10条消息

您可以使用ROW_NUMBER()来限制每次对话的消息数量。获取最新的：

SELECT m.*
FROM (SELECT m.*,ROW_NUMBER() OVER (PARTITION BY m.conversation_id ORDER BY m.created_at DESC) as seqnum
      FROM messages m
     ) m JOIN
     conversation_users cu
     ON m.conversation_id = cu.conversation_id
WHERE cu.user_id = 1 AND seqnum <= <n>
ORDER BY m.conversation_id,m.created_at ASC;

另一种方法是使用横向连接：

select m.*
from conversation_users cu cross join lateral
     (select m.*
      from messages m
      where m.conversation_id = cu.conversation_id
      order by m.created_at desc
      limit <n>
     ) m
where cu.user_id = 1
order by m.message_id,m.created_at;

我认为这可能在较大的数据上具有更好的性能，但是您需要对其进行测试。

Postgres限制另一个表中WHERE IN id中每个行的行数

xiaoqi176 回答：Postgres限制另一个表中WHERE IN id中每个行的行数

大家都在问