Python中关于SQLAlchemy的计数函数count()的使用问题

是这样的,我有一个这么一个 query,它的 sql 语句是这样的:

SELECT post.id AS post_id, post.group_id AS post_group_id, post.topic_id AS post_topic_id, post.title AS post_title, post.ukey_origin AS post_ukey_origin, post.ukey_author AS post_ukey_author, post._summary AS post__summary, post.date_created AS post_date_created, post.date_modified AS post_date_modified, post.date_last_replied AS post_date_last_replied, post.date_digested AS post_date_digested, post.date_combined AS post_date_combined, post.is_replyable AS post_is_replyable, post.is_stick AS post_is_stick, post.is_digest AS post_is_digest, post.is_virgin AS post_is_virgin, post.is_anon AS post_is_anon, group_1.id AS group_1_id, group_1.name AS group_1_name, group_1.category_id AS group_1_category_id, group_1.icon AS group_1_icon, group_1.background AS group_1_background, group_1.introduction AS group_1_introduction, group_1.tags AS group_1_tags, group_1.friends_group AS group_1_friends_group, group_1.date_created AS group_1_date_created, group_1.date_modified AS group_1_date_modified, group_1.is_application_required AS group_1_is_application_required, group_1.is_publicly_readable AS group_1_is_publicly_readable, group_1.is_indexable AS group_1_is_indexable, group_1.level AS group_1_level, group_1.custom_info AS group_1_custom_info, group_1.reply_prompt AS group_1_reply_prompt
FROM post LEFT OUTER JOIN post_statistics ON post.id = post_statistics.id JOIN "group" AS group_1 ON group_1.id = post.group_id

但是一旦我执行了 query.count()之后,它的 inner join 就会消失,执行后的 sql 语句是这样子的:

SELECT count(*) AS count_1
FROM (SELECT post.id AS post_id, post.group_id AS post_group_id, post.topic_id AS post_topic_id, post.title AS post_title, post.ukey_origin AS post_ukey_origin, post.ukey_author AS post_ukey_author, post._summary AS post__summary, post.date_created AS post_date_created, post.date_modified AS post_date_modified, post.date_last_replied AS post_date_last_replied, post.date_digested AS post_date_digested, post.date_combined AS post_date_combined, post.is_replyable AS post_is_replyable, post.is_stick AS post_is_stick, post.is_digest AS post_is_digest, post.is_virgin AS post_is_virgin, post.is_anon AS post_is_anon
FROM post LEFT OUTER JOIN post_statistics ON post.id = post_statistics.id) AS anon_1

我看了一下 sqlalchemy orm 的 query 源码,发现里面用到了 from_self()函数,然后我试了一下如果不给这个函数传参数的话,sql 语句里的 inner join 是不会消失的;

但是如果我往里面传了 count(*)这个参数,sql 语句里的 inner join 就不见了;我看了一下 from_self()函数的实现,里面的 entity 参数按理说应该只是覆盖 sql 语句前面的 select 部分,我不明白为什么后面的 from 部分也会被覆盖掉。

希望有大神能看看,谢谢了


Python中关于SQLAlchemy的计数函数count()的使用问题

3 回复

SQLAlchemy里用count()确实有几个坑。最直接的方式是用func.count()

from sqlalchemy import func

# 统计User表总记录数
count = session.query(func.count(User.id)).scalar()
print(f"总用户数: {count}")

如果你要按条件统计,比如统计状态为’active’的用户:

active_count = session.query(func.count(User.id))\
    .filter(User.status == 'active').scalar()

注意.scalar()是获取单个值,.first()返回的是元组。还有个常见错误是直接对查询对象用.count()

# 这样效率低,会加载所有数据
query = session.query(User).filter(User.status == 'active')
bad_count = query.count()  # 不推荐

# 这样更好
good_count = session.query(func.count(User.id))\
    .filter(User.status == 'active').scalar()

func.count(1)func.count('*')也可以,但func.count(User.id)通常更明确。如果关联查询需要去重统计,加上.distinct()

# 统计有订单的不同用户数
users_with_orders = session.query(func.count(distinct(Order.user_id))).scalar()

总结:用func.count()配合.scalar()获取统计结果。


代码都没有, 怎么看看。。

大概看了一下
如果 group 的 id 字段有 UNIQUE 约束,post 的 id 字段有 NOT NULL 和 REFERENCES group(id) 约束
那么 INNER JOIN 和 count 的结果无关,可以被优化掉

我从来没用过 SQLAlchemy,至于是怎么优化掉的就不清楚了

回到顶部