计算Postgresql中的累计总数

| 我使用
count
group by
来获取每天注册的订户数:
  SELECT created_at, COUNT(email)  
    FROM subscriptions 
GROUP BY created at;
结果:
created_at  count
-----------------
04-04-2011  100
05-04-2011   50
06-04-2011   50
07-04-2011  300
我想每天获取累积的订户总数。我怎么得到这个?
created_at  count
-----------------
04-04-2011  100
05-04-2011  150
06-04-2011  200
07-04-2011  500
    
已邀请:
对于较大的数据集,窗口函数是执行这类查询的最有效方法-表格将仅扫描一次,而不是像自动联接一样对每个日期扫描一次。它看起来也简单得多。 :) PostgreSQL 8.4及更高版本支持窗口功能。 看起来是这样的:
SELECT created_at, sum(count(email)) OVER (ORDER BY created_at)
FROM subscriptions
GROUP BY created_at;
ѭ6在这里创建窗口;
ORDER BY created_at
表示必须按
created_at
顺序求和。 编辑:如果要在一天之内删除重复的电子邮件,可以使用
sum(count(distinct email))
。不幸的是,这不会删除跨越不同日期的重复项。 如果要删除所有重复项,我认为最简单的方法是使用子查询和
DISTINCT ON
。这会将电子邮件归因于最早的日期(因为我将按created_at的升序进行排序,因此会选择最早的日期):
SELECT created_at, sum(count(email)) OVER (ORDER BY created_at)
FROM (
    SELECT DISTINCT ON (email) created_at, email
    FROM subscriptions ORDER BY email, created_at
) AS subq
GROUP BY created_at;
如果您在
(email, created_at)
上创建索引,则此查询也不应该太慢。 (如果要测试,这就是我创建示例数据集的方式)
create table subscriptions as
   select date \'2000-04-04\' + (i/10000)::int as created_at,
          \'foofoobar@foobar.com\' || (i%700000)::text as email
   from generate_series(1,1000000) i;
create index on subscriptions (email, created_at);
    
采用:
SELECT a.created_at,
       (SELECT COUNT(b.email)
          FROM SUBSCRIPTIONS b
         WHERE b.created_at <= a.created_at) AS count
  FROM SUBSCRIPTIONS a
    
SELECT
  s1.created_at,
  COUNT(s2.email) AS cumul_count
FROM subscriptions s1
  INNER JOIN subscriptions s2 ON s1.created_at >= s2.created_at
GROUP BY s1.created_at
    
我假设您每天只需要一行,并且您仍然希望显示没有任何订阅的日期(假设没有人订阅某个日期,您是否要显示前一天的余额作为该日期?)。在这种情况下,您可以使用\'with \'功能:
with recursive serialdates(adate) as (
    select cast(\'2011-04-04\' as date)
    union all
    select adate + 1 from serialdates where adate < cast(\'2011-04-07\' as date)
)
select D.adate,
(
    select count(distinct email)
    from subscriptions
    where created_at between date_trunc(\'month\', D.adate) and D.adate
)
from serialdates D
    
最好的方法是拥有一个日历表: 日历(   日期,   月int,   四分之一整数,   半整数   周整数   年int ) 然后,您可以联接该表以为所需字段做摘要。     

要回复问题请先登录注册