我有下表:
id date cid birth_place location
1 2022-01-01 1 France Germany
2 2022-01-30 1 France France
3 2022-01-25 2 Spain Spain
4 2022-01-12 3 France France
5 2022-02-01 4 England Italy
6 2022-02-12 1 France France
7 2022-03-05 5 Spain England
8 2022-03-08 2 Spain Spain
9 2022-03-15 2 Spain Spain
10 2022-03-30 5 Spain Italy
11 2022-03-22 4 England England
12 2022-03-22 3 France England
我試圖計算每個人在這個月的位置,但如果他們回到他們的出生地,我會優先考慮。檢查我想要的輸出:
date location count
2022-01-01 France 2
2022-01-01 Spain 1
2022-02-01 Italy 1
2022-02-01 France 1
2022-03-01 Spain 1
2022-03-01 England 3
cid
2022-01-01 中的 1 有一個location = birth_place
,并且在該時間段內沒有其他客戶將德國作為位置,因此在我想要的輸出位置中沒有德國。
編輯:最后一行的值不正確,所以我修復了它。
編輯:另一種觀點來進一步解釋我想要實作的目標。希望這可以幫助。
id date cid birth_place location | (I want to count distinct cids grouped by trunc('month', date) based on these values)
1 2022-01-01 1 France Germany | France (cid 1 has location France in Jan)
2 2022-01-30 1 France France | France
3 2022-01-25 2 Spain Spain | Spain
4 2022-01-12 3 France France | France
5 2022-02-01 4 England Italy | Italy (cid 4 has only location Italy in Feb and no England, so Italy is taken)
6 2022-02-12 1 France France | France
7 2022-03-05 5 Spain England | England (cid 5 has two locations that are not Spain, so the first one that the query sees is simply taken)
8 2022-03-08 2 Spain Spain | Spain
9 2022-03-15 2 Spain Spain | Spain
10 2022-03-30 5 Spain Italy | England
11 2022-03-22 4 England England | England
12 2022-03-22 3 France England | England
編輯:我正在使用的 sql 代碼
CREATE TABLE tbl (
id int NOT NULL
, date date NOT NULL
, cid int NOT NULL
, birth_place text NOT NULL
, location text NOT NULL
);
INSERT INTO tbl VALUES
(1 , '2022-01-01', 1, 'France' , 'Germany')
, (2 , '2022-01-30', 1, 'France' , 'France')
, (3 , '2022-01-25', 2, 'Spain' , 'Spain')
, (4 , '2022-01-12', 3, 'France' , 'France')
, (5 , '2022-02-01', 4, 'England', 'Italy')
, (6 , '2022-02-12', 1, 'France' , 'France')
, (7 , '2022-03-05', 5, 'Spain' , 'England')
, (8 , '2022-03-08', 2, 'Spain' , 'Spain')
, (9 , '2022-03-15', 2, 'Spain' , 'Spain')
, (10, '2022-03-30', 5, 'Spain' , 'Italy')
, (11, '2022-03-22', 4, 'England', 'England')
, (12, '2022-03-22', 3, 'France' , 'England');
with
t as (
select id, date_trunc('month', date)::date AS date, cid, birth_place, location
from tbl),
t1 as (
select date, cid, location
from t
where birth_place = location),
t2 as (
select date, cid, location, row_number() over (partition by date, cid order by date) as row
from t
where birth_place <> location),
t3 as (
select t.*,
case
when t1.location is not null then t1.location
else t2.location
end as new_loc
from t
left join t1
on t.cid = t1.cid and t.date = t1.date
left join t2
on t.cid = t2.cid and t.date = t2.date and t2.row = 1)
select date, new_loc, count(distinct cid)
from t3
group by 1, 2
order by 1, 2
uj5u.com熱心網友回復:
假設這個目標:
將日期截斷為月份。
每個(月,cid)選擇一個位置,家庭位置優先。
然后計算每個(月,位置)的行數。
SELECT date, location, count(*)
FROM (
SELECT DISTINCT ON (1, 2) -- chose **one** location per (month, cid)
date_trunc('month', date)::date AS date, cid, location
FROM tbl
ORDER BY 1, 2, birth_place = location DESC -- priority to home location, else **arbitrary**
) sub
GROUP BY 1, 2
ORDER BY 1, 2; -- optional
db<>
uj5u.com熱心網友回復:
我的方法是case...when
在計數范圍內使用。這樣,無論是否使用where
過濾器,它都可以作業,因此將來允許在同一查詢中進行其他資料聚合。
SELECT
date_trunc('month', date)::date AS date, t.location
, count(distinct (case when t.location=t.birth_place then t.cid else null end)) as "count"
FROM theTable AS t
WHERE t.location=t.birth_place
GROUP BY date_trunc('month', date)::date, t.location
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/492785.html
標籤:sql PostgreSQL 总计的 每组最大 n