給定如下表
elems
['a', 'b', 'c', 'd', 'e']
['v', 'w', 'x', 'y']
我想把它變成這樣的東西:
tuple
['a', 'b', 'c']
['b', 'c', 'd']
['c', 'd', 'e']
['v', 'w', 'x']
['w', 'x', 'y']
即,我想得到所有重疊的三元組。
我目前的嘗試如下:
WITH foo AS (
SELECT ['a', 'b', 'c', 'd', 'e'] AS elems UNION ALL
SELECT ['v', 'w', 'x', 'y']),
single AS (
SELECT * FROM
foo,
UNNEST(elems) elem
),
tuples AS (
SELECT ARRAY_AGG(elem) OVER (ROWS BETWEEN 2 PRECEDING AND 0 FOLLOWING) AS tuple
FROM single
)
SELECT * FROM tuples
WHERE ARRAY_LENGTH(tuple) >= 3
但問題是,它也回傳了一些不需要的行,即位于foo
表中原始行“之間”的行。
tuple
['a', 'b', 'c']
['b', 'c', 'd']
['c', 'd', 'e']
['d', 'e', 'v'] <--- unwanted
['e', 'v', 'w'] <--- unwanted
['v', 'w', 'x']
['w', 'x', 'y']
此外,是否可以保證行的順序single
是正確的,還是因為基數低而僅在我的最小示例中偶然起作用?(我想可能有一個簡單的解決方案,中間沒有這一步。)
uj5u.com熱心網友回復:
考慮以下方法
select [elems[offset(index - 1)], elems[offset(index)], elems[offset(index 1)]] as tuple
from your_table, unnest([array_length(elems)]) len,
unnest(generate_array(1, len - 2)) index
如果應用于您問題中的樣本資料 - 輸出是
uj5u.com熱心網友回復:
您可以考慮以下查詢。
此外,是否可以保證單行的順序是正確的,還是因為基數低而僅在我的最小示例中偶然起作用?
afaik,如果沒有在查詢中明確使用WITH OFFSET ,它就不會被隔離。
WITH foo AS (
SELECT ['a', 'b', 'c', 'd', 'e'] AS elems UNION ALL
SELECT ['v', 'w', 'x', 'y']),
single AS (
SELECT * FROM
foo,
UNNEST(elems) elem WITH OFFSET
),
tuples AS (
SELECT ARRAY_AGG(elem) OVER (PARTITION BY FORMAT('%t', elems) ORDER BY offset ROWS BETWEEN 2 PRECEDING AND 0 FOLLOWING) AS tuple
FROM single
)
SELECT * FROM tuples
WHERE ARRAY_LENGTH(tuple) >= 3;
uj5u.com熱心網友回復:
只是給你另一個想法
create temp function slice(arr ARRAY<string>, pos float64, len float64)
returns array<string> language js as
r"return arr.slice(pos, pos len);";
select slice(elems, index, 3) as tuple
from foo, unnest([array_length(elems)]) len,
unnest(generate_array(0, len - 3)) index
讓你自己重構上面的查詢,直到它看起來像
select tuple
from foo, unnest(slices(elems, 3)) as tuple
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/522175.html
標籤:Google Cloud Collective sql谷歌大查询
上一篇:是否可以逐列遍歷向量的向量?
下一篇:如何檢查前一周的值