用多索引資料幀上每列的最后一個非零值替換零-有解無憂

我有以下多索引資料框：

        0   1   2   3
0   0   0   0   0   0
0   1   36  19  2   4
0   2   233 21  2   4
0   3   505 25  1   4
0   4   751 27  1   4
0   5   976 28  1   4
0   9   0   0   0   0
0   10  0   0   0   0
0   11  0   0   0   0
0   12  0   0   0   0
1   0   40  19  2   4
1   1   323 18  1   4
1   2   595 24  1   4
1   3   844 26  1   4
1   4   0   0   0   0
1   5   0   0   0   0
1   9   0   0   0   0
1   10  0   0   0   0
1   11  0   0   0   0
1   12  0   0   0   0

我可以重復最后一個大于零的值直到第一組結束的最簡單方法是什么？期望的結果是：

        0   1   2   3
0   0   0   0   0   0
0   1   36  19  2   4
0   2   233 21  2   4
0   3   505 25  1   4
0   4   751 27  1   4
0   5   976 28  1   4
0   9   976 28  1   4
0   10  976 28  1   4
0   11  976 28  1   4
0   12  976 28  1   4
1   0   40  19  2   4
1   1   323 18  1   4
1   2   595 24  1   4
1   3   844 26  1   4
1   4   844 26  1   4
1   5   844 26  1   4
1   9   844 26  1   4
1   10  844 26  1   4
1   11  844 26  1   4
1   12  844 26  1   4

謝謝

uj5u.com熱心網友回復：

您可以為每個組使用replacewith ：method='ffill'

out = df.groupby(level=0).apply(lambda x: x.replace(0, method='ffill'))
print(out)

# Output
        0   1  2  3
0 0     0   0  0  0
  1    36  19  2  4
  2   233  21  2  4
  3   505  25  1  4
  4   751  27  1  4
  5   976  28  1  4
  9   976  28  1  4
  10  976  28  1  4
  11  976  28  1  4
  12  976  28  1  4
1 0    40  19  2  4
  1   323  18  1  4
  2   595  24  1  4
  3   844  26  1  4
  4   844  26  1  4
  5   844  26  1  4
  9   844  26  1  4
  10  844  26  1  4
  11  844  26  1  4
  12  844  26  1  4

uj5u.com熱心網友回復：

您可以按列分隔串列中的資料框，并對每個資料框執行遍歷：

l = [36, 233, 505, 751, 976, 0, 0, 0, 0, 0, 40, 323, 595, 844, 0, 0, 0, 0, 0]

current = l[0]
for i in range(len(l)):
    if l[i]>= current:
        current = l[i]
    elif l[i]==0:
        l[i] = current
    else:
        current = l[i]
    
print(l)

輸出：

[36, 233, 505, 751, 976, 976, 976, 976, 976, 976, 40, 323, 595, 844, 844, 844, 844, 844, 844]

uj5u.com熱心網友回復：

您可以使用掩碼選擇每組的最后一段 0（在的幫助下GroupBy.cummax），然后ffill每組用最后一個非零值替換：

# select the last stretch of zeros per group
mask = df[::-1].groupby(level=0).cummax().eq(0)

# mask the above found values and ffill them
out = df.mask(mask).ffill(downcast='infer')

輸出：

        0   1  2  3
0 0     0   0  0  0
  1    36  19  2  4
  2   233  21  2  4
  3   505  25  1  4
  4   751  27  1  4
  5   976  28  1  4
  9   976  28  1  4
  10  976  28  1  4
  11  976  28  1  4
  12  976  28  1  4
1 0    40  19  2  4
  1   323  18  1  4
  2   595  24  1  4
  3   844  26  1  4
  4   844  26  1  4
  5   844  26  1  4
  9   844  26  1  4
  10  844  26  1  4
  11  844  26  1  4
  12  844  26  1  4

uj5u.com熱心網友回復：

查看解決方案差異的示例資料：

print (df)

        0   1  2  3
0 0     0   0  0  0
  1    36  19  2  4
  2     0   0  2  4
  3     0  25  1  4
  4   751  27  1  4
  5   976  28  1  4
  9     0   0  0  0
  10    0   0  0  0
  11    0   0  0  0
  12    0   0  0  0
1 0     0  19  2  4
  1   323  18  1  4
  2   595  24  1  4
  3   844  26  1  4
  4     0   0  0  0
  5     0   0  0  0
  9     0   0  0  0
  10    0   0  0  0
  11    0   0  0  0
  12    0   0  0  0

由 MultiIndex 的第一級使用GroupBy.ffill，將 0 替換為缺失值 - 如果存在，它還會在組中替換 0：

df1 = df.replace(0, np.nan).groupby(level=0).ffill().fillna(0, downcast='infer')
print (df1)
        0   1  2  3
0 0     0   0  0  0
  1    36  19  2  4
  2    36  19  2  4
  3    36  25  1  4
  4   751  27  1  4
  5   976  28  1  4
  9   976  28  1  4
  10  976  28  1  4
  11  976  28  1  4
  12  976  28  1  4
1 0     0  19  2  4
  1   323  18  1  4
  2   595  24  1  4
  3   844  26  1  4
  4   844  26  1  4
  5   844  26  1  4
  9   844  26  1  4
  10  844  26  1  4
  11  844  26  1  4
  12  844  26  1  4

新解決方案僅0使用回填值進行最后一次測驗：

df2=df.mask(df.replace(0, np.nan).groupby(level=0).bfill().isna()).ffill(downcast='infer')

print (df2)
        0   1  2  3
0 0     0   0  0  0
  1    36  19  2  4
  2     0   0  2  4
  3     0  25  1  4
  4   751  27  1  4
  5   976  28  1  4
  9   976  28  1  4
  10  976  28  1  4
  11  976  28  1  4
  12  976  28  1  4
1 0     0  19  2  4
  1   323  18  1  4
  2   595  24  1  4
  3   844  26  1  4
  4   844  26  1  4
  5   844  26  1  4
  9   844  26  1  4
  10  844  26  1  4
  11  844  26  1  4
  12  844  26  1  4

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/496673.html

標籤：Python 熊猫数据框多指标

上一篇：在Pandas中將字串格式化為沒有零填充、AM/PM和UTC的日期時間

下一篇：類互動方法php