如何在熊貓中獲得當前時間和前15秒之間的差異？-有解無憂

我有一個pandas DataFrame，它存盤股票價格和時間，時間列的型別是pd.datetime。

這是一個演示：

import pandas as pd
df = pd.DataFrame([['2022-09-01 09:33:00', 100.], ['2022-09-01 09:33:14', 101.], ['2022-09-01 09:33:16', 99.4], ['2022-09-01 09:33:30', 100.9]], columns=['time', 'price'])
df['time'] = pd.to_datetime(df['time'])

In [11]: df
Out[11]: 
                 time  price
0 2022-09-01 09:33:00  100.0
1 2022-09-01 09:33:14  101.0
2 2022-09-01 09:33:16   99.4
3 2022-09-01 09:33:30  100.9

我想在 15 秒內計算未來回報。（15秒后的第一個價格-當前價格）

我想要的是：

In [13]: df
Out[13]: 
                 time  price  return
0 2022-09-01 09:33:00  100.0    -0.6  // the future price is 99.4, period is 16s
1 2022-09-01 09:33:14  101.0    -0.1  // the future price is 100.9, period is 16s
2 2022-09-01 09:33:16   99.4     NaN
3 2022-09-01 09:33:30  100.9     NaN

我知道df.diff可以得到索引的差異，有什么好的方法可以做到這一點嗎？

uj5u.com熱心網友回復：

`merge_asof`救援

15s從資料幀中減去一個時間增量，right然后time使用merge_asof它direction=forward選擇資料幀中的第一行，right其 on 鍵大于或等于資料幀中的 on 鍵，left然后減去該price列以計算return

df1 = pd.merge_asof(
    left=df,
    right=df.assign(time=df['time'] - pd.Timedelta('15s')),
    on='time', direction='forward', suffixes=['', '_r']
)

df1['return'] = df1.pop('price_r') - df1['price']

結果

                 time  price  return
0 2022-09-01 09:33:00  100.0    -0.6
1 2022-09-01 09:33:14  101.0    -0.1
2 2022-09-01 09:33:16   99.4     NaN
3 2022-09-01 09:33:30  100.9     NaN

uj5u.com熱心網友回復：

請試試這個（但我不相信輸出很有意義:-(）。這是你所期望的嗎？（我意識到這個代碼分配了前“15”秒的回報，而不是下一個“15”秒. 但這就是回報通常的索引方式——在它實作的時候，而不是在未來仍然預期的時候）。

import numpy as np
import pandas as pd

df = pd.DataFrame([['2022-09-01 09:33:00', 100.], ['2022-09-01 09:33:14', 101.], ['2022-09-01 09:33:16', 99.4], ['2022-09-01 09:33:30', 100.9]], columns=['time', 'price'])

df['time'] = pd.to_datetime(df['time'])
df = df.sort_values('time').reset_index(drop=True)

df.loc[:, 'return'] = df['price'].diff()

df['time_diff'] = df['time'].diff()

df['15sec_or_more'] = (df['time_diff'] >= np.timedelta64(15, 's'))

for k, i in enumerate(df.index):
    if k:
        if not df.loc[i,'15sec_or_more']:
            temp = df.iloc[k:].loc[:,['return','time_diff']].cumsum(axis=0)
            conds = (temp['time_diff'] >= np.timedelta64(15, 's'))
            if conds.sum():
                true_return_index = conds.idxmax()
                df.loc[i, 'return'] = df.loc[true_return_index, 'return']
            else:
                df.loc[i, 'return'] = np.nan

df = df[['time', 'price' ,'return']]
print(df)

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/507682.html

標籤：Python 熊猫约会时间

上一篇：使用Pandas計算類中每對值之間的時間差

下一篇：從日期時間列中過濾特定日期的資料-R

如何在熊貓中獲得當前時間和前15秒之間的差異？

merge_asof救援

結果

`merge_asof`救援