我有一個資料框
import pandas as pd
iris=pd.read_csv("https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv")
iris.tail(5)
iris.head(5)
從iris
資料框中我得到的df_setosa
,df_virginica
和df_versicolor
dataframes
df_setosa = iris[iris['variety'] == 'Setosa']
df_virginica = iris[iris['variety'] == 'Virginica']
df_versicolor = iris[iris['variety'] == 'Versicolor']
# paste the corresponding variety name as the suffix to each dataframe
df_setosa = df_setosa.add_suffix('_setosa')
df_virginica = df_virginica.add_suffix('_virginica')
df_versicolor = df_versicolor.add_suffix('_versicolor')
print(df_virginica.columns)
print(df_versicolor.columns)
print(df_setosa.columns)
print(df_setosa.shape) # 50 row by 5 columns
print(df_versicolor.shape) # 50 rows by 5 columns
print(df_virginica.shape) # 50 rows by 5 columns
由于每個資料幀的形狀為(50,5)
,我想連接(或如我們在 R cbind 中所說)三個資料幀。
我的嘗試:
#### I need help concatenating the three dataframes
concat_df = pd.concat([df_setosa,df_virginica,df_versicolor]) # this returns a lot of NaN
concat_df.shape # this returns a shape of 150 rows by 15 columns instead of 50 rows by 15 columns
本concat_df
應該有一個50 rows by 15 columns
形狀
提前致謝
uj5u.com熱心網友回復:
創建“子”資料幀時,請重置它們的索引,因為在這種情況下沒有理由保留原始虹膜的索引
df_setosa = iris[iris['variety'] == 'Setosa'].reset_index(drop=True)
df_virginica = iris[iris['variety'] == 'Virginica'].reset_index(drop=True)
df_versicolor = iris[iris['variety'] == 'Versicolor'].reset_index(drop=True)
然后,當您連接時,通過將“axis”引數設定為 1 來確保水平連接,如下所示:
concat_df = pd.concat([df_setosa,df_virginica,df_versicolor], axis=1)
您還可以為最后一步保留“reset_index”。如果你不這樣做,concat 仍然會放置 150 行,因為它會嘗試將 0 到 149 的索引按順序排列,并用 NaN 填充其余部分
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/340413.html
上一篇:在撰寫CaeserCipher練習時,輸入的SecretMessage應該輸出為VhfuhwPhvvdjh但它輸出為VhfuhwqPhvvdjh
下一篇:架構師之路-https底層原理