我有一個資料框,其中有一列包含字典串列。這是一個示例列值的樣子:
[{'score': 0.09248554706573486, 'category': 'soccer', 'threshold': 0.13000713288784027}, {'score': 0.09267200529575348, 'category': 'soccer', 'threshold': 0.11795613169670105}, {'score': 0.1703065186738968, 'category': 'soccer', 'threshold': 0.2004493921995163}, {'score': 0.08060390502214432, 'category': 'basketball', 'threshold': 0.09613725543022156}, {'score': 0.16494056582450867, 'category': 'basketball', 'threshold': 0.2284235805273056}, {'score': 0.008428425528109074, 'category': 'basketball', 'threshold': 0.018201233819127083}, {'score': 0.0761604905128479, 'category': 'hockey', 'threshold': 0.0924532413482666}, {'score': 0.10853488743305206, 'category': 'basketball', 'threshold': 0.1252049058675766}, {'score': 0.0012563085183501244, 'category': 'soccer', 'threshold': 0.008611497469246387}, {'score': 0.058744996786117554, 'category': 'soccer', 'threshold': 0.08366610109806061}, {'score': 0.20794744789600372, 'category': 'rugby', 'threshold': 0.26308900117874146}, {'score': 0.1463163197040558, 'category': 'hockey', 'threshold': 0.18053030967712402}, {'score': 0.12938784062862396, 'category': 'hockey', 'threshold': 0.13267497718334198}, {'score': 0.09140244871377945, 'category': 'basketball', 'threshold': 0.13820350170135498}, {'score': 0.06976936012506485, 'category': 'hockey', 'threshold': 0.0989123210310936}, {'score': 0.05813559517264366, 'category': 'basketball', 'threshold': 0.06885409355163574}, {'score': 0.09365707635879517, 'category': 'hockey', 'threshold': 0.12393374741077423},]
我想創建一個單獨的資料框,它為每一行獲取上述列值,并生成一個資料框,其中“類別”是一列,該列的值是分數和閾值。
例如:
category | score | threshold
soccer | 0.09248554706573486 | 0.13000713288784027
soccer | 0.09267200529575348 | 0.13000713288784027
soccer | 0.1703065186738968 | 0.13000713288784027
basketball | 0.16494056582450867 | 0.018201233819127083
basketball | 0.08060390502214432 | 0.018201233819127083
basketball | 0.10853488743305206 | 0.018201233819127083
uj5u.com熱心網友回復:
假設lst
輸入串列,只需使用DataFrame
建構式:
df = pd.DataFrame(lst)
輸出:
score category threshold
0 0.092486 soccer 0.130007
1 0.092672 soccer 0.117956
2 0.170307 soccer 0.200449
3 0.080604 basketball 0.096137
4 0.164941 basketball 0.228424
5 0.008428 basketball 0.018201
6 0.076160 hockey 0.092453
7 0.108535 basketball 0.125205
8 0.001256 soccer 0.008611
9 0.058745 soccer 0.083666
10 0.207947 rugby 0.263089
11 0.146316 hockey 0.180530
12 0.129388 hockey 0.132675
13 0.091402 basketball 0.138204
14 0.069769 hockey 0.098912
15 0.058136 basketball 0.068854
16 0.093657 hockey 0.123934
如果您對系列中的每個專案都有這樣的串列,請使用itertools.chain
:
from itertools import chain
df2 = pd.DataFrame(chain.from_iterable(df['col']))
uj5u.com熱心網友回復:
IIUC,您需要迭代列中的值,將它們轉換為資料框并將它們附加到新的資料框。例如:
dout = pd.DataFrame()
for dd in df['col']:
dout = pd.concat([dout, pd.DataFrame(dd)])
dout = dout.reset_index(drop=True)
或者作為一種理解:
dout = pd.concat(pd.DataFrame(v) for v in df['col']).reset_index(drop=True)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/516970.html