我有一個DF
df = spark.sql("""select number,name,owner,support,user,business_unit from table""")
我想重命名owner.display_value
為owner_display_value
support.display_value
support_display_value
所有者列和支持列是一個結構,因此我只從該列中獲取 display_value。
df2 = df.select("number","name","owner.display_value" as owner_display_value,"support.display_value" as support_display_value, "user_group","business_unit")
但我得到錯誤
“DataFrame”物件沒有“重命名”屬性。
uj5u.com熱心網友回復:
代替
df2 = df.select(
"number",
"name",
"owner.display_value" as owner_display_value,
"support.display_value" as support_display_value,
"user_group",
"business_unit"
)
和
df2 = df.selectExpr(
"number",
"name",
"owner.display_value as owner_display_value",
"support.display_value as support_display_value",
"user_group",
"business_unit"
)
uj5u.com熱心網友回復:
使用F.col("column_name").alias("new_name")
:
完整示例:
import pyspark.sql.functions as F
schema = StructType([
StructField("number", LongType()),
StructField("name", StringType()),
StructField("owner", StructType([StructField("display_value", StringType())])),
StructField("support", StructType([StructField("display_value", StringType())])),
StructField("user_group", StringType()),
StructField("business_unit", StringType()),
])
df = spark.createDataFrame(data=[[123, "abc", ("onwr",), ("sprt",), "usr", "bu"]], schema=schema)
df2 = df.select(
"number",
"name",
F.col("owner.display_value").alias("owner_display_value"),
F.col("support.display_value").alias("support_display_value"),
"user_group",
"business_unit")
[Out]:
------ ---- ------------------- --------------------- ---------- -------------
|number|name|owner_display_value|support_display_value|user_group|business_unit|
------ ---- ------------------- --------------------- ---------- -------------
|123 |abc |onwr |sprt |usr |bu |
------ ---- ------------------- --------------------- ---------- -------------
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/533411.html
標籤:熊猫pyspark