我想將行寫入 csv 檔案,但檔案應包含不超過 X 行。如果超過閾值,則需要啟動一個新檔案。所以如果我有以下資料:
csv_max_rows=3
columns = ["A", "B", "C"]
rows = [
["a1", "b1", "c1"],
["a2", "b2", "c2"],
["a3", "b3", "c3"],
["a4", "b4", "c4"],
["a5", "b5", "c5"],
["a6", "b6", "c6"],
["a7", "b7", "c7"],
["a8", "b8", "c8"],
["a9", "b9", "c9"],
["a10", "b10", "c10"]
]
我想最終得到 4 個檔案,其中檔案 1、2、3 每個有 3 行,檔案 4 只有一行。在 Python csv writer 中是否有內置選項可以做到這一點?
uj5u.com熱心網友回復:
我認為您的要求太具體了,無法期望標準庫中有內置選項。下面的解決方案有點笨拙,但我認為這正是您想要的。
import csv
csv_max_rows = 3
columns = ["A", "B", "C"]
rows = [
["a1", "b1", "c1"],
["a2", "b2", "c2"],
["a3", "b3", "c3"],
["a4", "b4", "c4"],
["a5", "b5", "c5"],
["a6", "b6", "c6"],
["a7", "b7", "c7"],
["a8", "b8", "c8"],
["a9", "b9", "c9"],
["a10", "b10", "c10"],
]
for i, row in enumerate(rows):
if (i % csv_max_rows) == 0:
fp = open(f"out_{i//csv_max_rows 1}.csv", "w")
writer = csv.writer(fp)
writer.writerow(columns)
writer.writerow(row)
uj5u.com熱心網友回復:
我不確定是否有內置選項,但顯然這并不復雜:
from typing import List
import csv
import concurrent
def chunks(lst: List, n: int):
while lst:
chunk = lst[0:n]
lst = lst[n:]
yield chunk
def write_csv(csv_file_path: str, columns: List[str], rows: List[List]):
with open(csv_file_path, 'w') as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerow(columns)
for row in rows:
csv_writer.writerow(row)
def write_csv_parallel(base_csv_file_path: str, columns: List[str], rows: List[List], csv_max_rows: int) -> List[str]:
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
chunked_rows = chunks(rows, csv_max_rows)
csv_writing_args = ((f"{base_csv_file_path}.{idx 1}", columns, chunk_of_rows) for idx, chunk_of_rows
in enumerate(chunked_rows))
executor.map(lambda f: write_csv(*f), csv_writing_args)
if __name__ == "__main__":
columns = ["A", "B", "C"]
rows = [
["a1", "b1", "c1"],
["a2", "b2", "c2"],
["a3", "b3", "c3"],
["a4", "b4", "c4"],
["a5", "b5", "c5"],
["a6", "b6", "c6"],
["a7", "b7", "c7"],
["a8", "b8", "c8"],
["a9", "b9", "c9"],
["a10", "b10", "c10"]
]
base_csv_file_path = "/tmp/test_file.csv"
csv_file_paths = write_csv_parallel(base_csv_file_path, columns, rows, csv_max_rows=3)
print("data was written into the following files: \n" "\n".join(csv_file_paths))
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/401941.html
上一篇:如何將所有字典保存到CSV檔案?