如何使用numpy檢查2d陣列中5行的每個組合的條件-有解無憂

出于多種原因，我想用 numpy 重寫下面的這個（作業！）代碼，但我找不到一個好的方法；#1，我以前從未使用過 numpy，而且通常是新的，#2，Python 太慢了，#3，我想使用 print(m[:,0]) 輸出一個名稱列，因為 # 4、itertools組合只輸出二維串列，不是二維numpy陣列，我不行。

def compCheck(m):          # function to check how many attributes the group shares
    rowsNum = len(m)
    columnsNum = len(m[0])
    sCount = 0             # counts the non empties in a row
    matches = 0            # counts the total number of matches

    for a in range(2,columnsNum):
        for b in range(0,rowsNum):
            if m[b][a]:            # if entry isn't blank
                sCount  = 1
        if sCount >= 3:
            matches  = 1
        sCount = 0
    print (matches)

from itertools import combinations
teamSize = 5

for i in combinations(masterList, teamSize):
    compCheck(i)

為了解釋這段代碼做什么（或應該做什么），它創建了一個包含 5 行的每個唯一組合的串列，無需從 2d 串列（稱為 masterList）中替換。它查看每個組合并檢查列（偏移 2，因此不計算名稱）。如果該列中的 5 個條目中至少有 3 個被填充，那么它將將該列計為匹配項。然后它回傳匹配的總數并移動到下一個組合。

檢查示例應為：

Input: compCheck([["Alex", "Smith", "Chess", "Skiing", "", ""],
["Bob", "Dole", "Chess", "", "", ""],
["Charlie", "Chaplin", "Chess", "", "", ""],
["Daisy", "Buchanon", "", "", "", "Partying"],
["Emily", "Evans", "Chess", "Skiing", "", ""]]

Output: "1 for ['Alex' 'Bob' 'Charlie' 'Daisy' 'Emily']"

輸入示例與上面的串列大致相同（但有更多行），因此我將僅發布 6 行串列的示例：

from itertools import combinations
teamSize = 5
masterList = [["Alex", "Smith", "Chess", "Skiing", "", ""],
["Bob", "Dole", "Chess", "", "", ""],
["Charlie", "Chaplin", "Chess", "", "", ""],
["Daisy", "Buchanon", "", "", "", "Partying"],
["Emily", "Evans", "Chess", "Skiing", "", ""],
["Frank", "Ferdinand", "", "Skiing", "", ""]]

for i in combinations(masterList, teamSize):
    compCheck(i)

Output: ["1 for ['Alex' 'Bob' 'Charlie' 'Daisy' 'Emily']",
"1 for ['Alex' 'Bob' 'Charlie' 'Daisy' 'Frank']",
"2 for ['Alex' 'Bob' 'Charlie' 'Emily' 'Frank']",
"2 for ['Alex' 'Bob' 'Daisy' 'Emily' 'Frank']",
"2 for ['Alex' 'Charlie' 'Daisy' 'Emily' 'Frank']",
"1 for ['Bob' 'Charlie' 'Daisy' 'Emily' 'Frank]"]

uj5u.com熱心網友回復：

我認為pandas.DataFrame在這種情況下更合適。

masterList = [
    ["Alex", "Smith", "Chess", "Skiing", "", ""],
    ["Bob", "Dole", "Chess", "", "", ""],
    ["Charlie", "Chaplin", "Chess", "", "", ""],
    ["Daisy", "Buchanon", "", "", "", "Partying"],
    ["Emily", "Evans", "Chess", "Skiing", "", ""],
    ["Frank", "Ferdinand", "", "Skiing", "", ""]
]

df = (
    pd.DataFrame(
        masterList,
        columns = ['name','surname','chess','skiing','whatever','partying']
    )
    .drop(columns='surname')
    .set_index('name')
    .applymap(bool)
)

以下是轉換后的資料：

如何使用 numpy 檢查 2d 陣列中 5 行的每個組合的條件

我們必須考慮combinations回傳一個元組序列，而不是一個二維串列。因此，我們必須在提取資料之前將每個組合轉換為串列：

NGroup = 5
minShare = 3
for combo in combinations(df.index, NGroup):
    print(
        '{count} for {combo}'.format(
            count=(df.loc[[*combo]].sum() >= minShare).sum(), 
            combo=', '.join(combo)
        )
    )

這是輸出：

如何使用 numpy 檢查 2d 陣列中 5 行的每個組合的條件

如果numpy是選擇，則此代碼可用于相同的輸出：

data = np.array(masterList)
captions = data[:, 0]
hobbies = (data[:, 2:] != '')

for combo in combinations(range(len(hobbies)), NGroup):
    print(
        '{count} for {combo}'.format(
            count=(hobbies[[*combo]].sum(axis=0) >= minShare).sum(), 
            combo=', '.join(captions[[*combo]])
        )
    )

uj5u.com熱心網友回復：

您可以使用 numpy 輕松完成此操作

例如

def compCheck(m):          # function to check how many attributes the group shares
    rowsNum = len(m)
    columnsNum = len(m[0])
    sCount = 0             # counts the non empties in a row
    matches = 0            # counts the total number of matches

    for a in range(2,columnsNum):
        for b in range(0,rowsNum):
            if m[b][a]:            # if entry isn't blank
                sCount  = 1
        if sCount >= 3:
            matches  = 1
        sCount = 0
    return matches

from itertools import combinations
import numpy as np
teamSize = 5
masterList =  [['A','B',0,1,0,1],
               ['A','B',0,0,0,1],
               ['A','B',1,1,0,0],
               ['A','B',0,1,1,1],
               ['A','B',0,1,0,0],
               ['A','B',1,1,0,1],
               ['A','B',1,1,0,0],
               ['A','B',0,1,1,1],
               ['A','B',0,0,1,1],
               ['A','B',1,1,0,1], ]
for i in combinations(masterList, teamSize):
    Mat2D = np.array([l[2:] for l in i])
    print(np.sum(np.count_nonzero(np.array(Mat2D),axis=0) >= 3))
    print(compCheck(i))

如果矩陣是正確的，你說的兩個第一個值是字串。

最好直接從 masterList 中洗掉名稱

def compCheck(m):          # function to check how many attributes the group shares
    rowsNum = len(m)
    columnsNum = len(m[0])
    sCount = 0             # counts the non empties in a row
    matches = 0            # counts the total number of matches

    for a in range(0,columnsNum):
        for b in range(0,rowsNum):
            if m[b][a]:            # if entry isn't blank
                sCount  = 1
        if sCount >= 3:
            matches  = 1
        sCount = 0
    return matches

from itertools import combinations
import numpy as np
teamSize = 5
masterList =  [['A','B',0,1,0,1],
               ['A','B',0,0,0,1],
               ['A','B',1,1,0,0],
               ['A','B',0,1,1,1],
               ['A','B',0,1,0,0],
               ['A','B',1,1,0,1],
               ['A','B',1,1,0,0],
               ['A','B',0,1,1,1],
               ['A','B',0,0,1,1],
               ['A','B',1,1,0,1], ]
masterList =  np.array([l[2:] for l in masterList])
for i in combinations(masterList, teamSize):
    print(np.sum(np.count_nonzero(np.array(i),axis=0) >= 3))
    print(compCheck(i))

更新

這里有你的資料的代碼

from itertools import combinations
import numpy as np
teamSize = 5
masterList = [["Alex", "Smith", "Chess", "Skiing", "", ""],
              ["Bob", "Dole", "Chess", "", "", ""],
              ["Charlie", "Chaplin", "Chess", "", "", ""],
              ["Daisy", "Buchanon", "", "", "", "Partying"],
              ["Emily", "Evans", "Chess", "Skiing", "", ""],
              ["Frank", "Ferdinand", "", "Skiing", "", ""]]

for i in combinations(masterList, teamSize):
    Mat2D = np.array([l[2:] for l in i])
    check = np.sum(np.count_nonzero(np.array(Mat2D), axis=0) >= 3)
    if check:
        print(check, ' for ', [l[0] for l in i])

與給予

1  for  ['Alex', 'Bob', 'Charlie', 'Daisy', 'Emily']
1  for  ['Alex', 'Bob', 'Charlie', 'Daisy', 'Frank']
2  for  ['Alex', 'Bob', 'Charlie', 'Emily', 'Frank']
2  for  ['Alex', 'Bob', 'Daisy', 'Emily', 'Frank']
2  for  ['Alex', 'Charlie', 'Daisy', 'Emily', 'Frank']
1  for  ['Bob', 'Charlie', 'Daisy', 'Emily', 'Frank']

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/487603.html

標籤：Python 数组麻木的多维数组组合

上一篇：在Python中呼叫另一個檔案并生成輸出

下一篇：如何從多個變數生成組合？