我正在使用矩陣乘法方法將 True 和 False 的位置檢索到陣列中；這是必要的，因為我不能使用 for 外觀（我有數千條記錄）。程式如下：

import numpy as np
# Create a test array
test_array = np.array([[False, True, False, False, False, True]])
# Create a set of unique "tens", each one identifying a position
uniq_tens = [10 ** (i) for i in range(0, test_array.shape[1])]
# Multiply the matrix
print(int(np.dot(test_array, uniq_tens)[0]))
100010

10010 必須從右到左讀取（0=False，1=True，0=False，0=False，1=True）。一切正常，除非 test_array 包含20 個元素。

# This works fine - Test with 21 elements
test_array = np.array([[False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True]])
print(test_array.shape[1])
uniq_tens = [10 ** (i) for i in range(0, test_array.shape[1])]
print(int(np.dot(test_array, uniq_tens)[0]))
21
111000000000000000010

# This works fine - Test with 19 elements
test_array = np.array([[False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True]])
print(test_array.shape[1])
uniq_tens = [10 ** (i) for i in range(0, test_array.shape[1])]
print(int(np.dot(test_array, uniq_tens)[0]))
19
1000000000000000010

# This does not work - Test with 20 elements
test_array = np.array([[False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True]])
print(test_array.shape[1])
uniq_tens = [10 ** (i) for i in range(0, test_array.shape[1])]
print(int(np.dot(test_array, uniq_tens)[0]))
20
10000000000000000000

我用 numpy 版本 1.16.4/1.19.4 和 1.19.5 進行了測驗。你能幫我理解為什么嗎？我擔心其他數字也會發生這種情況，而不僅僅是 20。

非常感謝你的幫助！

uj5u.com熱心網友回復：

解釋

您正在達到 int64 限制：

print(len(str(2 ** (64 - 1))))
# 19

計算時uniq_tens，這會導致np.dot()與混合資料型別輸入相關的一些資料型別問題。

更準確地說，這里發生的是：

uniq_tenscontent 是 Python 的int，任意精度
當您呼叫串列時，np.dot()將uniq_tens轉換為 NumPy 陣列，具有未指定的資料型別
- 當最大值達到直到np.iinfo(np.int64).max資料型別被推斷為int64
- 當最大值介于兩者之間np.iinfo(np.int64).max并且np.iinfo(np.uint64).max資料型別被推斷為uint64
- 當最大值高于該值時，它會保留 Python 物件并回退到任意精度
np.dot()如果輸入是混合 dtype ，則可能會有額外的演員表。在和的情況下，np.bool_推斷np.uint64的常見型別是np.float64。

現在：

max_int64 = np.iinfo(np.int64).max
print(max_int64, len(str(max_int64)))
# 9223372036854775807 19

max_uint64 = np.iinfo(np.uint64).max
print(max_uint64, len(str(max_uint64)))
# 18446744073709551615 20

print(repr(np.array([max_int64])))
# array([9223372036854775807])
print(repr(np.array([max_uint64])))
# array([18446744073709551615], dtype=uint64)
print(repr(np.array([max_uint64   1])))
# array([18446744073709551616], dtype=object)

因此，直到 19 歲及 21 歲以上，一切正常。當您使用 20 時，它會轉換為uint64. 但是，當您使用np.dot()它時，它意識到它不能再使用int64或uint64保存結果并將所有內容轉換為np.float64：

print(np.dot([1], [max_int64]))
# 9223372036854775807
print(np.dot([1], [max_uint64]))
# 1.8446744073709552e 19
print(np.dot([1], [max_uint64   1]))
# 18446744073709551616

相反，當您從已經是 uint64 的東西開始時，它會繼續使用它：

print(np.dot(np.array([1], dtype=np.uint64), [max_uint64]))
# 18446744073709551616
print(np.dot(np.array([4321], dtype=np.uint64), [max_uint64]))
# 18446744073709547295  # wrong result

它有自己的溢位問題。

減輕

確保上述代碼始終有效的一種方法是強制dtypeof ：uniq_tensobject

import numpy as np


test_array = np.array([[0, 1]   [0] * 17   [1]])
uniq_tens = np.array([(10 ** i) for i in range(test_array.shape[1])], dtype=object)

print(test_array.shape[1], int(np.dot(test_array, uniq_tens)[0]))
# 20 10000000000000000010

其他方法

如果我們追求以特定基數計算整數的最快方法，可以設計多種方法：

import numpy as np
import numba as nb


def bools_to_int(arr, base=2):
    return sum(base ** i for i, x in enumerate(arr.tolist()) if x)


def bools_to_int_dot(arr, base=2):
    pows = np.array([base ** i for i in range(len(arr))], dtype=object)
    return np.dot(arr, pows)


def bools_to_int_mul_sum(arr, base=2):
    pows = np.array([base ** i for i in range(len(arr))], dtype=object)
    return np.sum(arr * pows)


@nb.njit
def bools_to_int_nb(arr, base=2):
    n = arr.size
    result = 0
    for i in range(n):
        if arr[i]:
            result  = base ** i
    return result

Cython 也可以加速回圈方法：

%%cython -c-O3 -c-march=native -a
#cython: language_level=3, boundscheck=False, wraparound=False, initializedcheck=False, cdivision=True, infer_types=True

# cimport numpy as cnp
# cimport cython as ccy

# import numpy as np
# import cython as cy


cpdef bools_to_int_cy(arr, base=2):
    cdef long n = arr.size
    result = 0
    for i in range(n):
        if arr[i]:
            result  = base ** i
    return result

請注意，該bools_to_int_nb()方法將一直有效，直到 int64 限制。

由于冪運算是此類計算中最昂貴的運算之一，因此可以預先計算它以進一步加速多個呼叫：

MAX_PRE_VAL = 256
BASES = list(range(2, 16))
POWS = {
    b: np.array([b ** i for i in range(MAX_PRE_VAL)])
    for b in BASES}


def bools_to_int_pre(arr, base=2, pows=POWS):
    return sum(pows[base][i] for i, x in enumerate(arr.tolist()) if x)


def bools_to_int_dot_pre(arr, base=2, pows=POWS):
    return np.dot(arr, pows[base][:len(arr)])


def bools_to_int_mul_sum_pre(arr, base=2, pows=POWS):
    return np.sum(arr * pows[base][:len(arr)])

很容易看出所有方法都產生相同的結果（除了bools_to_int_nb()已經指出的限制）：

funcs = (
    bools_to_int, bools_to_int_pre,
    bools_to_int_dot, bools_to_int_dot_pre,
    bools_to_int_mul_sum, bools_to_int_mul_sum_pre,
    bools_to_int_cy, bools_to_int_nb)


rng = np.random.default_rng(42)
arr = rng.integers(0, 2, 112)
for func in funcs:
    print(f"{func.__name__!s:>32}  {func(arr)}")

                    bools_to_int  3556263535586786347937292461931686
                bools_to_int_pre  3556263535586786347937292461931686
                bools_to_int_dot  3556263535586786347937292461931686
            bools_to_int_dot_pre  3556263535586786347937292461931686
            bools_to_int_mul_sum  3556263535586786347937292461931686
        bools_to_int_mul_sum_pre  3556263535586786347937292461931686
                 bools_to_int_cy  3556263535586786347937292461931686
                 bools_to_int_nb  -4825705174627124058

使用以下代碼生成時序：

rng = np.random.default_rng(42)


timings = {}
k = 16
for n in range(1, 128, 3):
    arrs = rng.integers(0, 2, (k, n))
    print(f"n = {n}")
    timings[n] = []
    base = [funcs[0](arr) for arr in arrs]
    for func in funcs:
        res = [func(arr) for arr in arrs]
        is_good = base == res
        timed = %timeit -r 8 -n 16 -q -o [func(arr) for arr in arrs]
        timing = timed.best * 1e6 / k
        timings[n].append(timing if is_good else None)
        print(f"{func.__name__:>24}  {is_good}  {timing:10.3f} μs")

要繪制：

import pandas as pd


df = pd.DataFrame(data=timings, index=[func.__name__ for func in funcs]).transpose()
df.plot(marker='o', xlabel='Input size / #', ylabel='Best timing / μs', figsize=(10, 8))

具有 20 個元素的 Numpy 矩陣乘法問題

表明在達到int64極限之前，該bools_to_int_nb()方法是迄今為止最快的。相反，對于較大的值np.dot()，預計算是最快的。在沒有預計算的情況下，采用簡單的手動回圈是最快的，Cython 加速提供了一個小而可觀的加速。

請注意，2 的冪問題可能可以進一步優化。

uj5u.com熱心網友回復：

我已經測驗了你的代碼，確實看起來錯誤是由np.dot函式后獲得的浮點精度引起的。您可以將其轉換回 int，但由于您將浮點數作為中間步驟，因此轉換效果不佳。此外，它適用于長度為 18 和 19 的事實純屬巧合——我已經為其他 test_arrays 測驗過它并在那里出錯。

事實上，我相信這是相當幸運的，因為您的解決方案不適用于更大的數字。您可以在下面找到一個可以解決您的問題并且應該適用于任意大陣列的單行器：

int(''.join(reversed(test_array.as_type(int).astype(str).flatten())))

這里的 test_array 會發生什么：

轉換為int獲得零和一
轉換為，str因為我們要連接
展平陣列以使其成為一維（或使用一維輸入）
使用反轉內容reversed
連接所有個人'0'和'1'字串
將輸出轉換回int

uj5u.com熱心網友回復：

對于您的用例，我認為最好的方法是：

int( 
    np.binary_repr( 
                   (2 ** np.where(test_array)[1]).sum()
                  ) 
   )

（為了清楚起見，多行，因為那里有很多嵌套的括號）

np.binary_repr()回傳一個可以int直接轉換為的字串，跳過許多轉換問題。

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/488954.html

標籤：Python 麻木的矩阵精确乘法

上一篇：Matplotlib日期時間x軸自定義日期來自檔案

下一篇：為什么在Python中計算網格和向量的矩陣乘法時會收到警告？

具有20個元素的Numpy矩陣乘法問題

解釋

減輕

其他方法