Python正則運算式不回傳任何匹配項-有解無憂

我正在嘗試匹配 HTML 檔案行中的模式。

這是檔案的片段

<tr>
<td>CBB <span> &bull; </span> CB <span> &bull; </span> LTSB</td>
<td>2022-09-13</td>
<td>14393.5356</td>
<td><a href="https://support.microsoft.com/help/5017305" target="_blank" data-linktype="external">KB5017305</a></td>
</tr>
<tr>
<td>CBB <span> &bull; </span> CB <span> &bull; </span> LTSB</td>
<td>2022-08-09</td>
<td>14393.5291</td>
<td><a href="https://support.microsoft.com/help/5016622" target="_blank" data-linktype="external">KB5016622</a></td>
</tr>
<tr>
<td>CBB <span> &bull; </span> CB <span> &bull; </span> LTSB</td>
<td>2022-07-12</td>
<td>14393.5246</td>
<td><a href="https://support.microsoft.com/help/5015808" target="_blank" data-linktype="external">KB5015808</a></td>
</tr>
<tr>
<td>CBB <span> &bull; </span> CB <span> &bull; </span> LTSB</td>
<td>2022-06-14</td>
<td>14393.5192</td>
<td><a href="https://support.microsoft.com/help/5014702" target="_blank" data-linktype="external">KB5014702</a></td>
</tr>
<tr>

這是我正在運行的代碼。

with open('file.html') as htmltext:
htmldata = htmltext.readlines()

pattern = "([\r\n].*?)(?:=?\r|\n)(.*?(?:14393).*)"

for data in htmldata:
    matchedx = re.search(pattern, data)
    if matchedx:
      print(matchedx)

正則運算式模式是匹配一個字串并回傳上一行。
在此處檢查正則運算式https://regex101.com/r/7vI31a/1會回傳匹配項，但是在 python 中運行時找不到匹配項。

在 python 中運行時，將其用作模式會回傳匹配項。

pattern = "(14393.*)"

uj5u.com熱心網友回復：

正如 jasonharper 評論的那樣，您需要將正則運算式應用于所有資料。

這對我有用：

import re
# data = open('file.html').read()
data = """<tr>
<td>CBB <span> &bull; </span> CB <span> &bull; </span> LTSB</td>
<td>2022-09-13</td>
<td>14393.5356</td>
<td><a href="https://support.microsoft.com/help/5017305" target="_blank" data-linktype="external">KB5017305</a></td>
</tr>
<tr>
<td>CBB <span> &bull; </span> CB <span> &bull; </span> LTSB</td>
<td>2022-08-09</td>
<td>14393.5291</td>
<td><a href="https://support.microsoft.com/help/5016622" target="_blank" data-linktype="external">KB5016622</a></td>
</tr>
<tr>
<td>CBB <span> &bull; </span> CB <span> &bull; </span> LTSB</td>
<td>2022-07-12</td>
<td>14393.5246</td>
<td><a href="https://support.microsoft.com/help/5015808" target="_blank" data-linktype="external">KB5015808</a></td>
</tr>
<tr>
<td>CBB <span> &bull; </span> CB <span> &bull; </span> LTSB</td>
<td>2022-06-14</td>
<td>14393.5192</td>
<td><a href="https://support.microsoft.com/help/5014702" target="_blank" data-linktype="external">KB5014702</a></td>
</tr>
<tr>"""

pattern = re.compile("([\r\n].*?)(?:=?\r|\n)(.*?(?:14393).*)")
matches = re.findall(pattern, data)
for match in matches:
    print(match)

哪個列印：

('\n<td>2022-09-13</td>', '<td>14393.5356</td>')
('\n<td>2022-08-09</td>', '<td>14393.5291</td>')
('\n<td>2022-07-12</td>', '<td>14393.5246</td>')
('\n<td>2022-06-14</td>', '<td>14393.5192</td>')

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/517299.html

標籤：Python正则表达式

上一篇：帶有2個捕獲組的正則運算式的問題，一個是可選的

下一篇：PythonRegex：如何使用數字關鍵字列印與字串匹配的整個單詞？