我正在嘗試將下面的特定文本字串作為單獨的輸出獲取,例如(從下面的 HTML 中抓取它們):
let text = "Thats the first text I need";
let text2 = "The second text I need";
let text3 = "The third text I need";
我真的不知道如何獲取由不同 HTML 標簽分隔的文本。
<p>
<span class="hidden-text"><span class="ft-semi">Count:</span>31<br></span>
<span class="ft-semi">Something:</span> That's the first text I need
<span class="hidden-text"><span class="ft-semi">Something2:</span> </span>The second text I need
<br><span class="ft-semi">Something3:</span> The third text I need
</p>
uj5u.com熱心網友回復:
您可以迭代 的子節點<p>
并抓取任何nodeType === Node.TEXT_NODE
具有非空內容的 s:
for (const e of document.querySelector("p").childNodes) {
if (e.nodeType === Node.TEXT_NODE && e.textContent.trim()) {
console.log(e.textContent.trim());
}
}
// or to make an array:
const result = [...document.querySelector("p").childNodes]
.filter(e =>
e.nodeType === Node.TEXT_NODE && e.textContent.trim()
)
.map(e => e.textContent.trim());
console.log(result);
<p>
<span class="hidden-text">
<span class="ft-semi">Count:</span>
31
<br>
</span>
<span class="ft-semi">Something:</span>
That's the first text I need
<span class="hidden-text">
<span class="ft-semi">Something2:</span>
</span>
The second text I need
<br>
<span class="ft-semi">Something3:</span>
The third text I need
</p>
在 Cheerio:
const cheerio = require("cheerio"); // 1.0.0-rc.12
const html = `
<p>
<span >
<span >Count:</span>
31
<br>
</span>
<span >Something:</span>
That's the first text I need
<span >
<span >Something2:</span>
</span>
The second text I need
<br>
<span >Something3:</span>
The third text I need
</p>
`;
const $ = cheerio.load(html);
const result = [...$("p").contents()]
.filter(e => e.type === "text" && $(e).text().trim())
.map(e => $(e).text().trim());
console.log(result);
uj5u.com熱心網友回復:
嘗試這樣的事情,看看它是否有效:
html = `your sample html above`
domdoc = new DOMParser().parseFromString(html, "text/html")
result = domdoc.evaluate('//text()[not(ancestor::span)]', domdoc, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
for (let i = 0; i < result.snapshotLength; i ) {
target = result.snapshotItem(i).textContent.trim()
if (target.length > 0) {
console.log(target);
}
}
使用您的示例 html,輸出應為:
"That's the first text I need"
"The second text I need"
"The third text I need"
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/506280.html
標籤:javascript jQuery 节点.js 网页抓取 切里奥