我不知道怎么寫標題,所以它很長。隨意編輯它。
我試圖從這個站點抓取資料,但我不知道如何使用漂亮的湯訪問“window.data”中的各個鍵和值。
例如,我想獲取 yyuid、生日等的值。
代碼是這樣的:
import urllib.request
import urllib.error
from bs4 import BeautifulSoup
import re
username = "itsahardday"
url = "https://likee.video/@" username # profile url - https://likee.video/account_name
def get_profile_html():
'''
Get profile data from HTML - https://likee.video/account_name
:return:
'''
response = urllib.request.urlopen(url)
soup = BeautifulSoup(response.read(), "html.parser")
results = soup.select_one("script:-soup-contains('userinfo')").string
print(results)
get_profile_html()
最好我想將它作為 JSON,但歡迎任何解決方案。
在此先感謝您的幫助!
uj5u.com熱心網友回復:
調整了你的代碼。從函式回傳。
import urllib.request
import urllib.error
from bs4 import BeautifulSoup
import re
username = "itsahardday"
url = "https://likee.video/@" username # profile url - https://likee.video/account_name
def get_profile_html():
'''
Get profile data from HTML - https://likee.video/account_name
:return:
'''
response = urllib.request.urlopen(url)
soup = BeautifulSoup(response.read(), "html.parser")
results = soup.select_one("script:-soup-contains('userinfo')").string
print(results)
return results # add return
res=get_profile_html() # save the result
然后,轉換為 JSON
import json # import
json.loads(res.split(";")[0].split("window.data =")[1])['userinfo']
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/401831.html