Script Throws An Error When It Is Made To Run Using Multiprocessing
Solution 1:
It does not make sense to reference the global variable row
in get_data()
, because
It's a global and will not be shared between each "thread" in the multiprocessing Pool, because they are actually separate python processes that do not share globals.
Even if they did, because you're building the entire ISBN list before executing
get_info()
, the value ofrow
will always bews.max_row + 1
because the loop has completed.
So you would need to provide the row values as part of the data passed to the second argument of p.map()
. But even if you were to do that, writing to and saving the spreadsheet from multiple processes is a bad idea due to Windows file locking, race conditions, etc. You're better off just building the list of titles with multiprocessing, and then writing them out once when that's done, as in the following:
import requests
from bs4 import BeautifulSoup
from openpyxl import load_workbook
from multiprocessing import Pool
defget_info(isbn):
params = {
'url': 'search-alias=aps',
'field-keywords': isbn
}
res = requests.get("https://www.amazon.com/s/ref=nb_sb_noss?", params=params)
soup = BeautifulSoup(res.text, "lxml")
itemlink = soup.select_one("a.s-access-detail-page")
if itemlink:
return get_data(itemlink['href'])
defget_data(link):
res = requests.get(link)
soup = BeautifulSoup(res.text, "lxml")
try:
itmtitle = soup.select_one("#productTitle").get_text(strip=True)
except AttributeError:
itmtitle = "N\A"return itmtitle
defmain():
wb = load_workbook('amazon.xlsx')
ws = wb['content']
isbnlist = []
for row inrange(2, ws.max_row + 1):
if ws.cell(row=row, column=1).value isNone:
break
val = ws["A" + str(row)].value
isbnlist.append(val)
with Pool(10) as p:
titles = p.map(get_info, isbnlist)
p.terminate()
p.join()
for row inrange(2, ws.max_row + 1):
ws.cell(row=row, column=2).value = titles[row - 2]
wb.save("amazon.xlsx")
if __name__ == '__main__':
main()
Post a Comment for "Script Throws An Error When It Is Made To Run Using Multiprocessing"