Urlretrieve Not Working For This Site

May 24, 2024 Post a Comment

I'm trying to download an image, however it does seem to work. Is it being blocked by ddos protection? Here is the code: urllib.request.urlretrieve('http://archive.is/Xx9t3/scr.pn

Solution 1:

For reasons that I cannot even imagine, the server requires a well known user agent. So you must pretend to use for example firefox and it will accept to send the image:

# first build a request object
req = urllib.request.Request("http://archive.is/Xx9t3/scr.png",
        headers = {
           'User-agent':
              'Mozilla/5.0 (Windows NT 5.1; rv:43.0) Gecko/20100101 Firefox/43.0'})

#then use it
resp = urllib.request.urlopen(req)
withopen("test.png","wb") as fd:
    fd.write(resp.read())

Rather stupid, but when a server admin goes mad, just be as stupid as he is...

Solution 2:

I'd advice you to use requests, basically the way you are trying to get the image is forbidden, check this:

import requests
import shutil

r = requests.get('http://archive.is/Xx9t3/scr.png', stream=True)
if r.status_code == 200:
    withopen("test.png", 'wb') as f:
        r.raw.decode_content = True
        shutil.copyfileobj(r.raw, f)

This snippet was adapted from here

The magic behind this is how the resource is retrieved, with requests that part is the stream=True line. Some servers are more restricted with this methods to pull some resources like media.

Learn Python Programming

Urlretrieve Not Working For This Site

Solution 1:

Solution 2:

Post a Comment for "Urlretrieve Not Working For This Site"