Skip to content Skip to sidebar Skip to footer

Urlretrieve Not Working For This Site

I'm trying to download an image, however it does seem to work. Is it being blocked by ddos protection? Here is the code: urllib.request.urlretrieve('http://archive.is/Xx9t3/scr.pn

Solution 1:

For reasons that I cannot even imagine, the server requires a well known user agent. So you must pretend to use for example firefox and it will accept to send the image:

# first build a request object
req = urllib.request.Request("http://archive.is/Xx9t3/scr.png",
        headers = {
           'User-agent':
              'Mozilla/5.0 (Windows NT 5.1; rv:43.0) Gecko/20100101 Firefox/43.0'})

#then use it
resp = urllib.request.urlopen(req)
withopen("test.png","wb") as fd:
    fd.write(resp.read())

Rather stupid, but when a server admin goes mad, just be as stupid as he is...

Solution 2:

I'd advice you to use requests, basically the way you are trying to get the image is forbidden, check this:

import requests
import shutil

r = requests.get('http://archive.is/Xx9t3/scr.png', stream=True)
if r.status_code == 200:
    withopen("test.png", 'wb') as f:
        r.raw.decode_content = True
        shutil.copyfileobj(r.raw, f)

This snippet was adapted from here

The magic behind this is how the resource is retrieved, with requests that part is the stream=True line. Some servers are more restricted with this methods to pull some resources like media.

Post a Comment for "Urlretrieve Not Working For This Site"