HOME - Recent Changes - Search:

Academic Work


* pot de départ




[ edit | logout ]
[ help | sandbox | passwd ]

Download an Image


Here we will see how to:

  • download an image with urllib
  • download an image with urllib2
  • download a protected image with urllib2 and wget
  • get cookies from Firefox 3 and use cookies.txt with wget

Download an image with urllib


import urllib

urllib.urlretrieve("http://japancsaj.com/pic/susu/Susu_2.jpg", "out.jpg")


import urllib

class MyOpener(urllib.FancyURLopener):
    version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv: Gecko/20071127 Firefox/'

#create an opener, so we can change its user-agent
urlopen = MyOpener().open
urlretrieve = MyOpener().retrieve

urllib.urlretrieve("http://japancsaj.com/pic/susu/Susu_2.jpg", "out.jpg")

This latter tip is from the project manwer.

Download an image with urllib2

import urllib2

req = urllib2.Request("http://japancsaj.com/pic/susu/Susu_2.jpg")
response = urllib2.urlopen(req)

output = open('out.jpg','wb')

See also:

Download a protected image with urllib2

There is a nice site at http://japancsaj.blog.hu/ with great photos. However, if you want to download an image from outside of the browser, you get an HTML file instead of the image. The trick is that you need to specify a referer:

import urllib2

referer = 'http://japancsaj.com'
image   = 'http://japancsaj.com/pic/susu/Susu_2.jpg'

req = urllib2.Request(image)
req.add_header('Referer', referer)   # here is the trick
response = urllib2.urlopen(req)

output = open('out.jpg','wb')

The same thing with wget:

wget  --referer=http://japancsaj.com   http://japancsaj.com/pic/susu/Susu_2.jpg

Get cookies from Firefox 3 and use cookies.txt with wget

To solve the japancsaj mystery, first I tried to combine cookies with wget. The real solution at the end was to use a referer (as seen before), but I'll share what I learnt about cookies, maybe it'll be handy one day.

So, to get cookies from Firefox 3, you can use this script (copied from http://0x7be.de/2008/06/19/firefox-3-und-cookiestxt/):

import sqlite3 as db
import sys

cookiedb = '/home/USENAME/.mozilla/firefox/PROFIL/cookies.sqlite'
targetfile = '/home/USERNAME/cookies.txt'
what = sys.argv[1]
connection = db.connect(cookiedb)
cursor = connection.cursor()
contents = "host, path, isSecure, expiry, name, value"

cursor.execute("SELECT " +contents+ " FROM moz_cookies WHERE host LIKE '%" 
               +what+ "%'")

file = open(targetfile, 'w')
index = 0
for row in cursor.fetchall():
  file.write("%s\tTRUE\t%s\t%s\t%d\t%s\t%s\n" % (row[0], row[1],
             str(bool(row[2])).upper(), row[3], str(row[4]), str(row[5])))
  index += 1

print "Gesucht nach: %s" % what
print "Exportiert: %d" % index


Normally, you could also use a Firefox plugin for this purpose called Export Cookies but it doesn't seem to work with Firefox 3.5. Fortunately we have this nice Python solution. Next step:

./export-firefox-cookies.py japancsaj

It'll produce a file called cookies.txt that contains all the cookies that are from the site 'japancsaj'.


wget --load-cookies=cookie.txt http://what_to_download

Or, you can also ask wget to produce the cookie.txt file:

wget --cookies=on --keep-session-cookies --save-cookies=cookie.txt http://first_page
wget --referer=http://first_page --cookies=on --load-cookies=cookie.txt http://second_page
Cloud City

anime | bash | blogs | bsd | c/c++ | c64 | calc | comics | convert | cube | del.icio.us | digg | east | eBooks | egeszseg | elite | firefox | flash | fun | games | gimp | google | groovy | hardware | hit&run | howto | java | javascript | knife | lang | latex | liferay | linux | lovecraft | magyar | maths | movies | music | p2p | perl | pdf | photoshop | php | pmwiki | prog | python | radio | recept | rts | scala | scene | sci-fi | scripting | security | shell | space | súlyos | telephone | torrente | translate | ubuntu | vim | wallpapers | webutils | wikis | windows

Blogs and Dev.

* Ubuntu Incident
* Python Adventures
* me @ GitHub


Debrecen | France | Hungary | Montreal | Nancy


full circle | km

Hobby Projects

* Jabba's Codes
* PmWiki
* Firefox
* JavaScript
* Scriptorium
* Tutorials
* me @ GitHub

Quick Links

[ edit ]

View - Edit - History - Attach - Print *** Report - Recent Changes - Search
Page last modified on 2010 September 10, 03:32