The urllib2
module contains a number of utilities for simple access to data on the web.
For instance, the example below allows the user to enter a series of urls, and displays both the metadata and the contents of each of the documents.
import sys import urllib2 # read a url print "Enter the first url (or just enter to quit):" chosenurl = sys.stdin.readline() # keep processing urls as long as the user enters them while (len(chosenurl) > 1): urlfile = urllib2.open(chosenurl) print "The associated metadata is: ", urlfile.info() print "The file content is:" for nextline in urlfile print nextline print "Enter the next url (or just enter to quit):" chosenurl = sys.stdin.readline()
Similarly, here is a simple Python script that allows you to access a url that requires basic authentication (i.e. the user is supposed to provide a username and password when requesting the url).
#! /usr/bin/env python import urllib2, sys, base64 # get the url desired, as well as the username/password to be used print "Enter the url:" chosenurl = sys.stdin.readline() print "Enter your username:" username = sys.stdin.readline() print "Enter the password (warning: this is not encrypted)" password = sys.stdin.readline() # request the url request = urllib2.Request(url) # respond to the authentication request with the name/pwd base64string = base64.encodestring('%s:%s' % (username, password))[:-1] request.add_header("Authorization", "Basic %s" % base64string) # open the resulting resource and read its contents htmlFile = urllib2.urlopen(request) htmlData = htmlFile.read() print htmlData htmlFile.close()Some of the other commonly used routines from urllib include:
Python and email
While the email
module contains tools for handling more sophisticated message structures,
even the smtplib
module contains a number of mail handling utilities, e.g. to send mail:
import smptlib server = smptlib.SMPT('localhost') server.sendmail('someonesending@somewhere', 'someonegetting@somewhereelse', """To: someonegetting@somewhereelse From: someonesending@somewhere The body of the email. """) server.quit()
Python for CGI
Output is typically generated by a Python CGI script simply using the print statement, e.g.
print "Content-type: text/html" print " " print "<html><body>" print "Hi!" print "</body></html>"For obtaining form data passed to the CGI script, life is simplest if we use the Python CGI module: import cgi
Note: if, during debugging, you want to have the browser display error information that results from bugs in your scripts, include the following additional line: import cgitb; cgitb.enable() If, on the other hand, you'd like to have this information dumped to a file instead of displayed on the browser, include this line instead: import cgitb; cgitb.enable(display=0, logdir="/tmp") |
Submitted form data is available via the FieldStorage class, so we can capture that information with statements like: myformdata = cgi.FieldStorage()
To extract the data from form field "name", we can use statements like:
namevalue = myformdata["name"].value
Note that sometimes a field contains multiple values, so we can check to see
if it is a list or not:
if isinstance(namevalue, list):
There are a huge number of other possibilities, but this should at least give you a start at Python CGI.
Cookie handling in Python
Naturally Python also provides support for handling cookies, here is a very simple example showing how to check for existing cookies, modify them if necessary, and create them if they don't exist.
#! /usr/bin/env python # grab the modules we'll use import os, cgi, Cookie, time from Cookie import SimpleCookie # Looking up, creating, and adjusting simple cookies # ================================================== # a cookie that counts how often the user has visited this page # check to see if there is already a defined cookie if os.environ.has_key('HTTP_COOKIE'): countercookie = SimpleCookie(os.environ['HTTP_COOKIE']) # otherwise we'll define one here else: countercookie = SimpleCookie() # if a value already exists for a cookie named 'counter' grab that # otherwise we want to set one with value 0 initialvalues = {'counter': 0 } for key in initialvalues.keys(): if not countercookie.has_key(key): countercookie[key] = initialvalues[key] # increment the 'counter' component of the cookie by 1 countercookie['counter'] = int(countercookie['counter'].value) + 1 # print our HTML form, specifying the number of visits print "Content-type: text/html" print print " Visit number: " print countercookie['counter'].value print " " # if we want to set an expiry date as N seconds in the future, # simply use the following (here with N = 86400 seconds, or 24 hours) countercookie['counter']['expires'] = 86400 # Aside: here's how to grab the current time # if you're planning on using/manipulating that currtime = time.gmtime(time.time())