logo4 Evolution is progress—                          
progress is creativity.        

Beautiful Soup in Django

view blog view wiki view wiki view wiki

Beautiful Soup is a handy tool to extract data from XML or HTML streams (files or strings). As I used the reference macro with XML data copied immediately from Zotero into my edit box, this tool proved quite helpful.

Everything worked perfect and more stable than the tool I used before. However when I tried to retrieve a page through the Apache web server, the system locked. I found this bug reported and discussed already. It sticks with Beautiful Soup version 4. Version 3 works perfect with Apache too.

To have Beautiful Soup version 4 and 3 running on a same system the installation of BS3 only requires the file BeautifulSoup.py copied into the working directory and the following replacements to the code.

#from bs4 import BeautifulSoup as bs
from BeautifulSoup import BeautifulStoneSoup as bs

Still find_all valid in BS4 has to be replaces by findAll, virtually the same command in BS3. That's it.

Tags: Software


(c) Mato Nagel, Weißwasser 2004-2013, Disclaimer