In this small tutorial/ guide/ how to I will explain how you can build libxml2 for the use with python lxml under Linux (Debian in my case). I had to do this because I wanted to run the Springer Downloader. I'm kind of a beginner to linux and escpecially to compiling something there so I will write down the problems I had - maybe they will be helpful to someone. Comments on how to improve what I did are welcome of course.
Downloading what we need
lxml and libxml
When you go to the lxml website you will find, that under linux you can download the source of lxml and the two libraries it depends on, as stated here. Quote:
libxml2 2.6.21 or later. It can be found here: http://xmlsoft.org/downloads.html
libxslt 1.1.15 or later. It can be found here: http://xmlsoft.org/XSLT/downloads.html-------------------------
- So first of all download lxml itself (in my case this was lxml 3.2.1.tgz) and unpack it.
EDIT: You do not need step 2 & 3. I found out a nicer way - thanks at tovotu for pointing it out so that I tried it again.
- Follow the link above to the FTP server and download the newest libxml2-2.9.0.tar.gz. Be sure that you have the .tar.gz file. I had to use 2.9.0, while there was already 2.9.1 out. Otherwise I had errors because it requested the older version of the library - no idea why. Unpack it.
- Then, also on the FTP, download libxslt-1.1.28.tar.gz (or newer) and unpack it.
gcc, make, python-dev
Now you should make sure you have a few things installed via apt-get. So open your console and make sure you can use sudo command. Then (for debian-based systems like ubuntu) type in:
- sudo apt-get install python2.7
- sudo apt-get install make
- sudo apt-get install gcc
- sudo apt-get install python-dev
- sudo apt-get install libxml2
- sudo apt-get install libxml2-dev
- sudp apt-get install libxslt1.1
You might have some of those packages already installed.
CompilingNow we have to compile the two libraries we downloaded before.
EDIT: You do not need step 1 & 2 as I found out, when you installed libxml2-dev. Go to step 3 (lxml 3.2.1.tgz).
- Go to the folder where you unpacked libxml2-2.9.0.tar.gz to and start a console there - you most likely can do this via right click in the folder somewhere, otherwise change your directory via 'cd'.In the console enter the following (yes I know this can be done in one line):
sudo make installMight produce some errors but most likely it will work.
- Go to the folder where you unpacked libxslt-1.1.28.tar.gz to and start a console there.
Now type in the same commands as above. Do not do this step before the first one or it will not work!
- Go to the folder where you unpacked lxml 3.2.1.tgz to and start a console there. Type in:
sudo python setup.py installThis will install the python lib into the python directory.
Fixing an errorNow you can try to run for example the springer downloader. At least for me it failed with this error:
ImportError: /usr/lib/i386-linux-gnu/libxml2.so.2: version `LIBXML2_2.9.0' not found (required by /usr/local/lib/python2.7/dist-packages/lxml/etree.so)This error is because the libxml2.so.2.9.0 got copied to /usr/local/lib/libxml2.so.2.9.0.
You can see this by typing
sudo locate libxml2.soI don't know why this is the case. For me there was a /usr/lib/i386-linux-gnu/libxml2.so.2.8.0 probably because this was installed via the debian package(?).
So we have to move the file to the correct location:
sudo cp /usr/local/lib/libxml2.so.2.9.0 /usr/lib/i386-linux-gnu/libxml2.so.2.9.0
sudo cp /usr/local/lib/libxml2.so.2 /usr/lib/i386-linux-gnu/libxml2.so.2
Well that is it. When you need additional packages - for example for the springer downloader I mentioned (pyPdf and cssselect) download them and install them like above via 'sudo python setup.py install'.