Category Archives: JRuby

libxml2 for Nokogiri in JRuby

I’ve been playing with Nokogiri on JRuby 1.3.0 and this message gets displayed when I require ‘nokogiri’:

macbookpro:~ douglas$ jirb
irb(main):001:0> require 'rubygems'
=> true
irb(main):002:0> require 'nokogiri'
HI. You're using libxml2 version 2.6.16 which is over 4 years old and has
plenty of bugs. We suggest that for maximum HTML/XML parsing pleasure, you
upgrade your version of libxml2 and re-install nokogiri. If you like using
libxml2 version 2.6.16, but don't like this warning, please define the constant
=> true

That can’t be, I’ve just upgraded libxml2 and libxslt in my MacPorts installation. Digging further, I don’t receive the message when doing the same thing on MRI. A quick look with lsof reveals that the shared libraries are being loaded from /usr/lib instead of /opt/local/lib (MacPorts is installed into /opt/local).

lsof -c java
java 390 douglas txt REG 14,2 290736 898139 /usr/lib/libexslt.0.dylib

lsof -c ruby
ruby 865 douglas txt REG 14,2 72156 1871557 /opt/local/lib/libexslt.0.8.13.dylib
ruby 865 douglas txt REG 14,2 218688 1871571 /opt/local/lib/libxslt.1.1.24.dylib
ruby 865 douglas txt REG 14,2 1251948 1866501 /opt/local/lib/libxml2.2.7.3.dylib

Nokogiri uses Ruby FFI to dynamically load native C code and FFI makes use of dlopen to do the actual loading of dynamic libraries. On OSX, dlopen searches for files specified by a couple of environment variables (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH, DYLD_FALLBACK_LIBRARY_PATH), and the current working directory.

Setting LD_LIBRARY_PATH to /opt/local/lib worked for me. There may be differences in the environment variables used for dlopen on different platforms, so a look at the MAN pages would be a good idea if things don’t seem to work.