However, setting up the file indexer to work is a little tricky. In this short guide we present the steps needed to enable Xapian on a Centos-6.3 machine. Similar steps can be taken for other Linux flavors. Unfortunately windows are not supported for the time being.
In the following we assume that an eFront instance is already up and running on the server. In addition, all commands should be run with root privileges.
- add the xapian-related rpm key:
- add the xapian-related rpm:
- Update yum:
- yum update
- Install xapian, along with php bindings[1]:
- yum install xapian-omega xapian-bindings-php
- Install poppler-utils, which will provide us with pdf manipulation tools
- yum install poppler-utils
- Enable the xapian extension in php.ini, by adding the following line:
- extension=xapian.so in php.ini
- Enable the EPEL repository, which is needed for catdoc installation:
- wget http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-7.noarch.rpm
- rpm -Uvh epel-release-6-7.noarch.rpm
- Install catdoc:
- yum install catdoc
[1] PHP bindings are not available on Ubuntu from the official repository due to some license restrictions. There are various sites discussing workarounds for this.
We now need to install unoconv which is dependent on OpenOffice or LibreOffice.
For the sake of this tutorial, we’ll be using OpenOffice libraries:
- Enable the RPMForge repo, based on the instructions found at
- Install the OpenOffice parts that we’re interested in
- yum install openoffice.org-pyuno openoffice.org-headless
- Install unoconv:
- Install openoffice components:
- yum install openoffice.org-writer
- yum install openoffice.org-impress
- yum install openoffice.org-calc
After this, you should have a Xapian installation, ready to work with php. Initialize it by running: /usr/bin/php /var/www/efront/libraries/external/xapian_cron.php (assuming that the php cli is in /usr/bin/php and eFront is located in /var/www/efront/)
After making sure that Xapian is installed successfully, you can setup a cron job to periodically index new files, say every 10 minutes:
*/10 * * * * php /var/www/xapian/libraries/external/xapian_cron.php
From now on, every time you upload a pdf, doc, docx, xls, xlsx or txt file, it will be indexed and a full-text search will be performed on its contents when you search for a string in eFront.
Additional resources and troubleshooting: