March 6, 2009.
Last week I spent some time setting up Solango on my VPS to power full text search on this blog (so where is the full text search? that's a work in progress). I decided to go the Tomcat and Solr route, which ended up being a bit more painful than entirely necessary, largely because I was not knowledgeable about the Java server ecosystem.
If you're not concerned with optimal performance, I would not use these instructions, but would instead use the Jetty server that is packaged with Solr. The Solango tutorial explains how to do this, but you'll have to read to the very end to actually realize that. That said, if you want production performance and believe that Tomcat outperforms Jetty hype, read on.
This tutorial relies heavily on this tutorial on installing Solr on Ubuntu, as well as the Solango installation docs and Solango tutorial.
(This will not translate seamlessly to OS X.)
Install Java 6 if you don't have it already.
Following these instructions for Ubuntu, the process goes like this:
sudo emacs /etc/apt/sources.list
And add these settings:
deb http://za.archive.ubuntu.com/ubuntu/ intrepid main restricted
deb http://za.archive.ubuntu.com/ubuntu/ intrepid multiverse
Then update and do these installs.
sudo apt-get update
sudo apt-get install sun-java6-bin sun-java6-jre sun-java6-jdk
Next we need to install Tomcat and Solr. There are a couple of Ubuntu packages that purport to handle this for us,
sudo apt-get install solr-common solr-tomcat5.5
but I ran into some peculiar issues with the resulting installation and ended up with a different approach (this seems to be a common problem, rather than personal failure).
sudo apt-get install tomcat6
wget http://apache.osuosl.org/lucene/solr/1.3.0/apache-solr-1.3.0.tgz
tar -xzvf apache-solr-1.3.0.tgz
cd apache-solr-1.3.0/dist/
sudo cp apache-solr-1.3.0.war /var/lib/tomcat6/webapps/solr.war
cd ../example/
sudo cp -r solr /var/lib/tomcat6/solr
And let's do some permissioning.
cd /var/lib/tomcat6
sudo chgrp -R tomcat6 solr
sudo chmod -R 775 solr/
Finally we need to add a config file.
/var/lib/tomcat6/conf/Catalina/localhost
sudo emacs solr.xml
And add these settings.
<Context docBase="/var/lib/tomcat6/webapps/solr.war" debug="0" crossContext="true" >
<Environment name="solr/home" type="java.lang.String" value="/var/lib/tomcat6/solr" override="true" />
</Context>
And adjust its permissions.
sudo chgrp tomcat6 solr.xml
sudo chmod 775 solr.xml
And some magic (from here).
sudo /etc/init.d/tomcat6 stop
sudo cp -R /var/lib/tomcat6/conf /usr/share/tomcat6/
sudo cp -R /var/lib/tomcat6/temp /usr/share/tomcat6/
cd /usr/share/tomcat6
sudo ln -s /var/cache/tomcat6 work
sudo mkdir /usr/share/tomcat6/logs
sudo bin/shutdown.sh
sudo bin/startup.sh
Don't even ask me why this is necessary. Upon completing this step (and getting Solr to stop erroring) I have lost faith in far greater things than these instructions.
(editor's note: I might having an off day when I wrote this... but these instruction do work).
As a sanity check, let's stop and start Tomcat.
sudo /etc/init.d/tomcat6 stop
sudo /etc/init.d/tomcat6 start
Now go to http://255.255.255.255:8080/solr/admin
(replacing 255.255.255.255
with your server's IP)
and you should see the solr admin screen. If you do,
then you have successfully installed Solr.
The last step of our configuration is that we probably
don't want Solr to be accessible to just anyone, so
we'll rebind it to only be accessible on localhost:8080
instead of *:8080
.
Open up tomcat/conf/server.xml
.
sudo emacs /usr/share/tomcat6/conf/server.xml
Search for Engine name="Catalina"
(it's at line 101),
and add this line beneath it:
<Valve className="org.apache.catalina.valves.RemoteAddrValve" allow="127.0.0.1,0:0:0:0:0:0:0:1%0,::1" />
For good measure, let's copy these changes back to the Tomcat
directories in /var/lib/tomcat6/
.
sudo cp -r /usr/share/tomcat6/conf /var/lib/tomcat6/
Then restart Tomcat.
sudo /usr/share/tomcat6bin/shutdown.sh
sudo /usr/share/tomcat6bin/startup.sh
And verify that Tomcat isn't externally accessible:
curl http://lethain.com:8080/solr/admin/
And also that it is internally accessible:
curl http://127.0.0.1:8080/solr/
And now we're done with that.
Retrieve Solango source from SVN.
cd ~/libs/
svn checkout http://django-solr-search.googlecode.com/svn/trunk/ django-solr-search
Add Solango to your Python path.
cd django-solr-search
ln -s `pwd`/solango /usr/lib/python2.5/site-packages/solango
Copy the solango/inital_settings.py
file into your project
folder and rename it as solr_settings.py
.
cp solango/initial_settings.py ~/git/blog/solr_settings.py
Add Solango to your project's INSTALLED_APPS
setting
in its settings.py
file.
INSTALLED_APPS = (
'lifeflow',
'compress',
'portfolio',
'solango', # <-- like this
'django_monetize',
'...',
)
At the end of your settings.py
file, but before
importing local_settings.py
(if you do so),
import everything from solr_settings
.
try:
from solr_settings import *
except ImportError:
pass
try:
from local_settings import *
except ImportError:
pass
Next create a search.py
file for your Django apps that you
want to be searchable. The LifeFlow search.py
looks like
this:
import solango
from lifeflow.models import Comment, Entry
class EntryDocument(solango.SearchDocument):
date = solango.fields.DateField()
summary = solango.fields.TextField(copy=True)
title = solango.fields.CharField(copy=True)
tags = solango.fields.CharField(copy=True)
content = solango.fields.TextField(copy=True)
def transform_summary(self, instance):
return instance.summary
def transform_tags(self, instance):
tags = list(instance.tags.all())
texts = [ tag.title for tag in tags ]
return ",".join(texts)
def transform_date(self, instance):
return instance.pub_date
def transform_content(self, instance):
return instance.body
solango.register(Entry, EntryDocument)
The irony is that this is virtually identical to the search.py
that the Solango tutorial creates for another Django blogging app,
Coltrane.
I don't even have words. (ed: confirmed, it was a bad day.)
Create a Solr schema file in your local directory.
python manage.py solr --schema --path=./
Now move it into the the Solr configuration directory.
sudo mv schema.xml /var/lib/tomcat6/solr/conf/
Restart Tomcat.
sudo /usr/share/tomcat6/bin/shutdown.sh
sudo /usr/share/tomcat6/bin/startup.sh
In a local_settings.py
file override any of
the Solango settings that don't fit your needs.
# DIRNAME is for logs
DIRNAME = '/home/django/domains/lethain.com/log/'
LOG_FILENAME = DIRNAME + "solango.log"
# for SOLR stuff
SEARCH_UPDATE_URL = "http://localhost:8080/solr/update"
SEARCH_SELECT_URL = "http://localhost:8080/solr/select"
SEARCH_PING_URLS = ["http://localhost:8080/solr/admin/ping",]
SOLR_ROOT = '/var/lib/tomcat6/solr'
SOLR_SCHEMA_PATH = SOLR_ROOT + '/conf/schema.xml'
SOLR_DATA_DIR = SOLR_ROOT + '/data'
Be careful to ensure that your server can write
to the LOG_FILENAME
file. If you followed the
configuration from the Ubuntu & Django Almanac,
you'll need to do this:
chmod 775 ~/domains/lethain.com/log/
touch solango.log
chown django solango.log
chgrp www-data solango.log
chmod 775 solango.log
(Only the first two sevens are important, the last digit can be anything depending on your preference.)
Now we need to index content for search.
cd ~django/domains/lethain.com/blog/
sudo manage.py solr --reindex
If you get a raise ValueError, "Invalid or missing XML"
,
then you need to stop and start Tomcat.
sudo /usr/share/tomcat6/bin/shutdown.sh
sudo /usr/share/tomcat6/bin/startup.sh
Now it should be possible to start searching. Fire up your project's shell.
~django/domains/lethain.com/lethain_env/bin/python manage.py shell
And then perform some searches.
>>> from solango import connection
>>> results = connection.select(q='django')
>>> results.success
True
>>> results.documents
From here you can reference the Solango tutorial if you're interested in using its pre-packaged search view, or start writing your own wrapper as desired.
Sorry that this tutorial is a bit rough. Some of the layout decisions made in the Tomcat 6 packaging were a bit bewildering to me, perhaps because of my limited Java server background. I've always wondered what it would be like to write a Java web application...