Solango and Tomcat 6 on Ubuntu Intrepid
Last week I spent some time setting up Solango on my VPS to power full text search on this blog (so where is the full text search? that's a work in progress). I decided to go the Tomcat and Solr route, which ended up being a bit more painful than entirely necessary, largely because I was not knowledgeable about the Java server ecosystem.
If you're not concerned with optimal performance, I would not use these instructions, but would instead use the Jetty server that is packaged with Solr. The Solango tutorial explains how to do this, but you'll have to read to the very end to actually realize that. That said, if you want production performance and believe that Tomcat outperforms Jetty hype, read on.
This tutorial relies heavily on this tutorial on installing Solr on Ubuntu, as well as the Solango installation docs and Solango tutorial.
(This will not translate seamlessly to OS X.)
Setup Solr on Ubuntu Intrepid
Install Java 6 if you don't have it already.
Following these instructions for Ubuntu, the process goes like this:
sudo emacs /etc/apt/sources.list
And add these settings:
deb http://za.archive.ubuntu.com/ubuntu/ intrepid main restricted deb http://za.archive.ubuntu.com/ubuntu/ intrepid multiverse
Then update and do these installs.
sudo apt-get update sudo apt-get install sun-java6-bin sun-java6-jre sun-java6-jdk
Next we need to install Tomcat and Solr. There are a couple of Ubuntu packages that purport to handle this for us,
sudo apt-get install solr-common solr-tomcat5.5
but I ran into some peculiar issues with the resulting installation and ended up with a different approach (this seems to be a common problem, rather than personal failure).
sudo apt-get install tomcat6 wget http://apache.osuosl.org/lucene/solr/1.3.0/apache-solr-1.3.0.tgz tar -xzvf apache-solr-1.3.0.tgz cd apache-solr-1.3.0/dist/ sudo cp apache-solr-1.3.0.war /var/lib/tomcat6/webapps/solr.war cd ../example/ sudo cp -r solr /var/lib/tomcat6/solr
And let's do some permissioning.
cd /var/lib/tomcat6 sudo chgrp -R tomcat6 solr sudo chmod -R 775 solr/
Finally we need to add a config file.
/var/lib/tomcat6/conf/Catalina/localhost sudo emacs solr.xml
And add these settings.
<Context docBase="/var/lib/tomcat6/webapps/solr.war" debug="0" crossContext="true" > <Environment name="solr/home" type="java.lang.String" value="/var/lib/tomcat6/solr" override="true" /> </Context>
And adjust its permissions.
sudo chgrp tomcat6 solr.xml sudo chmod 775 solr.xml
And some magic (from here).
sudo /etc/init.d/tomcat6 stop sudo cp -R /var/lib/tomcat6/conf /usr/share/tomcat6/ sudo cp -R /var/lib/tomcat6/temp /usr/share/tomcat6/ cd /usr/share/tomcat6 sudo ln -s /var/cache/tomcat6 work sudo mkdir /usr/share/tomcat6/logs sudo bin/shutdown.sh sudo bin/startup.sh
Don't even ask me why this is necessary. Upon completing this step (and getting Solr to stop erroring) I have lost faith in far greater things than these instructions.
(editor's note: I might having an off day when I wrote this... but these instruction do work).
As a sanity check, let's stop and start Tomcat.
sudo /etc/init.d/tomcat6 stop sudo /etc/init.d/tomcat6 start
Now go to
http://255.255.255.255:8080/solr/admin
(replacing255.255.255.255
with your server's IP) and you should see the solr admin screen. If you do, then you have successfully installed Solr.The last step of our configuration is that we probably don't want Solr to be accessible to just anyone, so we'll rebind it to only be accessible on
localhost:8080
instead of*:8080
.Open up
tomcat/conf/server.xml
.sudo emacs /usr/share/tomcat6/conf/server.xml
Search for
Engine name="Catalina"
(it's at line 101), and add this line beneath it:<Valve className="org.apache.catalina.valves.RemoteAddrValve" allow="127.0.0.1,0:0:0:0:0:0:0:1%0,::1" />
For good measure, let's copy these changes back to the Tomcat directories in
/var/lib/tomcat6/
.sudo cp -r /usr/share/tomcat6/conf /var/lib/tomcat6/
Then restart Tomcat.
sudo /usr/share/tomcat6bin/shutdown.sh sudo /usr/share/tomcat6bin/startup.sh
And verify that Tomcat isn't externally accessible:
curl http://lethain.com:8080/solr/admin/
And also that it is internally accessible:
curl http://127.0.0.1:8080/solr/
And now we're done with that.
Setup Solango
Retrieve Solango source from SVN.
cd ~/libs/ svn checkout http://django-solr-search.googlecode.com/svn/trunk/ django-solr-search
Add Solango to your Python path.
cd django-solr-search ln -s `pwd`/solango /usr/lib/python2.5/site-packages/solango
Copy the
solango/inital_settings.py
file into your project folder and rename it assolr_settings.py
.cp solango/initial_settings.py ~/git/blog/solr_settings.py
Add Solango to your project's
INSTALLED_APPS
setting in itssettings.py
file.INSTALLED_APPS = ( 'lifeflow', 'compress', 'portfolio', 'solango', # <-- like this 'django_monetize', '...', )
At the end of your
settings.py
file, but before importinglocal_settings.py
(if you do so), import everything fromsolr_settings
.try: from solr_settings import * except ImportError: pass try: from local_settings import * except ImportError: pass
Next create a
search.py
file for your Django apps that you want to be searchable. The LifeFlowsearch.py
looks like this:import solango from lifeflow.models import Comment, Entry
class EntryDocument(solango.SearchDocument): date = solango.fields.DateField() summary = solango.fields.TextField(copy=True) title = solango.fields.CharField(copy=True) tags = solango.fields.CharField(copy=True) content = solango.fields.TextField(copy=True)
<span class="k">def</span> <span class="nf">transform_summary</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instance</span><span class="p">):</span> <span class="k">return</span> <span class="n">instance</span><span class="o">.</span><span class="n">summary</span> <span class="k">def</span> <span class="nf">transform_tags</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instance</span><span class="p">):</span> <span class="n">tags</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">instance</span><span class="o">.</span><span class="n">tags</span><span class="o">.</span><span class="n">all</span><span class="p">())</span> <span class="n">texts</span> <span class="o">=</span> <span class="p">[</span> <span class="n">tag</span><span class="o">.</span><span class="n">title</span> <span class="k">for</span> <span class="n">tag</span> <span class="ow">in</span> <span class="n">tags</span> <span class="p">]</span> <span class="k">return</span> <span class="s">","</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">texts</span><span class="p">)</span> <span class="k">def</span> <span class="nf">transform_date</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instance</span><span class="p">):</span> <span class="k">return</span> <span class="n">instance</span><span class="o">.</span><span class="n">pub_date</span> <span class="k">def</span> <span class="nf">transform_content</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instance</span><span class="p">):</span> <span class="k">return</span> <span class="n">instance</span><span class="o">.</span><span class="n">body</span>
solango.register(Entry, EntryDocument)
The irony is that this is virtually identical to the
search.py
that the Solango tutorial creates for another Django blogging app, Coltrane.I don't even have words. (ed: confirmed, it was a bad day.)
Create a Solr schema file in your local directory.
python manage.py solr --schema --path=./
Now move it into the the Solr configuration directory.
sudo mv schema.xml /var/lib/tomcat6/solr/conf/
Restart Tomcat.
sudo /usr/share/tomcat6/bin/shutdown.sh sudo /usr/share/tomcat6/bin/startup.sh
In a
local_settings.py
file override any of the Solango settings that don't fit your needs.# DIRNAME is for logs DIRNAME = '/home/django/domains/lethain.com/log/' LOG_FILENAME = DIRNAME + "solango.log"
# for SOLR stuff SEARCH_UPDATE_URL = "http://localhost:8080/solr/update" SEARCH_SELECT_URL = "http://localhost:8080/solr/select" SEARCH_PING_URLS = ["http://localhost:8080/solr/admin/ping",]
SOLR_ROOT = '/var/lib/tomcat6/solr' SOLR_SCHEMA_PATH = SOLR_ROOT + '/conf/schema.xml' SOLR_DATA_DIR = SOLR_ROOT + '/data'
Be careful to ensure that your server can write to the
LOG_FILENAME
file. If you followed the configuration from the Ubuntu & Django Almanac, you'll need to do this:chmod 775 ~/domains/lethain.com/log/ touch solango.log chown django solango.log chgrp www-data solango.log chmod 775 solango.log
(Only the first two sevens are important, the last digit can be anything depending on your preference.)
Now we need to index content for search.
cd ~django/domains/lethain.com/blog/ sudo manage.py solr --reindex
If you get a
raise ValueError, "Invalid or missing XML"
, then you need to stop and start Tomcat.sudo /usr/share/tomcat6/bin/shutdown.sh sudo /usr/share/tomcat6/bin/startup.sh
Now it should be possible to start searching. Fire up your project's shell.
~django/domains/lethain.com/lethain_env/bin/python manage.py shell
And then perform some searches.
>>> from solango import connection >>> results = connection.select(q='django') >>> results.success True >>> results.documents
From here you can reference the Solango tutorial if you're interested in using its pre-packaged search view, or start writing your own wrapper as desired.
Sorry that this tutorial is a bit rough. Some of the layout decisions made in the Tomcat 6 packaging were a bit bewildering to me, perhaps because of my limited Java server background. I've always wondered what it would be like to write a Java web application...