Irrational Exuberance!

Solango and Tomcat 6 on Ubuntu Intrepid

March 6, 2009. Filed under djangoubuntusolango

Last week I spent some time setting up Solango on my VPS to power full text search on this blog (so where is the full text search? that's a work in progress). I decided to go the Tomcat and Solr route, which ended up being a bit more painful than entirely necessary, largely because I was not knowledgeable about the Java server ecosystem.

If you're not concerned with optimal performance, I would not use these instructions, but would instead use the Jetty server that is packaged with Solr. The Solango tutorial explains how to do this, but you'll have to read to the very end to actually realize that. That said, if you want production performance and believe that Tomcat outperforms Jetty hype, read on.

This tutorial relies heavily on this tutorial on installing Solr on Ubuntu, as well as the Solango installation docs and Solango tutorial.

(This will not translate seamlessly to OS X.)

Setup Solr on Ubuntu Intrepid

  1. Install Java 6 if you don't have it already.

    Following these instructions for Ubuntu, the process goes like this:

    sudo emacs /etc/apt/sources.list
    

    And add these settings:

    deb http://za.archive.ubuntu.com/ubuntu/ intrepid main restricted
    deb http://za.archive.ubuntu.com/ubuntu/ intrepid multiverse
    

    Then update and do these installs.

    sudo apt-get update
    sudo apt-get install sun-java6-bin sun-java6-jre sun-java6-jdk
    
  2. Next we need to install Tomcat and Solr. There are a couple of Ubuntu packages that purport to handle this for us,

    sudo apt-get install solr-common solr-tomcat5.5
    

    but I ran into some peculiar issues with the resulting installation and ended up with a different approach (this seems to be a common problem, rather than personal failure).

    sudo apt-get install tomcat6
    wget http://apache.osuosl.org/lucene/solr/1.3.0/apache-solr-1.3.0.tgz
    tar -xzvf apache-solr-1.3.0.tgz 
    cd apache-solr-1.3.0/dist/
    sudo cp apache-solr-1.3.0.war /var/lib/tomcat6/webapps/solr.war
    cd ../example/
    sudo cp -r solr /var/lib/tomcat6/solr
    

    And let's do some permissioning.

    cd /var/lib/tomcat6
    sudo chgrp -R tomcat6 solr
    sudo chmod -R 775 solr/
    

    Finally we need to add a config file.

    /var/lib/tomcat6/conf/Catalina/localhost
    sudo emacs solr.xml
    

    And add these settings.

    <Context docBase="/var/lib/tomcat6/webapps/solr.war" debug="0" crossContext="true" >
       <Environment name="solr/home" type="java.lang.String" value="/var/lib/tomcat6/solr" override="true" />
    </Context>
    

    And adjust its permissions.

    sudo chgrp tomcat6 solr.xml
    sudo chmod 775 solr.xml 
    

    And some magic (from here).

    sudo /etc/init.d/tomcat6 stop 
    sudo cp -R /var/lib/tomcat6/conf /usr/share/tomcat6/ 
    sudo cp -R /var/lib/tomcat6/temp /usr/share/tomcat6/ 
    cd /usr/share/tomcat6 
    sudo ln -s /var/cache/tomcat6 work 
    sudo mkdir /usr/share/tomcat6/logs
    sudo bin/shutdown.sh 
    sudo bin/startup.sh 
    

    Don't even ask me why this is necessary. Upon completing this step (and getting Solr to stop erroring) I have lost faith in far greater things than these instructions.

    (editor's note: I might having an off day when I wrote this... but these instruction do work).

  3. As a sanity check, let's stop and start Tomcat.

    sudo /etc/init.d/tomcat6 stop
    sudo /etc/init.d/tomcat6 start
    

    Now go to http://255.255.255.255:8080/solr/admin (replacing 255.255.255.255 with your server's IP) and you should see the solr admin screen. If you do, then you have successfully installed Solr.

  4. The last step of our configuration is that we probably don't want Solr to be accessible to just anyone, so we'll rebind it to only be accessible on localhost:8080 instead of *:8080.

    Open up tomcat/conf/server.xml.

    sudo emacs /usr/share/tomcat6/conf/server.xml
    

    Search for Engine name="Catalina" (it's at line 101), and add this line beneath it:

    <Valve className="org.apache.catalina.valves.RemoteAddrValve" allow="127.0.0.1,0:0:0:0:0:0:0:1%0,::1" />
    

    For good measure, let's copy these changes back to the Tomcat directories in /var/lib/tomcat6/.

    sudo cp -r /usr/share/tomcat6/conf /var/lib/tomcat6/
    

    Then restart Tomcat.

    sudo /usr/share/tomcat6bin/shutdown.sh 
    sudo /usr/share/tomcat6bin/startup.sh 
    

    And verify that Tomcat isn't externally accessible:

    curl http://lethain.com:8080/solr/admin/
    

    And also that it is internally accessible:

    curl http://127.0.0.1:8080/solr/
    

    And now we're done with that.

Setup Solango

  1. Retrieve Solango source from SVN.

    cd ~/libs/
    svn checkout http://django-solr-search.googlecode.com/svn/trunk/ django-solr-search
    
  2. Add Solango to your Python path.

    cd django-solr-search
    ln -s `pwd`/solango /usr/lib/python2.5/site-packages/solango
    
  3. Copy the solango/inital_settings.py file into your project folder and rename it as solr_settings.py.

    cp solango/initial_settings.py ~/git/blog/solr_settings.py
    
  4. Add Solango to your project's INSTALLED_APPS setting in its settings.py file.

    INSTALLED_APPS = (
        'lifeflow',
        'compress',
        'portfolio',
        'solango', # <-- like this
        'django_monetize',
        '...',
    )
    
  5. At the end of your settings.py file, but before importing local_settings.py (if you do so), import everything from solr_settings.

    try:
        from solr_settings import *
    except ImportError:
        pass
    try:
        from local_settings import *
    except ImportError:
        pass
    
  6. Next create a search.py file for your Django apps that you want to be searchable. The LifeFlow search.py looks like this:

    import solango
    from lifeflow.models import Comment, Entry
    
    class EntryDocument(solango.SearchDocument):
        date = solango.fields.DateField()
        summary = solango.fields.TextField(copy=True)
        title = solango.fields.CharField(copy=True)
        tags = solango.fields.CharField(copy=True)
        content = solango.fields.TextField(copy=True)
    
        def transform_summary(self, instance):
            return instance.summary
    
        def transform_tags(self, instance):
            tags = list(instance.tags.all())
            texts = [ tag.title for tag in tags ]
            return ",".join(texts)
    
        def transform_date(self, instance):
            return instance.pub_date
    
        def transform_content(self, instance):
            return instance.body
    
    solango.register(Entry, EntryDocument)
    

    The irony is that this is virtually identical to the search.py that the Solango tutorial creates for another Django blogging app, Coltrane.

    I don't even have words. (ed: confirmed, it was a bad day.)

  7. Create a Solr schema file in your local directory.

    python manage.py solr --schema --path=./
    

    Now move it into the the Solr configuration directory.

    sudo mv schema.xml /var/lib/tomcat6/solr/conf/
    
  8. Restart Tomcat.

    sudo /usr/share/tomcat6/bin/shutdown.sh 
    sudo /usr/share/tomcat6/bin/startup.sh 
    
  9. In a local_settings.py file override any of the Solango settings that don't fit your needs.

    # DIRNAME is for logs
    DIRNAME = '/home/django/domains/lethain.com/log/'
    LOG_FILENAME = DIRNAME + "solango.log"
    
    # for SOLR stuff
    SEARCH_UPDATE_URL = "http://localhost:8080/solr/update"
    SEARCH_SELECT_URL = "http://localhost:8080/solr/select"
    SEARCH_PING_URLS = ["http://localhost:8080/solr/admin/ping",]
    
    SOLR_ROOT = '/var/lib/tomcat6/solr'
    SOLR_SCHEMA_PATH = SOLR_ROOT + '/conf/schema.xml'
    SOLR_DATA_DIR = SOLR_ROOT + '/data'
    

    Be careful to ensure that your server can write to the LOG_FILENAME file. If you followed the configuration from the Ubuntu & Django Almanac, you'll need to do this:

    chmod 775 ~/domains/lethain.com/log/
    touch solango.log
    chown django solango.log
    chgrp www-data solango.log
    chmod 775 solango.log
    

    (Only the first two sevens are important, the last digit can be anything depending on your preference.)

  10. Now we need to index content for search.

    cd ~django/domains/lethain.com/blog/
    sudo manage.py solr --reindex
    

    If you get a raise ValueError, "Invalid or missing XML", then you need to stop and start Tomcat.

    sudo /usr/share/tomcat6/bin/shutdown.sh 
    sudo /usr/share/tomcat6/bin/startup.sh 
    
  11. Now it should be possible to start searching. Fire up your project's shell.

    ~django/domains/lethain.com/lethain_env/bin/python manage.py shell
    

    And then perform some searches.

    >>> from solango import connection
    >>> results = connection.select(q='django')
    >>> results.success
    True
    >>> results.documents
    
  12. From here you can reference the Solango tutorial if you're interested in using its pre-packaged search view, or start writing your own wrapper as desired.

Sorry that this tutorial is a bit rough. Some of the layout decisions made in the Tomcat 6 packaging were a bit bewildering to me, perhaps because of my limited Java server background. I've always wondered what it would be like to write a Java web application...