Skip to content
jkraemer edited this page Sep 13, 2010 · 4 revisions

acts_as_ferret plugin for Ruby on Rails

About

This is the home of the acts_as_ferret plugin for Ruby on Rails.

Acts_as_ferret is a plugin for Ruby on Rails which makes it simple to implement full text search for Rails. It builds on Ferret which is a ruby port of Apache Lucene. It is a technology suitable for nearly any application that requires full-text search.
For more information about Ferret, visit its website, search the mailing list at ruby-forum.com or (very recommended) buy the book.

If you have any questions please ask them on the Ferret mailing list ([email protected]) or via ruby-forum.com: http://www.ruby-forum.com/forum/5

RDoc-generated API documentation is available for the current trunk
and the latest stable release.

Issue Tracker

Please post issues to Lighthouse

Features

  • High speed full text search across the contents of any Rails model class, without any hassles. The index will be kept up to date automagically while you work with your Rails model classes as usual.
  • Each Model class calling acts_as_ferret gets its own Ferret index on disk, but you can search multiple models at once using the multi_search method.
  • Supports Rails’ single table inheritance mechanism (just declare acts_as_ferret in the base class, and be able to search across all inheriting classes, see TypoWithFerret for an example)
  • Aaf is not limited to indexing the attributes of your model: You can tell it to index the result of any instance method of your model class.
  • Further customization of the indexing process can be achieved by overriding the to_doc instance method in your model class, which is supposed to return the Ferret document object to be stored in the index.
  • Use my_model_instance#more_like_this to retrieve objects having similar contents like my_model_instance. Great for suggesting related pages to your readers, or related products to your customers.
  • DRb Server for centralized index access in production environments. This is required as soon as more than one process needs to update the Ferret index, which is true for most deployments with multiple Mongrel / fastCGI instances.

Installation

Prerequisites

Install ferret gem. In most cases

gem install ferret

should do. More instructions here.

How to get it

Since May 2008 the aaf code base is hosted with git at Rubyforge. Use

git clone git://rubyforge.org/actsasferret.git

to checkout the project.

There’s also a mirror on Github :

git clone git://github.com/jkraemer/acts_as_ferret.git

I keep both repositories in sync, so it doesn’t matter which one you follow.

System-wide installation

From Version 0.3.1 onwards, acts_as_ferret is available as a gem, too. So you can use

gem install acts_as_ferret

to install the latest version it into your local gem repository. In your Rails project, you have to

require 'acts_as_ferret'

in your environment.rb to hook the plugin into Rails. To install the DRb server start/stop scripts and config from your gem repository to your Rails project, call the aaf_install script that comes with the gem inside your RAILS_ROOT:

cd myproject
aaf_install

Inside your Rails project

Git

To install from github use

script/plugin install git://github.com/jkraemer/acts_as_ferret.git

Not sure about how I’ll handle the stable/versioning stuff with git in the future, we’ll see how it goes when a new release comes out :) Current trunk is pretty stable and I use it with several production sites. Ymmv, of course.

If you lack a git client just download a tarball from github and unpack it to vendor/plugins/acts_as_ferret.

SVN (Versions up to 0.4.3 and trunk up to 2008/11/25)

I don’t push updates to SVN anymore, but I’ll keep the repository online.

Please use

script/plugin install svn://code.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret

for easy installation of the current stable version of the plugin. At the moment this is version 0.4.3 based on Subversion Rev. 281. This is supposed to work with Ferret Versions 0.11.x, and Rails >= 1.1.

The last version of the plugin compatible with Ferret 0.3.2 is 0.2.0, located at svn://code.jkraemer.net/acts_as_ferret/tags/0.2.0/.

The last version of the plugin compatible with Ferret 0.9.x is 0.2.3, located at svn://code.jkraemer.net/acts_as_ferret/tags/0.2.3/.

If you want to use bleeding edge version of the plugin, use

script/plugin install svn://svn.jkraemer.net/acts_as_ferret/trunk/plugin/acts_as_ferret

Please note that the trunk usually requires the latest stable versions of Ferret and a recent version of Rails.

Demo project

There’s also a whole demo Rails (2.0) project (containing the acts_as_ferret test suite,
a simple model class, scaffolded CRUD GUI and a search form) located in the doc/ directory of the plugin.
The demo project requires Rails 2.0 or newer.

Updating your installation (optional)

After the plugin is already installed, you might want to get the latest plugin version. Do this by typing:

script/plugin update

This will automatically get updated versions of your acts_as_ferret plugin (and all plugins that were installed from remote repositories.)

Also, an interesting note from trunk/railties/lib/commands/plugin.rb:

#   * If @vendor/plugins@ is under subversion control, the script will
#     modify the svn:externals property and perform an update. You can
#     use normal subversion commands to keep the plugins up to date.

If you installed the acts_as_ferret gem, upgrading is even easier:

gem update acts_as_ferret

Usage

Basic usage is as follows:
In any model.rb add acts_as_ferret

class Foo < ActiveRecord::Base
   acts_as_ferret
end
All CRUD operations will be performed on both ActiveRecord (as usual) and a ferret index for further searching. Aaf will try to connect to an indexing server via DRb if one is configured in
config/ferret_server.yml
for the current environment. *The DRb server is required for acts_as_ferret to work reliably if your application is running with more than one process (read: mongrel or fastcgi listener).* The built in DRb server can be controlled with the ferret_server script that comes with the plugin. Most of the time you will want to use the DRb server on your production server, and go without it for test and development environments. Now you can use
Foo.find_with_ferret(query) # Query is a string representing your query
in your controller logic.

More documentation can be found in the API documentation .

Read about AdvancedUsage with aaf.

Resources on the web

  • 2008-05-13 Andrew Cetinick posted a tutorial in using acts_as_ferret covering installation, configuration, search conditions, pagination, boost, highlighting, re-indexing, and Ajax searching.
  • 2008-04-29 Brandon Keepers made a nice tutorial on using shared indexes
  • 2008-01-17 Although pagination is built into acts_as_ferret, it’s possible to use other approaches to pagination as this post suggests
  • 2007-02-19 Be sure to check out Gregg Pollack’s great tutorial. He covers all the important features of Ferret/ActsAsFerret – from simple searches to custom fields to match highlighting.
  • 2006-10-18 Roman Mackovcak posted a nice Introduction to Acts_as_ferret including info on how to do paging across search results.

Read how to integrate Ferret with the Typo blogging engine: TypoWithFerret

UTF-8 support

With recent Ferret versions (0.9.x) acts_as_ferret should provide UTF-8 support for indexing and searching out of the box. See test_unicode in content_test.rb. Unfortunately this UTF-8 support is not available in the ruby-only version of Ferret 0.9.×.

Read here about configuring a complete Debian/Rails/MySQL/Ferret stack to use UTF8

Read this for a discussion of non-latin character handling

The following resources describe how to set up and integrate a custom analyzer that replaces accentuated characters with their ascii counter parts for use with acts_as_ferret. This approach is useful when using older Ferret versions or the Ruby-only version of Ferret 0.9.x, which aren’t able to handle such characters properly:

For acts_as_ferret unicode support see Albert Delamednolls code example on his blog.

Here’s a sample method to transform western european accentuated chars to ascii (thanks to Bernd Schmeil)
pre.
def strip_diacritics(s)

  1. latin1 subset only
    s.tr(“ÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝàáâãäåçèéêëìíîïñòóôõöøùúûüýÿ”,
    “AAAAAACEEEEIIIINOOOOOOUUUUYaaaaaaceeeeiiiinoooooouuuuyy”).
    gsub(/Æ/, “AE”).
    gsub(/Ð/, “Eth”).
    gsub(/Þ/, “THORN”).
    gsub(/ß/, “ss”).
    gsub(/æ/, “ae”).
    gsub(/ð/, “eth”).
    gsub(/þ/, “thorn”)
    end

An easier approach to this would be using the Stringex gem. Just run all your UTF-8 text through it’s remove_formatting method before giving it to Ferret. Remember to do the same for your queries, or you won’t find anything.

Multi language support

Analysis is language dependent. If you have to handle documents containing multilingual text, the MultiLingualFerretTools plugin built by Lingr.com might be useful.

Using Multiple Databases (Database Hijacking) and one ferret index

See MultipleDatabases for information on how to index multiple databases into one ferret index.

Prior Versions of this plugin

Before the creation of this repository there were several different versions of this plugin floating around, mainly inside the Ferret Wiki at http://ferret.davebalmain.com/trac/wiki. Have a look at them: OriginalVersions

Acts_as_ferret combines the efforts by Kasper Weibel, Thomas Lockney and Jens Kraemer, who each made a version of the plugin based on Kaspers original code from December 2005. The three versions were unified by Jens in February 2006 and put it into SVN.

The intention is for the SVN to be the main acts_as_ferret source.

Gotchas

A page of Gotchas and other possible issues.