Skip to content
23/06/2010 / Danresa Consultoria de Informática

MS SQL Server 2005 full-text indexing

Getting started with full-text search

  • Make sure that you download the version of SQL Server Express with the additional full-text search capabilities. This is the larger one of the two downloads possible.
  • Read up a bit on Full Text Search concepts.
  • Making the Catalog
    Now you need to create a full-text catalog. The catalog seems to be a container to put all your full-text indexes into. So, if you are going to need to search the text of three tables, you make three indexes (one for each table) and put all three indexes in the one catalog.
    WARNING: Some of the information of the Microsoft website relates to the full version of SQL Server, such as the “Storage folder” that Express doesn’t have. Don’t panic, you just need to create the catalog by hand, rather than through the Management Interface.
    For my application, I just made one catalog and put two of my indexes into it. From what I understand, if your indexes are going to be huge, you should make a separate catalog for each index. Here is how to make a catalog:

    1. Start SQL Server Management Studio Express, and connect to your database.
    2. Click on New Query to get a box where you can type raw SQL.
    3. Type:
      use mydatabase
      go
      EXEC sp_fulltext_database 'enable'
      go
      CREATE FULLTEXT CATALOG mycatalog
      go
    4. Click “Execute” to run the SQL. This will enable fulltext searching on the database, and create the catalog, which is in fact a folder on your computer called “mycatalog” somewhere within the Microsoft SQL Server folder. If your catalog is going to be HUGE, it may be worth researching some extra parameters for this command to tell it where to put the catalog, e.g. on a different disk to the database itself. This would be for performance improvement more than anything else.
  • // //

    Next we are going to create a full-text index. But if you want to create a full-text index on a table, the table must have a “unique, single-column, non-nullable index“. In most cases this will be your primary key. What Microsoft is saying here is that you must have a non-null index on the table (all primary keys automatically are this), and that the index must be on just one column. Oh Drat! My table has a primary key that is on two columns, so I have to make an index that is on a single unique column in order to use full-text searching. What to do, what to do? Well, my solution was to add a new column to the table called “text_id”. I made this an int column, and also made it an identity column. This way, the text_id field gets automatically filled with a unique integer for each record that is entered, without me needing to change any of my existing aspx pages.
    Also, if you already have a primary key, but it is on a really long field, I suggest you make an int field and use that instead, because the full-text index that gets created will be huge if the primary key you use is big, such as a GUID.

    Ok, here is how to create the full-text index:

    1.  
      1. If you need to add an identity column, do this first.
      2. Make sure you have an index on the identity column. I called mine “myindex”.
      3. Type in and execute this SQL:
        CREATE FULLTEXT INDEX ON mydatabase.dbo.mytable
        (
        column_to_index
        Language 0X0
        )
        KEY INDEX myindex ON mycatalog
        WITH CHANGE_TRACKING AUTO

        Where mytable is the name of the table, column_to_index is the name of the column that is full of text you want to be able to search on, myindex is the name of the index (could be PK_something if you used your primary key), and mycatalog is the name of the catalog you created earlier. The “Language” bit just tells SQL Server to not treat this as being in any specific language. I don’t know how to specify another language here, but you can change it later by using the SQL Server Manager interface. I didn’t notice any difference between a UK English index and a language-neutral index. I suspect it is only important for languages other than English.
      4. I did discover that sometimes when you try to make the index, it doesn’t get fully made until you do a little bit through the user interface. To verify that things have worked, start the SQL Server Manager interface, right-click on the table name and choose “Modify”. Right-click on the column and choose “Full text index…”. In here you need to check that “Columns” is set to the column name you want to index. Also check whether “Active” is Yes — if not, it means the index isn’t built yet.
      5. You may have to wait a while for the index to be created. If you have a large table, the CPU usage on your PC will probably go up to 100% while this happens. If you get problems, see the troubleshooting section below…
    2. Performing a Find
      The actual SELECT statement to find records is then very easy. Here is an example:
      SELECT description
      FROM mytable
      WHERE CONTAINS(description, '"shark attack"')

      (You need to put the search terms in double-quotes, within the single-quotes needed for the SQL)
    3. Eliminating Noise Words
      If your text fields are in HTML, then you should add things lik “BR” and “P” to the list of noise words.

    What I discovered

    At first, full-text search queries were taking about 30 seconds to return results. This was pretty disappointing, and I started looking for ways to tune the system up. But while I was looking, the whole thing sorted itself out, and suddenly queries became almost instant. It is very impressive, I can search for a phrase like “african drumming” on my 20,000 record database almost instantly. I don’t know what caused this delay then speed-up. Presumably the SQL Server was building the indexes, and it took a while. I’m not sure how come I could do queries while the index was being built though. It’s a puzzle.

    Also, I discovered that searching for “drum* lesson*” returns better results than searching for “drum lesson”, because the first one will also find “drumming lessons”. So, I made it so that queries that people enter automatically have a * put after each word.

    A Bit of Troubleshooting

    If it doesn’t work, here are some of the problems I encountered, and the ways of debugging and solving the problems:

    Application Events

    Right Click on My Computer, choose Manage, then Event Viewer, then Application. This shows you the Application Event Log. In here you will see any error messages from the Full Text service. Usually one of these errors will just redirect you to the server logs themselves.

    SQL Server Logfile & Fulltext Index Log-file

    Have a look in the logs. I found mine in C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG. The fulltext logs are called SQLFTxxxxxx.LOG. They are hard to understand, but they may help.

    Try ReBooting

    You need to reboot after you install FullText Indexing, and a reboot is always a good place to start when troubleshooting. I had the error message “Error ‘0x80040e09’ occurred during full-text index population” appearing in my logfiles, and no fulltext indexes created at all until I figured out that I needed to reboot.

    Commands & Queries that you can type into SQL Server 2005:

    How to check the status of the full-text indexes in a catalog:

    USE mydatabase;
    GO
    EXEC sp_help_fulltext_tables mycatalog;
    GO

    Enable fulltext search on a table:

    USE mydatabase;
    GO
    ALTER FULLTEXT INDEX ON mytable ENABLE;
    GO

    Check if full text indexing is enabled:

    USE mydatabase;
    GO
    SELECT DATABASEPROPERTY(‘mydatabase’, ‘IsFullTextEnabled’);
    GO

    Update

    I’ve been looking at this a bit more recently, and worked out a few things:

    • You can have more than one fulltext indexed column per table. Old SQL Server versions prevented this, but 2005 allows it. So, you can have a table with a title and a summary column, and both can be full-text indexed.
    • The language setting is important, despite what I wrote above. I now use the code ‘2057’ to set my indexes to be in UK English. This means that searches on “dogs” will also find “dog”.
    • You can find the status of your fulltext indexes like this:
      exec sp_help_fulltext_columns mytable
      This shows you which columns in a table are set up for full-text searching.
      select * from sys.fulltext_indexes
      This shows you the status of the current indexes and when they were last updated. I think the key is to make sure that change_tracking_state_desc is set to “AUTO” and that has_crawl_completed is “1”. You can also see the time of the last crawl, i.e. when the index was last updated, which can help you see why things aren’t working.

    Here is my example .sql statement, that sets up three columns of a table for fulltext indexing, and sorts out the change tracking state, and starts a full crawl:

    use content
    exec sp_fulltext_database 'enable'
    go
    exec sp_fulltext_catalog 'cmscatalog', 'create'
    go
    exec sp_fulltext_table 'document', 'create', 'cmscatalog', 'PK_document'
    go
    exec sp_fulltext_column 'document', 'title', 'add', '2057'
    exec sp_fulltext_column 'document', 'keywords', 'add', '2057'
    exec sp_fulltext_column 'document', 'search_text', 'add', '2057'
    go
    exec sp_fulltext_table 'document', 'activate'
    go
    EXEC sp_fulltext_table 'document', 'Start_background_updateindex';
    go
    exec sp_fulltext_table 'document', 'start_full'
    go

    Conclusion

    Full Text Searching is easy to set up and use in SQL Server Express. When it works, it works fast!

    Anúncios

    Deixe um comentário

    Preencha os seus dados abaixo ou clique em um ícone para log in:

    Logotipo do WordPress.com

    Você está comentando utilizando sua conta WordPress.com. Sair / Alterar )

    Imagem do Twitter

    Você está comentando utilizando sua conta Twitter. Sair / Alterar )

    Foto do Facebook

    Você está comentando utilizando sua conta Facebook. Sair / Alterar )

    Foto do Google+

    Você está comentando utilizando sua conta Google+. Sair / Alterar )

    Conectando a %s

    %d blogueiros gostam disto: