Robert Douglass's blog

Drupal Search: How indexing works

Robert Douglass's picture

This article explores the process of taking HTML content from Drupal nodes and indexing it for the purpose of search and text retrieval at a later time. The code examples apply to Drupal 6.

Finding what to index

Minnesota Search Sprint: Your top-five feature requests

Robert Douglass's picture

In the same way that the Internet itself would not have achieved greatness without the ability to search it easily and efficiently, Drupal’s greatness will always be tied directly to the effectiveness of its core search solution. Improving core search for Drupal 7 will be no small task, however. The current implementation is both elegant but complex, robust yet inflexible. The seven coders participating in the Minnesota Search Sprint this weekend have a great challenge as well as a great opportunity. Here are some of the things we hope to achieve:

  • Identify the most important weaknesses in Drupal search and create a project plan for fixing them.
  • Identify the most important new features currently missing from Drupal search and clear the roadblocks for implementing them.
  • Increase the test coverage for Drupal search.
  • Increase general developer awareness and knowledge of search.

A large part of what we will be doing is evaluating and planning. Without a roadmap and common understanding of what search is to become, little progress will be made in the Drupal 7 development cycle. However, a coding sprint is all about code, and we’ll be writing some of that, too. Specifically I’m hoping that we’ll be able to fix one of the top-five bugs, increase search module’s test coverage, and come up with a first attempt at one of the top-five new features.

That’s a lot! No matter what we manage to code during the three days together, we’ll walk away with a high level of agreement about our goals for the next months, and plenty of homework to do.

We’ll post regular updates that you can follow on Planet Drupal, as well as in the search group, and we’re all ears if you have suggestions or wishes. For anyone wanting to catch up on their search related reading, here are some links:

Drupal's search compared to Google and Yahoo!

Robert Douglass's picture

When Drupal does a content search, it optionally weighs the results using up to four scoring factors. These scoring factors include keyword relevancy, recency of the content, number of comments, and (if statistics module is enabled), the number of page views. Site administrators can adjust the relative weighting of these scoring factors from the example.com/admin/settings/search administration page. Setting any scoring factor to zero disables it.

In this article, which applies primarily to Drupal 6 but is relevant for Drupal 5 as well, I explore how useful these scoring factors really are, and whether they help Drupal search live up to the high standards that are set by leaders like Google and Yahoo!. This article is part of a series of search related articles in preparation for the Minnesota Search Sprint.

Drupal 6: Hot new themes

Robert Douglass's picture

This video highlights the new theming features of Drupal 6. The themes, Pixture, Wabi, and Twilight, are the work of Hide Ito (Pixture), and utilize not only the Farbtastic color picker, but also the ability to adjust the width of the layout, and in the case of Twilight, a configurable header silhouette.

Drupal's Search Framework: The execution of a search

Robert Douglass's picture

Drupal’s ambitious search module provides a framework for building searches of all kinds. By isolating the tasks involved in searching, and allowing the actual search implementations to be handled by other modules, the search framework sets the stage for all sorts of creative search applications. This article, which applies to Drupal 6, explores the structure of the search framework by following the steps needed to execute a search.

Stucture of a search

The Minnesota Search Sprint

Robert Douglass's picture

Continuing the great and growing tradition of bringing people together in small groups to attack focused problems, a search related code sprint has been planned. From May 9 to 11, in the headquarters of the University of Minnesota Libraries, a small but dedicated group of Drupal coders will be melding minds to bring forth the next generation of Drupal search.

Why Search?

Drupal has a great search module. The search index it builds powers search on Drupal.org and thousands of other sites. It is a critical piece of the Drupal project and fundamental to countless sites built on Drupal. Being able to effectively search for issues and solutions is a cornerstone of keeping the Drupal.org community happy and productive, so investing in making search even better is akin to investing in Drupal’s overall success.

A Certification Success Story

Robert Douglass's picture

Now that Acquia has officially announced its intentions to provide certification for its Drupal related products, I'd like to share a personal success story. In 2000, when I decided to leave my position in a German orchestra and become a programmer, I desperately needed an opportunity to gain practical knowledge and prove that I had it. The prospect of pursuing a whole college degree wasn't attractive as I didn't have enough money to pay tuition for so many classes. I didn't know about or understand open source software, and I wouldn't have been able to contribute anyway as I was a complete beginner. Installing Drupal would have been too big a task. Furthermore, the amount of knowledge and skills involved in programming seemed endless to me. Every concept hid five other concepts, which in turn depended on even more concepts or information that was completely missing. It was quite difficult to find any material that started at step 1 and proceeded to step 2, in that order.

An honor and a privilege

Robert Douglass's picture

From the first days of my involvement with Drupal I have looked to Dries Buytaert with deep respect and affection. His guidance of the Drupal project and the Drupal Association have been impeccable, and every opportunity I have had to work with him has been positive and productive. Therefore, when I told him that I was changing jobs, and he asked if I would like to join Acquia, I was pretty darn happy and excited.

Syndicate content