23 Jun 2009 18:25
TAGS: dev lucene search wikidot
This work was done, because searching the whole Wikidot database was far from being fast. At first we used Google Custom Search Engine to solve the problem, but we wanted to be independent from it. Also we wanted to include search results that are accessible by the person that searches but not by search engines (like private sites).
This worked really good. The only problem was long time spent to display the results. It was like 20 seconds, which was far better than when using previous search, but much worse than Google. This was strange, because from previous tests, we calculated the average search time should be about a second or two.
It started to be clear, when we noticed that only 20-30 searches per day are performed. The index is quite big, and it needs to be cached in RAM to work with sufficient performance.
Today I did more tests under heavy load and it seams Lucene can handle big number of queries. When users search often, the index is partially or even fully cached by the filesystem and searches are really quick!
But our main problem to solve today was slow local search a.k.a. "search this site". Moreover many concurrent queries were degrading database performance (not only for searching), so we decided to enable Lucene for local searches as well.
I must say it works really nice, fast and has a nice set of syntax tricks you can do with it, for example you can search for pages with something in tags. Just search for youtube tags:embed. This would search for pages matching youtube (in tags, title or content) and with embed in tags. If no such pages are found, partial matches are also returned, like: pages matching youtube (but with no embed in tags) or pages with embed in tags, (but not matching youtube).
To sum up, new search is faster, gives more accurate results, saves the database performance (which was the main goal) and allows nicer syntax than the old one.