NateGoSearchQuery
public class NateGoSearchQuery
| Field Summary | |
|---|---|
| protected array | Keywords that should not be included in the search performed by this query. |
| protected MDB2_Driver_Common | The MDB2 database driver to use. |
| protected array | The document types searched by this query. |
| protected array | Popular keywords that are searched often on the site. |
| protected NateGoSearchSpellChecker | Spell checked used to check the spelling of keywords used in this query. |
| Constructor Summary | |
|---|---|
NateGoSearchQuery(MDB2_Driver_Common db) Creates a new NateGoSearch fulltext query. |
|
| Method Summary | |
|---|---|
| void | addBlockedWords(string|array words) Adds words to the list of words that are not to be searched. |
| void | addDocumentType(string type_shortname) Adds a document type to be searched by this query. |
| void | addPopularWords(string|array words) Adds words to the list of popular words that should be suggested. |
| protected array | cleanPopularResults(array results) Clean the popular words list. |
| static array | Gets a default list of words that are not searched by a search query. |
| protected void | getPopularReplacements(string keywords, array misspellings) Get popular replacements for words in the search keywords. |
| protected boolean | isPopularMatch(string word1, string word2) Checks if two words are similar. |
| protected string | normalizeKeywordsForSearching(string text) Performs additional normalization of a query string suitable for searching. |
| protected string | normalizeKeywordsForSpelling(string text) Performs initial normalization of a query string suitable for spell-checking. |
| NateGoSearchResult | query(string keywords) Queries the NateGoSearch index with a set of keywords. |
| private string | quoteArray(array array, string type) Quotes a PHP array into a PostgreSQL array. |
| void | setSpellChecker(NateGoSearchSpellChecker spell_checker) Sets the spell checker used by this query. |
protected array $blocked_words = array()
Keywords that should not be included in the search performed by this query
This is an array of blocked keywords.
protected MDB2_Driver_Common $db
The MDB2 database driver to use
Currently, NateGoSearch only supports PostgreSQL.
protected array $document_types = array()
The document types searched by this query
protected array $popular_words = array()
Popular keywords that are searched often on the site
This is an array of of popular keywords.
protected NateGoSearchSpellChecker $spell_checker
Spell checked used to check the spelling of keywords used in this query
Null by default meaning no spell-checking is done.
public NateGoSearchQuery(MDB2_Driver_Common db)
Creates a new NateGoSearch fulltext query
db - the database driver to use.public void addBlockedWords(string|array words)
Adds words to the list of words that are not to be searched
These may be words such as 'the', 'and' and 'a'.
words - the list of words not to be searched.public void addDocumentType(string type_shortname)
Adds a document type to be searched by this query
type_shortname - the shortname of the document type to add.public void addPopularWords(string|array words)
Adds words to the list of popular words that should be suggested
words - the list of popular words.protected array cleanPopularResults(array results)
Clean the popular words list
This is used to clean up the results queried in
NateGoSearchQuery::getPopularWords() to a list of unique words
that contains only on word per array entry. Numbers and common words
found in NateGoSearchQuery::blocked_words are also removed from
the list.
results - an array of results to cleanpublic static array getDefaultBlockedWords()
Gets a default list of words that are not searched by a search query
These words may be passed directly to the
NateGoSearchQuery::addBlockedWords() method.
protected void getPopularReplacements(string keywords, array misspellings)
Get popular replacements for words in the search keywords.
This is used to check search keywords along with their coresponding spelling suggestion for matches in the popular words list. If a match is found we either replace the current misspelling, if one exists, or add an entry to the mispelling list with the new popular suggestion added.
keywords - the keywords to check for improved suggestionsmisspellings - the misspellings for the given $keywordsmisspellings - the misspellings with added suggestions for popular wordsprotected boolean isPopularMatch(string word1, string word2)
Checks if two words are similar
This is used to check if a one word matches a another word from the popular wordlist. Used to improve search suggestions by confirming whether two words are similar in sound and/or spelling.
word1 - the first word being compared for similaritiesword2 - the second word being compared for similaritiesprotected string normalizeKeywordsForSearching(string text)
Performs additional normalization of a query string suitable for searching
This converts all words to lower-case and removes apostrophe s's from
all words. Keywords should have already been partially normalized
using NateGoSearchQuery::normalizeKeywordsForSpelling().
text - the string to be normalized.protected string normalizeKeywordsForSpelling(string text)
Performs initial normalization of a query string suitable for spell-checking
This removes excess punctuation and markup. The resulting string may be
tokenized by spaces. Before searching, query strings should be further
normalized using
NateGoSearchQuery::normalizeKeywordsForSearching().
text - the string to be normalized.public NateGoSearchResult query(string keywords)
Queries the NateGoSearch index with a set of keywords
Querying does not directly return a set of results. This is due to the way NateGoSearch is designed. The document ids from this search are stored in a results table and accessed through a unique identifier.
keywords - the search string to query.private string quoteArray(array array, string type)
Quotes a PHP array into a PostgreSQL array
This is used to quote the list of document types used in the internal SQL query.
array - the array to quote.type - the SQL data type to use. The type is 'integer' by default.public void setSpellChecker(NateGoSearchSpellChecker spell_checker)
Sets the spell checker used by this query
spell_checker - optional. The spell checker to use for this query. If not specified or specified as null, no spell checking is performed.
Perform queries using a NateGoSearch index
This is the class used to actually search indexed keywords. Instances of this class may search the index using the
NateGoSearchQuery::query()method. For example, to search a database table called Article indexed with a document type of article, use the following code:<?php
$query = new NateGoSearchQuery($db);
$query->addDocumentType('article');
$result = $query->query('some keywords');
$sql = 'select id, title from Article ' .
'inner join %s on Article.id = %s.document_id and '.
'%s.unique_id = \'%s\' and %s.document_type = %s';
$sql = sprintf($sql,
$result->getResultTable(),
$result->getResultTable(),
$result->getResultTable(),
$result->getUniqueId(),
$result->getResultTable(),
$result->getDocumentType('article'));
$articles = $db->query($sql);
?>
Because of the specific PL/pgSQL implementation of the search algorithm, the
query()method may only be called once per page request.If the PECL
stempackage is loaded, English stemming is applied to all query keywords. Seehttp://pecl.php.net/package/stem/for details about the PECL stem package. Support for stemming in other languages may be added in later releases of NateGoSearch.Otherwise, if a
PorterStemmerclass is defined, it is applied to all query keywords. The most commonly available PHP implementation of the Porter-stemmer algorithm is licenced under the GPL, and is thus not distributable with the LGPL licensed NateGoSearch.