search question search question
 

News:

cpg1.5.48 Security release - upgrade mandatory!
The Coppermine development team is releasing a security update for Coppermine in order to counter a recently discovered vulnerability. It is important that all users who run version cpg1.5.46 or older update to this latest version as soon as possible.
[more]

Main Menu

search question

Started by popov, October 09, 2004, 06:17:17 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

popov

Can anybody help me with the following:

The search function in CPG search only images with words in description that consist of >3 characters.
How can I search images with search query that consist of <3 characters?

Thanks for help.

Best regards,
Alexnadre

kegobeer

Unfortunately, this is more of a limitation with MySQL than Coppermine.  From the MySQL manual:

QuoteMySQL uses a very simple parser to split text into words. A ``word'' is any sequence of characters consisting of letters, digits, `'', or `_'. Some words are ignored in full-text searches:

    * Any word that is too short is ignored. The default minimum length of words that will be found by full-text searches is four characters.
    * Words in the stopword list are ignored. A stopword is a word such as ``the'' or ``some'' that is so common that it is considered to have zero semantic value. There is a built-in stopword list.

The default minimum word length and stopword list can be changed as described in section 13.6.4 Fine-Tuning MySQL Full-Text Search.

Every correct word in the collection and in the query is weighted according to its significance in the collection or query. This way, a word that is present in many documents has a lower weight (and may even have a zero weight), because it has lower semantic value in this particular collection. Conversely, if the word is rare, it receives a higher weight. The weights of the words are then combined to compute the relevance of the row.

Such a technique works best with large collections (in fact, it was carefully tuned this way). For very small tables, word distribution does not adequately reflect their semantic value, and this model may sometimes produce bizarre results. For example, although the word ``MySQL'' is present in every row of the articles table, a search for the word produces no results:

mysql> SELECT * FROM articles
    -> WHERE MATCH (title,body) AGAINST ('MySQL');
Empty set (0.00 sec)

The search result is empty because the word ``MySQL'' is present in at least 50% of the rows. As such, it is effectively treated as a stopword. For large datasets, this is the most desirable behavior--a natural language query should not return every second row from a 1GB table. For small datasets, it may be less desirable.

Searching for a 1 or 2 letter word would almost certainly result in a match of over 50% and would be considered a stop word, killing the search.

However, 1.3.2 allows you to search for any length string.  I'd consider upgrading to 1.3.2.
Do not send me a private message unless I ask for one.  Make your post public so everyone can benefit.

There are no stupid questions
But there are a LOT of inquisitive idiots

popov


Maybe you know how to change search.inc.php to allow any length search string in CPG 1.2 ?

Joachim Müller

do as suggested and upgrade. If you only want to find out the differences between cpg1.2 and cpg1.3.x, get the cpg1.3.2 package, unzip it and use a diff viewer like Winmerge to see them.

Joachim