problem with keywords in Greek problem with keywords in Greek
 

News:

CPG Release 1.6.26
Correct PHP8.2 issues with user and language managers.
Additional fixes for PHP 8.2
Correct PHP8 error with SMF 2.0 bridge.
Correct IPTC supplimental category parsing.
Download and info HERE

Main Menu

problem with keywords in Greek

Started by nasose, October 30, 2006, 02:14:07 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

nasose

Hi there

I just upgraded today to version 1.4.10 from an old 1.3.3 version and everything works more than fine. Before upgrading I never wanted to add keywords to my images. Anyway I decided to do so after upgrading. In some cases I use keywords in English and in some cases in Greek. When I search in English all results are fine. When I search in Greek I get no results whatsoever. My langage settings are:

Language: english
encoding: Greek (iso 8859-7)

If it makes any difference I use firefox 1.5 for surfing

My gallery is located at http://www.nefsta-photography.com/gallery

Any ideas what is going on? I want to add keywords in Greek but I'm afraid that I'll waste my time since search shows nothing

Joachim Müller

Use utf-8 encoding instead of iso8859-7

agridoc

I have the same problem and switching to UTF-8 is not desired. I rolled back the upgrade because of search problem with Greek. No such problem with 1.3x.

What other solution is proposed by the CPG Team to overcome this?

Joachim Müller

None. Use utf-8. It's better. It's the future. We want to promote it, because we realized that it's superior. Don't go back to cpg1.3.x, just because you have issues with getting started with utf-8. Use the most recent version and ask us for fixes on all issues you have with utf-8, but don't go back to cpg1.3.x.

Nasose made the mistake of leaving his encoding set to iso8859-7 instead of converting his textual content to utf-8. Don't make the same mistake.

Cpg1.3.x goes unsupported (as in "no support at all"). If you insist on keeping cpg1.3.x, you're on your own - simple as that.

agridoc

Thank you for your reply GauGau.

I didn't write that I am getting started in UTF-8, I wrote that UTF-8 is not desired. I have my reasons, and it is my choice. I would definitely recommend UTF-8 for other cases but not this one. UTF-8 is not always better, it has it's pros and cons. CPG seems to run it's support forum in ISO-8859-1 and I believe it' s the right choice, unless you create language specific support boards for non latin characters languages (and it needs a good percentage of messages in these languages to be beneficial). I am not arguing for or against UTF-8, I just believe that software should be flexible enough to let the user select the best choice for a specific use.

I believe that software developers should take care of the needs and choices of possible users. UTF-8 can be proposed and promoted by arguments, there are also other arguments against but it shouldn't and can't be forced on.

CPG is excellent software and should not loose language flexibility. Version 1.4x seems to work quite well with other languages and codepages and a latin-1 database, except the search problem. I think it could be overcomed, it's a matter of will for the team.

Nibbler

It's not a matter of will atall, it just needs someone who is familiar with localised encodings to analyse the problem and post a fix. The problem might be in the first 40 lines of include/search.inc.php, that deals with charsets.

agridoc

Yes Nibbler, you are right, it needs work.

I wrote
QuoteI think it could be overcomed, it's a matter of will for the team.

It will be difficult or improbable to find a solution if the CPG Team is not interested. It took some time and tests in SMF to overcome problems and find solutions for various languages and multilingual approaches.

CPG 1.3x worked well in this, so there is a base to start.

Joachim Müller

There are i18n issues in cpg1.3.x that made us switch to utf-8 as a default. Going back to how things used to be in cpg1.3.x is not an option. As Nibbler suggested: i18n could be improved, but there are many things in coppermine that could be improved. It's merely a question of time of the devs and the individual needs of the devs. Currently, most devs use English for their personal gallery, so they simply don't have the need to look into i18n even more. If you want to contribute, please post actual suggestions for code changes.

agridoc

I examined the code in of include/search.inc.php

I am not a coder and I know little of PHP commands and syntax, however I was quite active in the '80s to have the ability to follow a program.

I just omitted a command and Greek (ISO-8859-7) search works now.  :)

It's in line 38 of include/search.inc.php
if (!$mb_charset)
        $search_string = preg_replace('/[^0-9a-z %]/i', '', $search_string);


And the stripped code
if (!$mb_charset);

The above could probably be completely omitted, not tested though.

I know it's a crude operation but it works. I believe it could be fixed better but I don't know how.

The test was done with an older test 1.4.6 CPG but the code is the same in this area in 1.4.10, so it should work too.

Nibbler

That code change should not have made any difference. What do you have set as the charset in config ?

agridoc

Quote from: Nibbler on December 11, 2006, 02:28:39 PM
That code change should not have made any difference.

It's true, it seemed so to me at first glance. However, I had a feeling that a forced UTF-8 input might be put somewhere else in the code and the search input string was manipulated as UTF-8. So I thought, it's nothing to try, strip the parsing and see what happens.

It really works and works in word combinations too. I tried the original file, no Greek search. Switched to edited , Greek search OK.

I use Greek ISO-8859-7 as codepage in config. Also I transformed the Greek language file to ANSI and put ISO-8859-7 as codepage there too and this is the language input. Database has latin-1_swedish collation.

I have to do more testing though, especially input and see how strings are stored now. This works with database as it was.

Nibbler

I see now, I misunderstood your post. That code probably should be removed.

agridoc

I am not sure Nibbler if it should be removed or edited. The Team knows better CPG's code. Where should I search for input parsing?

There are also some other problems as SMF membergroups names in Greek not showing in CPG. I will open another topic when I will be ready.

I must say I like the feeling that there is interest for non UTF-8 codepages, the Team has shown really good will.  ;)

agridoc

Works with newly added Greek strings.

agridoc

I have upgraded and I am really amazed with new capabilities of CPG. Search in Greek without UTF-8 works very well with the way and modification mentioned above and is not case sensitive.

The really exciting is the use of keywords to make an album. It works well with Greek chars and ISO-8859-7 but I noticed a small problem.

When entering the search screen the Greek keywords are not displayed, there is only one dash for two added and working so far. Also when "Edit Keywords" is entered they display correctly but can't be edited, they are somewhat ignored.

Could you please give some advice or guidelines where to search the code for this, so a solution could be found?

agridoc

I found the solution.

In /include/keyword.inc.php, line 32 for CPG 1.4.10 change code
  // Find unique keywords
  $keywords_array = array();

  while (list($keywords) = mysql_fetch_row($result)) {
      $array = explode(" ",$keywords);

      foreach($array as $word)
      {
       if (!in_array($word = strtolower($word),$keywords_array)) $keywords_array[] = $word;
      }
  }

to this one
  // Find unique keywords
  $keywords_array = array();

  while (list($keywords) = mysql_fetch_row($result)) {
      $array = explode(" ",$keywords);

      foreach($array as $word)
      {
       if (!in_array($word = utf_strtolower($word),$keywords_array)) $keywords_array[] = $word;
      }
  }


The difference is in line 40, changing utf_strtolower() to the original PHP strtolower().

When I find time I will check for more instances of utf_strtolower() and utf_ucfirst() as another problem with Greek ISO was solved with similar approach in http://forum.coppermine-gallery.net/index.php?topic=39243.0

Maybe a slightly different code of these functions in next version of Coppermine could give the flexibility to use the desired codepage without having to do modifications.