Search will not work with special characters (æ,ø,å) added automatically - IPTC Search will not work with special characters (æ,ø,å) added automatically - IPTC
 

News:

CPG Release 1.6.26
Correct PHP8.2 issues with user and language managers.
Additional fixes for PHP 8.2
Correct PHP8 error with SMF 2.0 bridge.
Correct IPTC supplimental category parsing.
Download and info HERE

Main Menu

Search will not work with special characters (æ,ø,å) added automatically - IPTC

Started by mboesen, November 04, 2009, 10:11:37 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

mboesen

Link to database:
http://www.fodboldfotografen.dk/coppermine14x/index.php

I have searched but found only old topics and no solutions. Was told to start new topic, so here we go. Found a topic created by myself long time ago and no solution was found. Didin't if I should add to that or start a new topic. Tried to add and the forum told me it was way old and I should consider starting a new topic. I did - and hope there is some new people who might have a solution.

I am from Denmark and unfortunately we have odd characters ø, æ, å. I use Adobe Lightroom/Photoshop to add title, description and keywords and Coppermine automatically adds these to the database when mass uploading. The weird thing is that keywords are added and the database seems to work fine with æ,ø,å.

BUT..... when I search - test with BRØNDBY - nothing is found - NO IMAGES TO DISPLAY

I then added a test photo and entered the title, description etc. manually (Brøndby logo) and that image is found

So somehow Coppermine works with manually entered æ,ø,å, but not when they are added with the function read IPTC/EXIF

Any suggestions?

Joachim Müller

The iptc library that coppermine currently is using does not support unicode encoding, that's why the import of special chars like the scandinavian "ø" fails due to encoding differences. I'm sorry, there is currently nothing to circumvent or improve the situation except the lame recommendation not o use special chars.
You could do a search&replace on the content of the database and manually convert the chars, but that's hardly an option neither for a dynamic gallery.

mboesen

Thanks for your answer.... what a shame and yes would be so much easier just using English, but customers/user probably won't agree  ;)

I just tried to open a photo and select EDIT DESCRIPTION and without doing anything I just hit the "OK" bottom (says ANVEND RETTELSER in danish) and now the characters (the photo) are found in the search......

Is there any smart way of doing that automatically to all images in the archieve?

mboesen

My own solution for now......

Found out that if I enter EDIT DESCRIPTION and just press USE/OK, then the characters will be found in the searches.

Can do 100 at a time in the EDIT FILES by the albums - better than one at a time  ;)

zeppo

Hello Friends,

I had a working modification for encoding problems (for Mac OS). It worked OK with previous (1.4) versions.
So far I have not managed to make it work properly with my 1.5 test site.

Here's the mod:

iptc.inc.php

The beginnig of the original file:

function strip_IPTC($data) {
    if (is_array($data)) {
        foreach ($data as $key=>$item) {
             $data[$key]=strip_IPTC($item);
        }
    } else {
         $data=htmlentities(strip_tags(trim($data,"\x7f..\xff\x0..\x1f")),ENT_QUOTES); //sanitize data against sql/html injection; trim any nongraphical non-ASCII character:
    }
    return $data;
}


and I replaced it with:

function strip_IPTC($data) {
    if (is_array($data)) {
        foreach ($data as $key=>$item) {
             $data[$key]=strip_IPTC($item);
        }
    } else {
        $data = htmlentities(strip_tags($data),ENT_QUOTES); // the trim function below removes some MacRoman chars if they are in the beginning/end of the string.
        //$data=htmlentities(strip_tags(trim($data,"\x7f..\xff\x0..\x1f")),ENT_QUOTES); //sanitize data against sql/html injection; trim any nongraphical non-ASCII character:
       
        // replace MacRoman chars
        $data=ereg_replace(128, "Ä",$data);
        $data=ereg_replace(138, "ä",$data);
        $data=ereg_replace(133, "Ö",$data);
        $data=ereg_replace(154, "ö",$data);
        $data=ereg_replace(134, "Ü",$data);
        $data=ereg_replace(159, "ü",$data);
        $data=ereg_replace(205, "Õ",$data);
        $data=ereg_replace(155, "õ",$data);
        $data=ereg_replace(129, "Å",$data);
        $data=ereg_replace(140, "å",$data);
        $data=ereg_replace(175, "Ø",$data);
        $data=ereg_replace(191, "ø",$data);
        $data=ereg_replace(190, "æ",$data);
        $data=ereg_replace(174, "Æ",$data);
        $data=ereg_replace(169, "©",$data);
    }
    return $data;
}


In 1.5.12 version, the mod fixed special characters in keywords, but not in Title.

I would be happy to learn, how also the title encoding could be fixed. 

Cheers Seppo

talou

Hello all,

using the last cpg1.5.12, with a lot of keywords, I discover that accented characters are forbidden in keywords and search, not only in case of IPTC import, but in ALL the searches with special characters.

The reason is that keywords are saved in htmlentities format.

So a query like :
http://www.example.com/thumbnails.php?album=search&keywords=on&search=d%C3%A9couverte
returns nothing.

To get something, it should be
http://www.example.com/thumbnails.php?album=search&keywords=on&search=découverte
But then, there are too much false result, only with the "d" search...

This is very problematic for languages that use accented characters, and they are numerous...

What type of workaround can be used on your advice ? Maybe a html_entity_decode in search module, but I dislike to change local files. It screws up any update...

Thanks for your answers