MySQL Collation mix problem, after clean install MySQL Collation mix problem, after clean install
 

News:

CPG Release 1.6.26
Correct PHP8.2 issues with user and language managers.
Additional fixes for PHP 8.2
Correct PHP8 error with SMF 2.0 bridge.
Correct IPTC supplimental category parsing.
Download and info HERE

Main Menu

MySQL Collation mix problem, after clean install

Started by Varrah, October 19, 2006, 07:28:01 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Varrah

Hi
Just installed CPG 1.4.9
It printed me Fatal error, so I switched to Debug mode and here's what it says:
Quote
While executing query "select user_id from `coppermine`.photo_sessions where session_id=md5("b46fcc325af383fffb78bbf99a963d06d6020768e4617f73d93f5346f84a4141");" on 0

mySQL error: Illegal mix of collations (cp1251_general_ci,IMPLICIT) and (koi8r_general_ci,COERCIBLE) for operation '='

Configuration:
MySQL - 4.1.9; The default mysql charset is set koi8r by my hoster.
PHP - 5.0

Tried already to add the mysql_query("SET CHARACTER SET cp1251") after each mysql_connect call, but this didn't help.

What should I do?

thanks in advance

Varrah

Seems to be solved by setting the koi8r_general_ci collation for the session_id column in the _sessions table

At least it let me through the login screen and didn't break yet...

Varrah

GauGau - You've made it far too quick, my marking the issue as solved. It was only seeming to be so.

After connecting the cpg to the phpbb the problem appeared once again. This time MySQL was not happy with the phpbb_session table configuration. After that I finally understood, that the problem is really in the code of the cpg not in my hosting configuration or DB. Since if phpbb works OK with cp1251 - then the cpg should also. My mistake in the first case was to use a wrong MySQL query to provide the MySQL with information about used charset. I wrote "SET CHARACTER SET cp1251", but should do it in the same way it is done in phpbb - "SET NAMES cp1251". Of course, in phpbb they do it even better - they check the version of the MySQL and the charset, that is chosen by the user and then make the query, if it's neccessary, substituting the correct encoding name.

So that's the case - after each mysql_connect (I found 4 apperances of it!) one should insert the line: mysql_query("SET NAMES <mysql-charset-name>") in order to be sure, that the data, that is written and then searched and displayed is really in the charset that was selected in the configuration tab of the cpg. (By now the situation is like: DB is set by default to e.g. KOI8-r but the data is stored in cp1251. So when searching in the DB for the string in cp1251 - you'll not find anything actually, since DB thinks it's KOI8).

The issue also affects the information taken from the bridge conncetions to other systems, e.g. the same phpbb. If you could really read the questions in the russian part of the support forum, you could find out that many users report about [problems of displaying of the users list, after connection to phpbb. And that's it - phpbb really stores the data in the cp1251, but cpg doesn't inform the DB that it needs data in that encoding, so the DB give it all out in the default charset (like koi8-r set on many russian unix-based hosting services) or even latin-1 (if that's a foreign hosting).

I'm not sure how to report this to the bug-tracker, could moderators do this please?

Joachim Müller

Quote from: Varrah on October 22, 2006, 01:22:06 PM
GauGau - You've made it far too quick, my marking the issue as solved.
I marked it as "solved" because you reported the case to be solved.


Quote from: Varrah on October 22, 2006, 01:22:06 PM
Since if phpbb works OK with cp1251 - then the cpg should also.
Who says so? Don't make false assumptions.

Quote from: Varrah on October 22, 2006, 01:22:06 PMOf course, in phpbb they do it even better
Well, that's a matter of what you consider to be better.

Quote from: Varrah on October 22, 2006, 01:22:06 PMIf you could really read the questions in the russian part of the support forum, you could find out that many users report about [problems of displaying of the users list, after connection to phpbb.
We recommend using utf-8. The russian-speaking community appears to be reluctant to use it, as most of them don't appear to be understanding the great benefits utf-8 brings in comparison to proprietary encodings. Imo not a bug, but expected behaviour. Even the moderator of the Russian support board is reluctant to understand the benefits of utf-8 and keeps on recommending proprietary encodings.

Quote from: Varrah on October 22, 2006, 01:22:06 PMI'm not sure how to report this to the bug-tracker, could moderators do this please?
Moderators will move threads accordingly if they consider issues an actual bug. Yours is imo not a bug: use utf-8 as recommended and everything will be dandy.
Maybe other devs want to have a say on this.

Varrah

I really do not like arguing on forums, especially in a subject-oriented topic, but I really should say something...

Quote from: GauGau on October 22, 2006, 06:33:24 PM
I marked it as "solved" because you reported the case to be solved.
Well, OK, I'll ask for an excuse about this fact, since it was hard to notice the "seems to be" and "didn't break yet" words, and also hard for you to check the setup of MySQL with KOI8-R as default charset...

Quote
Who says so? Don't make false assumptions.
So, then, how many dozens of links you want me to write here for you to believe?
I say dozens because I can't remember all those hundreds of sites, that use PHPBB with Windows-1251 encoding for quite a long time, including those 4 forums that I had set up and still administer by myself. But if you really do not believe me, OK, I'll use google and will go and find you more links...

Quote
Well, that's a matter of what you consider to be better.
I consider the code can be called good if it's aware that some settings on a server, placed on the other side of the Earth, can differ from settings that a developer has on his own testing machine.

Quote
We recommend using utf-8. The russian-speaking community appears to be reluctant to use it, as most of them don't appear to be understanding the great benefits utf-8 brings in comparison to proprietary encodings. Imo not a bug, but expected behaviour. Even the moderator of the Russian support board is reluctant to understand the benefits of utf-8 and keeps on recommending proprietary encodings.
Surprisingly, huh? :-)
That's because not all the computers in the world know anything about UTF-8. But you can say for sure, that on the computers of 70% of internet users Windows is installed. And so if saying about Russian users, what encoding, do you think, will windows set up? Let me guess - Windows Cyrillic (cp-1251)?

You're german. That's good, but your extended latin chars are not so widelly used in your language for you to understand the big difference between loosing just some umlaut-chars (that can be easily replaced with ae, ue, oe) and loosing all the chars in a second of changing from one encoding to another. UTF-8 was made especially so that the first 128 chars are on the same places as they always were. So you wouldn't want to bother yourself with setting up an apropriate encoding upon connecting to the DB. Notice - it's a NEW version of DB, and it supports MORE different encodings for full-text search and better optimization of programming and searching. Not just the magic UTF-8.

And so the guys from PHPBB are prepeared to the fact that the DB can be set up in a differrent way, as well as user prefferences can be somethat different as well and they do the check and setup. And that's why their system works fine on every server I tried to set it up, not like coppermine, that could say only "Fatal Error:" and nothing more, just on the first run after the "Setup was successful" tabloid appeared.

Quote
Moderators will move threads accordingly if they consider issues an actual bug. Yours is imo not a bug: use utf-8 as recommended and everything will be dandy.
Maybe other devs want to have a say on this.
That's because you're thinking of it as if I'm just a geek of those "proprietary encodings". I made the coversion to the cp-1251 (by the way - it's used most often on the russian servers, but is not in the encoding setup of the cpg yet, only koi8-r is there already, thought that one is used mostly on *nix-based systems) only when I understood, that a connection to my ready and running phpbb is necessary. Since the whole site, all services and the forum (including, of course, the usernames) were in cp-1251 I made the conversion.

But in the first case I set it all up with the recommended UTF-8 settings. Who could tell me, that I'll get it all broken right after install only due to the KOI8-R setting on the hoster side? I'll tell you even more - the error was appearing only in Internet Explorer (v 6.0 + SP2). If trying it all in FireFox (1.5.2) or Opera (8.5) - it was OK. The whole thing was just about the encoding of the session ID, got from the cookie. I spent hours of investigation - why did IE made the cookie in one encoding but in DB it was checked with the data, got in another encoding? I got through the whole code of coppermine and checked, that EVERYWHERE the encoding was correct. The onliest thing was wrong - there wasn't a word about the connection to MySQL.

And also, man, if with UTF-8 it's so easy and fun, why is THIS forum using the Western Latin-1? You can not see and understand this, but the same Russian part of the forum looks weird, since half of posts are submited with the wrong encoding, preset by server side, and so to read the post one has to switch from Latin-1 to cp-1251 and vice versa.

Joachim Müller

Quote from: Varrah on October 23, 2006, 07:48:22 PM
Quote
Who says so? Don't make false assumptions.
So, then, how many dozens of links you want me to write here for you to believe?
I say dozens because I can't remember all those hundreds of sites, that use PHPBB with Windows-1251 encoding for quite a long time, including those 4 forums that I had set up and still administer by myself. But if you really do not believe me
I have not the slightest doubt that phpBB does exactly what you claim it does. The false assumption is that just because phpBB works fine with your setup coppermine will work fine with it as well. Coppermine is not phpBB, nor is phpBB the same as Coppermine.

Quote from: Varrah on October 23, 2006, 07:48:22 PMThat's because not all the computers in the world know anything about UTF-8.
That's just plain wrong: although I have never used a russian Windows install, both my German, Spanish, Turkish and English Windows installs performed fine with utf-8, as both IE as well as Firefox and Opera come with utf-8 support. The OS doesn't necessarily have to support utf-8 as long as the browser supports it, and I'm pretty sure all modern browser do support utf-8. Anyway, you're just trading one issue with another: with proprietary M$-fonts, you lock out all other operating systems. Yes, they are a minority, but wrong things don't get right just because the world's market leader for operating systems does things wrong and everybody accepts it as god-given. Don't be part of the problem, but be part of the solution.


Quote from: Varrah on October 23, 2006, 07:48:22 PMYou're german.
Correct
Quote from: Varrah on October 23, 2006, 07:48:22 PMThat's good, but your extended latin chars are not so widelly used in your language for you to understand the big difference between loosing just some umlaut-chars (that can be easily replaced with ae, ue, oe) and loosing all the chars in a second of changing from one encoding to another.
That's wrong: using utf-8, I don't lose them - they work as expected, as well as special chars from many other languages - on one and the same page.
Russian works fine as well: I have tested this (although I don't speak Russian) quite often already (with the help of a Russian friend) if you use the recommended encoding (utf-8) both for your db tables and for Coppermine's encoding.

Quote from: Varrah on October 23, 2006, 07:48:22 PMAnd also, man, if with UTF-8 it's so easy and fun, why is THIS forum using the Western Latin-1? You can not see and understand this, but the same Russian part of the forum looks weird, since half of posts are submited with the wrong encoding, preset by server side, and so to read the post one has to switch from Latin-1 to cp-1251 and vice versa.
I noticed this as well - I'm not blind. The forum runs in iso8859-1 for historical reasons (we had loads of English content already when we decided to allow a Russian sub-board to be started). I'm not to blame if half of the russian users posts using one encoding and the other half uses the other. Surprisingly the users on the Chinese sub-board don't have those issues. However, we're not discussing SMF, are we? This can only serve as some proof of concept that the iso8859-1 encoded SMF board does work fine with the bridged demo in utf-8.

Bottom line: if you want, call the issue you brought up a missing feature: Coppermine has not been tested with all kinds of database encodings, and our main focus is on using utf-8, simply because we're convinced that it's the right thing to do and we're ready to promote the use of Unicode. Maybe we should add this line into the docs:
QuoteWarning: Coppermine will work fine when using utf-8 both as encoding for the app as well as the database. All other encodings may or may not work. Try it for yourself.

You know, as I suggested earlier: everybody has a chance to participate: just contribute your suggested code changes and we'll see what can be done. Volunteer as tester for the dev version, check using all kinds of different setups and platforms. Come up with good, valid, usefull reports. I'm convinced that things can be improved in coppermine as well, but currently, the issue you brought up has little or no priority, simply because none of the devs could possibly test your issues in detail and come up with fixes. The Russian community (with many million potential users) has so far contributed little to the coppermine project - why don't you make a change and stop nagging and come up with actual code contributions instead?

Varrah

I've explained the chahes already: after each mysql_connect (I found 4 apperances of it!) one should insert the line: mysql_query("SET NAMES <mysql-charset-name>");
Where the <mysql-charset-name> is actually a MySQL-proven name for the charset, selected by the user in the coppermine configuration tab.
So if you need the actual instructions, here they are:

I'm assuming here, that a user uses cp1251 as the encoding by default. Problem here is a bit more complicated - I do no know the code of cpg good enough right now to substitute the neccessary variables right now. What's more important, that when dealing with the bridge connections cpg should connect to foreign tables using not the settings of cpg, but settings of the system, it's connected to. I suppose, that there should be some kind of charset chooser in the bridge connection for that - since not all systems do the right connection, when writing to the DB.
Now, knowing all the above we can do the following:

assuming also, that /coppermine/ is the install folder

1. Find file /coppermine/bridgemgr.php
find the line:
$link = mysql_connect($_POST['db_hostname'], $_POST['db_username'], $_POST['db_password']);

right after it add a line:
mysql_query("SET NAMES cp1251", $link);

in the same file search for the line:
$link = @mysql_connect($BRIDGE['db_hostname'], $BRIDGE['db_username'], $BRIDGE['db_password']);

and again add a new line right after:
mysql_query("SET NAMES cp1251", $link);

2. In the file /coppermine/include/functions.inc.php find the line:
$result = @mysql_connect($CONFIG['dbserver'], $CONFIG['dbuser'], $CONFIG['dbpass']);

add a new line:
mysql_query("SET NAMES cp1251", $result);

3. In /coppermine/bridge/udb_base.inc.php search for:
$this->link_id = mysql_connect($this->db['host'], $this->db['user'], $this->db['password']);

add a new line:
mysql_query("SET NAMES cp1251", $this->link_id);

That's it.
Actually, why didn't you make the connection to the DB in some external module?
Then you could just call it from all those places (here actually there're a few more - in the checks and set ups in files install.php and update.php).
In PHPBB they also do a MySQL version check on each connect call and set the charset/collation as follows (extract from /phpbb/includes/db.php):

$db = new sql_db($dbhost, $dbuser, $dbpasswd, $dbname, false);
if(!$db->db_connect_id)
{
    message_die(CRITICAL_ERROR, "Could not connect to the database");
}
// Set connection charset & collation for MySQL (based on phpMyAdmin)
if( preg_match("/^mysql/i", SQL_LAYER) )
{
    $sql = 'SELECT VERSION() AS mysql_version';

    $result = $db->sql_query($sql);
    $row = $db->sql_fetchrow($result);
    $match = explode('.', $row['mysql_version']);

    if (!isset($row))
    {
        define('MYSQL_INT_VERSION', 32332);
        define('MYSQL_STR_VERSION', '3.23.32');
    }
    else
    {
        define('MYSQL_INT_VERSION', (int)sprintf('%d%02d%02d', $match[0], $match[1], intval($match[2])));
        define('MYSQL_STR_VERSION', $row['mysql_version']);
    }
    unset($result, $row, $match);

    if ( MYSQL_INT_VERSION >= 40100 && isset($mysql_charset) )
    {
        if (empty($collation_connection) || ( strpos($collation_connection, '_') ? substr($collation_connection, 0, strpos($collation_connection, '_')) : $collation_connection) == $mysql_charset )
        {
            $sql = "SET NAMES $mysql_charset";
            if ( !($db->sql_query($sql)) )
            {
                message_die(CRITICAL_ERROR, 'Could not set MySQL charset', '', __LINE__, __FILE__, $sql);
            }
        }
        else
        {
            $sql = "SET CHARACTER SET $mysql_charset";

            if ( !($db->sql_query($sql)) )
            {
                message_die(CRITICAL_ERROR, 'Could not set MySQL charset', '', __LINE__, __FILE__, $sql);
            }
        }
        if ( !empty($collation_connection) )
        {
            $sql = "SET collation_connection = '$collation_connection'";

            if ( !($db->sql_query($sql)) )
            {
                message_die(CRITICAL_ERROR, 'Could not set MySQL collation connection', '', __LINE__, __FILE__, $sql);
            }
        }
    }
}

Varrah

Oh, yes, and in the above instructions in assumed, that you have MySQL 4.1 or higher.

Makc666

I want to say that this issue is more relaid to Coppermine Bridging support
http://coppermine-gallery.net/demo/cpg14x/docs/index.htm#integrating_support_start

When you configure bridge for phpBB why not to add one more option for character set?

The idea I liked in Varrah's post is:
QuoteActually, why didn't you make the connection to the DB in some external module?

Joachim Müller

Quote from: Varrah on October 23, 2006, 10:29:20 PM
Oh, yes, and in the above instructions in assumed, that you have MySQL 4.1 or higher.
There's a significant number of Coppermine users who don't fullfill this. Are you asking to drop support for people with older mySQL versions just for the sake of your suggestions?

QuoteActually, why didn't you make the connection to the DB in some external module?
Database abstraction has been discussed in detail already. There are advantages as well as disadvantages that come with db abstraction. Dev team has voted not to introduce db abstraction in the cpg1.x.x series.

Varrah

Quote
There's a significant number of Coppermine users who don't fullfill this. Are you asking to drop support for people with older mySQL versions just for the sake of your suggestions?
That's why I've posted here the extract from the phpbb code - that one makes the full version check, charset check, collation check, compatibility check, and then makes all the neccessary modifications to the connection to the DB, my means of version-dependant SQL queries.

You asked me for a solution for the issue I found - I wrote it, as I had it for the case I had, that is also suitable to a number of other differrent situations. In order to make it universal developers should look the code through and make the changes to the code, add a universal version/charset check. I can't do this right now. I could, but I need some more time to examine the code throughly, but I don't have this time right now - I have my job and 5 different sites to administer and develop. I believe, since working in PHP and MySQL is my job profile and all the sites I write now use these technologies, that to make such type of univesal change is not a hard job, especially for the developer that knows the pluses and minuses of encapsulation of DB-access procedures, and so is supposed to know which of the db-connection calls is made for what.

The whole question of arguing about "to set or not to set the connection collation" looks ugly. It's like arguing about "to set or not to set the proper time, when warming a pizza in a microwave oven". From your side it looks like - "Most microwave ovens know the proper time for warming the pizza - just use the 'autocook' functionality and it'll be okie-dokie". What I'm saying is "Time of warming the pizza depends on the size of the pizza, type of the microwave oven and power of the microwaves". See the difference? Then what you asked me for? "What was your solution of warming the pizza?" And I say "In the case of microwaveovens with mechanical timer set, grill and low-power, one shold set 3 minutes of heating, in case if anyone has another type of oven or just want to have a universla recipe - try the unversal recipe I got for the pies warming". And your answer was: "Hey! Not so many ovens has a grill!". Should I continue?

vietnamholiday.com

http://www.vietnamholiday.com - Hanoi Red Tours offers Vietnam Travel and Indochina Travel services including tour packages, travel to vietnam
http://www.thegioidulich.org - du lich nuoc ngoai, trong nuoc, dat phong khach san, ve may bay, ve tau, cho thue xe, visa - ho chieu