Weird characters at the beginning of the utf-8 files? Weird characters at the beginning of the utf-8 files?
 

News:

CPG Release 1.6.26
Correct PHP8.2 issues with user and language managers.
Additional fixes for PHP 8.2
Correct PHP8 error with SMF 2.0 bridge.
Correct IPTC supplimental category parsing.
Download and info HERE

Main Menu

Weird characters at the beginning of the utf-8 files?

Started by artefact, September 25, 2004, 01:38:43 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

artefact

I have noticed that each utf-8 file for each language supported in the lang directory are starting with the following characters:

ï"¿

Is that a bug? If not what is it use for?

Charles.

Tranz

Do you see this with all of them? I sampled some and they all started with
<?php
regardless of language.

artefact

I see this on all the utf-8 versions.
So english.php is fine but english-utf-8.php has the problem.

I am using 1.3.2 but 1.3.1 already had the problem. Doesn't seem to hurt though.

Tranz

I see no weird characters in english-utf-8.php, v. 1.3.2. What are you using to read the files?

artefact

Humm, that's bizare.

With Eclipse (PHPEclipse) or Visual SlickEdit I see them but not with notepad...
All this under Windows XP in english.
May be your editor do not display them  ;)

Look in Hex mode, the file starts with:
0xEF 0xBB 0xBF

Tranz

I am using Dreamweaver, which should be able to read php files such as the language files.

Casper

They are there in all utf-8 files, and visible when using a plain text editor.

They are normal, no need to worry about it.
It has been a long time now since I did my little bit here, and have done no coding or any other such stuff since. I'm back to being a noob here

Tranz

#7
I guess I have some setting in Dreamweaver that suppresses those characters. (https://coppermine-gallery.com/forum/proxy.php?request=http%3A%2F%2Ftu2.net%2Fforums%2Fimages%2Fsmiles%2Ficon_dunno.gif&hash=9e4794e3c22c15004e803230c2f1b6c143f8e603)

Casper

I must admit, they are not visible in all plain text editors.  In fact, notepad does not see them either, but the editor I use, 1st page 2000, does.
It has been a long time now since I did my little bit here, and have done no coding or any other such stuff since. I'm back to being a noob here

artefact

Quote from: Casper on September 25, 2004, 09:55:05 AM
They are there in all utf-8 files, and visible when using a plain text editor.

They are normal, no need to worry about it.

Good, thanks for the info.
I came to the same conclusion after reading about BOM (Byte Order Mark) for unicode data stream.
http://www.unicode.org/unicode/faq/utf_bom.html#BOM

Charles.

Joel

Quote from: Casper on September 25, 2004, 09:55:05 AM
They are there in all utf-8 files, and visible when using a plain text editor.

They are normal, no need to worry about it.
Do you think so? I don't.
Why a PHP-script have to be marked as / with BOM? 

IE won't display the characters but Firefox do in the upper left corner.  After I removed them everythig looks fine.

Joel

jasendorf

Quote from: Casper on September 25, 2004, 09:55:05 AM
They are there in all utf-8 files, and visible when using a plain text editor.

They are normal, no need to worry about it.

It does break the XP Publishing Wizard registry file download.
Read the Online DOCs,FAQ, and SEARCH the board BEFORE posting questions for help.

Joachim Müller

John,

good to hear from you again :). There are some issues with XP Publisher that have been reported in the testing/bugs board which are still open (others may have been fixed) - I can't tell for sure, since I don't have Windows XP. Did you check out the CVS to see if those issues have been fixed there? I wouldn't mind to release a bugfix version cpg1.3.3 if there are enough fixes that make it worth the bother, just let me know (maybe we should discuss this on the dev board though).

@devs: there are a lot of reported bugs in the XP Publisher file. Could someone actually running Windows XP take care of them please?

Joachim

Casper

On the utf-8 characters issue, they appear to be put there during the conversion process.  If they cause problems, let's remove them, if that will not create other problems.
I set my test stable gallery to utf-8, and cannot see these characters on the screen at any point. 


Also, my xp_publish works fine when I'm in this mode.  But I am using the version I did 2 weeks ago which fixed the problem of a few people who'se systems did not like the file, caused by an empty line in the code.

As far as I'm aware, there are no outstanding issues with the xp_publish file.  There have been reports of users unable to upload to public albums with this, but that has always been the case, as the instructions point out.  This was to prevent flooding.

If John tests the latest version of xp_publish, and lets us know if he still has issues.  His is the first report I know of that links any xp_publish problem with the use of utf-8 files.

It has been a long time now since I did my little bit here, and have done no coding or any other such stuff since. I'm back to being a noob here

thinduke

Using Coppermine on linux and mozilla although I set the language to french-utf-8, and the pages generated indeed display the correct charset in the html header, Mozilla auto-detects iso-8859-1, and the three characters : ï"¿ are displayed (3 first chars in the html source), and of course the accented letters are wrong. If I manually set the character encoding in  Mozilla to unicode, the three characters disappear and the accented letters are correct.

I ran "iconv -t=UTF-8" on all UTF-8 lang files, which removed the three characters, but the behaviour did not change: iso-8859-1 is still detected. Does it have something to do with the 3 characters? Are there some other files which should be re-converted?

But this happens only on my computer; on my website at my ISP the pages are correctly detected as UTF-8.

Also, some accented letters are wrong in the french-utf-8 file, I guess someone used a non-unicode text editor...

chtito

Quote from: thinduke on November 25, 2004, 11:20:56 PM
But this happens only on my computer; on my website at my ISP the pages are correctly detected as UTF-8.

There may be one tricky explanation. A user agent looks for three places to determine the encoding to be used.
1)The http header
2)The xml encoding
3)The encoding in the meta tag

The first one encountered supercedes the others. So if your personal web server is configured to serve html files in iso-8859-1, the utf-8 files generated by cpg will be all messed up.

The reason is that coppermine currently does not override this setting (because it does'nt use the http header to specify the encoding). But if you want cpg to override the server settings, this can be easily done by adding header('Content-Type: text/html; charset=utf-8'); in the pageheader function of the theme.php file that you're using.

Hope this helps!

cheers!
Vous pouvez poser vos questions en français sur le forum francophone !