URL RegExp and Russian letters URL RegExp and Russian letters
 

News:

CPG Release 1.6.26
Correct PHP8.2 issues with user and language managers.
Additional fixes for PHP 8.2
Correct PHP8 error with SMF 2.0 bridge.
Correct IPTC supplimental category parsing.
Download and info HERE

Main Menu

URL RegExp and Russian letters

Started by kvv213, October 05, 2005, 01:56:15 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

kvv213

I use CPG for a year and it looks great. But I found that it doesn't hold russian names as a part of URLs. For example, there is such an URL http://ru.wikipedia.org/wiki/Ленин;. And it leads to a web site and it works OK in a brouser.
But if I use the next construction in coppermine (Lenin) it doesn't work. It show the link and the [url] construction as a text. I tink that it is so because there aren't some rules to hold russian chars in the URL in a RegExp patter that is defined for [url] constructions.
The string with the pattern looks as:
             $patterns[3] = "#\+?://){1}([a-z0-9\-\.,\?!%\*_\#:;~\\&$@\/=\+\(\)]+)\](.*?)\[/url\]#si";

And I don't know where I have to put some symbols to fix that behaviour.

Joachim Müller

You're right - coppermine hasn't been built to allow non-latin chars in the URL. There are checks in several places for validity. I'm aware that there has been a hype for such stuff, but although I understand the need for i18n I'm reluctant to allow such stuff even in future versions, as it makes hardening coppermine against XSS very hard. Not sure what to recommend.

kvv213

Quote from: kvv213 on October 05, 2005, 01:56:15 PM
I use CPG for a year and it looks great. But I found that it doesn't hold russian names as a part of URLs. For example, there is such an URL http://ru.wikipedia.org/wiki/Ленин;. And it leads to a web site and it works OK in a brouser.
But if I use the next construction in coppermine (Lenin) it doesn't work. It show the link and the [url] construction as a text. I tink that it is so because there aren't some rules to hold russian chars in the URL in a RegExp patter that is defined for [url] constructions.
The string with the pattern looks as:
             $patterns[3] = "#\+?://){1}([a-z0-9\-\.,\?!%\*_\#:;~\\&$@\/=\+\(\)]+)\](.*?)\[/url\]#si";

And I don't know where I have to put some symbols to fix that behaviour.

I managed to fix that. I changed the URL itself :)
Now it looks like http://ru.wikipedia.org/w/index.php?title=%D0%9B%D0%B5%D0%BD%D0%B8%D0%BD%2C_%D0%92%D0%BB%D0%B0%D0%B4%D0%B8%D0%BC%D0%B8%D1%80_%D0%98%D0%BB%D1%8C%D0%B8%D1%87&oldid=442956