coppermine-gallery.com/forum

Support => Older/other versions => cpg1.3.x Support => Topic started by: kvv213 on October 05, 2005, 01:56:15 PM

Title: URL RegExp and Russian letters
Post by: kvv213 on October 05, 2005, 01:56:15 PM
I use CPG for a year and it looks great. But I found that it doesn't hold russian names as a part of URLs. For example, there is such an URL http://ru.wikipedia.org/wiki/Ленин. And it leads to a web site and it works OK in a brouser.
But if I use the next construction in coppermine (Lenin (http://ru.wikipedia.org/wiki/Ленин)) it doesn't work. It show the link and the [url] construction as a text. I tink that it is so because there aren't some rules to hold russian chars in the URL in a RegExp patter that is defined for [url] constructions.
The string with the pattern looks as:
             $patterns[3] = "#\+?://){1}([a-z0-9\-\.,\?!%\*_\#:;~\\&$@\/=\+\(\)]+)\](.*?)\[/url\]#si";

And I don't know where I have to put some symbols to fix that behaviour.
(//(%5Ba-z)
Title: Re: URL RegExp and Russian letters
Post by: Joachim Müller on October 06, 2005, 08:19:48 AM
You're right - coppermine hasn't been built to allow non-latin chars in the URL. There are checks in several places for validity. I'm aware that there has been a hype for such stuff, but although I understand the need for i18n I'm reluctant to allow such stuff even in future versions, as it makes hardening coppermine against XSS very hard. Not sure what to recommend.
Title: Re: URL RegExp and Russian letters
Post by: kvv213 on October 06, 2005, 11:50:35 AM
Quote from: kvv213 on October 05, 2005, 01:56:15 PM
I use CPG for a year and it looks great. But I found that it doesn't hold russian names as a part of URLs. For example, there is such an URL http://ru.wikipedia.org/wiki/Ленин. And it leads to a web site and it works OK in a brouser.
But if I use the next construction in coppermine (Lenin (http://ru.wikipedia.org/wiki/Ленин)) it doesn't work. It show the link and the [url] construction as a text. I tink that it is so because there aren't some rules to hold russian chars in the URL in a RegExp patter that is defined for [url] constructions.
The string with the pattern looks as:
             $patterns[3] = "#\+?://){1}([a-z0-9\-\.,\?!%\*_\#:;~\\&$@\/=\+\(\)]+)\](.*?)\[/url\]#si";

And I don't know where I have to put some symbols to fix that behaviour.

(//(%5Ba-z)
I managed to fix that. I changed the URL itself :)
Now it looks like http://ru.wikipedia.org/w/index.php?title=%D0%9B%D0%B5%D0%BD%D0%B8%D0%BD%2C_%D0%92%D0%BB%D0%B0%D0%B4%D0%B8%D0%BC%D0%B8%D1%80_%D0%98%D0%BB%D1%8C%D0%B8%D1%87&oldid=442956