SEF_URLs plugin for cpg1.5.x - Page 10 SEF_URLs plugin for cpg1.5.x - Page 10
 

News:

CPG Release 1.6.26
Correct PHP8.2 issues with user and language managers.
Additional fixes for PHP 8.2
Correct PHP8 error with SMF 2.0 bridge.
Correct IPTC supplimental category parsing.
Download and info HERE

Main Menu

SEF_URLs plugin for cpg1.5.x

Started by Joachim Müller, March 26, 2007, 06:56:46 PM

Previous topic - Next topic

0 Members and 6 Guests are viewing this topic.

flapane

#180
Hello,
actually, googlebot is crawling weird URLs, as the one I posted above. If it's crawling them, there should be a reason (and they're showing off as thousands of http404s as shown in the attached image, even if they look like http200s in the access log).
http://www.test.com/gallery/displayimage-1-8.htmlalbums/userpics/10001/albums/viaggio-newyork/mostra-cerca-0-12-_Via_Caracciolo_dinverno_.html isn't an URL googlebot should crawl, because it's non-existent and it seems a mixture of 2 URLs.
Thanks
Flapane
www.flapane.com

Gallery
www.gallery.flapane.com

Niecher

#181
The log is written that way, perhaps because googlebot accessed at the same instant to 4 different url:

http://www.test.com/gallery/displayimage-1-8.html
http://www.test.com/gallery/albums/userpics/10001/
http://www.test.com/gallery/albums/viaggio-newyork/
http://www.test.com/gallery/mostra-cerca-0-12-_Via_Caracciolo_dinverno_.html

If you believe that, seeing as it is written, is part of a single url and you try to access that url, it obviously does not exist.

Regards.

[Edit André: replaced URL by request]

flapane

I already asked that to my web hosting some time ago, and they confirmed that the log shows one access per line only (in fact there are several accesses at the same time from "real" users and other bots, and they're displayed on different lines per URL). Also, this wouldn't explain all those errors in Google Webmaster Tools I posted in the previous message.
I don't know what Google changed in Googlebot code, why it's happening, and how to fix it, because it never happend until Mid August.
Thanks
Flapane
www.flapane.com

Gallery
www.gallery.flapane.com

Niecher

It may be what you say,

For a long time I ignore anything that says google.

Regards.

flapane

Somebody considers it as the most wise choice. ;)
Regards
Flapane
www.flapane.com

Gallery
www.gallery.flapane.com

Walkinman

flapane

I had exactly the same problem with this plugin .. http://forum.coppermine-gallery.net/index.php?topic=74808.0

The major error is that the rewrite allows for any text or characters in the url, and doesn't end anything.. Test it. You can basically write your correct url and add whatever you want in the url after the par # and it'll come up fine.

replace domain.com with skolaiimages.com

domain.com/stock/thumbnails-2-Camping-photos.html

correct.

Change it to, for example:

domain.com/stock/thumbnails-2-Caing-photos.htmltesta

shows the same page. Google would then go and try to find every url on the site with that kind of line as a prefix. I had thousands of 404s, and ended up having server usage issues, and on and on.

Another issue is that the plugin doesn't actually rewrite anything ... you can still go to your dynamic cpg urls and browse the albums that way. Which is what should happen.

I really wish someone would fix this so it works correctly. I'd definitely not recommend the plugin to anyone. I'm somewhat stuck with it now, as I have links everywhere to the 'sep friendly' urls, and on and on .. but I can assure you the plugin is most definitely NOT seo friendly. My site continues to go down on google rankings, and I'm about 95% sure all of the problems lie with this plugin.

This page here (static html page)
domain.com/alaska/wrangell-st-elias-photos.html

WAY outranks this one (cpg)
domain.com/stock/thumbnails-25-Wrangell-St-Elias-National-Park-Photos.html

I've also tested with wordpress

This page:
domain.com/alaska-polar-bear-photos/

WAYYYYY outranks this one
domain.com/stock/thumbnails-122-Polar-Bear-Photos.html

Yet that cpg page is older, has many more images, and so on. Both are in the site map, linked to correctly, and so on. The wordpress galleries outrank the cpg pages on my site virtually every time.

My site was loved by google before these kinds of problems started up. It sees double content,  and bad urls and so on and this hurts the site.

Why not a plugin that allows the user to set a url path when s/he creates a gallery? And then it's set, with any incorrect url returning a 404 (which is what should happen). It could even allow full size images to be manually set as urls as well, but really, I think for most folks, just rewriting the main album is what's most important.

One of the reasons this stuff happens a lot now is because google crawls and looks for any url it finds, even truncated urls that are not actual links. Many site aggregators now truncated the tex of a url, and so google will look for yoursite.com/albums/whatever-the-nam.... instead of whatever-the-name-of-the-page-is.html

And with this plugin, those do not return 404s, but google crawls them, and sees duplicate content.

I'm not a coder, so I can't begin to fix this. But here a couple of rules I wrote trying to solve the problem on the backdoor, and return 410s (no impact on server load)


ErrorDocument 410 "File no longer exists"
This was suggested but didn't fix it.
# RewriteRule ^(.*)\.htmlhttp\:\/ - [G,L]

I believe this below is what actually fixed it
RedirectMatch 410 ^/stock/thumbnails-18-Whitetail-Deer-Photos.htmlhttp:/

And here are 2 others I added
RedirectMatch 410 ^/stock/thumbnails-53-Alaska-Stock-Phot/
RedirectMatch 410 ^/stock/thumbnails-106-Juvenile-Bald-Eagle-Photos.html/

I'll reiterate, to anyone wanting to use the plugin, I advise against it.

Cheers

Carl

flapane

Quote from: Walkinman on November 13, 2013, 08:08:09 AM
The major error is that the rewrite allows for any text or characters in the url, and doesn't end anything.. Test it. You can basically write your correct url and add whatever you want in the url after the par # and it'll come up fine.

Another issue is that the plugin doesn't actually rewrite anything ... you can still go to your dynamic cpg urls and browse the albums that way. Which is what should happen.

Hello Carl,
yes, I noticed that.

Quote from: Walkinman on November 13, 2013, 08:08:09 AM
My site continues to go down on google rankings, and I'm about 95% sure all of the problems lie with this plugin.

My site was loved by google before these kinds of problems started up. It sees double content,  and bad urls and so on and this hurts the site.

Don't tell me... the amount of visits on my gallery almost dropped to zero sice I first noticed the problem.

I'll try to modify and test the .htaccess rules you proposed and will report the changes.

In my humble opinion, a modern gallery should (well, in theory it "must") offer the option to flawlessy rewrite URLs (more or less in the same way WP does), nowadays it's a crucial feature.
Unfortunately, I haven't the skills to investigate further into the problems with this plugin, and the development of coppermine has been really slow in the last years (mainly because of devs dropping out of the project). That's a pity, because I loved cpg since v1.3.
If I find a way of batch converting every photo/title/description into another open source gallery which handles URLs rewrites, I may sadly consider to do it.

Thanks,
Flavio
Flapane
www.flapane.com

Gallery
www.gallery.flapane.com

fekasatete

hello,
New user I install the plugin and when I go on an image I get this url: http://www.ldft.fr/photo-6.html I want to know how to change photo-6.html by name the photo.html

I apologized for my language I'm french and uses google translation

flapane

In codebase.php you may want to comment the if statement on lines 198,199 and 207, so that the umlaut transliterations work with every language (just in case you want to use germanic words).
I noticed that I was getting "Rmerberg" instead of "Roemerberg".
Flapane
www.flapane.com

Gallery
www.gallery.flapane.com

fekasatete


Sorry I do not understand too because the lines that I have discussed are:

198 :         </span>
199 : there is nothing on this line
207 : there is nothing on this line

here is my code.php what should I change?

<?php
/**************************************************
  Coppermine 1.5.x Plugin - sef_urls
  *************************************************
  Copyright (c) 2003-2007 Coppermine Dev Team
  *************************************************
  This program is free software; you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation; either version 3 of the License, or
  (at your option) any later version.
  ********************************************
  $HeadURL: https://coppermine.svn.sourceforge.net/svnroot/coppermine/branches/cpg1.5.x/plugins/sef_urls/codebase.php $
  $Revision: 7208 $
  $LastChangedBy: timoswelt $
  $Date: 2010-02-06 11:07:11 +0100 (Sa, 06 Feb 2010) $
  **************************************************/

if (!defined('IN_COPPERMINE')) { die('Not in Coppermine...');}

// Add plugin_install action
$thisplugin->add_action('plugin_install','sef_urls_install');

// Add plugin_uninstall action
$thisplugin->add_action('plugin_uninstall','sef_urls_uninstall');

// Add plugin_configure action
$thisplugin->add_action('plugin_configure','sef_urls_configure');

// Add plugin_cleanup action
$thisplugin->add_action('plugin_cleanup','sef_urls_cleanup');

// Add page_html filter
$thisplugin->add_filter('page_html','sef_urls_convert');


/**
 * Convert urls to search-engine friendly (SEF) urls
 */
function sef_urls_convert($html) {
    
    
$sef_language 'french';
    
    
// Language translation
    
if ($sef_language == 'german')
    {
        
$str_thumbnails 'uebersicht';
        
$str_displayimage 'bild';
        
$str_toprated 'beste';
        
$str_topn 'beliebteste';
        
$str_lastcomby 'kommentiertvon';
        
$str_lastcom 'kommentierte';
        
$str_page 'seite';
        
$str_profile 'benutzer';
        
$str_lastupby 'neuestevon';
        
$str_lastup 'neueste';
        
$str_search 'suche';
        
$str_contact 'kontakt';
        
$str_tdm 'oben';
        
$str_usermgr 'benutzerliste';
    }
    else if (
$sef_language == 'french')
    {
        
$str_thumbnails 'apercu';
        
$str_displayimage 'photo';
        
$str_toprated 'tresbien';
        
$str_topn 'populaire';
        
$str_lastcomby 'comentairede';
        
$str_lastcom 'comentaire';
        
$str_page 'page';
        
$str_profile 'client';
        
$str_lastupby 'neuvede';
        
$str_lastup 'neuve';
        
$str_search 'recherche';
        
$str_contact 'contacter';
        
$str_tdm 'enhaut';
        
$str_usermgr 'fichierclient';
    }
    else if (
$sef_language == 'spanish')
    {
        
$str_thumbnails 'resumen';
        
$str_displayimage 'cromo';
        
$str_toprated 'mejor';
        
$str_topn 'querido';
        
$str_lastcomby 'comentarde';
        
$str_lastcom 'comentar';
        
$str_page 'pagina';
        
$str_profile 'usuario';
        
$str_lastupby 'nuevode';
        
$str_lastup 'nuevo';
        
$str_search 'busca';
        
$str_contact 'contacto';
        
$str_tdm 'alto';
        
$str_usermgr 'usuariolistado';
    }
    else
    {
        
$str_thumbnails 'thumbnails';
        
$str_displayimage 'displayimage';
        
$str_toprated 'toprated';
        
$str_topn 'topn';
        
$str_lastcomby 'lastcomby';
        
$str_lastcom 'lastcom';
        
$str_page 'page';
        
$str_profile 'profile';
        
$str_lastupby 'lastupby';
        
$str_lastup 'lastup';
        
$str_search 'search';
        
$str_contact 'contact';
        
$str_tdm 'top_display_media';
        
$str_usermgr 'usermgr';
    }

    
// Rewrite usermgr.php
    
$html preg_replace('/usermgr\.php\?page=([0-9]+)/i',$str_usermgr.'-'.$str_page.'-$1.html',$html);
    
$html preg_replace('/usermgr\.php/i',$str_usermgr.'.html',$html);

    
// Rewrite index.php
    
$html preg_replace('/index\.php\?cat=([0-9]+)(\&|\&amp;)page=([0-9]+)/i','index-$1-'.$str_page.'-$3.html',$html);
    
$html preg_replace('/index\.php\?cat=0/i','index.html',$html);
    
$html preg_replace('/index\.php\?cat=([0-9]+)/i','index-$1.html',$html);
    
$html preg_replace('/index\.php/i','index.html',$html);
    
    
// Rewrite thumbnails.php
    
$html preg_replace('/thumbnails\.php\?album=lastupby(\&|\&amp;)uid=([0-9]+)/i',$str_thumbnails.'-'.$str_lastupby.'-$2.html',$html);
    
$html preg_replace('/thumbnails\.php\?album=lastcomby(\&|\&amp;)uid=([0-9]+)/i',$str_thumbnails.'-'.$str_lastcomby.'-$2.html',$html);
    
$html preg_replace('/thumbnails\.php\?album=([a-z0-9]+)(\&|\&amp;)cat=([\-0-9]+)(\&|\&amp;)page=([0-9]+)/i',$str_thumbnails.'-$1-$3-'.$str_page.'-$5.html',$html);
    
$html preg_replace('/thumbnails\.php\?album=([a-z0-9]+)(\&|\&amp;)cat=([\-0-9]+)/i',$str_thumbnails.'-$1-$3.html',$html);
    
$html preg_replace('/thumbnails\.php\?album=([a-z0-9]+)(\&|\&amp;)page=([0-9]+)/i',$str_thumbnails.'-$1-'.$str_page.'-$3.html',$html);
    
$html preg_replace('/thumbnails\.php\?search=([^"]+)(\&|\&amp;)album=search/i',$str_thumbnails.'-'.$str_search.'-$1.html',$html);
    
$html preg_replace('/thumbnails\.php\?album=search(\&|\&amp;)search=([^"]+)/i',$str_thumbnails.'-'.$str_search.'-$2.html',$html);
    
$html preg_replace('/thumbnails\.php\?album=([a-z0-9]+)/i',$str_thumbnails.'-$1.html',$html);

    
// Rewrite displayimage.php
    
$html preg_replace('/displayimage\.php\?album=lastcom(\&|\&amp;)cat=([\-0-9]+)(\&|\&amp;)pid=([\-0-9]+)(\&|\&amp;)msg_id=([\-0-9]+)(\&|\&amp;)page=([\-0-9]+)/i',$str_displayimage.'-lastcom-$2-$4-$6-'.$str_page.'-$8.html',$html);
    
$html preg_replace('/displayimage\.php\?album=lastcomby(\&|\&amp;)cat=([\-0-9]+)(\&|\&amp;)pid=([\-0-9]+)(\&|\&amp;)uid=([\-0-9]+)(\&|\&amp;)msg_id=([\-0-9]+)(\&|\&amp;)page=([\-0-9]+)/i',$str_displayimage.'-'.$str_lastcomby.'-$2-$4-$6-$8-$10.html',$html);
    
$html preg_replace('/displayimage\.php\?album=lastupby(\&|\&amp;)cat=([\-0-9]+)(\&|\&amp;)pid=([\-0-9]+)(\&|\&amp;)uid=([\-0-9]+)/i',$str_displayimage.'-'.$str_lastupby.'-$2-$4-$6.html',$html);
    
$html preg_replace('/displayimage\.php\?album=([a-z0-9]+)(\&|\&amp;)cat=([\-0-9]+)(\&|\&amp;)pid=([\-0-9]+)/i',$str_displayimage.'-$1-$3-$5.html',$html);
    
$html preg_replace('/displayimage\.php\?album=search(\&|\&amp;)pid=([\-0-9]+)/i',$str_displayimage.'-'.$str_search.'-$2.html',$html);
    
$html preg_replace('/displayimage\.php\?album=([a-z0-9]+)(\&|\&amp;)pid=([\-0-9]+)/i',$str_displayimage.'-$1-$3.html',$html);
    
$html preg_replace('/displayimage\.php\?pid=([0-9]+)/i',$str_displayimage.'-$1.html',$html);
    
    
// Rewrite profile.php
    
$html preg_replace('/profile\.php\?uid=([0-9]+)/i',$str_profile.'-$1.html',$html);
    
$html preg_replace('/profile\.php\?op=([a-z0-9_]+)/i',$str_profile.'-op-$1.html',$html);
    
    
// language specific replacements
    
if ($sef_language != 'french')
    { 
        
$html preg_replace('/-toprated/i','-'.$str_toprated,$html);
        
$html preg_replace('/-topn/i','-'.$str_topn,$html);
        
$html preg_replace('/-lastcom/i','-'.$str_lastcom,$html);
        
$html preg_replace('/-lastup/i','-'.$str_lastup,$html);
        
$html preg_replace('/top_display_media/i',$str_tdm,$html);
        
$html preg_replace('/'.$str_displayimage.'-search-/i',$str_displayimage.'-'.$str_search.'-',$html);
    }
    
    
// contact and search.php
    
$html preg_replace('/contact.php/i',$str_contact.'.html',$html);
    
$html preg_replace('/search.php/i',$str_search.'.html',$html);
    
    
// albums in cat=0
    
$html preg_replace('/'.$str_thumbnails.'-([a-z0-9]+)-0\.html/i',$str_thumbnails.'-$1.html',$html);
    
    
// Return modified HTML
    
return $html;
}


/**
 * Configure plugin for install
 */
function sef_urls_configure($action) {
    global 
$thisplugin;

    if (
$action===1) {
        
$code implode('',file($thisplugin->fullpath.'/ht.txt'));
        echo <<< EOT
    <form name="cpgform" id="cpgform" action="{$_SERVER['REQUEST_URI']}" method="post">
        <p>
            You already have a .htaccess file in your root Coppermine folder.<br />
            Is it ok to overwrite it?
        </p>
        <div style="margin:25;">
        <table border="0" cellspacing="0" cellpadding="0">
            <tr>
                <td><input type="radio" name="create" value="1" /></td>
                <td>Yes</td>
            </tr>
            <tr>
                <td><input type="radio" name="create" checked="checked" value="0" /></td>
                <td>No</td>
            </tr>
        </table>
        </div>
        <span>
           <input type="submit" name="submit" value="Submit" /> &nbsp;&nbsp;&nbsp;
            <input type="button" name="cancel" onClick="window.location='pluginmgr.php';" value="Cancel" />
        </span>

        <p>&nbsp;</p>

        <p style="color:red;font-weight:bold;">STOP! READ THE FOLLOWING!</p>

        <p>
            If you don't want your .htaccess file to be overwritten, you'll have to insert the following code:
        </p>

        <div align="right">
            <a class="link" href="
{$thisplugin->fullpath}/ht.txt" target="_blank">Open in a seperate window</a>
        </div>
        <pre style="border:1;border-color:black;background-color:white;font-family:fixedsys;">
            
$code
        </pre>
    </form>
EOT;
    }
}


/**
 * Display cleanup options for uninstall
 */
function sef_urls_cleanup($action) {
    if (
$action===1) {
        echo <<< EOT
    <form name="cpgform" id="cpgform" action="{$_SERVER['REQUEST_URI']}" method="post">
        <p>
            Delete the .htaccess file in your Coppermine root? (If this file was created by this plugin,
            It's ok to delete it.)
        </p>
        <div style="margin:25;">
        <table border="0" cellspacing="0" cellpadding="0">
            <tr>
                <td><input type="radio" name="delete" value="1" /></td>
                <td>Yes</td>
            </tr>
            <tr>
                <td><input type="radio" name="delete" checked="checked" value="0" /></td>
                <td>No</td>
            </tr>
        </table>
        </div>
        <span>
           <input type="submit" name="submit" value="Submit" /> &nbsp;&nbsp;&nbsp;
            <input type="button" name="cancel" onClick="window.location='pluginmgr.php';" value="Cancel" />
        </span>
    </form>
EOT;
    }
}


/**
 * Install the plugin'
 */
function sef_urls_install() {
    global 
$thisplugin;
    
$sef_language 'french';
    
$sef_superCage Inspekt::makeSuperCage();
    if (
$sef_superCage->post->keyExists('create')) 
    {
      
$create $sef_superCage->post->getInt('create');
    }
      

    
// There's no .htaccess file or user has clicked 'yes' on the create form
    
if (!file_exists('.htaccess') || $create) {
        
copy($thisplugin->fullpath.'/ht-'.$sef_language.'.txt','.htaccess');
        return 
true;

    
// An htaccess file exists; display the configure form
    
} elseif (!isset($create)) {
        return 
1;

    
// User has clicked 'no' on the configure form. Install plugin. Don't create .htaccess file
    
} else {
        return 
true;
    }
}


/**
 * Uninstall the plugin
 */
function sef_urls_uninstall() {
    global 
$thisplugin;

    
$sef_superCage Inspekt::makeSuperCage();
    if (
$sef_superCage->post->keyExists('delete')) 
    {
      
$delete $sef_superCage->post->getInt('delete');
    }

    
// There's an .htaccess file and user has clicked 'yes' on the cleanup form; delete the .htaccess file
    
if (file_exists('.htaccess') && $delete) {
        
unlink('.htaccess');
        return 
true;

    
// An .htaccess file exists; display the cleanup form
    
} elseif (file_exists('.htaccess') && !isset($delete)) {
        return 
1;

    
// User has clicked 'no' on the cleanup form. Uninstall plugin. Don't delete '.htaccess' file
    
} else {
        return 
true;
    }
}
?>

flapane

Sorry for the misunderstanding, I wasn't referring to your problem. It was just a general advice.
By the way, the lines are:
198-199
            if ($sef_language == 'german') //modifiche mie per translitterazioni
            {


207
}
Flapane
www.flapane.com

Gallery
www.gallery.flapane.com

fekasatete

please upload your codebase.php

flapane

It comes from the last svn linked in the first post of the thread.
Flapane
www.flapane.com

Gallery
www.gallery.flapane.com

fekasatete


yes I download version 1.8 but apparently it is not the same as yours because I can not have the name of the image. html at the end of my url

visit my website to : http://www.ldft.fr

for example in this picture I have this url

http://www.ldft.fr/displayimage-22.html

but I would like the title of the picture at the end of the url locations displayImage-22.html


fekasatete

all things considered I managed to put all its up I copy a file that this is the url
http://sourceforge.net/p/coppermine/code/HEAD/tree/branches/cpg1.5.x/plugins/sef_urls/

and updated my plugin by changing two lines in code.php

sef_language $ = 'English';

with my tongue

$ sef_language = 'french';

and all works great thank you for the help anyway

fekasatete

I still have a problem when I want to upload an image in an album at the end of the download I have this message:

albums/userpics/10001/thumb_shot_001.jpg

But when I return to the home page after you got this message the image was upload.

How to fix the problem?

fekasatete

sorry this message :

Download the failed
albums/userpics/10001/thumb_shot_001.jpg

SimonG

Sorry for pushing an topic that is over 120 days inactive, I just want to underline the importance of this plugin.

There are two kinds of coppermine users. Those who run private gallerys and those who have public gallerys.
For the last group, it is very important to have meaningful URLs for their gallery. Not only because of Google Rankings, also because people share links on social networks and its kinda creepy to open "blind" links like gallery.com/displayimage822-7281.html instead of gallery.com/animals/cute_dogs/happypuppy-8892.html

So it would be very useful to keep the development on this plugin active. I wish I could support the developer in some way. My coding skills are sadly crap. Would be a donation helpful?

flapane

Please check whether Googlebot and Bingbot are not crawling broken random URLs as it happend to me.
Something like:
66.249.66.62 - - [06/Oct/2013:22:05:04 +0200] "GET /gallery/displayimage-1-8.htmlalbums/userpics/10001/albums/viaggio-newyork/mostra-cerca-0-12-_Via_Caracciolo_dinverno_.html HTTP/1.1" 200 5435 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

It's taking months for recovering from this issue (over 100.000 broken links), with search engine spiders finally starting to remove those broken links from their databases.
Almost all of my images had been removed from Google. It's a pity, because the plugin, in theory, is great, but I had to remove it.
Flapane
www.flapane.com

Gallery
www.gallery.flapane.com