How to prevent entire site downloads How to prevent entire site downloads
 

News:

cpg1.5.48 Security release - upgrade mandatory!
The Coppermine development team is releasing a security update for Coppermine in order to counter a recently discovered vulnerability. It is important that all users who run version cpg1.5.46 or older update to this latest version as soon as possible.
[more]

Main Menu

How to prevent entire site downloads

Started by Sogeri, October 02, 2003, 06:22:51 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Sogeri

My site http://www.orchidspng.com gets an average of 25,000 page hits a day, some days peaking at over 100,000 page hits. From my web stats I can see what appears to be that some people are downloading the entire site. Yesterday's traffic was over 1GB! And that costs money. It is a hobby site.

Other than regulating access to the site for registered users only or via a throttle (I find it hard to guess what a reasonable number of hits per hour/day would be) or htaccess is there any other method to block a single IP from accessing the site to often within a given time frame.

hyperion

Yes, but it could get complicated.  Basically, you store the IP addresses along with a timestamp.  You then delete the IP addresses as they exceed a certain time.  You then count the number of times an IP address is in the list (or increment a counter, etc.), and redirect to an explanation page when it exceeds the number of hits in the time frame. You put the call to the function at the begining of every page by placing it in the theme.php file.

Some of those downloaders might be spiders or robots that obey commands.  Use meta tags and robot files to try and keep them under control.

Great orchid shots, BTW. :)
"Then, Fletch," that bright creature said to him, and the voice was very kind, "let's begin with level flight . . . ."

-Richard Bach, Jonathan Livingston Seagull

(https://coppermine-gallery.com/forum/proxy.php?request=http%3A%2F%2Fwww.mozilla.org%2Fproducts%2Ffirefox%2Fbuttons%2Fgetfirefox_small.png&hash=9f6d645801cbc882a52f0ee76cfeda02625fc537)


Jim

webmasterworld thread is for members only :(
Quadra Hosting

Sogeri

:D  Thanks for that. I will upload the .htaccess file as suggested.

gtroll

Here you go Jim contents of the post there
Quote#From toolman of webmasterworld
<Files .htaccess>
deny from all
</Files>
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.* - [F]
RewriteCond %{HTTP_REFERER} ^http://www.your-site.com$
RewriteRule !^http://[^/.]\.your-site.com.* - [F]

Tarique Sani

Don't want to be rain on the parade BUT spoofing of USER_AGENT is built into most new URL fetchers. I guess the correct way is to have Apache configured using mod_throttle OR mod_bandwidth.
SANIsoft PHP applications for E Biz

Sogeri

I found an even more extensive .htaccess file here:

http://tech.ratmachines.com/downloads/sample_wbmw.txt

So, which file would be best to use??

epsilon

In which directory i must put this htaccess ? in albums dir only?

Joachim Müller

Quote from: "epsilon"In which directory i must put this htaccess ? in albums dir only?
yes

epsilon

Don't want to be rain on the parade BUT spoofing of USER_AGENT is built into most new URL fetchers. I guess the correct way is to have Apache configured using mod_throttle OR mod_bandwidth.

How i can do it? i have on mod_rewrite to use the .htaccess commands, and when i will activate the throttle and bandwidth what i must do?

Thnks

Tarique Sani

SANIsoft PHP applications for E Biz