System dying after CPG1.3 upgrade - Memory leak? (mysql?) System dying after CPG1.3 upgrade - Memory leak? (mysql?)
 

News:

cpg1.5.48 Security release - upgrade mandatory!
The Coppermine development team is releasing a security update for Coppermine in order to counter a recently discovered vulnerability. It is important that all users who run version cpg1.5.46 or older update to this latest version as soon as possible.
[more]

Main Menu

System dying after CPG1.3 upgrade - Memory leak? (mysql?)

Started by DeadKenny, June 18, 2004, 01:11:14 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

DeadKenny

Since upgrading from standalone CPG1.2.1 to standalone CPG 1.3.0 (stable versions, not beta), mysqld is running away with memory very quickly and crashes with an Out of Memory error. This often kills the entire system and a reboot is required! (we're talking Linux here). In one case I couldn't even telnet in or get any console activity and had to do a hard reset on the system!! :(

There's nothing in the mysqld.log that indicates a problem, but on using 'top' it's definitely eating memory quick, and I get the following kind of messages in the system log...

Jun 17 03:32:15 mail kernel: Out of Memory: Killed process 1914 (mysqld).
Jun 17 03:32:55 mail kernel: Out of Memory: Killed process 2112 (mysqld).
Jun 17 03:33:02 mail kernel: Out of Memory: Killed process 2086 (gdmgreeter).
Jun 17 03:34:44 mail kernel: Out of Memory: Killed process 2127 (gdmgreeter).
Jun 17 03:35:07 mail kernel: Out of Memory: Killed process 1289 (httpd).


mysqld is the first to die and then a bunch of others die because of lack of memory, but using top I can see it's mysqld eating memory.

My system spec...

AMD K6-2 400Mhz
128MB RAM, 256MB Swap
Fedora Core 1
Apache 2.49
PHP 4.3.6
MySQL 3.23.58


I've restarted mysqld again after I discovered it had been down all day (again), and it started at 54MB RAM usage and has already climbed to 98MB in about 15 minutes without any activity.

Any ideas? Everything was okay with CPG 1.2 and the same spec system.

How can I roll back the database to 1.2 if it turns out 1.3 is unstable? (it's the one thing I didn't back up unfortunately).

GGallery


DeadKenny

Not sure what they should be. I've just checked with MySQL Administrator and found the following variables which I assume are the key and sort buffer sizes?...

key_buffer_size = 8388600
sort_buffer = 2097144

Is that good/bad?

Under the Memory Health graph, it says "Key buffer usage 15,360", but I've just restarted mysqld (died again overnight).

I don't know if this helps, but here's my 'my.cfg' ...

Quote
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
innodb_data_file_path=ibdata1:10M:autoextend
set-variable=innodb_buffer_pool_size=70M
set-variable=innodb_additional_mem_pool_size=10M
set-variable=innodb_log_file_size=20M
set-variable=innodb_log_buffer_size=8M

[mysql.server]
user=mysql
basedir=/var/lib

[safe_mysqld]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid


I had it without InnoDB enabled too, but the same problem occurred (hence I saw entries in the log about InnoDB so I enabled it to see if it would help, but it didn't. Don't know if I've got sensible values for InnoDB though)


Update, I'm now getting some more detail in the mysqld.log file...

Quote
InnoDB: Fatal error: cannot allocate 73416704 bytes of
InnoDB: memory with malloc! Total allocated memory
InnoDB: by InnoDB 14797112 bytes. Operating system errno: 12
InnoDB: Cannot continue operation!
InnoDB: Check if you should increase the swap file or
InnoDB: ulimits of your operating system.
InnoDB: On FreeBSD check you have compiled the OS with
InnoDB: a big enough maximum process size.
InnoDB: We now intentionally generate a seg fault so that
InnoDB: on Linux we get a stack trace.
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail

key_buffer_size=8388600
record_buffer=131072
sort_buffer=2097144
max_used_connections=0
max_connections=100
threads_connected=0
It is possible that mysqld could use up to
key_buffer_size + (record_buffer + sort_buffer)*max_connections = 225791 K
bytes of memory
Hope that's ok, if not, decrease some variables in the equation

Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Bogus stack limit or frame pointer, fp=0xbfece8e8, stack_bottom=0xfac0b3a0, thre
ad_stack=65536, aborting backtrace.
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0xf9fee720  is invalid pointer
thd->thread_id=16777472

Successfully dumped variables, if you ran with --log, take a look at the
details of what thread 16777472 did to cause the crash.  In some cases of really
bad corruption, the values shown above may be invalid

The manual page at http://www.mysql.com/doc/C/r/Crashing.html contains
information that should help you find out what is causing the crash
040619 04:55:56  mysqld ended



Doesn't mean much to me. As I say, all this started as soon as I upgraded CPG to 1.3 and ran update.php.

P.S. I'm running two galleries on two databases using virtual hosts in Apache (two domain names). The galleries have a low amount of photos and no users other than the admin user. The actual databases take nothing more than about 300K of disc space.

DeadKenny

I'm really getting no where with this. It's dying frequently (every few hours) and starting to annoy me. I just can't see what's changed to cause this.


P.S. This is the output from 'free'...


# free -m
             total       used       free     shared    buffers     cached
Mem:           123        110         13          0          5         30
-/+ buffers/cache:         73         49
Swap:          250         99        151


Which is telling me there is 13Mb of physical RAM free (this is after a restart of mysqld), but loads of swap free (151Mb).

Could it be that mysqld is not using the swap at all? I still don't understand why this has just happened, especially as the databases are so tiny (a few hundred Kb). ???

hyperion

It may be that a MySQL file corrupted itself (a power outage can do this).  Would you be willing to upgrade to the latest MySQL?

"Then, Fletch," that bright creature said to him, and the voice was very kind, "let's begin with level flight . . . ."

-Richard Bach, Jonathan Livingston Seagull

(https://coppermine-gallery.com/forum/proxy.php?request=http%3A%2F%2Fwww.mozilla.org%2Fproducts%2Ffirefox%2Fbuttons%2Fgetfirefox_small.png&hash=9f6d645801cbc882a52f0ee76cfeda02625fc537)

DeadKenny

I haven't had a power outage recently and it's been working fine until CPG1.3 was installed. When MySQL is running, Coppermine works fine, but it just dies after a while and MySQL needs restarting (if it hasn't killed the system entirely).

I could upgrade to MySQL 4, but it's complicated on Fedora from what I understand and would mean I have to maintain it for updates rather than relying on Fedora's auto update. Fedora ships with v3 and they appear to have no plans to ship v4 (some legal issue I think).

I've just tried upgrading to Fedora Core 2 also to see if that made a difference, but it didn't.

hyperion

1.Well, are you certain that this is only happens when using TOP?

2.  To roll back the database, you can look at update.sql and the other files in the /sql folder and reverse the changes they made by hand using a tool like PHP MyAdmin. 


"Then, Fletch," that bright creature said to him, and the voice was very kind, "let's begin with level flight . . . ."

-Richard Bach, Jonathan Livingston Seagull

(https://coppermine-gallery.com/forum/proxy.php?request=http%3A%2F%2Fwww.mozilla.org%2Fproducts%2Ffirefox%2Fbuttons%2Fgetfirefox_small.png&hash=9f6d645801cbc882a52f0ee76cfeda02625fc537)

DeadKenny

It's not happening only when using top, just that with top I can see how much memory it's taking. I can't tell how much it's taking when it crashes though as the mysqld processes are killed at that point and I've not been monitoring top all the time. All I know is the memory creeps up on the mysqld processes and then after a while I get 'out of memory' errors on them in the 'messages' log.

The other thought is that this is perhaps happening when coppermine is accessed but I can't be sure. Maybe there are queries going on in 1.3 which are pushing my system to the limit. Is 128MB RAM too little? !!

When I've got time I'll try and diagnose it more thoroughly but I don't have the time at the moment, I just want it running without mysqld falling over every few hours. Maybe for the moment it's easier for me to just write a script that monitors mysqld and restarts it when it dies. Not a good solution I know but at least it keeps it running. The only problem is sometimes the whole system dies with it.

hyperion

QuoteIt is possible that mysqld could use up to
key_buffer_size + (record_buffer + sort_buffer)*max_connections = 225791 K
bytes of memory
Hope that's ok, if not, decrease some variables in the equation

225791 K = 220.5 MB of memory allocation.

220.5 MB > 128 MB.


Read up on the configuration variables:

key_buffer_size
record_buffer
sort_buffer
max_connections

and decide what you want to change to bring that number more in line with your system's capabilities.
"Then, Fletch," that bright creature said to him, and the voice was very kind, "let's begin with level flight . . . ."

-Richard Bach, Jonathan Livingston Seagull

(https://coppermine-gallery.com/forum/proxy.php?request=http%3A%2F%2Fwww.mozilla.org%2Fproducts%2Ffirefox%2Fbuttons%2Fgetfirefox_small.png&hash=9f6d645801cbc882a52f0ee76cfeda02625fc537)

DeadKenny

Quote from: hyperion on June 23, 2004, 01:22:41 AM
QuoteIt is possible that mysqld could use up to
key_buffer_size + (record_buffer + sort_buffer)*max_connections = 225791 K
bytes of memory
Hope that's ok, if not, decrease some variables in the equation

225791 K = 220.5 MB of memory allocation.

220.5 MB > 128 MB.

Surely my max available memory is 128MB(RAM) + 256MB(swap) = 384MB, and thus 220.5MB is < 384MB?  ???

This is what I don't understand. It seems it's not paging.

I don't mind so much if it's slowing down, but the thing is just plain running out of memory and yet there's plenty of swap space.

Also, the thing is under virtually no load at all. I get very few hits and it's not like there are 10s of thousands of photos. It's a very lightweight gallery and MySQL is exclusively being use for it.

Quote
Read up on the configuration variables:

key_buffer_size
record_buffer
sort_buffer
max_connections

and decide what you want to change to bring that number more in line with your system's capabilities.

I'm just doing that now, but I was under the impression that for a small Coppermine install all you needed was the defaults in MySQL. I can't see any recommended settings for MySQL here.


Edit: I've just updated it with these settings...

set-variable=max_connections=100
set-variable=sort_buffer=1M
set-variable=key_buffer_size=7M
set-variable=record_buffer=132K


Going by what I've read these seem to be recommended basic settings, with the key buffer being just about 5% of system RAM.

I've turned off InnoDB which drastically reduces the amount of memory mysqld is taking, but all this was happening before I enabled InnoDB in the first place. Anyway, I'll see what happens.

DeadKenny

No luck. I've changed the settings as above and disabled InnoDB. MySQL appears to be taking very little memory while running and yet after a short amount of time and especially during the daily cron (around 4am), the thing just dies with loads of Out of Memory errors.

If I stop mysqld and leave the system running for 48 hours, nothing dies at all, so it's definitely MySQL at fault. As I say, the only change was to upgrade CPG to 1.3.

I'm at a loss what to do.  ???

I don't want to trash the database and start again either as I need the gallery links to work as I post them on numerous sites, so I need the indexes to be correct.

hyperion

Hmm.  I would suggest you take this up with the Fedora group.  The crash when cron runs is very suspicious.  Do you have anything unusual in the cron?

Anyway, to explain how CPG and MySQL interact with the memroy on your system:

Memory use due to CPG will occur whenever CPG collects a result set from MySQL.  At the end of the script's execution, PHP automatically cleans up the memory. Coppermine doesn't have anything to do with managing memory use except when it calls free result when several large queries are being made within the same script. Thus, a memory leak in Coppermine would kill the system at the same time a script runs, every time.  

As Fedora is Redhat, I went to Redhat's Bugzilla, and found this right off the bat:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=59808

That's fairly old, so you might want to look around in the new stuff:

https://bugzilla.redhat.com/bugzilla/

Perhaps you have found a new form of this creeping memory problem, or perhaps the patch never made it into the Fedora core. As you are the only one reporting this problem and your DB is very small, I'm fairly certain that CPG is not at fault. What may have happened is that upgrading to CPG 1.3 combined with increased traffic on your site exacerbated the problem to the point where it was noticeable.

Please let us know if you discover anything else that might change this assesment, and I wish you the best luck in finding the source of the problem.

-Hyperion

&quot;Then, Fletch,&quot; that bright creature said to him, and the voice was very kind, &quot;let&#039;s begin with level flight . . . .&quot;

-Richard Bach, Jonathan Livingston Seagull

(https://coppermine-gallery.com/forum/proxy.php?request=http%3A%2F%2Fwww.mozilla.org%2Fproducts%2Ffirefox%2Fbuttons%2Fgetfirefox_small.png&hash=9f6d645801cbc882a52f0ee76cfeda02625fc537)

DeadKenny

Thanks. I'll look into bugzilla.

I started MySQL last night and played about with the cron scripts and moved one of the larger entries (a full system virus scan using f-prot, done every night), so it doesn't trigger at the same time as all the daily entries (i.e. 4am). MySQL still died with the virus scan (around 6am).

It could be f-prot but I've had no problem with that before, and what's odd is it has died before during the day, although maybe there's something in that as my mail server also forwards inbound mail to f-prot for a scan. Maybe the last update on f-prot is conflicting somehow.

It could be a combination of things, like a kernel update, f-prot, and maybe the small changes in CPG's database just threw it over the edge but it's highlighting another problem. It's been the same on Fedora Core 1 and on Fedora Core 2, which are totally different kernels and most packages are updated, so I'm thinking it's a conifguration and/or f-prot conflict with MySQL.

The weird thing is I've run the full system scan for f-prot by hand with MySQL running and it's not a problem, but it seems to be when running under cron (and possibly when procmail calls it). I wonder if it's related to the user it runs under?

Oh well, I'll dig further, but for now my gallery remains offline :(