Storing pictures directly in mysql database Storing pictures directly in mysql database
 

News:

cpg1.5.48 Security release - upgrade mandatory!
The Coppermine development team is releasing a security update for Coppermine in order to counter a recently discovered vulnerability. It is important that all users who run version cpg1.5.46 or older update to this latest version as soon as possible.
[more]

Main Menu

Storing pictures directly in mysql database

Started by ipolo, May 31, 2005, 02:11:16 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

ipolo

Hello evrybody,


sorry for this newbie question  but i have been harvesting the forum and didnt find the answer to my question which is :



Can coppermine save the pictures directly to a mysql database?


I mean that i dont want the uploaded pictures in my web space, but directly stored into the mysql database in binary format.


Is there a way to do so? Does it need some hacks? Or is it simply a feature like in the xoops gallery?


Thanxs in advance for your answers.


ipolo

Nibbler


ipolo

thanxs for your answer, then is it possible to store the pictures on a remote webspace location or a remote ftp. By remote, i mean a different one than the one who host coppermine

nol33t

There is no option to do this neither  ;D

However, a hack to store the pic in the db wouldn't be too tough to code imo...but i have no idea about the consequences in term of performance..

Tranz

Yeah, I think you should want to have your database be as lean as possible, and adding photos to it will likely impact performance. Not just that, it's harder to deal with backups/restores of large databases as opposed to individual files/folders.

donnoman

You could actually store images in a database, but with horrific conseqences by doing so you won't find much support for it.

As you've been told Coppermine certainly won't do this out of the box.  I can't recommend spending any time trying to MAKE it do it. The solution simply won't scale well, and your ISP and database host won't be real thrilled with you either.

It's best to use the right tool for the job, use the database to be a database, use the webserver to serve static content like it was designed. If you don't like the limitations of your webhost, get a different webhost.  The price of decent hosting is down to one good meal a month. I pay 8.95 for 200mb of storage, 20gb of transfer, and it includes Mysql, among a bunch of other crap I'll never use.  Theres a board meant to discuss such things search around. I don't want to start a price war, other than to say there is very good, very cheap hosting available nowadays, its not worth manipulating the system to squeeze everything out of 10 free acounts scattered all over the place.

buy a domain for like 9.99 a year, hosting from 7-10/mo and your all set. Ask for a years worth of hosting for your birthday from your parents or something... it makes a pretty cheap and cool gift.

Tranz

Web hosting is one of those "costs" that can pay for itself. I am in an affiliate program with my webhost and my commissions have paid for a couple of years of hosting. Initially, I didn't want to put any ads on my site, but Adsense is quite tempting since the ads aren't so obnoxious. If I am in dire circumstances, I would sign up for it.

ipolo


dpark

Hi folks... I'm also new to CPG.

Since this is a general discussion forum, may I present some reasons for storing pics in the RDBMS?  It's something I'm heavily in favor of and I was kind of dismayed to find CPG didn't have this feature after having already committed to it.  Not surprising though.. it seems most open source PHP+MySQL apps end up being this way.

Where I'm coming from: I just recently started using CPG for myself just a couple days ago... loaded about 6000+ images into it.  Aside from that though, I admin a server (co-located, but we own the machine) that provides stuff to a dozen or so friends / small organizations: shell accounts, web hosting, installs of things like wordpress, coppermine, phpbb, vbulletin, etc.  There's probably 6-10 coppermine installations on this server right now.  I also happen to have both the web server and the DB server on the same physical machine, but there are many situations (if we were a larger operation) where this would not be true.  As for all that stuff earlier in this thread about finding good webhosts and stuff--it doesn't really apply to me... though I'm guessing that a majority of coppermine users do that and have those concerns.. I do not.

The primary reason one would want to store pics in the DB is for backups.  CPG has no built-in backup mechanism.

On the MySQL side, that's not a problem, backing up every single mysql database on the server (all the coppermine installations plus other apps) is incredibly easy and elegant to do as an admin with a few well-written perl scripts.  The nice thing is that you can set up a good backup schema of "occasional full dumps" + "daily incremental dumps" by using the mysql binary update log (or if you want to be hard core, you can use the replication features).  This reduces the amount of bandwidth we'd have to use because, honestly, doing a full network dump of our system takes forever.  Anyway, because we (and pretty much anyone else in a similar situation to us) already have a mysql backup thing going, backing up the mysql-end of Coppermine (all of them) is effortless.  It really does kind of make you wish that everything could just live in the DB.

There are two other components to coppermine and both are stored in the same area of the file system: the photo data (original images plus thumbs + intermediates) and the CPG code.  Of course, the code must live in files and on the web server.  That's always a given.  In my case, the code is easy to back up because we put them into a version control system (plus any themes/mods/hacks they've applied).  The photo data could either live in files on the webserver, or it could live in tables on the db server.  Currently, I dont' have a way of easily backing up the albums directory for the half-dozen (and more to come) installations...

One thing I'd like to point out in the area of backup though: it is not necessary for all the files in albums to be backed up--just the original images (because, in theory, the thumbs / intermediates can be restored from the originals + some other config settings that are stored in mysql).

The secondary reason comes down to server design / appropriation considerations.  In larger setups (or even small ones if you're thinking on a corporate level), you will often purchase a separate web server and database server.  The web server will often be a fast and lean machine, 1U, not a lot of disk space--I mean, all it has to hold is OS + web server + app code.  The db server will often be a large machine with lotsa disk, backed up often, perhaps RAIDed, etc.  If the web server blows up, dropping in a spare is almost trivial and probably wouldn't even involve much of a "restore" process.  But if the web app is relying on the local file system to store some very large files, then your web server starts needing a lot of storage space--either that or you have to do some undesireable thing like NFS-mount a disk partition and symlink everyone's "albums" dir over to it.  In any case, it becomes a pain, and suddenly your web server has to be much more than it originally was and it's questionable whether separating the two machines held any advantage anymore.

Again, for the secondary issue, I'll point out that, since the thumbs + intermediates are orders of magnitude smaller in file size than the originals, it really only matters that the originals get put on the DB.

The primary concern that I wish to contest is typically that putting large items in the RDBMS will slow the database down.  I can't agree with that... why would that be true?  It would be like saying that putting large files in the file system would slow down your access to any other file on the same partition.  Databases aren't that dumb.. the whole point is that they index things well, and when it comes to storing big things like this, it's pretty much the same as putting it on the FS except for the fact that you can't seek to random points in the file (you have to slurp the whole thing down at once if it's in the DB)--but that isn't an issue for things that are just going to be served over the web.  It is an issue for programs that have to read over a file and jump to different parts of it and read it again.

The secondary concern that I will partially agree / disagree with is that moving these items to the RDBMS will slow the overall operation of Coppermine down.  Yes and no... for example, if you want to access an image that lives in the DB, then yes, Coppermine (PHP) must pull it out of the DB (and do all that communication w/mysql) before sending it out--instead of pulling it out of the FS (and doing all that communication w/the kernel FS code) before sending it out.... it may be a tad slower, especially if the DB server is a physically distinct machine from the web server.  This is true.  However, if all you're storing in there are the originals and not the thumbs/intermediates, this becomes barely true at all--it only becomes a concern when people click on the thing to see a large image.

Another concern I alluded to earlier is that you can't easily run arbitrary programs (ie, ImageMagik) on something that's not a file.  Sometimes you can run them on pipes, but not always... anyway, the point is that if you had to do any re-generation of thumbnails/intermediates after your original file got sucked up into the RDBMS mothership by the batch add scripts, the CPG code would probably need to slurp down the image from the DB, write it to a file, run the program, read it back, etc, etc.  I mean, in general, making CPG work in this way seems like you'd have to change a decent number of PHP files...

Well, thus ends my argument for storing pics in the DB.  While it's important enough to me that I'd normally go try to hack this myself... I'm busy enough as it is hacking other software packages.  Let me know what you think though.

kegobeer

http://www.datanamic.com/support/storeimagesinthedatabase.html

Images/files stored in a database cannot have manipulations done on them, ie: getimagesize.  If the thumbs and intermediate images are lost, you can't rebuild those files until you download the image from the database, save it, and then perform your operations on it.  You also lose the ability to resize/rotate your images.  There is also a sizable performance impact on your database when storing thousands of images in it.  Examples I've read say performance is worse when your webserver and databases are located on different servers.  Queries for those images will tie up your database - when 10 people are viewing images, you will definitely see delays and lags when someone else is just trying to render a normal page.  All the connections will be attempting to stream thousands of KB.  Is that really the job of a database?  Not in my opinion.  File systems are designed to store files.  Database corruption will also be a factor.  Compare how often a database table gets corrupted versus a file on a webserver getting corrupted.

I have yet to see any web image app (actually, any web app that does uploads and downloads) provide some sort of file system backup.  If you want to back up files, you should set up a cron job to perform nightly file backups.  If on a Windows box, set up nightly backups via scheduled tasks.  Certainly this is also faster and more efficient that attempting to back up a database filled with thousands of images?

Overall, IMO there are no advantages to storing documents in a database.  If you come up with a working hack, feel free to post it.  This is not something the development team will pursue.
Do not send me a private message unless I ask for one.  Make your post public so everyone can benefit.

There are no stupid questions
But there are a LOT of inquisitive idiots

donnoman

Dpark,  none of the points you advocate make sense to put images in a db. It's the absolute worse thing you can do including from a backup and restore perspective.

A filesystem is built to hold large chunks of sequential data. That is the best place for images. Backup has nothing to do with it. You can use filezilla and backup the image contents from disk easier than you can a database if you need a client side backup solution in a shared hosting environment.

From a system admin perspective its easier and faster to backup a filesystem than an online database.  Search the boards for folks than run really huge coppermine databases and you'll see that they have problems bringing the sql dump files down from phpmyadmin because PHP times out, and thats not thier images, just the raw tables. The solution of adding images to the db scales very very poorly from the get go.

If you read mysql best practices for blob fields and where they talk about storing images, they are talking about storing things like icons and very small bits of information, like a 16x16 image.  Theres a reason for that.

Like I said in my first post on the subject, use the right tool for the job. The webserver excels at serving images from a filesystem, mysql handles blob fields because it has to, not because it's any good at it.

Look at it from a resources and lag view.

request image from filesystem: make one filesystem call, begin streaming data.

request image from sql: make a db connection; transmit sql statement; server runs query; server moves query results into its buffer, php script requests contents from the buffer, and has to move it into ram, when complete can begin to stream the image to the client.

How many cpu cycles did you just burn to transmit the same image?  How much latency did you add by requesting from the db, instead of a hard drive with an 8 millisecond (0.008 seconds) seek time, 8mb cache block. FYI even a quick sql query takes on the order of 1/10 or (0.1) second.

msyql users are reporting slowdowns with databases of 300mb. Any self respecting image gallery can exceed 300mb from the get go. My development gallery exceeds 400mb, and I wasn't even trying. 

If your really REALLY serious about persuing this, I'd set up a test scenario using any of the php guides like this one "http://www.phpbuilder.com/columns/florian19991014.php3".

And play with it.

Joachim Müller

To back up what has been said before: your main reason for wanting to store pics in the db is easy backup. I can't understand how this is supossed to be easy: I'm a server admin by profession (mostly Windows machines though), doing regular backups of all the servers is part of my job. However, backing up files (and I mean large numbers) is always easier, faster and more reliable than backing up databases (be it mySQL, MS-SQL, Access, Exchange, Progess, which all exist on the network of the company I work for). Just my 2 cent.

dpark

Wow, you guys reply quickly, even on a "no support" forum!  Well thanks for the ample feedback--I don't intend to make any converts here, if anything I just wanted to poll what people thought.

As for where I might go from here wrt backup... so my co-admins and I have been scratching our heads about this one and some other alternative solutions would look like 1) writing little hooks in CPG code to commit changes to the album directory into a subversion repository (the albums directory would then be a "working copy" of the repository.. every one of our users already has their own repository).. kind of odd, but it works, because those repos are easy to back up with our existing scripts.. and it would be a much easier change to make to CPG than making it store stuff in the DB.  Or 2) look into a file system backup for all the albums directories only.  A possible route here would be rdiff-backup or some similar variant of that.  That wouldn't be too bad.  There's also, of course, 3) hack CPG to put original images in the DB.  I think we lean towards #1, as strange as it is.  Yeah, it's kind of weird.  Actually, the commit hooks could be loose--maybe just run once an hour or something.  But then #1 is really just a specific way to implement #2, so perhaps it's all the same.

The backup schema has to be incremental (which is why I liked the DB approach)... I've only been using CPG for under a week and I've got 5GB in albums already.. multiply that by 6-10 other users on our system and try to push that over the network daily or weekly and.. well it just won't work.  Even if we did full dumps monthly.. I think that'd be too often for our bandwidth setup.

A few other responses to items above... they're not really pertinent, but I thought I'd write anyway...

- DB query speeds are going to be a function of table size (# of rows) and how well the tables are indexed for the particular query that was called.  If a table is 300MB because it has a billion rows of unindexed data, then yes, querying it will be incredibly slow.  But even then, at work, I use an open source apache+php+mysql package that runs a database that's grown to 3.8GB.. it's genomic expression data.  It's got 22.8 million rows in the largest table, and querying the thing (large  isn't too bad at all.  Anyway, the point is that if it's only a few hundred rows at 1MB a piece, why would it be slow?  File systems on the other hand.. if you put a couple thousand files in one directory, sometimes on semi-old unicies you can't even "rm *" because the argument list is too long.

- I've never used phpmyadmin (I stick to the command line or scripts) but it doesn't seem like the best way to back up databases... aside from the fact that it's going through the web server and php to do it (and all the problems with that).. I mean... it's a web app... wouldn't you rather have some non-interactive, cron-able script to do backup (whether it's database, file system, repositories, or anything else)?  Filezilla.. the sftp/scp program?  Sounds like a pain if it involves me having to actively run some app on my personal computer every day / week.

- As to the issue of data integrity... well, on one hand, I've seen files corrupt much more often than database tables--but I've also used files much much more than i've used database tables, so that statistic is rather useless.  Corruption mostly happens when something unusual happens (like power outages)... I think I'd probably rank my trust as something like this: reiserfs/oracle/mysql-innodb/etc > ext3 > ext2/mysql-myisam/etc (that could be a very wrong way to rank them).  Of course, most web apps that use mysql use myisam tables so.... yet another reason to back stuff up.  But data integrity is an issue no matter what storage approach you choose, so pick what you fear least, I guess, but realize that others may legitimately pick something else.  The beauty of living in a heterogeneous computing world.

- Running programs / manipulations on images in the database: yes, this is true.  I simply figured that the majority of Coppermine's activity in most heavy-usage scenarios would be displaying thumbs and intermediates the vast majority of the time, so the performance impact here would only be occasional.

Anyway, I'll probably be looking into a file-based backup mechanism (either involving subversion or rdiff-backup or some other thing).  Thanks to all...

Nibbler

Can't you just setup a cron to do a `cp -au` to copy new files to a backup location every day/hour ? seems the simplest solution.

donnoman

dpark, I think your making something really simple way too complex.

It's a filesystem, you have tons of tools available to you.

I've never really considered cvs/subversion a backup tool, but it would work, and be great for versioning. You don't need to hack cpg at all, just do a nightly commit of the albums directory via cron. Matter of fact... script a sql dump and commit it as well.  Problem is the creep in cvs as changes grow, you have no way to get rid of really old images, they remain in your repository.  You might be building a huge behemoth.

Rsync is a great way to make synced copies.

Tar -z to a remote machine with a log rolling mechanism of the tarballs would be easy to setup with cron as well.