Sean DeMerchant said:
Backups are a very poor choice of tools for version control. ....
Hi Sean,
I understand precisely what you're saying and agree with it. On top it, I'd like to make a few personal observations if I may
.
There are a few aspects of backing up data that some of us might get confused with so I'd like to clarify how I see this.
1) Backups for covering any contingencies such as hard disk crashes (version 1).
This issue might be resolved by setting up a RAID 1/5/6/10 array. Many "novice" people tend to think that having a mirrored RAID1 array is enough for backup purposes. Well, it is not. It only helps surviving a disk crash, nothing more, nothing less. RAID0 is in any case useless for any backup purposes since it does not offer any contingency, but speed improvement only.
In this situation, if you do something stupid like deleting an essential file or if a virus destroys your data, RAID does offer no protection at all. This is just a short term insurance against harware failure.
2) Backups for covering any contingencies such as hard disk crashes (version 2).
One can typically duplicate (ie copy) the same file by copying it onto different hard disks (internal or extarnal) or onto DVDs. This method offers a similar level of contingency to a RAID array, only the recovery process is slower. The advantage WRT version 1 is that it is immune to operator errors and/or virusses (expect in the case of on-line secondary disk drives in your PC that are also prone to virus attacks).
3) Backups for making sure that you can have access to your files in case your house burns down or your computer (including the RAID array) gets stolen, etc.
Similar to version 2, this can be achieved in many ways. Use external hard disks as backup medium and store them physically at a different location. Or use DVDs and store them off-site. Whatever.
4) Backups for restoring your system (OS) if it crashes.
Firstly, one shoud always use a seperate hard disk or partition to keep one's data files. This way, one can backup and restore the OS independent of any data. One can use various backup tools to backup the OS: Ghost, Drive Image, Partition Copy, MS Backup, etc.
This is an essential activity and must be conducted every time something essential changes on your OS such as installing new (major) programs for the first time.
The backups must be preferably kept on seperate, external hard disks or DVDs. Best option is to keep them off-site for maximum security.
5) Backsups for achieving version control.
One might want to keep various versions of a certain file while it evolves. In that case, just backing up daily/weekly is not a solution, as Sean has pointed out. He has given a few pointers as to how one can achieve real version control. There is also much good advice to be found in publications/sites such as the DAM book. Everything you do within the version control is not backing up data. Even if one does version control, one should still do the above mentioned steps 1-3.
<begins personal rambling>
So now I'd like to address my pet dislike. I don't understand why so many folks out there are so obsessed with RAID arrays. Sure, if you are running primary applications on your system which your business is dependent on, a RAID array is an absolute necessity in order to minimise the downtimes in case the of hard disk crashes. For anybody else, especially for people like myself who are doing this as a hobby, a RAID array can be counterproductive. Let me mention a few (not so obvious) disadvantages:
- all the disks spin at the same time, so the power supply must be able to handle higher peak currents (can cause stability issues)
- all the disks spin at the same time, so more noise is generated and more heat is to be dissipated
- all the disks spin at the same time, so per disk operational running time is much higher than in case of individual disks, which will statistically lead to more often/earlier disk crashes
- Oh, did I mention more expensive (i.e. less available disk space)?
The advantages are obviously:
- less downtime due to disk crashes
- better read/write performance and data bandwidth
For me, the advantages of RAID are not important enough to use one. What I do instead is that I use many individual on-line and external hard disks accross which my data are distributed and duplicated. Besides, modern SATA drives achieve a throughput of about 60 MB, which is more than enough to handle any workload I can throw at them (such as non-linear DV editing, HD editing, RAW processing, etc).
So, now I've got it off my chest ;-)
</ends personal rambling>
Cheers,
Cem