Archiving old content (tar gzip) Print

  • 1

In this article we'll explain how and why you should archive your old content - particularly website code - when it is no longer required.

More often than not, we find developers leave old code lying around publicly accessible on servers, in directories such as "/public_html/oldsite/" and similar. It's super bad practice to do this, because this old code is likely unpatched against many vulnerabilities yet will have just as many priveledges toward your live site data. One clever bot, or attacker, is all it takes to cause detriment to your brand and company image!

Instead, we recommend you archive this old content using tar (with gzip, to save space) and make sure it remains stored outside of a publicly accessible directory, such as /home/yourusername/oldsitearchive.tar.gz

Quick and easy public_html backup
You can generally run this as soon as you've logged in to your SSH session without changing anything:
tar -czvf public_html_backup.tar.gz public_html/

So, tar is the library, -c creates the archive, -z calls gzip for compression, -v or verbose shows us the output in the terminal, and -f allows us to specify the archive name. public_html_backup.tar.gz is the name we're using for this archive, but you can change it to anything you like - as long as you keep .tar.gz on the end! Finally, public_html/ is the directory we're asking to be compressed, and this is recursive by default so all files and subdirectories are included in the archive.

How to extract the archive?
Thankfully, extracting the contents of an archive is just as easy as above, but we're replacing a few commands. Here's an example:
tar -xzvf public_html_backup.tar.gz -C /public_html_backup

You can easily specify a different folder name to decompress and extract to - that's the bit on the end after -C. So we could use -C /public_html if we wanted to restore back to the main public_html folder again.

Was this answer helpful?

« Back