Managing Disk Space   «Prev  Next»
Lesson 5File archives
ObjectiveDescribe an archive.

File archives

One way of managing disk space is to create an archive. An archive is a set of files that are packaged into a single, larger file. An archive can be composed of a few files, one or more directories, or an entire directory tree. Archives are useful for making backup copies of your data. For example, you can store an archive[1] on a different computer or on removable media such as magnetic tape. An archive is easy to manage because you treat it as a single file. In addition, compressing an archive saves more space than compressing the same files individually. You may be familiar with archives in a Windows or Macintosh environment. Programs such as WinZip or pkzip create archives that end with a .zip extension. You might hear these archives referred to as zip files.
On UNIX, you create archives by using the tar command, which is short for tape archive. tar was designed for archiving data to tape. You also can use tar to archive data to a file, which is often called a tar file. Tar files typically end with a .tar extension. It is not required, but this convention lets people identify the file as an archive.
Zip files are automatically created in a compressed format, but tar files are not. If you want to compress a tar file, you must run the compress command separately.
You can use the tar command to create an archive, to list the file names in an archive, or to extract[2] files from an archive. The next three lessons describe these tasks.

How are File Archives used in Unix

File archives in Unix are used to package multiple files and directories into a single file for easy distribution or backup. The most common file archive formats in Unix are tar, gzip, and zip. The tar command is used to create a tar archive, which can then be compressed using gzip or bzip2 to create a .tar.gz or .tar.bz2 file. The zip command can be used to create a .zip archive. These archives can then be compressed or uncompressed and extracted using the appropriate commands.

Filesystem Types

Before any disk partition can be used, a filesystem must be built on it. When a filesystem is made, certain data structures are written to disk that will be used to access and organize the physical disk space into files. Table 6-5 lists the most important filesystem types available on the various systems we are considering.

Important filesystem types
Table 6-5. Important filesystem types

Unix Filesystems: Moments from History
In the beginning, there was the System V filesystem, that is where we will start. This filesystem type once dominated System V–based operating systems. The superblock of standard System V filesystems contained information about currently available free space in the filesystem in addition to information about how the space in the filesystem is allocated. It held the number of
  1. free inodes and data blocks,
  2. the first 50 free inode numbers, and
  3. the addresses of the first 100 free disk blocks.
After the superblock came the inodes, followed by the data blocks. The System V filesystem was designed for storage efficiency. It generally used a small filesystem block size: 2K bytes or less. Traditionally, a block is the basic unit of disk storage;† all files consume space in multiples of the block size, and any excess space in the last block cannot be used by other files and is therefore wasted. If a filesystem has a lot of small files, a small block size minimizes waste. However, small block sizes are much less efficient when transferring large files.

System V filesystem

The System V filesystem type is obsolete at this point. It is still supported on some systems for backward compatibility purposes only. The BSD Fast File System (FFS) was designed to remedy the performance limitations of the System V filesystem. It supports filesystem block sizes of up to 64 KB. Because merely increasing the block size to this level would have had a horrendous effect on the amount of wasted space, the designers introduced a subunit to the block known as the fragment. While the block remains the I/O transfer unit, the fragment becomes the disk storage unit (although only the final chunk of a file can be a fragment). Each block may be divided into one, two, four, or eight fragments. Whatever its absolute performance status, the BSD filesystem is an unequivocal improvement over System V. For this reason, it was included in the System V.4 standard as the UFS filesystem type. This is its name on Solaris and Tru64 systems (as well as under FreeBSD). For a while, this filesystem dominated in the Unix arena.

In the next lesson, you will learn to create an archive.
[1]archive: An archive is a set of files that are packaged as a single, large file.
[2]extract: To extract files from an archive means to copy them out of an archive and onto the filesystem.