Detecting defective tarballs (from a list of files)

There’s this build system, where tar files are downloaded into a specific folder, and then extracted, compiled and whatever. There are several thread, each one downloads, extracts, compiles and installs, and suddenly, there is no more space left on the hard drive. After a little cleanup for extra space, trying to rebuild fails, because some threads were in the middle of downloading when stopped violently, so a few tar files were corrupted – We need to delete them before going on.

Now, there are tens of files, checking them one by one will take a long time, and will be very tedious, better clean and rebuild, even if it takes a day 🙂

However, we can use a bash script for detecting defected tarballs!

First, how can we tell if a tar file is defected?

Tar’s “-t” option gives us a list of the stored files. It’s pretty quick, and when the file is corrupted, it fails, like so:

$ tar -tf not_a_tar_file.tar.gz 
tar: This does not look like a tar archive

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

$ echo $?

The bash variable “$?” gives us the result of the last execution. “Pass” is (obviously) zero, and any other value is “Fail”. So we can use this information.

Improving this for a single file, we’ll send the output, including stderr, to /dev/null, using a redirection: “&> /dev/null”

Then, all we have to do is to put it into a loop, add some command line input, and we get this short script:


if [ -z "$1" ] ; then
    ls $1 &> /dev/null
    if [ "$?" -ne "0" ] ; then
        echo "No such path: "$1
        exit 1


for FILE in $FILES_LIST ; do
    tar -tf $FULLNAME &> /dev/null
    if [ "$?" -eq "0" ] ; then
        echo $FULLNAME" is OK" 
        echo "[!]" $FULLNAME "is defiective"

And its execution looks like this:

$ ./ ./list

./list/backup.tar.gz is OK
[!] ./list/not_a_tar_file.tar.gz is defiective
./list/one_tar.tar.gz is OK
./list/onther_one.tar.gz is OK
./list/project1.tar.gz is OK

Notice that this script is good only for tar files. We can probably expand it to handle zip files and other archive file types, by using the same concept

* No tar files were harmed during the making of this post.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s