Getting rid of those files

I've been doing some maintenance of my server, and wanted to do some spring clearning, deleting all spam files inside users' directories. These files are automatically created by spamassassin software. Also, wanted to get rid of Rails production.log files.

Doing everything manually is no fun, and I have to admit, I completely suck at shell scripting.. But if you never try - you'll never learn, so that's what I came up with.

Calculating their size

First, I wanted to find out how much space exactly files called spam inside all directories inside the /home directory take. That's the command which I came up with ( of course, I first had to cd /home ):

[root@me-ja home]# find * -name spam -type f -exec echo {} \; | xargs du -ks | awk '{total += $1} END {print total}'

2952880

A little explanation. I use 3 commands each piping its output to the next one (and the last outputs everything to standard output which is the screen.

First command  -  find * -name spam -exec echo {} \; - finds all files ( -type f part ) inside current directory which have name spam ( -name spam part ), and prints their relative paths out with echo command ( -exec echo {} \; part). However, of course, since the output is piped into another command, nothing gets printed on the screen.

Second command - xargs du -ks - gets list of files from the find command and lists their size in bytes (1st column) and name (2nd column), and outputs it to standard output (which again, piped into the next command.

Third command - awk '{total += $1} END {print total}' - is a little awk script, which takes first column of the output (file sizes in kilobytes), sums them up into the total variable, and then prints that variable out.

The output is in kilobytes, of course.

So, we have what? Like almost 3Gigs of spam? This is uncool. Time to get rid of it!

Getting rid of them

Actually, I don't want to totally get rid of those files, because if I just remove them, they will be (not likely in this case, but possibly) recreated with different permissions, as well as if there are programs which keep these files opened , will be unpleasantly surprised by their sudden disappearance :)

So what I wanted to do is to truncate these files to zero size, keeping their owner/group/permissions as is.

Here's the command I came up with:

find * -name spam -type f -exec sh -c 'echo "truncating $0"; cat /dev/null > $0' {} \;

Basically, it is mostly the same as the previous command, with exception of parameters passed to -exec. One could think about just writing something like that (and I tried it myself as well) :

find * -name spam -type f -exec  cat /dev/null >  {} \; (THIS WON'T WORK!!)

However, this approach won't work. Check out here why exactly :)

And so, basically I just went with executing a shell script, which printed out "truncating <filepath>" and then trimmed the file to zero size with command - cat /dev/null > /file/path.

So now all my users' spam-holding files are back to zero, life is good and I have 3Gigs less of stuff to backup.

There might be better, more hardcore approaches to the problem, so if you are a god of shell scripting, do sound off in comments :) As for other folks, hopefully somebody learned something useful here.. I definitely did while producing these hardly readable lines of code :)


Leave a Reply