mv_smart

Doing a bit of spring cleaning on my filesystems, I found myself with several directories that had mostly the same files that I wanted to merge together.  Of the files the directories had in common, a few were different and would need examination, but the rest would be identical and could simply be discarded.

Basically I needed a more powerful version of `mv -n` which considered md5sums.  Here’s the script I cobbled together to solve this:

#!/bin/bash

DEST=${@: -1}
FILES=${@:1:$(($#-1))}

for file_path in ${FILES} ; do
    if [ -d "${file_path}" ]; then
	echo "${file_path} is a directory"
	continue
    fi

    file=$(basename $file_path)
    echo ${file}
    dest_file=${DEST}/${file}
    if [ ! -e ${dest_file} ]; then
	mv -i ${file_path} ${dest_file}
	continue
    fi

    a=$(md5sum $file_path | cut -d' ' -f1)
    b=$(md5sum $dest_file | cut -d' ' -f1)

    if [ "$a" = "$b" ]; then
	# files are the same; keep destination copy
	#echo "$file is already in $DEST - $a = $b" 
	rm $file_path
    else
	echo "$dest_file exists and differs from $file.  Skipping."
    fi
done

Nothing terribly special, but since then I’ve found this handy in more than a few cases, so I’m posting in case others might find it useful too, or could suggest improvements.

3 Responses to mv_smart

  1. Benjamin says:

    Why don’t you compare both files directly without generating a hash?

    if cmp $file_path $dest_file > /dev/null; then

  2. Did you look at rsync to do this? It’s not just for remote file transfers…

    • bryce says:

      Yep. rsync works acceptably if you know the source is newer than the destination, but if vice-versa then you’ll end up overwriting files with older versions.

      That said, there is probably some way to do it with rsync. rsync has a huge number of command line options. If anyone figures out an equivalent to my script using just rsync, please post!

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>