Import Soul Wouldn’t it be nice if things just worked


Cleaning up the itunes media directory

In the process of moving my iTunes library to my new laptop (well its not so new anymore), iTunes has somehow managed to duplicate my entire library, creating a copy of every track, movie, tv show ect, and giving the duplicate some obfuscated name, i.e. ABCD.mp4

Besides being a large bunch of unnecessary files, they eat up a lot of HDD space, around 100GB.

After finally finding out what was eating up my HDD space it was easy to write a quick little python script to find and remove all of these files.

import re, shutil, os

CLEAN_FROM = r"C:\Users\Hugh\Music\iTunes\iTunes Media"
CLEAN_TO = r"C:\Users\Hugh\Desktop\junk"

FILETYPES = ["mp4","m4a"]

if not os.path.exists(CLEAN_TO):

pat = re.compile("[A-Z]{4}[.](%s)"%"|".join(FILETYPES))

print "Starting"
for dirpath, dirnames, filenames in os.walk(CLEAN_FROM):
    for fname in filenames:
        if re.match(pat,fname):
            print "Moved: "+fname
print "Done"

I do not guarantee this script will work for you and check what you actually delete before you actually do it



As messing around getting some ebooks onto my nice new kindle, i have run across the problem where some books have been poorly OCRed and have lost a few spaces along the way. i.e. "Monty python" may become "Montypython"

To combat this i have written up a little script that will search though and try to identify and correct where this has happened.

Currently i have it all setup and hard-coded for use on ZIP/HTML exports from calibre returning a new zip that can be re imported. If you are interested in using the functionality on just plain text feel free to download it and use the clean_book function.

All the code and bug tracker can be found HERE on GitHub