Reduce Bandwidth By Shrinking Images

A friend has a simple WordPress-based website that I set up for him a few year ago - and he likes to post a lot of images. For various reasons I have it hosted on my VPS in Australia, which only comes with 15GB of bandwidth, most likely because I got a crazy good deal on it.

The problem is, that isn’t much bandwidth. Each page of his site was previously using almost 10MB of bandwidth per view - which meant the 15GB was running out. I deduced the high usage in Firebug and noticed it was gasp from large images, some almost 1MB large. The images on the server are contained in subdirectories under a gallery folder - and I wanted to resize them. What I actually wanted to do was:

  1. If the image was over 1024x900, resize it.
  2. If the image size (bytes) was over X, lower the quality.

I ended up using ImageMagick, which is what first obviously came to mind for image manipulation. While there is likely a better way to resize all the images, this is what I came up with. Maybe somebody else will find it useful too.

for dir in /home/vhosts/website.com/html/wp-content/gallery/*
do
cd $dir
for filename in *.jpg
do
echo Filename: "$filename"
width=`identify -ping -format %w "$filename"`
height=`identify -ping -format %h "$filename"`
if [ $width -ge 1024 -a $height -ge 900 ]; then
convert "$filename" -resize "%80>" "$filename"
echo New Dimensions: $width x $height
fi
filesize=`stat -c%s "$filename"`
if [ $filesize -ge 414981 ]; then
convert "$filename" -quality 90 "$filename"
echo Old Filesize: $filesize
echo New Filesize: `stat -c%s "$filename"`
fi
done
done

Save it as “resize.sh” and run it with “bash resize.sh”. The output:

Filename: img_5364.jpg
Old Filesize: 950748
New Filesize: 331426
Filename: img_5366.jpg
Filename: img_5367.jpg
Filename: 1c10_1.jpg
Filename: 1.jpg
Filename: 2752_1.jpg
Filename: 42920014.jpg
New Dimensions: 1544 x 1024
Filename: 42920017.jpg
New Dimensions: 1544 x 1024

Renaming Apache Log Locations

I realized a few of my log files were growing unusually large, and even worse, logrotate was skipping them. I took a look in logrotate.d and straight away realized why: I had created silly names for the log file. logrotate look for .log files, but I had specified mine as .log – e.g. kelvinism_access_log. I was as familiar with logrotate when I set up the domains, so set forth to get them in the rotation.

Firstly, I had to rename the actual log files. So, to rename kelvinism_access_log to kelvinism_access.log, a one-liner:

for x in *_log; do mv $x `basename $x _log`.log; done;

Next, I needed to rename the log location inside each of the Apache config files. While a one-liner might be possible, I used the following tiny script:

#!/bin/sh
 
for x in *
do
sed 's/_log/\.log/' $x > /tmp/tmpfile.tmp
mv /tmp/tmpfile.tmp $x
done

Beginning Scripting ESXi

I’m not impressed too often with much software, especially the closed source kind. I find a leaning preference to all things FOSS. If I had a million dollars, I’d likely spend all day contributing to all the projects I wish I had time to contribute to. Regardless, there are a select few closed-source products that I believe are truly excellent. I mean, the type of software where you aren’t asking “I wish this could do this” and start asking “I wonder what else this can do.”

While I’ve played around with most types of virtualization out there (OpenVZ, Xen, V-Server, qemu…), I’ve really found a soft spot for VMWare.

Don’t get me wrong, if I was going to host a heap of Linux web servers I would absolutely use Xen, but for a heterogeneous environment, I haven’t used anything as easy as VMWare’s products. Not that I judge a product by how easy it is to use, not by a long shot, but ease of use sure makes judging other factors easier.

Regardless, this isn’t a post trumpeting VMWare. I just realized tonight that some of the VMs I have running don’t need to be except for certain hours of the day, or if condition A is true. The first example is my backup mail server; I really don’t need it even powered on unless my main server is down. The second example is my Server 2003 instance, which has VI3 on it; I don’t need this running unless I’m asleep. One of the most useful resources I’ve seen for the vmrun command is over at VirtualTopia – loaded with examples.

Turn off via time

On my “monitoring” instance, which is always up, I’ve decided to install the script that controls my VM. I’ve opted to use a soft shutdown.

192.168.0.10 = ESXi box

datastore1 = name of datastore that hosts VMs

#!/bin/sh
 
vmrun -t esx -h https://192.168.0.10/sdk -u root -p root_password stop "[datastore1] Server 2003 R2/Server 2003 R2.vmx" soft

I have that saved in a file called stop_2003.sh in /opt/vmware/bin; make sure it isn’t world readable. I also have a start_2003.sh:

#!/bin/sh
 
vmrun -t esx -h https://192.168.0.10/sdk -u root -p root_password start "[datastore1] Server 2003 R2/Server 2003 R2.vmx"

Next, edit root’s crontab (crontab -e):

# m h  dom mon dow   command
0 8 * * * /opt/vmware/bin/start_2003.sh
0 23 * * * /opt/vmware/bin/stop_2003.sh

The conditional task is a tad bit more tricky, but just a tad. Ping won’t do, since the mailserver could go down itself, so install nmap. Create a script:

#!/bin/bash

if nmap -p25 -PN -sT -oG - mail.kelvinism.com | grep 'Ports:.*/open/' >/dev/null ; then
echo \`time\` >> mailserver.log
else
/opt/vmware/bin/start_mail.sh
fi

And sticking with our theme, start_mail.sh:

#!/bin/sh

vmrun -t esx -h https://192.168.0.10/sdk -u root -p root_password start "[datastore1] Mail Server/Mail Server.vmx"

This of course changes the crontab entry to:

#!/bin/bash
 
if nmap -p25 -PN -sT -oG - mail.kelvinism.com | grep 'Ports:.*/open/' >/dev/null ; then
echo `time` >> mailserver.log
else
/opt/vmware/bin/start_mail.sh
fi

So, that’s it. detect_port.sh is lacking any type of error detection or redundancy - if one packet/scan is dropped, the mail server will turn on. I’ll re-work this at some point, but it works for now.

Update: Vmware has also released a decent blog entry about using vmrun: on their blog.