Date Sent: February 15, 2012
Subject: It Only Takes A Minute (to Break a Website)
From: John Bondon
I am not a web coder/programmer by trade. My expertise with computers is more the behind the scenes, managing the computer servers and network infrasturcture that powers our Information Age economy. I would be classified as a Network Administrator or Server Operator, though I do have some limited coding knowledge. Enough to build and maintain this hiking website, for example. Computer programming is something I do from time to time -- either when I have the time to tackle something new on my "wishlist" for the hiking website, or more commonly when I am forced to because of an ugly error or problem with the site. The later example being the case today.
Some members have reported to me that they have not received any of my weekly hiking emails lately. While others that have been may have noticed the same email has gone out for 3 weeks in a row ! Ooops!
You may recall that back in 2009 I automated this process of sending out the Weekly RSVP (Wednesday) email. Now I just need to upload my content (the body of the email) and it fires automatically. Or if I fail to get my content uploaded in time, it fires anyway, but sends a generic message to you. This is better than the old system where I always had to send the email manually and you could never predict when that would happen, IF at all!
Late last month I moved our hiking website to a new server. At the same time I upgraded the underlying software that the website runs on. I have since discovered some differences that explain BOTH complaints ... either no emails or the same email for weeks!
You see I always test when I move the website to a new home, before making that final cutover. But as with most tests, there is a difference between sending out a few test emails versus generating this Weekly email that reaches thousands of members at once. And not testing for that second scenerio was my downfall.
So long story short, what I discovered today as the cause of some people not receiving my emails, was a simple timeout setting on the new server. A limit I either never reached before, or was set differently on the old server. The limit stopped my code from executing after a certain period of time if it had not finished yet. This is a failsafe in case there is a problem with my code (or a hacking attempt) so that one script doesn't tie up the entire server. (Like when ONE program on your computer hangs, causing your ENTIRE computer to appear as if it crashed and become non-responsive.) Only in this case, my script was actually fine, but because our membership list has been growing steadily over the years, my automated email script now legitimately takes longer to run. Today's email, for example, is being sent to over 3,400 receipients, which takes the server a full minute or two to generate. That full minute of processing time was enough to trigger a timeout and force an intteruption of my code.
So what was happening is, some people received my January 25th email each time this script ran, because it would always bomb in the middle. So the second batch of members would never receive it. And that email would never get marked as having completed so the next week it would try to send the exact same email again!
So today I finally took some time to review my code, the data in the database, as well as any errors in the server logs, to figure out what was going on. And in the process I learned about a new timeout setting I never knew about, nor thought would affect my small website. And I learned a valuable lesson about testing for real world conditions before cutting over the hiking website to a new server.
And now hopefully you are once again receiving my emails ... a NEW email to boot!
Copyright © 2004-2021 John Bondon all rights reserved