Selma, Nelson, blacklists and shared server SSLs.

Written by Tyler Jacobson on February 26, 2013

It’s been quite a week and it’s only Tuesday.

I apologize up front for the issues, no matter how big or small, that you may have been experiencing this week. We do our absolute best to keep things running ship shape and problem free but there are times when all of the proper maintenance & attention to detail in the world won’t stop issues from arising. Now is one of those times. When it rains, it pours – or given that it’s winter; when it snows, it blizzards.

Selma

Selma

UPDATE: (2/26/13 – 2:15 MST) Our techs were able to get incoming and outgoing mail up during this migration. 

2 days ago Selma began sending us notices that the S.M.A.R.T. status of her hard drive indicated that it may fail soon.  This is the purpose of the S.M.A.R.T. status to give us a head’s up when the drive is legitimately experiencing issues. 9 times out 10 when this happens we have at least a few days to be able to make a smooth transition to new hardware, with the ability to offer some forewarning to customers on the server that a migration to new hardware will take place. We are also usually able to keep the server running as the files are copied to the new hardware.

Unfortunately, this is not currently the case. The S.M.A.R.T. warning that occurred signaled that the issue was severe and required immediate attention.

Good news: Out of sheer coincidence, we were in the process of provisioning a new server specifically to replace Selma within the next week when this happened. Bad news: The current server required a reboot into rescue mode which disables all access and file modifications. From the moment the server came back up in rescue mode a copy of it’s contents to the new hardware began. It is a long process and until it completes, there will not be access to email. This is inconvenient and very close to a worse case scenario for us and we know it is for you too. We will continue to pursue other paths to migration that bring the server up in the meantime, but to be frank, it’s not looking good.

Don’t you have  a backup?

We do, but it’s an archival backup and not an exact copy of the main drive. Unfortunately, there’s no way to actually run the server software off of the backup. That’s the bad news.

The good news is that the new hardware will have more redundancy so this should be the last time we experience an issue such as this.

Can’t you just move me to a new server, then?

Yes. That’s what we are doing at this very moment with every account on Selma. At the moment, this can’t be done without downtime due to the severity of the hard drive errors.

Can you re-route my mail to a different server?

Unfortunately, no, we’re not able to do this given the current status of the server. While we know it’s not ideal, we’d recommend setting up a Gmail account and notifying your contacts to reach you there for the time being.

Nelson

160px-Nelson_Muntz

Nelson has twice experienced an issue where it has run out of memory, causing the server to stop responding. When this happens, we have rebooted the server and examined the logs looking for the errant source. Both times, so far, the findings were inconclusive.

As a result, we have added additional logging to the server to help us identify the culprit and resolve the issue. I wouldn’t say that we are out of the woods yet, (and luckily these woods are inconvenient rather than disabling), but we see some daylight and the issues on Nelson are close to being resolved.

Blacklists

Where to start on blacklists? Let me start by telling you what MacHighway does to prevent spam:

We don’t offer an open relay. This means that all mail sent through our server must be sent with a valid username and password with an email account. A spammer can’t just send through our server without either hacking an email account (to prevent this make sure all email accounts are sending and receiving via SSL – click here for instructions) or hacking a site (to prevent this keep your WordPress, Joomla, phpBB, Drupal, et al. regularly up to date).

We are on feedback loops with major providers. This means that when someone, for example, on AOL reports a message as spam that generated from our server, we’re aware of it. We can see who the sending account was and determine if it is indeed spam or if it is just an email that was incorrectly marked as spam by the recipient.

Once we can verify that spam is legitimately being sent from our servers….

We stop the spam that is being sent from our servers. We investigate all spam complaints to determine from which account it came, if it was indeed a hack or a misguided attempt from a customer to market their site. We then take the appropriate action to immediately stop the spam messages from leaving the server.

Ending up on blacklists is a fact of shared hosting in 2013. While slow to respond to removal requests, most blacklist proprietors are at least responsive to such requests and understand that even legitimate shared hosts who take all of the necessary precautions still land on lists from time to time.  Other list providers act in an almost vindictive matter, assuming that if a server ended up on the list, it must either be because they are spammy or irresponsible. These particular blacklist operators refuse to be open to communication in an effort to resolve the problem and, ultimately, any blacklisted server is at their mercy.

We take the responsibility of caring for your site and email very seriously. We are committed to making your hosting experience with MacHighway fuss free. When problems do arise (which they will – because… COMPUTERS) we do our absolute best to resolve them quickly and effectively. Being unable to remove ourselves from blacklists is as much a source of frustration for us as it is for you.

SSL Certificate issues.

It took a little bit of searching, but we’ve concluded that a cPanel update had disabled the shared SSLs for many servers, resulting in certificate errors (mostly on mail) for several customers. Once we identified the source of the issue, the fix was relatively painless. Everything should be working on this front. We don’t expect this to happen again…. unless, of course, cPanel pushes out another update that does this again. Fingers crossed.

Thanks so much for your patience on all of these matters. We recognize that it is never a good time for your email or hosting to be down and we are doing (and will continue to do) everything in our power to minimize downtime and keep your servers running without problems.

If you need us, please contact our 24/7 support department.