I’m glad that Becky has been blogging quite a bit recently, because even though there are a lot of things I have to blog about, I haven’t been able to get any of them worked into an actual post. So here’s an attempt at covering just one of those things: some of my frustrations with [Site5](http://www.site5.com/?affiliateid=9), the web hosting provider for prwdot.org.
I’ve been a customer of [Site5](http://www.site5.com/?affiliateid=9) for several years now. When I originally signed up, their motto was “Web hosting for web developers by web developers” (or something along those lines). The idea was that they were a small company who offered knowledgeable support, rock-solid web servers, and cutting-edge technology. They were a company that understood what web developers needed, and being a web developer, I was sold on the idea. I liked that I knew the names of the system administrators, that they respected my intelligence and actually worked *with* me to solve problems.
Over the past year or so, I’ve seen their level of service degrade significantly. Perhaps because of their incredibly great deals (I’m only paying $8.77/month for 11 gigabytes of disk space), they have experienced a steady influx of customers, and have grown to the limits of their capacity, both in terms of hardware resources and personnel resources. System load averages on their servers have increased, customer support response times have increased, and things have generally become very frustrating for me. I depend on Site5 to host all of the subdomains running at prwdot.org, and Becky and I also depend on them for our personal email service. In addition, I do all of the coding for our website directly on the server by means of SSH, the secure shell protocol. This allows me to directly connect to the Site5 servers and edit files on the server itself. I avoid the hassle of editing files locally and then uploading them in order to make changes to our site. SSH is not a service that many web hosts provide, and I am grateful to Site5 for allowing its customers this type of access. But when Site5’s servers have problems, then our SSH, email access, and website don’t work, and that’s a problem.
I am always vigilant about submitting technical support requests. As soon as I realize there is an issue, I email Site5’s support team. The time it takes for them to respond varies based on how many support technicians are on staff at a given time, the backlog they have to work through, and the difficulty of a given request. Often I receive a response within 5 minutes, but there are also many times when it takes a half hour or more to respond. Quite frequently, there will be a transient issue with our server, and by the time the support tech has responded, the issue has already passed, so there is nothing for them to look at. Part of the problem is that there aren’t enough support techs to handle the load, but the other part of the problem is that Site5’s servers are not as rock-solid as they need to be, and thus require a lot more intervention by support staff than should be necessary. In addition, Site5’s system monitoring tools are not sophisticated or comprehensive enough to adequately monitor and respond to all possible issues that could occur on a server.
Because of these issues, and because of my growing concern, I recently sent Site5 management a very long email detailing the problems I had with their services and my hopes for resolution. It took several weeks before I received a full written response, but that is because Site5 was busy handling several other issues, including the launch of a new customer backend website. In addition, before actually responding to my email, they discussed its contents during a management meeting and actually began addressing some of my concerns. I was really hoping that they would respond to me first before taking any action, but I can’t blame them for wanting to do things their way and for attending to their top business priorities first.
During this intervening period, there were several blog posts by Site5 staff that I assume were either directly in response to my email, or tangentially related. For example, Todd Mitchell, Site5’s COO, wrote a post on [system load](http://www.westifelia.com/2006/04/03/what-is-system-load-how-does-it-affect-me/) in the evening of the day on which I sent my original email. In this post, he seemed to be deflecting my concerns about high system load by stating that the system load average is not the best indicator of site performance. Instead, he recommends looking at the time it takes to load a webpage as the best metric of site performance. While I agree that would be fine if website performance is the only thing you’re concerned about, I look at system performance in a more holistic manner. Yes, webpage load time is one important factor, but other factors important to me include email subsystem performance (POP/IMAP/SMTP – how fast can I send and receive email messages?) and shell performance (how quickly can I log in via SSH, how quickly does the vi editor respond when I am editing a file?). It appears to me that Site5’s servers give preference to web server performance, and that even during times of relatively high system load, web page load time is quite good. But during those times, the other services take a hit. And though your average joe web hosting customer may not care about them, I do. Since Site5 never said anything to make me believe that website performance was paramount, I fully expected the rest of their services to perform just as well.
Based on Site5’s response to my email, I do believe that they intend for their servers to be operating up to my discriminating standards, but they haven’t had the systems in place to adequately monitor and respond to increased load on their systems. It appears that they have started to take some steps in the right direction. Site5 CEO Matt Ligthner says in [Rock Solid Web Hosting](http://weblog.site5.com/articles/2006/04/24/rock-solid-web-hosting) that they are working on “A new intelligent system resource management system, automatic outage response and resolution and increased transparency of our systems administration goings-on” and that they “will be bolstering our customer service and support teams to ensure that when you need assistance, you get it in record time”. Todd Mitchell states that they’re [hiring some new sysadmins](http://www.westifelia.com/2006/04/29/new-team-member-more-to-come/) to help develop better server monitoring technologies, and customer service manager Kevin Hazard (who was actually the person to write the response to my email) writes about their [current state of meta-planning](http://weblog.site5.com/articles/2006/04/12/she-turned-me-into-a-newt-i-got-better).
So I’m semi-optimistic about Site5’s progress. Although I am always evaluating other options for web hosting, I haven’t made the decision to switch just yet. I am going to stick with Site5 for a while – even though things have gotten a bit rough, I still believe that at their core, they are a good company that provides a good service, and I want to help them improve. That’s not something you find a lot in a [commodity market like web hosting](http://weblog.site5.com/articles/2006/02/23/the-death-of-web-hosting)… most people would just switch to whichever host was fastest and cheapest. I guess it’s the same kind of loyalty I have to companies like [Apple](http://www.apple.com/). Site5 does some truly cool stuff, and from reading over their weblogs, it’s obvious that they have some very bright software engineers, system administrators, and manager. They’re probably one of the first web hosting companies to fully embrace the [Ruby on Rails](http://www.rubyonrails.org/) programming environment, and they’ve done a lot of good things for the RoR community. They have some very active customer forums, and are constantly trying out interesting and innovating programs and contests for their employees and customers. Again, there really are a lot of great things I can say about Site5.
However, just like with Apple, I’m not looking at everything through rose-colored glasses. I am well aware of the past, and in fact I’d like to reflect on a few posts from various Site5 blog entries of almost a year ago. Some of them are eerily similar to what I’ve been reading lately. I know that nobody likes their skeletons to be dragged out of the closet, but I just wanted to give everyone else the benefit of looking at current events in the context of the past.
On June 26, 2005, in [Bulletproof Hosting](http://blogs.eng5.com/~mlightner/?p=16), Matt Lightner wrote:
It has come to my attention that Site5 could benefit from some additional attention to our operational infrastructure. With the number of servers currently in our fleet, making sure that everything on every server is in perfect condition at all times (and that we know within seconds if it isnâ€™t) is now a requirement. Up until now, we have been working with several different systems to accomplish the goal of fleet health monitoring, however that method is quickly showing its inability to scale effectively.
Performance issues like those recently seen with the Collosex server are simply unacceptable, and I am taking it upon myself to ensure that the health of a Site5 server is never again permitted to fall outside an acceptable range more than evanescently.
Later, on August 2, 2005, in [On the horizon](http://blogs.eng5.com/~khazard/?p=7), Kevin Hazard wrote:
Server performance: The Site5 server structure is being improved, and our engineering team is working with our systems administrators to smooth out the process of migrating accounts between servers seamlessly, so you will never complain about high loads again.
Eight months after that date, I wrote my email to Site5 to complain, among other things, about high loads…
Twenty days later, on August 22, 2005, Matt Lightner wrote in [A Brief Site5 Update](http://the.fivefoldpath.com/?p=26):
3: Performance Upgrades
We will be performing some account transfers this week (actually they started last week) to ensure that we deliver a consistently reliable service at all hours of the day, across all servers. In addition to these moves, we will be establishing a public information center where customers can quickly and easily view and compare the load averages (with history) on all Site5 servers.
A few days after that, Site5 CTO Adam Greenfield wrote some more about their [load issues](http://www.adamgreenfield.com/articles/2005/08/25/load-issues-and-spectre-slippage).
On September 9, 2005, Matt Lightner wrote his [State of the Union](http://the.fivefoldpath.com/?p=29), stating:
Our systems administration and operations group, led by Site5 COO Todd Mitchell, is making some very large changes to the way things work behind the scenes. We brought eight new servers online two weeks ago, which were used for nothing other than helping to increase performance for existing customers. Todd is pulling out all the stops and really doing some incredible things to show Site5 customers what the word fast means in our neck of the woods!
So there you have it… a bit of history from last year to consider when looking at the events of this year. Hopefully, with some new people, new hardware, new software, new procedures, some mindfulness of history, and a bit of prodding from customers like me, Site5 will be able to make the improvements they need become a truly awesome all around host.