As I've mentioned a few times already, most of our site hardware in in NYC. On Friday, our tech team began preparations for moving our infrastructure out of the city in case Hurricane Sandy took it down. First things first, we began mirroring our database at an outside facility. We have a big freakin' database, so it was literally a days-long process.
As NY-based sites started going down left and right last night (HuffPo, Gawker, Mediate, others), our data center held up. Their power went out early in the evening, but their generators were powering the place. But even then, given the flooding, our tech team was assuming we'd have to move out -- the generators had fuel for a couple of days, and while the data center had plans for fuel resupply, we know how those things go awry in a disaster zone.
Our tech team had broken for dinner when we got an ominous email from the data center -- their generators were several floors up and safe from flooding, but their diesel storage and fuel pumps were in the basement, and that had flooded out. Not only were the generators cut off from refueling, but the building started filling up with deadly fumes. They had to evacuate the building, but they expected the servers to have 5-7 hours left of gas before they gave out. I wrote about it here.
We assumed the worst, five hours, and got to work on moving the site to our alternate location. In reality, we had 10 minutes. The site went down suddenly. Best we can tell, our servers actually never lost power. Instead, the infrastructure that lets the world know that "DailyKos.com" should point to our servers in the building in which they are located just disappeared. It wasn't a server problem, it was a network one. Kind of like your wi-fi going down in your house, while your computer works fine.
Had that network gone down 15 minutes later, it wouldn't have been a problem, we would have already switched to our alternate site. But when it brought down the network, it brought down our ability to quickly redirect the site. This literally locked us out of our site for several hours as the tech team routed around the problem.
After several starts and stops, the original data center came back online at 4 a.m. PT. So right now, we're on our usual hardware. The data center still has no grid power, but they claim to have enough fuel for three days before needing to resupply.
Best case, nothing changes, power comes back, we continue on our merry way to next Tuesday. Worst case, the NY data center has additional problems, or runs out of gas, or the network mysteriously goes down, or who knows what, and we have to switch to our alternate site. In such a situation, the site would be down for about 30 minutes, and the alternate infrastructure is nowhere near as powerful as our current setup. But at least we'd have a working site.
So as we prepare for the worst, let's hope for the best. Not just for us, but for everyone else affected by this disaster, either directly or indirectly.