Sidekick: another lesson about Microsoft security

Microsoft's Cloud has no silver lining for T-Mobile's Sidekick phone customersIts funny how we’ll make assumptions about things. Like PC’s crashing. but phones, well, they’re just always on, right?

Err, no. If the back-end’s built on a Microsoft infrastructure, think again. T-Mobile’s Cloud-based Sidekick phone customers had to when their entire infrastructure crashed, losing all their phone data. Ouch!

Microsoft upgraded it’s Danger Infrastructure last weekend, but the flaky platform fell over. And now all T-Mobile’s Sidekick customers have lost everything. Forever.

T-Mobile, like many mid-players, assumed Microsoft could be trusted with enterprise data. The reality is a lot different. So how could a routine upgrade go wrong?

What’s T-Mobile Sidekick?

Sidekick’s a neat looking mobile phone with a screen that kicks out to the side revealing a keyboard (sidekick – get it?). But the clever thing is that all the data normally stored on the phone is actually stored on servers run by a Microsoft subsiduary called Danger.

So, instead of keeping your personal phone information on err, your phone, its on servers – Microsoft servers – in the Cloud. Or was…

Guesses, wild assumptions and conspiracy theories

Its not clear how the crash happened – Microsoft’s saying nothing until it can come up with a plausible spin to put on it. But if we sieve through the reports, guesswork, simply wrong and frankly laughable, something does come out.

The fact is, a routine server upgrade in Microsoft’s data centre failed and the phone data of all T-Mobile’s Sidekick service customers vanished for ever.

Properly designed Cloud infrastructures simply don’t present a scenario when this could happen. That’s not the same as saying such a loss can never happen, but it means that fail-safes should kick in to prevent it. But to understand this, we have to understand how such environments work.

Its all about distribution – of resources, processing and significantly, storage. It appears here that it was the storage aspect that failed. It wasn’t a security failure, no information was released, nor was it a transactional issue that caused a corruption of the information. No, everything was lost. The information was no longer stored.

Microsoft-based servers too weak to trust?

Once upon a time, I used to like Microsoft technologies. It placed processing power into the hands of the man on the street. Anyone could unpack a server, do a short course at a local night school or via correspondence and there you are, you’re a Microsoft Certified Geek. But they never told you just how fragile it all is.

If you were lucky, come that inevitable crash, nothing too significant was lost and you kept your job, a wiser geek. And if you were really wise, moved over to Unix and Open Source.

Obscured by Clouds

Now when Microsoft moves into the Cloud space, things get a lot more complicated. Because big infrastructures don’t have their own storage, they aren’t like PC’s any more. They have storage silos – separate dedicated arrays of storage disks.

These are great, you can expand them easily, something that’s vital for the Cloud where you never know how many users – or how much customer data – you will have.

But hand in hand with this great power comes great responsibility, as Spiderman tells us. But unlike Spiderman, Microsoft’s server technology simply has no right to be on the Web. Microsoft’s just not that responsible. Or great.

A big pot of customer information

OK so here’s my spin on things. Some time ago, I was consulting for a well-known high street bank. At the very top, doing the big number crunching, everything ran on Unix from Sun. But at the lower levels, where the less informed ran things, Microsoft had the monopoly. The bank had just installed all this “unbreakable” remote storage, called a SAN, a Storage Area Network.

Don’t worry what that means, its just a big area to storage data attached to the network. The servers are told to forget about managing their own storage. They assume the storage knows what its doing and the storage assumes that the servers know what they’re doing. So there’s a kind of mutual trust thing going on.

Very soon, something terrible happened. That trust relationship failed. The storage fell out with the servers and the servers stopped talking to the storage. When everything appeared again, the servers saw what they thought was a new area of storage and wiped it clean, deleting everything that was actually on it, which was the real live data.

That bank was down for a number of days while they scrabbled to recover what they could, which was some but not all of what was lost. Fortunately, that bank didn’t trust the Microsoft guys that much so the damage was limited to login information and user accounts. So everyone had to be given new accounts to log on with. Bad, but not terminal.

It looks like in T-Mobile’s case, the storage relationship may well have failed. But this time, it was real customer information, phone numbers, photos, music, video that was wiped off the face of the Earth. Not just bad, Terminal.

Will the truth about this be buried, along with the data?

Maybe we’ll never really know what happened here. Microsoft won’t want people believing its Cloud strategy is flawed, so expect some excuse that somehow doesn’t involve them. It’s already trying to play this down by offering all T-Mobile’s Sidekick users a $100 payoff, but if you’d just lost the only photo of your new-born baby, would you be happy with that?

From our banks to Microsoft themselves: A failure to assess risk

OK, I’ve worked a bit with storage design and can claim the first Network Attached Storage (a technology alternative to SAN) to go into a bank I was consulting for a few years ago. That doesn’t make me a leading authority on storage design by any means. But I know this much. For every eventuality there should be a proper, risk assessment. Everything should be met with the “What if?” question.

When this upgrade was proposed, why didn’t someone ask “what if it goes wrong?” and why wasn’t there something in place to deal with the resulting failure.

To me, the crime was not the fact that T-Mobile’s infrastructure collapsed in the first place. The unforgivable blunder in all this is that there was no one capable of picking it up again. And that’s why you can’t trust Microsoft in the Cloud.

let's do more - email or print this or let's just get social!
  • Digg
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • email
  • LinkedIn
  • Live
  • Netvibes
  • Propeller
  • Slashdot
  • StumbleUpon
  • TwitThis
  • Yahoo! Buzz
  • MySpace
  • Technorati
  • Print
  • BlinkList
  • co.mments
  • De.lirio.us
  • eKudos

2 Responses to 'Sidekick: another lesson about Microsoft security'

Subscribe to comments with RSS or TrackBack to 'Sidekick: another lesson about Microsoft security'.

  1. contextfree said,

    on October 15th, 2009 at 9:43 am

    the back end in this (Sidekick) case wasn’t built on a Microsoft infrastructure, it was all Unix-based. so MSFT technology really has nothing to do with this, though MSFT management might. it appears to have been a hardware failure rather than software anyway.


  2. on October 15th, 2009 at 10:40 am

    Hi, Contextfree,

    The web front end of Danger is certainly running Apache on UNIX. You can see this using a Netcraft enquiry. But I’m not talking about the web front end here, I’m talking about the back-end and storage management.

    Most insiders agree that this has been recently migrated to a Microsoft back-end attached to a SAN infrastructure in preparation for Microsoft’s new Pink (or Zune) phone launch. This can’t be confirmed, I admit as there is a severe embargo on any news surrounding this.

    But think about this. If you were Steve Balmer, would you be launching a new marketing campaign on a non-Microsoft back-end?

    It could also be that Microsoft were trying virtualise the entire service – they had an agreement with Red Hat to provide virtualised Unix / Linux instances, which they announced recently… But there is nothing to corroborate this currently.

Leave a comment

Comment Spam Protection by WP-SpamFree



LANZen IT Strategy and Design Consultants Phone me on 01260 290 592 Contact me - click HERE