You may have heard that Bromium is one of the top 5 cloud companies to watch in cloud security and was named startup of the year, but we’ve generally been trying to keep a low profile – mostly because what we are doing is difficult technically, and we wanted to make substantial progress before talking about it.
Bromium is developing a second-generation virtualization technology that offers profound benefits in the trustworthiness, security and manageability of computer systems. [“First-gen virtualization” includes the well-established use cases with which you are familiar. “Second-gen” includes these and substantially extends the value of virtualization.]
The Bromium system architecture is very different from anything available today. So much so that simply describing the technology would not be particularly useful – it needs to be set in the context of real world use cases and value propositions. So this and my next few blogs will set the stage for the forthcoming Bromium full monty.
It is important to clarify that Bromium is not attempting to add security to today’s clouds or virtual infrastructure deployments. We are developing a software system that has the potential to make any computer system (client or cloud) secure by design. Our goal is to transform the trustworthiness and security of computer systems and thereby enable enterprises to embrace the key trends affecting IT: consumerization, work-shifting, device diversity, and cloud computing.
We use hardware features for virtualization and security to deliver a huge leap forward in systems architecture, by tackling a “grand problem” – Trustworthy Computing. The Committee on Information Systems Trustworthiness’ publication, Trust in Cyberspace, defines such a system as one which
“…does what people expect it to do – and not something else – despite environmental disruption, human user, and operator errors, and attacks by hostile parties. Design and implementation errors must be avoided, eliminated, or somehow tolerated. It is not sufficient to address only some of these dimensions, nor is it sufficient simply to assemble components that are themselves trustworthy. Trustworthiness is holistic and multi-dimensional.”
This is a challenging goal, and one which it is probably impossible to achieve in practice, but we are confident that we can deliver an improvement of many orders of magnitude by comparison with current systems. And unlike most vendors, it is our goal to declare our limitations up front, rather than making claims that require customers to take a leap of faith.
Our first product will be enterprise client focused, though the technology has broad applicability. Why clients? We are witnessing explosive growth in public cloud services and applications. Combined with the incredible adoption of mobile device form factors this has led to a profusion of new applications and challenges for IT: Whose device is it? Where is it? Who chose the application? Is the user authorized to access data or applications from it? What network is it on? Can enterprise data be secure in a world of empowered users? (Lest you’re inclined to scoff, have you ever used your PC in a hotel room?)
It is depressing to see IT leaders so focused on private versus public clouds – for so-called “security reasons” – yet they appear blissfully unaware that every single enterprise access point – PC, mobile or virtual desktop/app – offers the bad guys a direct route into the heart of the enterprise.
While trustworthy computing is a laudable goal on its own, our architecture is the result of a need to solve IT challenges at a time of profound change. IT is charged with compliance and security at a time when users – as consumers – are dictating the future. Personal usage, device and application choices, and the growth of employee mobility raise concerns about identity management, the security of data and access, the cost of support, issues of compliance, application compatibility and much more. From the user’s perspective “What matters is me!” but IT has no choice but to respond: “Users don’t get to choose!”.
It is impossible to empower the user without dramatically increasing risk to the enterprise. It’s equivalent to letting the bad guys in.
We need a radically new approach to securing access to enterprise applications and data, starting with PCs and mobile devices, but including hosted and virtualized access (RDS, VDI). We need to transform IT practice for endpoint security just as we needed a new approach for data center management before server virtualization catalyzed IT’s metamorphosis from dull cost center to an agile, service-centric, strategic business capacity. In the ensuing change, silo-ed work practices and legacy tools were swept aside by an integrated, powerful, automated virtual infrastructure management framework.
The desktop revolution calls for a profound change in the trustworthiness of our infrastructure. We need systems that are inherently trustworthy – by design. If such a thing already existed, then the mess of VDI, Patch Management, Data Loss Protection, End Point Security and Identity and Access Management practices would not exist. The infrastructure would shrug off attacks, protect enterprise assets at all times, and guarantee the privacy and confidentiality of the user. And we’d save about $10BN on useless software per year. That would be real progress.
So, I’d like to invite you to join me on a quest for the desktop Holy Grail. We aim to deliver a thing of beauty that is well matched to our human nature, that is affordable, reliable and secure by design, that empowers users and democratizes IT while preserving control and compliance.
Our quest will result in an architecture that can make trustworthy computing a reality, today.
I know what you’re thinking. Not a peep from Bromium in months and now here’s a blog about DaaS. What the heck are these guys doing over there? Well, as a person who’s lived and breathed the stuff nearly since its inception, I feel that there’s still a lot of misinformation out there about VDI – possibly resulting in detrimental strategic decisions, and DaaS is quickly rising to the top of that information entropy word cloud. So let’s talk about the scope of this blog: I’m not going write so much about VDwhy here, Simon’s already done a pretty good job of it on our blog. The focus of this blog is about VDIaaS – that is the concept of a service provider offering remote Windows 7 desktops for a monthly fee.
We’ve all heard the pitch about how centralized desktops are easier to secure and administer than conventional ones. And in some cases they are. Those cases usually involve someone who has a fairly nominal workflow – they need Windows, Office, and maybe a CRM app, and a productivity app or two. They likely also don’t really need Windows 7. It’s my position that someone who truly is in a position to use a service-provider leased hosted desktop would more than likely be satisfied with Terminal Services desktops as a service than VDI desktops as a service. But I’m not going to write about that here – my former peer Calvin Hsu did a bang up job talking about it on his blog.
You see, Terminal Services-based Windows Server desktops have an available Service Provider License Agreement. That is, if you started a company tomorrow and wanted to provide DaaS for $5 a month by delivering a Windows Server desktop skinned to look like a Windows 7 desktop, you could legally do it. But there is no SPLA for Windows 7, so it is illegal for service providers to deliver Windows 7 desktops as a service. This doesn’t mean that your company can’t pay a service provider to host your VDI desktops, it just means that you need to buy the licenses for those desktops yourself. The service provider can’t buy the licenses and rent them back to you.
Joe Matz, Corporate Vice President, Worldwide Licensing and Pricing at Microsoft recently posted a blog which I equate to a giant dancing human arrow pointing at an elephant in a small brightly lit room: Microsoft doesn’t want service providers delivering Windows 7 desktops as a service.
Up until that blog there was this sort of make-believe grey area in the VDIaaS arena. A hope that Microsoft would “come to its senses” and allow service providers to deliver VDI as a service. But no more.
Guise Bule, CEO of tuCloud was so upset about this that he wrote a manifesto about how Microsoft’s licensing actions are akin to a threat to national security (I’ll get to the security bit at the end). In this opus he also attacked TS desktops as follows:
“Stop calling them desktops, its false advertising at best and flat out lying to yourselves and to your customers at worst. If you really believe in your model then call your ‘desktops’ what they are, shared slices of server designed to trick the user into thinking they are using a desktop, yet you insist to your customers that they are Windows 7 desktops and try to disguise them as such.”
Desktop Virtualization pundit Brian Madden got so upset that he quit the MVP program, sending Microsoft an “I quit you” letter sealed with salty tears.
What could Microsoft possibly be thinking??
I believe Microsoft’s perspective on VDI is exactly this: Hosted VDI desktops are no match for the rich local Windows 7 experience. If you want a throwaway desktop experience, go use Windows Server, but the Windows 7 experience is special. You probably know they have an index for it. They even capitalize the “E” in experience. When users see the Windows Experience Index on their VDI desktops they are seeing a lie, because there is no guarantee that the user is actually experiencing the promised Experience. Just like Guise believes TS desktops aren’t “real”, so does Microsoft believe remotely hosted VDI desktops aren’t “real” without a mechanism to appropriately set the end-user’s expectations for the “Experience” they are about to experience . I think the only reason they support VDI in the first place is because someone convinced them that end-users would always be connecting to VDI over LAN.
But what about the SME’s/SMB’s who don’t want to have to worry about administration? Is it Microsoft’s intention to completely eliminate service providers from the Windows 7 delivery market? Nope. Hello Intune! If you are Microsoft and you dearly care about end-user experience and the only way to ensure predictable positive end-user experience is to put Windows on the end-point, then you find a way for Windows 7 to be on the end-point but abstract all the management components such that a service provider could provide all the administrative capabilities while ensuring a positive end-user experiences. If you really need a Windows 7 desktop, and you need it managed by a service provider, Intune is likely the way to go for your use case. Pricing at $11 per PC per month (much cheaper than $100 for VDA). Add Office 365 to that for as low as $4 a month. If you need more apps there’s always App-V.
Let’s dig deeper. As you may know, while at Citrix I was a big champion of using end-user experience at scale as the primary metric for VDI POC’s. Desktops as workflows have very unique requirements in order to be provisioned as a cloud-based service, the net of which I feel makes most use cases untenable.
To begin with, VDI desktops have a very demanding IOPS requirement (Input/output operations per second) which is very expensive to maintain in both public and private clouds. In the private cloud each random IO is a spindle head movement. With an average of 20 IOPS per desktop, the total random IO required of a SAN is 20,000 IOPS. This translates to 300 spindle disks without accounting for RAID. With RAID 5 or 6, the number of disks required is 600-800 just to support steady state random IO coming from these 1,000 virtual desktops. While the hardware cost may be abstracted in the public cloud, the service cost could easily outweigh it: Consider the going rate of $6 per IOP per second per month, at 20 IOPS per desktop the cost of 1,000 desktops on a public cloud would be $120,000 per month! It stands to reason that even if we discount for the licensing limitations, the cost of ensuring “as good as local” end-user experience for DaaS service providers would be immense and passed on to the customer. My experience has been that service providers tend to skirt these costs at the expense of their customers’ end-user experience, and economies of scale would dictate that the net outcome of any cost savings in delivering VDI usually net poor end user experience.
The notion of multi-tenant VDI desktop administration is a pipe dream. Enabling true multi-tenancy is close to impossible (read: unsupported by Microsoft, ridiculously expensive and complicated). The ability for a cloud tenant to have single-pane-of-glass visibility and control over the instances, data, and networks in their cloud-hosted solution certainly sounds appealing. In terms of a DaaS solution this would mean the desktops, the master images, patching, user data, networks, access policies, etc. would be available for multiple isolated virtual desktop silos . In addition, the multi-tenant management solution would need to have the ability to securely provide this level of access to multiple tenants. None of this functionality exists in any of the desktop virtualization offerings available today for VDI. Manifesting muti-tenancy in a manner that wouldn’t negatively impact end-user experience seems unlikely.
Lastly: The security benefits of virtual desktops are overhyped. Persistent or not, every “war games” scenario I’ve seen with VDI ends with one of the virtual desktops getting compromised and the attacker gaining access to the datacenter subnet. Sure one can try to add additional network layers between the virtual desktops and the infrastructure behind them, but the net of such exercises only serves to increase rather than decrease the attack surface. One needs to do a serious calculation of the benefit of zero data at the end point vs. the risk of putting desktops on your server subnet. Revisiting the “Experience” bit – VDI is basically just a desktop populated by whitelisted apps. Users will be productive at any cost, and so if they need to go outside of the “safe” VDI desktop to do their job, they’ll do it. The poor end-user experience endemic of VDIaaS is in itself a security threat because it drives users to look outside of VDI in search of productivity.
It’s Halloween, and what could be more appropriate than dressing in some ghoulish garb, and trick-or-treating the neighborhood? This year I thought I’d go as a VDI virtual desktop. I encased myself in a getup looking like an old TV, and then planned to magically unfurl a picture of a Windows 7 desktop on the front, to spook the neighbors (many of whom work for Microsoft), illustrating a key value prop of VDI. Taking a leap of faith, I thought that that out on the streets where I’d have no pesky firewall seeking drop all UDP, I might be able to make use of PCoIP. But I was disappointed – my desktop kept falling off, and when it did work, one neighbor asked if I was trying to look like Windows XP. So I decided instead to highlight the benefits of an SSL VPN for desktop delivery using a red string to attach me to my house. But I quickly found that each time I turned a corner, I had to reconnect. And in a final coup de grace, my VPN connection was seized by a neighbor’s dog, who wound it and me around a tree.
I retired to the deck to hand out candy to the ghouls and ghosts of the neighborhood, still determined to work out where VDI makes sense. I recalled some examples of extremely beneficial VDI deployments that I have encountered in my visits to customers world-wide:
- Large software development house in China: The employers don’t trust the employees with locally cached source code on their machines. Moreover, developer desktops are a great example of a user category that demands a full desktop and custom apps. Finally, as developers move between project teams they can simply move to another thin terminal rather than have to lug their PC along with them.
- Japanese manufacturer with offshore workers: Moving software development and traditional back-end processes to an offshore location in Asia, this vendor cut costs enormously and managed to grow faster because it had been unable to find enough highly skilled workers in Japan. VDI based desktops for specific high-skill tasks made most sense, with the desktops being delivered over a high capacity dedicated link to an office offshore. No company IP is ever local to the employees in the remote office.
- Banking Support: Credit card company used a hot-desking workforce running 3×8 hour shifts in India. Office space rented from a third party, with no ability to install traditional bump-in-the-wire network devices, or enforce corporate policies. All they could do was terminate a dedicated link at a router shared by all tenants. At the end of each shift a new user takes the seat in front of the support desktop. No new log-in, just a new user at the keyboard.
- F50 Bank: The traders are allowed to trade on the 4th floor, where traders work, but not on the 5th, where the bankers work. If a trader logs on from the 5th floor, he gets a generic “surf the intranet” desktop, but none of his trading apps. Again, a great example of the need for a customized desktop/app experience for the user, and excruciating control. Oh, and all web browsing has to hit an internal proxy first, where policies related to trustworthy external sites are enforced. Gmail is not on the list of valid external sites – so the only way to access personal mail while at work is on your personal device, and a carrier network. All desktop and app access is logged for compliance. There are no rich clients, and in emergencies, users are permitted to access their VDI desktops from home (though they dislike this, in case a key logger has been installed on a home PC).
- F50 Bank (another one): Different banking groups are on different, isolated networks. The network management console for administering the internal network is hosted on a VDI based desktop that has access to all networks, and that can only be accessed from a specific internal thin client device associated with a named administrator. All accesses are logged for compliance.
The common themes are these:
- The user must be connected “by default”, and preferably on a decent network
- She is typically a power-user, with custom apps and a need for a full desktop
- The enterprise needs extraordinary control over the user’s activity to limit/empower them and to meet regulatory requirements.
There was a lively attempt at a debate on twitter last week between myself, @VirtualTal, @ShawnBass, @brianmadden, @cswolf, @bsonposh, @harrylabana and others about the value of VDI. Because this needs more precision than 140 chars, let’s be crystal clear: I mean Virtual Desktop Infrastructure as invented by VMware but offered by many vendors: Windows client OS-based desktops, hosted centrally on a hypervisor, with a remoted desktop user experience (RDS, HDX, ICA, PCoIP…). Optionally (eg: View) there is the opportunity to “check out” a virtual desktop VM to run it locally on a client hypervisor (VMware’s type-2 ACE, Workstation or Player for Windows, or Mac-based Fusion) then “check it in” at some later point. [This is an utterly daft idea - and not discussed in this post - if I have a decent laptop, why would I ever check-in my desktop and go back to VDI?].
If you’re after the short version, here’s the summary: I have not found a single desktop virtualization expert (that does not work for one of the DV vendors) who will put their err… cred, on the table to recommend VDI over other desktop virtualization technologies (other than for a narrow set of use cases). In fact, the opposite is true: The leading voices in desktop virtualization think customers are being misled, and that it’s high time for the truth to out. A leading Wall Street Bank CIO told me “I charge back VDI desktops at $150/month/user. It’s a nightmare. We should spell it vDIE.” VDI is a lot more expensive (than anything else), users don’t love it, it causes gray hairs for desktop admins, and it isn’t more secure. Brian Madden recently wrote a “you use it, so pay for it, sucker” piece, which is excellent, though he is wrong on the presumed security benefits.
When I talk to CIOs, they all agree on two key drivers for desktop virtualization:
- The need for better desktop security, and
- Support for mobile devices.
Note that they don’t all say they need a better way to manage the desktop or even distribute apps. Existing tools do pretty well, and after all why shouldn’t they? It’s not a new problem.
There are three key arguments against VDI:
- It’s expensive, complex, and vastly complicates the role of desktop admins
- Technology exists that delivers the centralization benefits, at a fraction of the cost, in a way that is more useful to end-users: Microsoft RDS (Terminal Services) either as an app or a desktop abstraction.
- VDI isn’t more secure (than… anything else). (Nor is RDS, though it is better than distributed desktops).
Before I dig in, let’s agree that there’s no way to deliver Windows apps to non-Windows clients other than centralized execution and a remoted experience. And let’s agree that Windows Remote Desktop Services (RDS, Citrix XenApp, Quest vWorkSpace…) delivers apps just fine to a tablet, using the app metaphor that users expect. Moreover a remoted Win 7 desktop on a tablet is … not fun (a finger is not a good mouse), so ignore it and give users the apps they want. The data say this: The overwhelming majority of virtual desktops today are delivered using TS/RDS. Very few enterprises have successfully rolled out VDI at a scale beyond a few thousand users, and those that have are beginning to wonder why.
The security arguments advanced in favor of VDI warrant a closer look:
- Centralized desktop execution & data: Less enterprise data roaming unprotected on laptops is a good thing. And if data is client-side cached, it is encrypted at rest. But client-side encryption has been around forever, and VDI vendors appear to have belatedly re-invented it. If you don’t have client security & backup procedures in place already, you deserve the pain that VDI will bring.
- Secure centralized access control & auth: Rather than rely on a password on the device, the enterprise can use an access-time credential check. The access/auth gateway can also provide single sign on to legacy, web and SaaS apps (see Citrix Cloud Gateway, VMware project Mirage). Granular control of access and identity are powerful tools for enhancing security. But they aren’t a feture of VDI. Do it, and use RDS instead.
- Single Golden Image desktops: Every employee runs “the same” approved golden OS image, and the desktop layers (OS, apps, user) are composed at login-time to deliver a ”new PC every day”, with the right apps and user customizations & data. Each desktop starts clean, for each logon. However, although this layer-cake story is seductive, it is basically untrue. And even if it were true, you would still need to manage scores of images for the (vast majority of) your (non-VDI) desktops. But let’s be generous: The vendors are investing heavily, so let’s say they pull it off. The result: a bunch of new desktop layers to manage, store and dynamically compose in the hope that it all works. More management – more oversight – more people, and technology that rips apart an OS in ways it was not designed to be. Security – yes, job security for desktop admins. Also late nights trying to make it all work. And no more secure… (more below)
- Audit control and compliance: Log everything. Great, good stuff, no arguments.
The goal of the VDI vendors is to persuade customers that they will have more control and therefore more security, courtesy of more layers of virtualization. There is a significant downside though: it requires new tools, infrastructure, and IT management skill-sets to separately manage the lifecycle of each desktop component and each layer of the infrastructure that runs them (servers, hypervisors, storage, networks, and IDAM). Finally, it completely stymies the helpdesk.
Nonetheless, VDI is not more secure: Even if I log on to a pristine golden Windows desktop each day, the enterprise is still vulnerable to common vectors of attack: users click on bad links and open bad attachments – in an execution context where enterprise state is un-encrypted. And a smart attacker will target VDI desktops specifically to get inside the enterprise data center.
VDI can deliver Win7 to the CEO’s iPad. So can Windows RDS. And I bet that an app experience is preferred. Let’s be clear on cost too: From the hundreds of customers I’ve spoken to, I’d guess that RDS infrastructure costs about one fifth to one tenth the cost of VDI to purchase and run. And it works great.
New desktops and apps (x86 & mobile) need a GPU. Unless you fancy racking server side GPUs so your users can use IE9, recognize that Microsoft’s path is clear, and responds to user demands: rich graphical apps & “desktops” are here to stay. Deliver legacy 2D apps using TS/RDS, and let a rich client (x86 & mobile) with a GPU deliver what the user wants.
My recommendation: there’s enough discussion about the future of the desktop to argue for no change for now. Most enterprises use XenApp/TS. Continue. Grow the footprint. Consult vendors, industry analysts and solution vendors. I’d recommend you go down a TS/RDS path to deliver apps to iPads, thin clients and desktops. Use AppSense or RES to guarantee consistency (and more). Use TS/RDS with Win7 UI for users that need a remoted desktop. Meanwhile, figure out a plan for the 70-80% of users that will never use VDI, and start prototyping next-gen touch-enabled apps for your mobile clients. Let the VDI mess sort itself out, and look to Win8 (which my friends at Microsoft call a “VDI killer”) to discern the Microsoft strategy (yup, years away, but that’s fine).
And after all that, what about those pressing security challenges? Right, back to work…
In my last post I argued that it’s time to get over our love affair with hypervisors. The enterprise journey toward private cloud is one of progressive automation of today’s human centric IT Operations – IT_Ops. This market is skill and tool set centric: Admins managing virtualized infrastructure. But it is now table-stakes for public IaaS clouds to offer rich, hypervisor independent, app-centric services directly to developers. And in the SaaS world, the hypervisor is at best a useful component of the SaaS provider’s stack. Even in the enterprise, as the consumption of infrastructure becomes more app-centric, the hypervisor, VM format and other vendor battlegrounds of the last few years will recede. Humans will manage VMs only in IT_Ops environments, which will be needed (forever) to run legacy apps.
So, what’s beyond the hypervisor? Sure, there will be plenty of hype around next gen mobile app platforms, and I for one would not want to be in Oracle’s shoes when WebLogic faces off against CloudFoundry. But I am increasingly convinced that when we look back in 5 years we will be surprised to see that the primary value of virtualization technology generally will be security, and all the “ilities” of first gen virtualization will seem rather passé. There will be clouds, and secure clouds (courtesy of virtualization), and the former will go out of business. So beyond the hype will emerge another kind of hypervisor.
To understand why, let’s turn to a leader drawn from the small group of virtualization pioneers that hail from the southern tip of Africa: Paul Maritz. In his recent VMworld keynote Paul identified four key components of the infrastructure that the hypervisor has to control in order to deliver the benefits of virtualization: CPU/memory, storage, networking and security. The first three are obvious to anyone familiar with managing virtual infrastructure:
- Storage vendors have had to adapt to the challenges of virtualized environments and storage management is now simply a management task in the management of virtual infrastructure.
- The last hop switch in the data center infrastructure is on the server – in the form of the virtual switch in the hypervisor or SRIOV NICs. As a result a powerful new, distributed, software network control plane can be delivered as a service of the virtual infrastructure.
But what of security? To my mind the most succinct dissection of the security challenges raised by virtualization remains Chris Hoff’s Four Horsemen of the Virtualization and Cloud Security Apocalypse.
His conclusion: It’s a mess, and we need to fix it or too many kittens will die. It’s “all change” for the security industry:
- To start with, the traditional “bump in the wire” network appliance vendors are scrambling to interface their products to hypervisor-based virtual switches, and though today’s vSwitches and hypervisor I/O stack are software based, SRIOV hardware virtual functions can easily be programmed to implement ACLs in firmware, and there are several NICs with built in hardware switches. Flow setup and exception handling will be a control plane service of the virtual infrastructure and the data path will run at line rate. Most traditional “bump in the wire” security capabilities will need to interface to the vSwitch API, and hardware devices will be replaced by virtual appliance implementations of infrastructure services. Expect these to become increasingly hidden – surfacing at a management console as they do in the AWS Console. And so many vendors will disappear.
- The End Point Security (EPS) vendors are scrambling too. Having an instance of AV per VM kills I/O, increases storage costs and decreases VM density. This is nowhere more acute than in Desktop Virtualization, where implementation of a re-factored EPS such as McAfee’s MOVE, Trend’s Virtual Desktop Security or Symantec’s EPS 12.1 can make the difference between success and failure of a VDI project. In these virt-ready EPS implementations, one instance of the EPS capability exists per server, and lightweight instrumentation per-guest permits the security function to be consolidated.
Back to Hoff’s Security Apocalypse – he correctly states that while security functions need to migrate onto/interface to virtualized infrastructure, this is far from enough. In the future lies another kind of hypervisor-related security feature set set that is solely focused on security as a property of the runtime platform. I am confident that it will radically re-shape our view of security architecture for the cloud, and it will shift the battle lines in the constantly evolving “Cyber threat” landscape (No, not the Daleks).
To get a hint of what is possible, we need to go back to VMware’s acquisition of Determina and their demo and claim at VMworld Cannes in 2007 that virtualized environments would be more secure than physical. They showed how the hypervisor actively identified a malware infected code-page in a compromised VM. The CTOs of key EPS vendors were on stage, proclaiming that this would be the next strategic bastion in the fight against the bad guys (as VMware gently pulled the rug out from under their businesses). And though the technology was immature, the goal and VMware’s vision were on target.
Fast forward to 2011 and the new hype – the role of the hypervisor in deliberately and specifically enhancing security – is finally becoming real. Two recent milestones are listed below.
- In May, Citrix announced XenClient XT - – a small, flash-embeddable, hardened type 1 Xen based client hypervisor that leverages hardware root of trust to build a highly secure client platform that can host Multi-Level Secure (MLS) workloads. This project is, in my view (and yes, I’m biased), one of the great achievements of open source. Never, before XenClient, has an open source code base had to pry so deeply into client hardware – for which drivers are simply unavailable. And, courtesy of open source, collaboration with the most security conscious customers was possible – all in the open. A mountain, summited by a small, agile, determined team of the world’s best.
- And at IDF in September, McAfee announced DeepSAFE – Intel and McAfee have seized upon VMware’s failure to deliver on the Determina promise, and (as I read the announcements) appear to have developed a hypervisor that offers a McAfee a new locus in which to embed security capabilities – outside the guest. DeepSAFE appears to be a type-1, late-load hypervisor that runs alongside the (single) Windows guest on a PC. It appears to hand all devices through to the guest, leaving the hypervisor in charge of CPU & memory only, but with the ability to detect and remediate against (at least) kernel memory corruption by malware.
[Added] Do these two approaches compete? No. XenClient is the first step along a road towards making systems secure by virtue of hardware based root of trust, and a better overall system design – really about separation of domains of trust (using the hypervisor). DeepSAFE is a new way to “find the bad guys”. One could envisage a combination of the two approaches in future systems.
In conclusion, whoever “owns” the hypervisor has absolute control over the platform, whether server or client. That vendor has an incredible opportunity to change the game in endpoint security, for client and cloud. VMware dropped the ball on the client, Citrix and McAfee responded. What will Symantec, Trend, Sophos and others do? What is the future of all of these vendors in the context of mobile and tablets? On the server side, the advantage would appear to lie with the incumbents. But open source code bases exist that could transform the competitive landscape as customers and vendors alike begin to realize that there’s no point in building or using a cloud service unless you can secure it.
There’s an infrastructure storm brewing that, when it finally unleashes its fury, will catch off guard every enterprise CIO steadily making progress towards their Private Cloud. The argument: It’s time to get over our love of Hype(rvisors) and question the value of virtualization in the cloud. What kind of cloud? Oh, All kinds. And lest you want a nutshell summary: Here is my thesis: The hypervisor as you know it is useful for hosting legacy IT workloads in private clouds. But beyond today’s Hypervisors, a profoundly important new building block for the cloud will emerge – an even more powerful kind of Hypervisor.
I: The Schism: IT_Ops vs Dev_Ops
I am always surprised by how few IT organizations developing a cloud strategy understand the difference between the typical Enterprise IT infrastructure versus (say) Amazon Web Services. In the enterprise, every legacy technology ever built remains in service, coddled by an army of dedicated IT professionals whose challenges grow with each new technology acquisition, merger or strategic initiative. For these poor folk, x86 server virtualization and its evolution into virtual infrastructure has been a godsend. Legacy workloads, bundled up as VMs, can be efficiently and dynamically spun up on any server, on demand. IT gets to be greener, more highly available, and more responsive to business needs. But most importantly, since most workloads are seriously long in the tooth, it allows IT to take advantage of Moore’s Law – replacing old gear with more efficient, faster, smaller devices without changing software. Legacy workloads can live forever, happily ensconced and managed in VM bubbles, and sophisticated management frameworks provide insight, control and automation of traditionally manual tasks.
But this so-called IT_Ops flavor of cloud lacks important attributes found in public clouds: You have to buy equipment up front, instead of paying as you go, and it is inelastic because your capacity is fixed. By contrast, the “Cloud in your pocket” (the clouds that run the apps on your smartphone or tablet) runs on big public IaaS clouds such as AWS. There are no IT folk involved here, and the focus is on providing a set of service interfaces to support app developers who will never encounter an IT person. These Dev_Ops clouds offer rich toolsets for developing, testing, provisioning and automatically scaling a web-services based app and its storage and networking infrastructure, atop a “pay as you go” business model. Examples include Heroku, Engine Yard (which runs Groupon), PiCloud, Node.js, VMware’s CloudFoundry or even Red Hat’s OpenShift. By focusing solely on making the developer’s life simpler, and the use of powerful automation frameworks such as Chef, Puppet or tools such as RightScale, the abstraction (service interfaces) of the cloud can quickly move beyond the concept of a VM instance. VMs may well be used under the covers (they are by the frameworks listed above) as units of workload that can be elastically provisioned using the basic VM-centric primitives of the cloud, but to the user of the cloud they are hidden. Welcome to the world of next-gen apps, where VMs are best described in PaaSt tense.
The bottom line: The public cloud is PaaS centric. Though you can certainly spin up VMs if you need to, richer app-centric service interfaces let you forget all about them. As better instrumentation of the PaaS layer becomes available, your need to be involved with VMs will steadily decline (good blog here) Finally, if you are an enterprise user of VMs in an IaaS cloud, you probably use an OS instance provided by the cloud provider (another value-added service) – once again freeing you from any concern about the hypervisor.
II: Big SaaS Couldn’t Give a Hoot about VMs
The Big SaaS properties – consumer and enterprise focused – have historically been built according to highly app-specific needs. A few notable exceptions come to mind, such as Netflix, whose journey to using the AWS cloud to host their apps is superbly chronicled in Adrian Cockroft’s blog and on the Netflix tech blog. But many, including Salesforce and Facebook believe they don’t need a hypervisor in their infrastructures. The argument is simple (and a bit naive): they operate at a scale where the “server virt” consolidation arguments in favor of running multiple VM instances per server simply don’t make sense. Saving 50% of a server when you have 100,000 of them is kind-of-meaningless. But there are a couple of reasons why a hypervisor does make sense in large web shops: First, if the infrastructure hosts multiple apps, multi-tenancy of the hardware infrastructure makes sense. A good example is Yahoo, which has over 250 web properties sharing about 500,000 servers in 26 data centers world wide, and because they grew so rapidly and applied the naive approach, used to run at a shocking 8% average utilization before opting to build a large virtualized private cloud. The other reason to use a hypervisor (MySpace springs to mind) is provisioning cleanliness: each server runs a single VM (which contains an instance of the app) and the operators can insulate their software from different kinds of hardware by using the hypervisor and its virtual hardware as a clean abstraction layer. Historically this caused overhead since the hypervisor introduces at the very least an I/O overhead, but with the emergence of SRIOV, or with simple PCI pass-through of devices to a VM or VMs, the performance overhead of the hypervisor is tiny.
As more and more next gen apps (both SaaS and PaaS-hosted) are developed, we will quickly move beyond the era of relevance of the hypervisor. A powerful abstraction, for sure. Used liberally – everywhere where the lowest levels of infrastructure require multi-tenancy, dynamic provisioning, optimal packing, manageability and high availability – but unimportant to the users of the cloud. A mere capability in the IaaS stack. Only in the enterprise, where IT is slowly automating its traditional practices and where the traditional single-server-OS based units of work as VMs remain, will the “big brand” hypervisors command a following. Why? manual procedures for VM Management, that are vendor specific. The Enterprise Private Cloud market is growing at about 30% per year, but IaaS and PaaS clouds, already well beyond the hypervisor-as-service-interface, are growing at break-neck pace of about 70% per year, driven by the staggering growth of mobile apps and our insatiable consumer appetite for services.
III: But Wait! That’s all Wrong!
Yes, it is. There’s something critically important that I’ve left out, that completely changes the relevance of the hypervisor. It’s so important that I believe the hypervisor will become ubiquitous. You wouldn’t dare to have a server or a client without one. More on that in Part II.
American poet Don Marquis once said that “Every cloud has its silver lining but it is sometimes a little difficult to get it to the mint”. This has certainly been true for so-called cloud infrastructure vendors – those crafting the orchestration and management functions that are required to turn a large number of servers, storage and networking gear, and a hypervisor, into a scalable, elastic Infrastructure as a Service fabric. VMware dominates the enterprise server virtualization market (which is aspiring to cloud-dom), and service providers have been slow to ramp up alternative service offerings to vCloud. But today’s announcement by Citrix that it is acquiring cloud.com is a significant shot in the arm for the Citrix cloud effort, as well as for the OpenStack community at large.
Those who follow the so-called “clouderati” on twitter will be quick to realize that Citrix has in one fell-swoop acquired the leading independent cloud orchestration stack, cloud.com’s customers, including Zynga, Korea Telecom, Edmunds.com and many others (for which typically XenServer is the preferred hypervisor, but not exclusively so), the “cloud.com” domain name, a great leadership team, and (as the cherry on the top) Christian Reilly, until recently chief architect at Bechtel, and latterly of cloud.com, and now taking on the role of steering the Citrix cloud architecture and strategy.
The last year has seen an almost meteoric rise in the fortunes of OpenStack, as it has emerged as the only viable alternative to proprietary products such as VMware’s vCloud. The latter has grown up from an enterprise pedigree, and is in many ways ill suited to the task of delivering low-cost, massively scalable Infrastructure as a Service. By contrast, the cloud.com CloudStack product has shown itself in large production environments such as Zynga’s private / hybrid cloud, to scale superbly, and at very low cost – leveraging any (though typically free: XenServer/Hyper-V) hypervisor or even doing bare metal provisioning for large web 2.0 style clouds. Cloud.com is to be congratulated for forging a hard earned reputation for delivering a highly scalable product to customers whose entire business depends on it. If CloudStack fails, Frontierville fails. And yet, everyone knows that the future for proprietary cloud stacks looks rather bleak, given the enormous industry focus on developing a community owned, massively scalable open source cloud stack – OpenStack. Cloud.com was therefore quick to jump aboard the OpenStack community development model, and has led some of the key contributions to OpenStack, including support for Hyper-V. Citrix can use the cloud.com acquisition to accelerate its own Project Olympus, which will be OpenStack based, and in so doing it can offer existing cloud.com customers a roadmap that is far richer than could ever be created by a single vendor following its own development path. Future versions of CloudStack (or whatever it ends up being called) will be able to scale better and offer a far richer networking model, storage infrastructure and so on, courtesy of the incredible contributions being made by over 50 vendors to OpenStack. At the same time, the acquisition is a major shot in the arm for the OpenStack community, who will be able to benefit from a substantially beefed-up development effort at Citrix, and from the oversight of Ewan Mellor (OpenStack Archecture Board Member, Xen god, and developer of the OpenStack ESX support) and Christian Reilly.
The Citrix cloud business will now report to Sameer Dholakia, who joined Citrix with the acquisition of VMLogix about a year ago. He’s off to a great start!
I’m back in the startup world again. Bromium has (more or less) 5,000 fewer employees than Citrix, and instead of a rich portfolio of products, a sales force, 8,000 channel partners and 250,000 customers, we have a dream, a working demo, an architecture diagram, and a hiring plan. Life is very simple at this stage of a company’s development. People keep asking me when we’re going to announce our first product, and the answer is likewise, very simple: “After we’ve built it and, are you looking for a job?”
So why the startup thing again? As a technologist seeking to deliver the potential value from technology to the real world, there are several compelling reasons to do so from a startup. Here are a few that spring to mind:
- Focus: In a startup you only have one goal that demands absolute focus. No time to waste going down adjacent paths; no rationale for the existence of the business other than to deliver that first product and wow those initial customers.
- Reduced risk: We can reduce risk of failure by being extraordinarily selective in building our team. I want to hire people who are far better than me at everything they do (That ought to be pretty easy!), and I want them to succeed beyond their wildest dreams. By assembling a team that is highly motivated, extra smart, and by telling them everything about the business, every day, I believe we have a far greater chance of success than by building a hierarchical organization with restricted information flows. No idea is too important that it cannot be challenged, and every challenge opens the possibility of a better solution to a problem. At Bromium we all dive into challenging technical problems that appear to be roadblocks. And every employee has a veto right on a hiring decision, whether for a VP or an office assistant. It is empowering, and when people feel empowered, they become powerful, creative and productive.
- Trust: In an organization that is small, every individual has a vital role to play. The entire organization depends completely on each person. We are as vulnerable to failure through poor architecture or coding or testing as we are through poor customer service and shoddy sales. We are therefore interdependent, and since each of us is a mere human, we therefore each have to step up to support our peers, sharing the load in a very active way. For example, I dislike the entire idea of a time-off policy. In my view if we ever need to discuss an employee wanting to take leave, then that is probably the wrong employee for us. Of course I expect people to take time off – we all have families and real lives outside work that need support and support us. So we each make sure that we’ve delivered to the team and then simply take whatever time we need, returning to work refreshed and ready to contribute again.
- Quicker learning: All startups are experiments, and along the way the team needs to discover what failure means, and how to morph that into a stronger organization, better products and better support of the customer. In a small team where people trust each other, we are much more likely to share our failures, so that others can learn from them. Moreover, large organizations tend to get mired in their cultures, and innovation stalls. Every once in a while you need to throw out everything and start from scratch, otherwise you cannot escape narrow group-think.
- New tech, new ideas: We escape the confines of traditional enterprise technology think. We don’t use legacy apps, and our organization lives, breathes and runs in the cloud. We use new development methodologies and tools (git is a thing of beauty and a joy forever), we run at the pace of open source, and every line of code is open for scrutiny by all.
