Resource Management for Isolation Enhanced Cloud Services

2 downloads 32480 Views 623KB Size Report
Nov 13, 2009 - to provide performance and security isolation in the shared cloud infrastructure - cache hierarchy aware core assignment and page coloring based ... cluding Amazon EC2 [1] and Microsoft Azure [2]. This approach allows CIP ...
Resource Management for Isolation Enhanced Cloud Services Himanshu Raj

Ripal Nathuji

Microsoft Corporation 1 Microsoft Way Redmond, WA

Microsoft Corporation 1 Microsoft Way Redmond, WA

[email protected] Abhishek Singh

[email protected] Paul England

Microsoft Corporation 1 Microsoft Way Redmond, WA

Microsoft Corporation 1 Microsoft Way Redmond, WA

[email protected]

[email protected] forms, decoupling the service provider (SP) from the platform owner or the cloud infrastructure provider (CIP). Such a model has potential for huge cost savings for service providers as they avoid the significant financial overheads associated with deploying, maintaining, and managing datacenter environments, and instead pay just for the usage of these resources. However, this benefit comes at a price - the separation of service and infrastructure providers implies that the service provider has less control over the service deployment, and must trust the CIP to uphold the guarantees provided in the service level agreement (SLA). An SLA works as a contract between CIP and service provider, and provides guarantees either in terms of low level resource provisioning, e.g., X number of CPUs with some computational capacity and Y Mbps network throughput, or in terms of higher-level goodput value(s) for the service, such as Z ops/second. A service provider must also trust in the infrastructure provider’s ability to properly isolate services from each other. Isolating a service from other services includes both performance and security isolation. This implies that the infrastructure provider must employ mechanisms so that it is not possible for one service to interfere with the execution of another service. A typical method to achieve isolation is to enforce physical isolation, e.g., ensure that different services execute on unique physical machines, and use isolated network infrastructures. Although these mechanisms achieve excellent isolation, strict physical isolation is costly, and in many cases the dedicated resources will be under-utilized. Hence, many cloud computing platforms use virtualization to encapsulate services inside virtual machines (VMs), including Amazon EC2 [1] and Microsoft Azure [2]. This approach allows CIP to better utilize resources, while still providing adequate isolation. The adoption of virtualization also enables other benefits including the ease of service deployment and the flexibility of VM migration to provide fault tolerance and improved consolidation. Isolation properties of a virtualized platform, however, are weaker compared to physical isolation. In particular, resources that may be implicitly shared among VMs, such as the last level cache (LLC) on multicore processors and memory bandwidth, present opportunities for security or performance interference. For example, it has been shown that an otherwise isolated process can compromise the confiden-

ABSTRACT The cloud infrastructure provider (CIP) in a cloud computing platform must provide security and isolation guarantees to a service provider (SP), who builds the service(s) for such a platform. We identify last level cache (LLC) sharing as one of the impediments to finer grain isolation required by a service, and advocate two resource management approaches to provide performance and security isolation in the shared cloud infrastructure - cache hierarchy aware core assignment and page coloring based cache partitioning. Experimental results demonstrate that these approaches are effective in isolating cache interference impacts a VM may have on another VM. We also incorporate these approaches in the resource management (RM) framework of our example cloud infrastructure, which enables the deployment of VMs with isolation enhanced SLAs.

Categories and Subject Descriptors D.4.6 [Operating Systems]: Security and Protection

General Terms Security, Performance, Measurement

Keywords Isolation Attributes, Cache Isolation, Cache Coloring

1. INTRODUCTION Cloud computing environments provide an Infrastructure as a Service (IaaS) model to host services provided by independent service providers. This vision of cloud computing enables services running on leased computation plat-

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CCSW’09, November 13, 2009, Chicago, Illinois, USA. Copyright 2009 ACM 978-1-60558-784-4/09/11 ...$10.00.

77

pendent SPs, are then deployed on the infrastructure based on the decisions of the resource manager (RM) of the CIP. We make a simplifying assumption that a SP will deploy all the components on a single CIP. Moreover, any dependencies a SP might have on other SPs are also limited to the same CIP. A specific example of such a cloud based service is the Virtual Desktop Experience (VDE) (refer to Figure 1). In this scenario the SP provides a virtual machine for each customer. The customer may access the virtual machine from another PC, or possibly from a dumb terminal or set top box. The service provider adds value by allowing roaming access to the VDE, and possibly providing centralized professional management. To realize this architecture, the VDE cloud service, provided by one or more SPs, forms the middle layer. The SP has some number of session VMs and service VMs. The session VM is specific to a client, and works as her personal computer. These SPs will in turn use cloud infrastructure services from a CIP, such as Microsoft Azure. This scenario shows an implicit dependency between a client and a single service provider. However, this is scenario specific, and may not be representative of cloud based services in general. Another example is Live Mesh [3], where the client stores data with an SP, which in turn uses resources provided by the CIP. In this example, the SP may implement the service using a pool of VMs, which provide storage to the client via a web interface. These VMs then utilize some name space partitioning mechanism to store data for different clients, such as a distributed file system. The file system may be implemented by the same SP, or it may use another SP that provides the file service. Ultimately, the disk storage is required for this SP is provisioned by the CIP. In all these examples, there exist constraints at each boundary between adjoining layers that form the SLA between these layers. For example, in our sample VDE scenario, a client is usually concerned about experience (including performance and ease of use), security of service, and privacy of user data. Constraints based on these concerns, such as “screen resolution”, “data encryption quality”, and “need for anti-virus agent”, are then presented to an SP as part of the SLA between the client and the SP. Similar security and privacy concerns are also applicable to the Live Mesh scenario. In this case SPs implement appropriate mechanisms in order to meet concerns of their clients, such as encapsulating each client in a separate VM, or by using access control mechanisms provided by file systems. These mechanisms, in turn, create concerns about isolation and resource management for the SP (e.g., whether a VM from another SP can adversely impact the performance of a Session VM, how much CPU resource be allocated to a Session VM), which are then passed on as constraints to the CIP as part of the SLA between SP and CIP. Table 1 outlines some of the SLA constraints typical of a SP. In general, an SLA specification may use a subset of these at a given time. The CIP actively manages resources in order to meet the SLA constraints specified by SPs. For our specific cloud infrastructure scenario, this includes assigning physical resources to the VMs. This resource assignment problem can be posed as a Constraint Satisfaction Problem (CSP) formulation for the CIP. However, the constraints of the CSP are different than the SLA constraints, since they need to

tiality of another process [14]; with similar attacks possible in the virtualized environment. Similarly, use of shared cache(s) makes it difficult to properly isolate performance [7], possibly allowing a malicious VM to launch the denial of service attack on another VM, and certainly threatens a service provider’s ability to guarantee an SLA to its clients. Isolation issues are further complicated by the fact that large scale Internet services will likely be developed by building services on top of existing underlying services, and not just the hosting VMM. In this case, a service might rely on services from multiple different service providers, and complete isolation among dependent services may not be possible (or indeed desirable). However, currently any interdependence among services implies a trust relationship the dependent service trusts the other service to do the right thing (e.g. securely communicate and store any user related data), with severe repercussions for a service provider that trusts another service that is then compromised. Isolation attributes for a service defined as part of the SLA between a service provider and the infrastructure provider serve two purposes - 1) to capture the degree of isolation demanded by a service (from both the performance and security points of view), and 2) to allow a service to authoritatively report its isolation characteristics so that service consumers can decide whether to trust it. We call the latter feature “isolation attestation” and it is specifically useful in the case when one SP depends on another SP for some service, or when a cloud service user (client) is deciding whether to use a service. Based on the isolation attestation from CIP, a SP can choose to trust another SP rather than being forced to trust the SP and the cloud service provider. Similarly, this attestation can be provided to the client as a measure to gain trust in the SP. Our focus in this work is limited to the first item. In particular, isolation related SLA constraints are used in CIP’s resource manager to manage various service components encapsulated in the VMs, including their instantiation, migration, and termination. We also present mechanisms to enforce some of the isolation constraints, in particular, focusing on the last-level cache as the shared resource in the multicore environment prevalent in today’s cloud infrastructures. We explore two such mechanisms in the paper: cache hierarchy aware core assignment, and cache-aware memory assignment through page coloring. These techniques prevent cache-based side channel attacks and provide performance isolation against a VM. We also present an example formulation of a constraint satisfaction problem (CSP) for VM placement in the cloud environment based on enhanced SLA constraints. We conclude with future directions on further enhancing SP’s SLA by integrating trusted computing techniques, such as attestation, in cloud computing platforms.

2.

ISOLATION ATTRIBUTES FOR CLOUD SERVICES: EXAMPLE SCENARIOS

In cloud services architecture, the service provider implements the service logic and presents it to clients over the internet (cloud). The service logic itself is typically composed of multiple components. The cloud infrastructure provider uses some container abstraction, e.g. a virtual machine, for service deployment, and it is up to the service provider to package various components of the service into these containers. Several of these VMs, belonging to various inde-

78

Figure 1: Example interaction between different entities in virtual desktop experience cloud service Constraint Number of processors

Classification QoS attribute

Type Integer

Goodput of service

QoS attribute

Float

Replication factor (r) H/w fault domain (n)

QoS attribute Isolation attribute

Integer Integer

Cache based DoS attack avoidance

Isolation attribute

Boolean

Cache based side channel attack avoidance

Isolation attribute

Boolean

Comment Service specifies this constraint based on the parallelism desired. Service specific QoS metric. This can be used in lieu of specifying low level resource allocation, such as CPU share, RAM size, storage capacity, and network bandwidth [wood 2008, choi 2008]. Defines how many replicas of the VM are needed. Defines the number of physical nodes across which replicas should be scattered. Defines whether VM must be safeguarded against a malicious VM that might cause cache thrashing. Defines whether VM must be safeguarded against a malicious VM that might try to steal secrets (e.g., encryption keys) based on cache based attacks [6, 14, 13]

Table 1: Example SLA Constraints specified by a SP cache as the medium for side channel attacks. Addressing the cache isolation issues is especially important in cloud computing scenarios, since interfering threads may belong to different SPs (more precisely, to VMs owned by different SPs). This can impact the ability of the SP to uphold the SLA to its clients, or makes it more expensive for the SP to provision to uphold the SLA. To this end, we present two techniques that we have implemented - cache hierarchy aware core allocation and page coloring based cache partitioning, that provide better isolation.

be stated in terms of resources and mechanisms meaningful to the CIP, such as CPU share, size of RAM, storage capacity, and network bandwidth. We defer to Section 5 to describe one such formulation; first we describe mechanisms employed by the CIP to enforce some of the isolation attributes. In the following section we describe two such mechanisms for cache based isolation, along with a prototype implementation and evaluation.

3. ENFORCING CACHE ISOLATION IN MULTICORE SYSTEMS

3.1

Multicore systems are prevalent in today’s large scale data centers, which form the core of the cloud infrastructure. This multicore trend is expected to continue in the future. Shared caches are commonly used in such multicore architectures. A drawback to these designs, though, is that it is difficult to guarantee performance to a thread whose active working set spills out of its local non-shared caches into the last level cache (LLC) since other threads can simultaneously access the LLC resulting in active interference. Hence, memory-bound threads that thrash the cache (maliciously or otherwise) can severely impact performance of other applications sharing the LLC. Moreover, it is possible to impact the confidentiality of another thread using

Current generation multicore systems usually share the LLC at the package level, however many server class machines deployed in data centers today are configured with multiple packages. These machines provide opportunities for placing VMs in a manner so as to exclude any cache sharing. Traditionally, details of how processing cores, caches, and memory are organized are exposed to software so that computational thread placement can be optimized. We group cores on a machine based on their LLC organization - all cores sharing the LLC are put in a single group. Next, if a VM’s SLA defines isolation attributes related to cache, the caching hierarchy aware core assignment algorithm tries to satisfy these constraints by choosing a group that is cur-

79

Cache hierarchy aware core assignment

rently not assigned to any other VM. Cores from this group are then assigned to different virtual processors of the VM. Depending on the number of virtual processors required by the VM, one or more groups may be used. This approach is simple to implement, although the biggest drawback is that it may result in under utilization of the platform. In particular, if a VM requires cache isolation and uses fewer cores than the sum of all the cores in the groups assigned to it, these unassigned cores cannot be used.

3.2

Page-coloring based cache partitioning

Page coloring is a software method to control how the physical memory used by an application maps to cache hardware. The number of colors that a cache can support is determined by its organization, and is obtained by multiplying the cache line size by the number of sets and dividing by the page size. In the case of virtualized systems, the manner in which the hypervisor allocates memory pages to back a VM can influence the cache usage of threads in the VM. We utilize page coloring as a software mechanism for cache isolation by isolating the color sets that are used to back individual VMs running on CPU cores that share the LLC (i.e., belonging to the same group). Page coloring has the advantage over the cache hierarchy aware core allocation technique in that it does not result in processor under utilization. However, it is still possible to under utilize the memory available on the platform. For example, if memory required by a VM is not an integral sum of the amount of memory available of colors that are currently unassigned to any other VMs, remaining pages of these colors may not be assigned to any other VM. Also, since cache is exclusively partitioned using colors, the performance of a VM may suffer if VM’s working set does not fit in the cache partition. Such performance degradation may be acceptable, as long as the QoS SLA constraints of the VM are satisfied.

4. 4.1

Figure 2: Nehalem cache hierarchy sizes. The experimental platform we use is 8-core Intel Nehalem processor based machine, with 6GB RAM. The Nehalem processor cache hierarchy includes a local L1 and L2 per core, as well as an 8MB shared LLC as shown in Figure 2. The machine consists of two such packages, organized as two NUMA nodes. All three levels of the cache have 64byte cache lines. Hence, there are two core groups, and each group shares a LLC supporting 128 colors (based on 4Kbyte page size). For initial results, we utilized a synthetic application to run inside of VMs. The application allocates an array of a specified working set size, and then accesses it in a regular pattern. The Nehalem processor includes multiple hardware prefetching mechanisms to enable improved performance. For the included measurements, we have disabled some of these mechanisms to better isolate the effects of page coloring. In particular, we have disabled the Data Prefetch Logic (DPL) prefetching mechanism. Impact of different hardware prefetching mechanisms on page coloring is part of our future work. The target VM running the synthetic workload is a single virtual processor (VP) VM. To observe the impact on performance, we run a perturbing VM comprising of three VPs. For the cache hierarchy aware placement, the perturbing VM is placed on the separate group (i.e., the separate package). For page coloring based cache partitioning, the perturbing VM is assigned cores from the same group as that of the target VM. The perturbing VM runs a memory intensive application with varying number of threads that repeatedly access memory and cause cache thrashing.

EXPERIMENTAL EVALUATION OF CACHE ISOLATION TECHNIQUES Implementation Details and Methodology

Our prototype implementation of cache isolation techniques is based on the Hyper-V virtualization infrastructure [5]. Hyper-V consists of a micro-kernel hypervisor that manages CPU and memory resources and a privileged VM, called the primary partition, for the overall management of the platform. In particular, the primary partition manages the life-cycle of other VMs, and also hosts a virtualization stack for I/O virtualization. As part of the VM creation, the memory management component of Hyper-V running in the primary partition allocates physical memory pages to back memory pages of the VM. We modified this component to use a variant of Windows NT kernel’s memory allocation API that allows the caller to specify an address range and stride factor for allocated pages. We used these parameters to limit pages to a set of colors for VMs so that only a percentage of the LLC would be accessible. Actual number of colors supported by a platform is specific to its cache organization, and is described later in the section. Next, we enhanced the configuration of every physical machine with two pieces of information - the group information for cores, and the numbers of page colors and their current

4.2

Experimental Results

Our first set of experiments consider the impact of cache sharing by consolidating the target VM and the perturbing VM on the same core group (quad-core package) with no page coloring. Here we measure the execution time of the benchmark application in the target VM. Figure 3 provides the execution time from our experiments. When the target VM is executing alone, we observe that once the working set fits within the hardware LLC size (8MB), the execution time drops to a baseline value of approximately 40 seconds. Subsequently we introduce threads in the perturbing VM that have a working set of 8MB. As expected, as such threads execute concurrently on the remaining three cores in the group, we see increased performance interference to the measured application, requiring reduced working set sizes before execution time drops to the baseline value. These initial results without any cache isolation technique highlight the fact that there can be significant performance impact of interference from other threads in the shared LLC (up to 400%). We next

80

arithmic scale). We observe that in the unconsolidated case there is a penalty of coloring (up to 3.6x for 50% coloring). This is expected since the coloring limits the ability of the application in target VM to make use of the entire LLC where it otherwise would have. Once threads from perturbing VMs are included, however, we observe that the execution time can be cut by up to a factor of three with coloring. These numbers highlight that though page coloring can impact performance, it is an effective means of providing cache isolation. As long as the performance degradation does not violate any QoS SLA constraint, this approach can be used to provide performance and security isolation to a VM.

look at how cache management can impact the isolation for the target VM.

Figure 3: Performance of target VM with varying degrees of perturbation For the cache hierarchy aware core assignment approach, target and perturbation VMs are placed in different core groups. In particular, VMs are assigned cores and memory from separate NUMA nodes. Hence, the target VM doesn’t share cache resources with the perturbation VM. The performance of the target VM is similar to that in the previous experiment without cache isolation when target VM executes without the perturbing VM, as shown by the lowest curve in Figure 3. Execution of the perturbation VM does not result in any visible loss of performance for the target VM. Due to brevity, we have omitted these results from the paper. We evaluate page coloring by using it to segment the LLC. We assign the target VM a preferential share of 50% (by using half of the total number of colors available, which in our case is 8), and coloring the perturbing VM with the remaining 50%. Figure 4 depicts the performance data of the measured application inside the target VM with page coloring turned on. We observe that by integrating coloring, the performance of the target application is fairly consistent as additional threads are added to the perturbing VM. We next look at the performance impact of this static isolation when compared to the non colored case.

Figure 5: Comparative performance of target VM with and without page coloring

5. BRINGING IT TOGETHER: AN SLA DRIVEN APPROACH TO RESOURCE MANAGEMENT IN THE CLOUD INFRASTRUCTURE In this section, we demonstrate how to utilize various resource management techniques to manage resources in the cloud infrastructure in a way so as to satisfy various QoS and isolation SLA constraints put forth by a SP to the CIP. As specified earlier, the SLA constraints are converted into a set of CIP specific constraints - defined in terms of attributes related to resources available at the CIP. The problem, then, reduces to a constraint satisfaction problem (CSP). Formally, a CSP is defined as having a set of constraint C that are defined over a set of variables X. The variables in X can take values in the domain D. The goal is to find value assignments to X such that all the constraints in C are satisfied. For the example cloud infrastructure, the CSP can be informally defined as: given a set of VMs with CIP specific constraints, is it possible to place these VM on a subset of physical nodes in the infrastructure in a manner that all CIP specific constraints related to these VMs are satisfied? We will present the CSP formulation with the help of a running example.

Figure 4: Performance of target VM with page coloring Figure 5 compares the execution time of the various scenarios with and without coloring (with Y-axis shown on log-

81

FOREACH vm IN VMs FOREACH blade IN Blades FOREACH D IN blade.ProcessorDomains FOREACH P IN procdomain.PageColorDomains vm.Blade = blade vm.ProcessorDomain = D vm.PageColorDomain = P IF all constraints evaluate to true Jump to next vm ELSE vm.Blade = NULL IF THERE EXISTS vm in VM : vm.Blade == NULL PRINT "FAILED" ELSE PRINT "SUCCEEDED"

the greedy approach does not guarantee a solution if it exists. We are currently investigating a more generic formulation of this problem using Microsoft Solver Foundation (or other CSP solvers).

6.

Figure 7: Pseudo-code of a greedy algorithm for the CSP formulation Suppose the SP specifies the following SLA for its service: Number of processors = 2 Replication factor(𝑟) H/w fault domain(𝑛)

= =

RELATED WORK

There is little prior work on security and isolation specific SLA constraints. Monahan et al. define example security related SLA constraints that are applicable in cloud computing scenarios [10]. However, they only broadly define isolation among multiple services. To our knowledge, presented work is the first attempt on characterizing specific isolation related attributes for SLAs between the CIP and SPs. Specifically we define attributes to thwart against cache based side channel attacks in a shared cloud computing infrastructure. We also extend the resource management framework to include these isolation based constraints when deploying and managing services in the infrastructure. Cache based interference has given rise to many isolation problems in multicore systems, both impacting performance [7] and security [6, 14, 13]. Prior research on cache based isolation includes many software [8, 15], and hardware techniques [16, 9], with focus of software techniques mostly on performance isolation. In this work, our focus is on using software approaches for both security and performance isolation in virtualized environment. Further hardware support [11] may be necessary to provide better performance isolation guarantees.

5 5

Cache based DoS attack avoidance = True Cache based side channel attack avoidance = True The goal for this specific example is to place 5 VMs (based on replication factor, r = 5) on physical machines in the cloud such that the SLA is satisfied. In our example cloud infrastructure model, a physical node is identified as a Blade object, and the complete set of these objects is the set Blades. Figure 6 and Table 2 describe various attributes associated with a blade object. Let VMs be the set of virtual machines, corresponding to the five replicas, vm1, . . . , vm5, that need to be placed on the set Blades. Each VM object has following decision variables that need to be solved:

7. CONCLUSIONS AND FUTURE WORK We envision that in future cloud computing environments, service providers (SPs) will also specify security and performance isolation constraints as part of their Service Level Agreement (SLA). One such set of constraints advocated in this paper are based on cache sharing in contemporary multicore systems, where a VM may severely impact another VM’s performance and compromise its secrecy and integrity using cache based interference. To this end, we present two approaches to provide cache-based security and performance isolation - cache hierarchy aware core assignment, and page-coloring based cache partitioning. Experimental results based on our prototype implementation based on Hyper-V virtualization platform demonstrate that both of these techniques are effective in providing required isolation properties. We also provide a generic Constraint Satisfaction Problem (CSP) formulation that incorporates these approaches in the general resource management framework of our example cloud infrastructure. We are currently in the process of implementing our CSP formulation using the Microsoft Solver Foundation [4], and plan to evaluate the impact of SLA isolation attributes on the overall cost of VM placement in a typical cloud infrastructure. In future, we plan to incorporate attestation of an SP’s isolation attributes by the CIP. Such “isolation attestation” will enable clients and other SPs in the cloud services platform to make an informed trust decision regarding whether, and how much, to depend on a particular service. Another class of isolation issues that we plan to address in future arise from the fact that cloud administrators or other management related entities can impact a service’s confidentiality and/or integrity. Although we do not consider denial of service attacks by an entity in CIP - we assume that a

∙ Blade, The mapping to a blade object; ∙ ProcessorDomain, The processor domain object; and ∙ PageColorDomain, The page color domain object. The domain of Blade decision variable is the set of blades Blades. Similarly, the domain of ProcessorDomain variable is the Processor Domain objects present in the cloud. However, there is an added constraint that a VM’s ProcessorDomain must belong to the same Blade which corresponds to the value of the decision variable Blade. Similar constraint is applicable to the PageColorDomain. The goal of the resource manager, then, to find a placement that satisfies the constraints presented in Table 3. A solution to this placement problem would be valid assignment of objects Blade, ProcessorDomain and PageColorDomain such that the all the constraints are satisfied. We currently use a simple greedy approach to find the solution, as described below (refer to Figure 7). Our current formulation, and the greedy algorithm, do not yet consider multiple ProcessorDomains or multiple PageColorDomains to satisfy VM’s resource requirements (number of processors and amount of memory, respectively). Also,

82

Blade

AvailableProcessors

FaultDomain

ProcDomains

D1

Capacity

CurrentVMs

...

Available

PageColorDomains

P1

Capacity

CurrentVMs

...

Available

Figure 6: Hierarchical attributes of a Blade object Attribute AvailableProcessors FaultDomain

Type Integer Integer

ProcessorDomains

Set

.Capacity .CurrentVMs .Available .PageColorDomains

Integer Set Integer Set

..Capacity ..CurrentVMs ..Available

Integer Set Integer

Comment The number of processors currently available for reservation. Identifies the hardware fault domain number assigned to this blade. Different number implies different fault domain. Currently each blade is in its unique fault domain. Set of ProcessorDomain objects. Each processor domain’s cache is independent of others. Each ProcessorDomain object in turn has the following attributes - Capacity, VMs, Available, and PageColorDomains, described next. Number of processors in this domain Set of VMs assigned to this domain Number of Processors available in this domain Set of PageColorDomain objects. The memory pages in different page colors do not intersect. Each PageColorDomain object in turn has the following attributes - Capacity, VMs, and Available, described next. Number of pages in this domain Set of VMs assigned to this domain Number of pages available in this domain

Table 2: Details of attributes of a Blade object service provider can simply switch to another CIP if this is the case, the other security problems are more pernicious in that SPs may never discover the compromise. This problem is present for non-virtualized cloud environments, but is perhaps more worrisome in virtualized environments where the infrastructure provider has quick and easy access to the hosting software. Thus a malicious administrator, who is an integral part of current virtualized environments, can compromise the confidentiality/integrity of a service. Recent attempts at disaggregation in virtualized environments deal with some of these issues [12]. These disaggregation techniques, combined with the remote platform attestation of virtualization software that employs them, will form the root of trust for SPs/client and a basis for trust in “isolation attestation” provided by the CIP.

8.

[2] Microsoft azure services platform. http://www.microsoft.com/azure/default.mspx. [3] Microsoft Live Mesh. www.mesh.com. [4] Microsoft solver foundation. http://code.msdn.microsoft.com/solverfoundation. [5] Virtualization with Hyper-V. http://www.microsoft.com/windowsserver2008/en/us/hypervmain.aspx. [6] D. J. Bernstein. Cache-timing attacks on AES. http://cr.yp.to/antiforgery/cachetiming-20050414.pdf. [7] D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting inter-thread cache contention on a chip multi-processor architecture. In HPCA ’05: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, pages 340–351, Washington, DC, USA, 2005. IEEE Computer Society. [8] A. Fedorova and M. Seltzer. Improving performance isolation on chip multiprocessors via an operating

REFERENCES

[1] Amazon elastic compute cloud (ec2). http://aws.amazon.com/ec2/.

83

SLA Constraint Number of processors

Value 2

Translated constraint evaluates to either true or false ∀vm ∈ VMs, vm.Blade ∈ Blades : vm.Blade.AvailableProcessors ≥ 2

H/w fault domain (n)

5 ∀vm1, vm2 ∈ VMs, vm1 ∕= vm2 vm1.Blade.FaultDomain ∕= vm2.Blade.FaultDomain

Cache based isolation

True ∀vm ∈ VMs : (vm.ProcessorDomain ∈ vm.Blade.ProcessorDomains AND (vm.ProcessorDomain.CurrentVMs = 𝜙 OR (vm.PageColorDomain ∈ vm.ProcessorDomain.PageColorDomains AND vm.PageColorDomain.CurrentVMs = 𝜙)))

Table 3: CSP Formulation

[9]

[10] [11]

[12]

[13] D. A. Osvik, A. Shamir, and E. Tromer. Cache attacks and countermeasures: the case of aes. In Topics in Cryptology - CT-RSA 2006, The CryptographersSˇ Track at the RSA Conference 2006, pages 1–20. Springer-Verlag, 2005. [14] C. Percival. Cache missing for fun and profit. http://www.daemonology.net/papers/htt.pdf. [15] D. Tam, R. Azimi, L. Soares, and M. Stumm. Managing shared l2 caches on multicore systems in software. In Workshop on the Interaction between Operating Systems and Computer Architecture, 2007. [16] Z. Wang and R. B. Lee. New cache designs for thwarting software cache-based side channel attacks. In ISCA ’07: Proceedings of the 34th annual international symposium on Computer architecture, pages 494–505, New York, NY, USA, 2007. ACM.

system scheduler. In Parallel Architecture and Compilation Techniques, 2007. PACT 2007. 16th International Conference on, pages 25–38, Sept. 2007. S. Kim, D. Chandra, and Y. Solihin. Fair cache sharing and partitioning in a chip multiprocessor architecture. In PACT ’04: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pages 111–122, Washington, DC, USA, 2004. IEEE Computer Society. B. Monahan and M. Yearworth. Meaningful security slas. Technical report, HP Labs, 2008. T. Moscibroda and O. Mutlu. Memory performance attacks: denial of memory service in multi-core systems. In SS’07: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, pages 1–18, 2007. D. G. Murray, G. Milos, and S. Hand. Improving xen security through disaggregation. In VEE ’08: Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, pages 151–160, 2008.

84