Meltdown/Spectre impact on SGX

Meltdown/Spectre impact on SGX

Could we have an official statement from Intel (or an educated guess) regarding the impact of the newly discovered Meltdown/Spectre attacks on SGX? Are secrets stored inside an enclave at risk? This has been asked at https://security.stackexchange.com/questions/176635/how-does-meltdown-sp... (by someone else) as well, but no replies so far.

Could someone please clarify this?

12 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I direct technical development for an SGX research and development group. We develop an alternative implementation of the PSW focused on the needs of intelligent network endpoint devices and embedded systems. So I believe we have a moderate to strong understanding of all the moving pieces involved in making SGX work.

We have been up all night studying the papers trying to estimate the impact of Meltdown on SGX. Given that it would be most problematic if unprivileged user code, or privileged code for that matter, given the the fact that SGX is designed to be secure under an IAGO threat model, were to be able to dump out the contents of initialized enclaves at 500 KBPS.

Here is what we are currently worried about, absent official notification from Intel. Hopefully official word will be coming out soon given the fact that this type of scenario is what the technology is designed to protect against, but I'm sure they are pretty busy right now.

The covert channel that Meltdown uses is based on probing a region of, ostensibly access protected memory, by studying the cache impacts of byte differential displacements of reference attempts to the protected address. An access exception is properly posted when the instruction is 'complete', ie. retired from the re-order buffer, however, since the memory access actually succeeded its impact can be discerned from the perturbations the 'failed' lookup caused.

An attempt to access initialized enclave memory from an address outside of the enclave address space obviously generates an exception. One of the operative issues is whether or not the MM{U,E} infrastructure elects to generate the exception in a different manner for this type of event. We have reasonable concerns that this may not be the case which would be the first crack in the armor.

We obviously know that SGX enclave data is encrypted while in main memory. The covert channels are dependent on forcing data reads from main memory into the cache architecture in order to discern the timing differences. There is therefore some reason to believe that while Meltdown could be used to discern the contents of enclave memory the best that could be accomplished would be to discern the value of the encrypted bytes, which would by definition be as secure as the quality of the encryption.

This is all decidedly suppositional, only Intel will be able to provide suitable guidance.

We are going to be pouring through the x86 architecture documents in fine detail, the answer may be there but it needs to be found. We will post back any further analysis from those efforts if something official doesn't come out.

Hopefully all of the above is helpful.

Have a good day.

Dr. Greg

Hi all,

My name is Dan O'Keeffe from the LSDS group at Imperial College London. We've just released a proof-of-concept spectre-like attack on Intel SGX enclaves on github: https://github.com/lsds/spectre-attack-sgx. Be interested to hear your opinion.

Thanks!

Dan

Good morning to everyone.

We spent the weekend studying the architecture manuals and the SGX security proofs. We had some initial thoughts that SGX may be more resistant to micro-architectural cache probing attacks but we became more pessimal about the situation as time went on. As Dan's group went on to confirm, there is really no reason to assume that SGX enclaves would be resistant to this approach given the architectural model it is based on.

Some thoughts and reflections for other readers.

First of all a clarification on exactly where SGX committed data is encrypted. On review, Devadas' paper confirms that the Management Encryption Engine (MEE) sits at the edge of the on-chip memory controllers below the the caches. This implies that enclave data is decrypted on memory fetch and populated into the caches as plaintext. This makes perfect sense when one considers that it would not be possible for speculative execution to operate unless it was working on plaintext.

We concluded the situation may be pessimistic when we noted the architectural documents confirmed that access protections are applied by the Page Miss Handler (PMH) when a Translation Lookaside Buffer (TLB) refill is requested, this is after AFTER page table attribute checks are performed. This is almost exactly analogous to why the Meltdown attack is effective, an exception is delivered only when the instruction is retired from the re-order buffer. This is after the necessary memory fetches and cache fills have been executed, as a result the micro-architectural state of the platform is affected and is capable of being probed. The SGX access exceptions are invoked even LATER then the access protections which failed to thwart the data disclosure in Meltdown.

In reading the architectural documents it is tempting to assume that enclaves would not be vulnerable since they talk about ENCLU[EENTER], ENCLU[ERESUME] and ENCLU[EEXIT] instructions being serialization points where the TLB cache and out-of-order execution pipelines are flushed. The primary security premise is that the SGX modifications to the hardware PMH are effective in blocking a page fetch from being reflected as a TLB fill. Unfortunately, it appears as if this check, as in the case of page access protection checks, occurs too late to defend against a micro-architectural state probe.

The invariant case one premise for the SGX security induction proof requires that when a processor is outside of enclave mode the TLB can only contain physical addresses belonging to DRAM pages outside the enclave. The induction moves forward on the premise that any attempts to access enclave memory will be rejected and replaced with an abort page by the PMH. Unfortunately, this supposition appears to be invalid with respect to the actual micro-architectural state of memory.

If all of this reasoning is correct, it is extremely interesting that the modified PMH in the SGX architecture was allowed to proceed forward with the actual memory fill. The architecture documents specifically indicate that the Enclave Page Cache Map (ECPM) is consulted by the PMH so that a decision can be made to reject TLB slot population when the processor is operating in non-enclave mode. The information should be there for the system to completely deny the memory access, which would have had a significant impact on SGX security posture.

So this line of reasoning would be in support of the LSDS group findings that a modified SPECTRE attack could be conducted against an enclave.

If, as the LSDS findings indicate, the attack is successful, it provides a positive demonstration that enclaves are susceptible to micro-architectural probing by the process which is running the enclave. In its current implementation two ECALL's are required; one for the enclave to return the offset from the one of the probe arrays to the 'secret' value to be exfiltrated; the second to conduct the necessary indirection references in order to exploit the speculation vulnerability. One could argue that an enclave is always vulnerable to its application asking it to disclose sensitive information.

Of much more concern would be the ability of an arbitrary external process, without ECALL access to the enclave, being able to effectively provoke and extract micro-architectural effects of enclave cache state to exfiltrate confidential data. The SGX community, I'm sure, would be very interested and concerned about this level of attack. Your findings do indicate that enclave developers need to apply defensive practices inside of their enclave and in their enclave interface practices.

What we find as an interesting threat scenario is that enclaves have unfettered access to physical memory. One can picture a model where an enclave is the 'receiver' in a covert side-channel attack. All the necessary architecture is present for an enclave to serve as a conduit for spiriting the contents of memory off a platform in a confidential fashion in the face of these micro-architecture vulnerabilities.

Intel has arguably disabled access to precision timing information inside of the enclave. I don't have the reference immediately at hand but one group has demonstrated the ability of enclave code to implement code based timing primitives with sufficient resolution for cache timing attacks.

It would have complicated the mode but we have always felt that it would have been an interesting capability to specify constraints that only certain enclave addresses/pages would have the ability to access main memory. Arguably only the ECALL and OCALL interface code actually has the need to reference main memory. This would have limited the ability of enclaves to be used to mask nefarious memory probing and similar scenarios.

Very interesting findings by the LSDS, thanks for bringing them to the attention of the forum. The last couple of weeks have caused us to be very thankful that we depend on enclaves for integrity rather then confidentiality guarantees.

One of the outstanding questions with respect to SGX confidentiality guarantees will be the level of microcode modifications that Intel can make to the SGX technology. SGX was designed around the model of a microcode implementation so that security modifications could be made. If PMH behavior could be modified it would be theoretically possible for Intel to apply security protections against these types of attacks to enclave based data.

Hopefully all of this is helpful to others working in the field.

Have a good day.

Dr. Greg

Thank you for that detailed post Greg W.

I think I agree with you on all the points after reading those sources. I am envisioning Meltdown attack on enclaves as a scenario where non-enclave code will try to access memory in the enclave ELRANGE. I have two questions  --

  1. From reading the architectural specification documents and Devadas's paper, is your conclusion that the MEE will be independent of any access checks, i.e. the data from ELRANGE will be decrypted regardless, but then the access checks will have subsequent to that, meaning that decrypted data will be available in the cache for the attack to happen. 
  2.  The Meltdown paper says that the attack depends on leaking the micro-architectural effects of reading otherwise inaccessible memory by  executing a set of data-dependent instructions that will never be retired because the original illegal memory access will lead to an exception. In the case of mounting a meltdown attack in enclaves, there wouldn't really be an exception -- rather the illegal access would result in Abort Page semantics which will replace the data read with all 1s. So the subsequent data dependent instructions will continue to be executed with this replaced data and be retired - so that may add noise/negate the micro-architectural effects?   

Quick note to say that we would also be interested in a statement as to the impact of Spectre on SGX.

Regards,

-Arthur

Cita:

Greg W. escribió:

 

In reading the architectural documents it is tempting to assume that enclaves would not be vulnerable since they talk about ENCLU[EENTER], ENCLU[ERESUME] and ENCLU[EEXIT] instructions being serialization points where the TLB cache and out-of-order execution pipelines are flushed. The primary security premise is that the SGX modifications to the hardware PMH are effective in blocking a page fetch from being reflected as a TLB fill. Unfortunately, it appears as if this check, as in the case of page access protection checks, occurs too late to defend against a micro-architectural state probe.

The invariant case one premise for the SGX security induction proof requires that when a processor is outside of enclave mode the TLB can only contain physical addresses belonging to DRAM pages outside the enclave. The induction moves forward on the premise that any attempts to access enclave memory will be rejected and replaced with an abort page by the PMH. Unfortunately, this supposition appears to be invalid with respect to the actual micro-architectural state of memory.

 

 

are you saying that the PMH actually is allowed to read and fill the TLB with the 'victim' entry, then second consult the EPCM and check whether it is a valid access or not, and remove it from the TLB eventually if it is not? could you provide any references about that please?

 

Good morning to everyone, I hope the week is going well, blizzard conditions here with -34C wind chills.

We wanted to get a note back to everyone who may be interested in following these issues. We are putting together more detailed briefing material as well, on what we consider, and are already implementing, as best practices for SGX development in light of our findings and analysis. When we get those up on our web-site we will provide a link.

We can now state with full determinism that we have been able to reproduce and demonstrate, in our labs, the conditional branch misprediction attack from the Spectre paper that Dan O'Keeffe and the group at LSDS in London posted SGX application and enclave code for. The demonstration was done on Linux with the alternative SGX PSW and SDK environments that we develop and maintain for minimum footprint SGX environments. The LSDS code was done with the standard SGX SDK, so our reproduction provides confirmation of the security regression in a completely different implementation environment.

We believe the posted code has an error that would prevent it from demonstrating the security regression. This may be due to an error, either intended or accidental, with the code that was released. We will refrain from discussing what we believe to be the issue, in case the former was there intention. We would stress that we found what seems to be the error only through code inspection since the two runtime environments are significantly different.

We believe it is important to stress that this vulnerability is not so much of interest due to its potential applicability or immediate security impact. Rather it helps confirm our reasoning and hypothesis, based on analysis of the SGX architectural documents, of the susceptibility of properly initialized and non-debug enclaves to micro-architecturally induced cache disturbance attacks.

We are currently working to refine our understanding of all this with respect to rogue cache load attacks which are the basis for the Meltdown style vulnerabilities. Based on the SGX architectural confirmations provided by the branch misprediction findings we would advise extreme caution on platforms which are not using split-VM configurations, ie. KPTI remediations, and possibly beyond that. We have significant resources invested in all of this, so we are not heavily invested in demonstrating attacks, we are laser focused on understanding these issues and developing effective responses. We will address some of the issues noted by Divya and others in further briefing documents.

Hopefully all of this is useful to the SGX community.

We will try and get more information released in the next day or so. Right now I need to get the skid-steer loader going and dig my wife's car out of the four foot it is currently sitting in and then get back to work.

Have a good afternoon.

Dr. Greg

Have we still not got an official statement from Intel regarding the impact of the Meltdown/Spectre attacks on SGX?

Come on Intel we are waiting, I have directors asking if it is worth investing in SGX or should we be investing in alternative technologies. All the time Intel remain quiet it becomes harder to make the case for SGX.

We have now been waiting for 2 weeks, we need an official statement.

Regards,

-Arthur

There is no engineering rationale to support the conclusion that data consigned to an enclave would have an expectation of confidentiality protections in the face of these micro-architectural cache probing attacks.  Unless opposing rationale can be afforded, the operative security premise must be that there is no guarantee for enclaves providing confidentiality or runtime encryption protections for 'data in motion' in the face of this particular threat scenario.

In support of this we have an operational implementation of the Spectre conditional branch misprediction attack running in our labs which is demonstrating an 88% accuracy rate with respect to exfiltration of data inside of an enclave.  This attack has the standard limitation of needing to be conducted by a process on itself, but it does provide the single instance proof needed to demonstrate that untrusted code can reliably infer the contents of data protected by an enclave encryption envelope and virtual memory protections.  There is a high probability that demonstrating a rogue cache fill, ie. Meltdown, attack is just a matter of putting in the necessary time.

The engineering rationale for this makes sense given the design of SGX.  We are working to put together a white paper which explains all of this, including immediately practical mitigations.  We will defer greater detail to that in order to avoid a TLDR phenomenon in this forum.

Before everyone throws out the baby with the bathwater it is important to note there is currently no indication of threat to enclave integrity, which is an extremely important tool to have in the toolbox when building high security assurance systems.

Best wishes for a pleasant weekend to everyone.

Dr. Greg

Disclaimer: We do not speak for Intel and there is probably remarkably little probability that Intel would want us to speak for them... :-)

*/

Hi Greg W

Thanks for your (well written) thoughts.

I agree that based on the information that we currently have SGX enclaves don’t provide confidentiality guarantees, however integrity guarantees appear to be unaffected.

On this basis additional measures need to be used with the SGX to provide the confidentiality that Intel advertise the SGX as providing. I note that Intel do state that they don’t protect against side channel attacks, but the PoC code shows how permeable the cache is between the enclave and the calling code.

My question is can Intel address this in microcode, do they recommend software measures (if so what) or do we have to wait for a new hardware architecture before the SGX provides the advertised confidentiality?

The failure of Intel to comment leads me to conclude that we need to wait for a new hardware architecture.

Regards,

-Arthur

Good morning, I hope the week has gone well for everyone.

It took a bit of time to get all of the material organized but we now have a comprehensive review available of what we believe is the current state of SGX security in the face of the Meltdown and Spectre vulnerabilities.

The review is linked as a blog post off our web-site at http://www.idfusion.net

The document is also available in PDF form through the following link, in case anyone needs something printable or passable:

ftp://ftp.idfusion.net/pub/sgx/sgx-spectre-meltdown.pdf

Speaking to Gordon's concerns, we have included a discussion of potential mitigations that can be immediately implemented to protect SGX confidentiality guarantees.

We also believe there is a strong possibility that Intel will be able to modify the behavior of enclaves on current hardware to provide significant strengthening of enclaves to speculative exfiltration attacks. Those fixes may already be there and be testable but we haven't had the bandwidth for that. Given the stability concerns over the current crop of microcode we have taken a go slow approach to testing it.

It doesn't seem likely that Intel will be in a position to implement major operational level changes in the Page Miss Handler with microcode updates, but that is only speculation... :-) As our paper discusses, the security design of SGX is based on a virtual memory model and the inherent problem with these vulnerabilities is that they are extracting information from the operational level of the processor where, by design, the microcode/micro-ops have unconditional ability to fetch memory pages through the MM{U,E} infrastructure.

Hopefully all of this will be helpful.

Have a good weekend.

Dr. Greg

Leave a Comment

Please sign in to add a comment. Not a member? Join today