IA-32 PAE in SMM context

IA-32 PAE in SMM context

The Intel Architecture Software Developers Manual has the following to say about
using PAE (Physical Address Extension) while in SMM (System Management Mode):
The physical address extension (PAE) mechanism introduced in the
P6 family processors is not supported when a processor is in SMM.
The IA-32e mode address-translation mechanism is not supported in
SMM. See Section 3.10 of Intel 64 and IA-32 Architectures
Software Developer's Manual, Volume 3A.
and:
The addressable SMRAM address space ranges from 0 to FFFFFFFFH
(4 GBytes). (The physical address extension (enabled with the
PAE flag in control register CR4) is not supported in SMM.)

Yet, how is system firmware supposed to scrub ECC errors in high memory
(such as on the 5000P chipset) if this is true?

I have done a small amount of testing on PIII, P4 Xeon, and Core 2 processors
and found that I am able to set up PAE while in SMM and have tested access
to high memory beyond 4 GB by changing the appropriate page table entries.
Can Intel confirm whether the above text is still valid for current processors
or provide insight as to how we can scrub ECC correctables?

8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Quoting - cdhooper
The Intel Architecture Software Developers Manual has the following to say about
using PAE (Physical Address Extension) while in SMM (System Management Mode):
The physical address extension (PAE) mechanism introduced in the
P6 family processors is not supported when a processor is in SMM.
The IA-32e mode address-translation mechanism is not supported in
SMM. See Section 3.10 of Intel 64 and IA-32 Architectures
Software Developer's Manual, Volume 3A.
and:
The addressable SMRAM address space ranges from 0 to FFFFFFFFH
(4 GBytes). (The physical address extension (enabled with the
PAE flag in control register CR4) is not supported in SMM.)

Yet, how is system firmware supposed to scrub ECC errors in high memory
(such as on the 5000P chipset) if this is true?

I have done a small amount of testing on PIII, P4 Xeon, and Core 2 processors
and found that I am able to set up PAE while in SMM and have tested access
to high memory beyond 4 GB by changing the appropriate page table entries.
Can Intel confirm whether the above text is still valid for current processors
or provide insight as to how we can scrub ECC correctables?

I am not entirely sure whether this is practical or no but is is possible to scrub memory while you are in Long Mode? As far as I am aware, the IA-32e method isn't really necessary in long mode. 64-bit processors can access more than 4GB anyway so you really should not have much difficulty testing those areas. Assuming, of course, that is what you are trying to do. In Long Mode you can use the full range of memory without needing PAE. Many modern 64-bit x86 processors have a physical limitation of 128GB from what I am aware of. But maybe I am wrong.

I'm not the best on x86, but I do have a good idea on a few things that are no longer supported in the long mode. I know that the hardware-based context switch and segmentation are to of those.

As far as I am aware however to maintain compatibility with 32-bit and 16-bit x86 platforms, these features are supported in the processor compatibility modes. They are not necessary in long mode.

Correct me if I am wrong about this one, but that's what I interpreted from the last time I read those manuals.

And I may not be interpreting the question properly either (I don't have the best comprehension on the planet) so my answer here is based on my interpretation of your question.

The reason your test came successful is due to the fact that you are not in Long Mode yet.

Hi Adam,

Thank you for the follow up. "Long mode" as you refer to it is also called "IA-32e" or "Intel 64" or "EM64T" by Intel. In order to transition to this mode, the processor must first transition through PAE if I am not mistaken. It would be perfectly reasonable to scrub memory while in "long mode," but I need to get there first. When the chipset detects a fatal, non-fatal, or correctable error, the entry mechanism for our BIOS is via an SMI. An SMI enters in 16-bit real mode, so software must first transition to protected mode, then enable PAE, and finally transition to "Intel 64."

I think my test was successful because the processor actually allowed entry into PAE mode. I'm fairly confident that it did transition to protected mode and then enabled PAE. I was able to modify and verify physical addresses which would not otherwise be accessible.

Regardless of whether it's PAE mode (which I want to use), or long mode (which you suggest), I still need to get there. Intel's documentation specifically mentions both PAE and IA-32e as not supported when a processor is in SMM mode. So, I guess I need to know if neither is supported then, how do I access high physical addresses to scrub?

Quoting - cdhooper
Hi Adam,

Thank you for the follow up. "Long mode" as you refer to it is also called "IA-32e" or "Intel 64" or "EM64T" by Intel. In order to transition to this mode, the processor must first transition through PAE if I am not mistaken. It would be perfectly reasonable to scrub memory while in "long mode," but I need to get there first. When the chipset detects a fatal, non-fatal, or correctable error, the entry mechanism for our BIOS is via an SMI. An SMI enters in 16-bit real mode, so software must first transition to protected mode, then enable PAE, and finally transition to "Intel 64."

I think my test was successful because the processor actually allowed entry into PAE mode. I'm fairly confident that it did transition to protected mode and then enabled PAE. I was able to modify and verify physical addresses which would not otherwise be accessible.

Regardless of whether it's PAE mode (which I want to use), or long mode (which you suggest), I still need to get there. Intel's documentation specifically mentions both PAE and IA-32e as not supported when a processor is in SMM mode. So, I guess I need to know if neither is supported then, how do I access high physical addresses to scrub?

I am not entirely so familiar with SMM, but if SMM can only execute real mode code, than my understanding here is that SMM doesn't actually enable the protected mode or PAE mode (you need to do this using software if I am correct), so it does not support PAE or IA-32e under SMM. So if you can get into Protected Mode and then get into PAE and then Long Mode (I was under the impression PAE wasn't required for long mode) then you have effectively exited SMM and have gone into Long Mode. So you should then be able to test the memory like that.

I would probably suggest use EFI instead of BIOS wherever possible. BIOS is a little out of date. Do any of the systems that the code runs on support EFI?

Adam, SMM is a separate context of the processor. You can think of it as something like an NMI that can interrupt other NMIs, if that helps. Please read Chapter 26 of the IA-32 SDM Vol 3A to learn more. In order to exit SMM, an RSM instruction or a reset is required. As I've said previously, I am quoting the SDM which specifically says, "The physical address extension (PAE) mechanism introduced in the P6 family processors is not supported when a processor is in SMM. The IA-32e mode address-translation mechanism is not supported in SMM."

I was using BIOS as a generic term. Our x86 firmware is neither a legacy BIOS nor EFI BIOS. That's not relevant to this discussion. An EFI BIOS would have the same challenge.

Quoting - cdhooper
Adam, SMM is a separate context of the processor. You can think of it as something like an NMI that can interrupt other NMIs, if that helps. Please read Chapter 26 of the IA-32 SDM Vol 3A to learn more. In order to exit SMM, an RSM instruction or a reset is required. As I've said previously, I am quoting the SDM which specifically says, "The physical address extension (PAE) mechanism introduced in the P6 family processors is not supported when a processor is in SMM. The IA-32e mode address-translation mechanism is not supported in SMM."

I was using BIOS as a generic term. Our x86 firmware is neither a legacy BIOS nor EFI BIOS. That's not relevant to this discussion. An EFI BIOS would have the same challenge.

OK. Sorry about that. I see exactly what you saw. Now apparently I noticed that SMRAM is an entirely separate address space to the normal RAM. It is inaccessible to the OS in the main RAM space. I would assume, though, that ECC checks are done autonomously by the hardware itself. I may be wrong though. But my understanding at the moment, based on the Intel documentation, is that SMRAM is treated as a separate address space, so what you are probably testing is in fact SMRAM.

I have apparently noticed that Intel doesn't seem to provide the most extensive documentation for *some* processor features.

Congratulations Adam on becoming brown belt. If you read more on SMRAM, you will discover it is usually implemented as a memory region which can be made inaccessible to non-SMI context code. This protection of the SMRAM space does not preclude SMM code from being able to access normal system RAM. Implementation of SMRAM addressing is between the chipset and part of the firmware I help maintain, so I am very familiar with this.

I am quite certain that writes made during my testing to physical addresses higher than 4 GB were written successfully. I am able to verify those writes remain after exiting SMM code and switching on PAE in non-SMM context. In our system, we also have PCI devices which we can map their memory spaces above 4 GB, and I am able to verify access to those devices as well (such as knowing what values to expect in configuration registers).

ECC checking is done "autonomously" by the hardware, but many hardware implementations require scrubbing of the detected ECC correctables be done by software (firmware). Scrubbing means writing the repaired data back out to the address which caused the error. It must be done in an atomic fashion in order to prevent race conditions (such as when the address is a DMA destination).

I don't mind you asking questions to learn more, but you seem to be going about it in a way where you make incorrect assumptions and statements which lead to a false answer. This is harmful to the developer community as a whole as the text of what you write is searchable on the Internet. If someone doesn't follow the whole thread, they might leave happy thinking your attempted answer is verbatim or go away disappointed thinking your attempted answer is the best this forum has to offer.

I've obviously chosen the wrong venue to get a technical answer out of Intel and will instead take this question up with our Intel Representative.

Quoting - cdhooper
Congratulations Adam on becoming brown belt. If you read more on SMRAM, you will discover it is usually implemented as a memory region which can be made inaccessible to non-SMI context code. This protection of the SMRAM space does not preclude SMM code from being able to access normal system RAM. Implementation of SMRAM addressing is between the chipset and part of the firmware I help maintain, so I am very familiar with this.

I am quite certain that writes made during my testing to physical addresses higher than 4 GB were written successfully. I am able to verify those writes remain after exiting SMM code and switching on PAE in non-SMM context. In our system, we also have PCI devices which we can map their memory spaces above 4 GB, and I am able to verify access to those devices as well (such as knowing what values to expect in configuration registers).

ECC checking is done "autonomously" by the hardware, but many hardware implementations require scrubbing of the detected ECC correctables be done by software (firmware). Scrubbing means writing the repaired data back out to the address which caused the error. It must be done in an atomic fashion in order to prevent race conditions (such as when the address is a DMA destination).

I don't mind you asking questions to learn more, but you seem to be going about it in a way where you make incorrect assumptions and statements which lead to a false answer. This is harmful to the developer community as a whole as the text of what you write is searchable on the Internet. If someone doesn't follow the whole thread, they might leave happy thinking your attempted answer is verbatim or go away disappointed thinking your attempted answer is the best this forum has to offer.

I've obviously chosen the wrong venue to get a technical answer out of Intel and will instead take this question up with our Intel Representative.

Thank you. Now about the support issue - perhaps it is supported only on some processors (that I am not aware of) - I will post another thread about this documentation issue soon. This being said, I presume that you were able to do all that just using SMM alone without having to exit SMM. If this is correct, then I would say it is perhaps a documentation clarity issue. I have noticed similar issues occur with the IPF documentation too, and in some respects this is rather ambiguous.

Sorry I wasn't much help.

Leave a Comment

Please sign in to add a comment. Not a member? Join today