Building the GNU* Multiple Precision* library for Intel® Software Guard Extensions

GNU* Multiple Precision Arithmetic Library* for Intel SGX

File(s):Download
License:GNU Lesser General Public License v3 and GNU General Public License v2

Demo programs

File(s):Download
License:BSD 3-Clause "New" or "Revised" License
Optimized for... 
Operating System:Ubuntu* Linux* 16.04, 18.04
CentOS* Linux 7.4
Hardware:6th gen Intel® Core™ or later, Intel® Xeon® E3 v6
Software:
(Programming Language, tool, IDE, Framework)
Linux*: gcc, Intel® Software Guard Extensions SDK for Linux* (Intel® SGX SDK for Linux*)
Prerequisites:C/C++ programming

Introduction

One of the restrictions placed on an Intel® Software Guard Extensions enclave is that it cannot have dependencies on dynamically linked libraries. An enclave's security is based in a measurement of all of the code and data that make up the enclave when it is first loaded into memory, which is compared to the measurement that was made when the enclave was first compiled. Dynamically linked components would violate this constraint. This means that all libraries that are used by an enclave must be statically linked.

Another restriction on enclaves is that not all CPU instructions are legal inside an enclave. As of hardware version 1.x, instructions that generate a VMEXIT, change privilege levels, or perform I/O will generate a #UD fault if executed in an enclave (a complete list of illegal instructions fan be found in the Intel Software Guard Extensions Developer Guide).

Developers often times need to use third-party libraries when building an application and these restrictions can pose a challenge when adapting libraries for use in an enclave. Developers must turn them into what Intel SGX defines as a trusted library: a statically linked library that does not contain enclave-illegal instructions and does not depend on dynamically linked libraries. A final challenge is that the developer must integrate the Intel SGX SDK tools and build procedures into the library's build system.

This article guides developers in adapting libraries for use in an Intel SGX enclave by stepping through a real-world example. It describes how the GNU* Multiple Precision Arithmetic Library*, commonly referred to as libgmp or just GMP, can be modified to build as trusted library.

About GMP

The GMP library is a well-known arithmetic library that provides arbitrary-precision arithmetic for integer and floating point numbers. Unlike native integer and float types, the numeric types provided by GMP do not have any limits in size, nor do the floating point types have any precision loss. The primary applications for GMP are scientific research, computational algebra, cryptography, and internet security.

GMP was chosen as the target library because:

  • Its security applications make it a useful and relevant library for use in an enclave.
  • It’s a non-trivial example due to its complexity and size.
  • The core functionality of the library does not involve instructions or operations that are illegal in an enclave (such as file I/O).
  • It uses the GNU build system, which is popular with open source Linux applications.
  • The library is well-designed with compartmentalized source modules that do not mix multiple functions together. Specifically, it separates its I/O routines from its core functionality, making it easier to isolate operations that are illegal in an enclave.
  • The code base incorporates hand-written assembly routines for increased performance, which adds complexity to both the build system and the porting effort.

In summary, GMP was chosen because it is a practical and complex example with surmountable challenges.

Sample Code

Two pieces of sample code are provided as downloads:

  • sgx-gmp, a GMP distribution that has been modified to optionally build as an enclave trusted library.
  • sgx-gmp-demo, a set of demonstration programs that utilize the enclave build of GMP.

The code has been built and tested under:

  • Ubuntu* Linux* 16.04 LTS Server 64-bit
  • CentOS* 7.4 64-bit

Useful Utilities

These code samples make use of the Intel SGX templates for the GNU build system. This package is a set of M4 macros and Automake includes that let you easily integrate Intel SGX into software projects that use GNU Automake and Autoconf.

Goal

The goal is to produce a preliminary port of the GMP library that:

  • implements as much functionality as possible
  • minimizes edits to the source code
  • implements the high-performance assembly loops for maximum performance

This initial port is intended to be a starting point for a more comprehensive porting effort.

Assessing the Project

The first step in the process is to assess the work that’s required. This requires that you familiarize yourself with the library’s basic operation, programming interface, and build system.

Basic Operation and Programming Interface

You don’t need expert-level or detailed knowledge of the library’s full API, but you should be able to answer the following questions:

  • Does it perform any device I/O, whether it be networking, printing to stdout or stderr, reading from the terminal, or file manipulation?
  • Does it invoke any system calls?
  • Does it depend on instructions that are illegal in an enclave, such as CPUID or RDTSC?
  • How does it manage memory, and are there any hooks for its memory management functions?
  • What custom data types does it define, and how are they passed or invoked?

GMP has the following properties that will affect the Intel SGX port:

  • There are I/O modules for printing GMP values, as well as scanning and parsing values from user input. These modules are in isolated code modules, and can be excluded from the build.
  • Fatal errors such as memory allocation failures print messages to stderr. These messages are not critical, and can be excluded from the initial port.
  • Fatal errors end with a call to abort(). These calls can be left as-is.
  • CPUID is used if you build a “fat” library (by supplying --enable-fat to configure). According to the GMP documentation, the fat build is intended to provide runtime detection of CPU features so a single binary can run efficiently on multiple architectures. This is an optional feature that is disabled by default, so the Intel SGX version of the library should be built for the architecture where it will be used. (GMP provides support for a fat library build that does not depend on CPUID, and that may be an option for a future revision).
  • The library defines custom data types that are pointers to opaque structures that reference dynamically allocated memory. There are also hooks for implementing memory allocation routines. Though it won’t impact the library porting effort, it will impact developers when they pass GMP variables to and from an enclave.
  • The library provides a C++ interface. For this initial port, C++ will be excluded as it is an optional feature.
  • The library also includes assembly code. Programs that mix C/C++ with assembly may result in text relocations in the binary, and relocations are not allowed in enclaves for security reasons.

The Build System

GMP uses the GNU build system, specifically Automake and Autoconf. To build the library, developers first run the configure script to set compile-time options and features. The Intel SGX port should add an option to configure to enable the SGX trusted library build.

One complication with GMP is that the build system uses GNU* libtool. Libtool provides a system-independent interface to building shared libraries, but does so by choosing compiler and linker flags automatically based on the target architecture. This actually poses a problem for an Intel SGX build because enclave builds need specific compiler flags that conflict with libtool’s assumptions. That means libtool will need to be removed from the build chain when compiling the library for SGX. Any compiler flags that libtool sets will need to be reviewed, too, as some of them may be necessary for the SGX build.

Another complication is that the automake build definitions reference object modules instead of source files, as show below:

RANDOM_OBJECTS =                                                        \
  rand/rand$U.lo rand/randclr$U.lo rand/randdef$U.lo rand/randiset$U.lo \
  rand/randlc2s$U.lo rand/randlc2x$U.lo rand/randmt$U.lo                \
  rand/randmts$U.lo rand/rands$U.lo rand/randsd$U.lo rand/randsdui$U.lo \
  rand/randbui$U.lo rand/randmui$U.lo

This is unconventional, and the filenames assume libtool is in use (see the “.lo” file extensions).

Identify Target Files

Once the library has been assessed, you can identify files that will most likely need modification.

For the SGX port of GMP, the following source code files need to be modified as they call fprintf() to print fatal error messages:

assert.c
invalid.c
memory.c
mpz/init2.c
realloc.c
realloc2.c

The following build configuration files need to be modified to add the Intel SGX build options:

acinclude.m4
configure.ac
Makefile.am
mpf/Makefile.am
mpn/Makeasm.am
mpn/Makefile.am
mpq/Makefile.am
mpz/Makefile.am
rand/Makefile.am

The header file gmp-h.in also needs to be modified to prevent the build system from incorrectly assuming that the SGX libraries define the FILE type.

Build System Integration

With the list of target files identified, it’s time to integrate Intel SGX support into GMP’s build system. As a general rule, you should always start with the build system before making changes to source code because it is difficult to predict what issues will be uncovered in the source code without first compiling it. The build system will also give you the opportunity to define symbols for the preprocessor that can be used when source modifications become necessary.

For GMP, the build system integration begins with incorporating the Intel SGX templates for the GNU build system. A new directory, called m4, is created and the m4 macro definition files are copied there. The automake definitions for an SGX trusted library are in sgx_tlib.am, and this is copied to the root of the source directory.

configure.ac and acinclude.m4

The following lines must be added to configure.ac to include the macros from the Intel SGX GNU build system templates, and provide the SGX build option:

AC_CONFIG_MACRO_DIRS([m4])

SGX_INIT_OPTIONAL
SGX_IF_ENABLED([
    AC_DEFINE(HAVE_SGX, 1, [Build with SGX support])
    GMP_WITH_SGX="#define GMP_WITH_SGX 1"
    O=o
],[
    GMP_WITH_SGX="/* #undef GMP_WITH_SGX */"
    O=lo
])

AC_SUBST(GMP_WITH_SGX)

The SGX_INIT_OPTIONAL macro adds the option --enable-sgx to the configure script and allows you to use SGX_IF_ENABLED to make decisions based on whether or not an SGX build was requested. The last line, AC_SUBST(GMP_WITH_SGX), defines a Makefile substitution variable: the C preprocessor symbol GMP_WITH_SGX will be set during an SGX build. This will be needed for gmp-h.in which is discussed later on.

Also note that there’s a shell variable $O being defined: for a traditional GMP build it’s set to “lo”, and for an Intel SGX build it’s set to “o”. This variable specifies the object file extension at build time, and is a partial solution to the libtool issue mentioned previously. It’s needed here because configure.ac also specifies some object files that will be added to the Makefiles. For example, this line:

CALLING_CONVENTIONS_OBJS="x86call.lo x86check$U.lo"

Becomes:

CALLING_CONVENTIONS_OBJS="x86call.$O x86check$U.$O"

This is a good reminder that libtool must be excluded from the SGX build:

SGX_IF_ENABLED([],[
AC_PROG_LIBTOOL
])

A typical function of the configure script is to dynamically detect what headers, C pre-processor symbols and library symbols are available on the target system. This is problematic for Intel SGX because these checks use the standard C library and headers. When compiling for an enclave, the standard C library is replaced with a trusted C library which does not include illegal functions.

These header and library tests need to be modified to search the trusted C library instead. This is done with the SGX macros for Autoconf:

SGX_IF_ENABLED([
  SGX_TSTDC_CHECK_DECLS([fgetc, fscanf, optarg, ungetc, vfprintf])
  SGX_TSTDC_CHECK_DECLS([sys_errlist, sys_nerr], , ,
  [#include <stdio.h>
#include <errno.h>])
],[
  AC_CHECK_DECLS([fgetc, fscanf, optarg, ungetc, vfprintf])
  AC_CHECK_DECLS([sys_errlist, sys_nerr], , ,
  [#include <stdio.h>
#include <errno.h>])

  AC_TYPE_SIGNAL
])

SGX_IF_ENABLED([
  SGX_TSTDC_CHECK_TYPES([intmax_t, long double, long long, ptrdiff_t, quad_t,
                uint_least32_t, intptr_t])
],[
  AC_CHECK_TYPES([intmax_t, long double, long long, ptrdiff_t, quad_t,
                uint_least32_t, intptr_t])
])

Removing libtool from the SGX build creates another issue that must be addressed. GMP relies on compiler flags that optimize the code for a target architecture, and libtool is responsible for passing these to the compiler. The configure script passes these flags to libtool, but the SGX builds can’t use libtool so these flags must be written to the Makefiles directly.

The solution is to find where in configure.ac those flags are set, and define a Makefile substitution variable to pass them along. For GMP, the flags we need are set in the shell variable $gcc_cflags_cpu:

SGX_IF_ENABLED([
        AC_SUBST(CFLAGS_CPU, $gcc_cflags_cpu)
])

These are only some of the changes to the configure.ac file. Review the source code to see all of them, as well as the changes in acinclude.m4.

Top-Level Makefile.am

The automake definitions in the top-level Makefile.am come next. It needs to include the Intel SGX automake definitions for building a trusted library:

include sgx_tlib.am

The automake definitions for a trusted library set a number of automake flags that are needed for an enclave build, including AM_CFLAGS and AM_CPPFLAGS.

The SGX_INIT macro in configure.ac defines an automake conditional, SGX_ENABLED. In Makefile.am, this conditional is used to modify build settings when an SGX build is requested. The top-level Makefile.am specifies a number of build targets, such as the library itself, the C++ extensions, and test programs.

The SGX build only needs the library, and should exclude unnecessary source modules. Here, the conditional is used to exclude the subdirectories containing the printf and scanf subsystems:

if !SGX_ENABLED
	SUBDIRS += printf scanf
endif

This next change names the target library libsgx_tgmp.a when building for SGX, and sets a Makefile variable $O to either “.o” or “.lo”, similar to what was done in configure.ac.

if SGX_ENABLED
  lib_LIBRARIES = libsgx_tgmp.a $(GMPXX_LIBRARIES_OPTION)
  O=o
else 
  lib_LTLIBRARIES = libgmp.la $(GMPXX_LTLIBRARIES_OPTION)
  O=lo
endif

The next change sets specific compiler flags for the SGX build. Note that this is where $(CFLAGS_CPU) is brought in from configure.ac. This ensures the SGX build of GMP includes the same architectural compiler flags that would normally be passed to lilbtool.

if SGX_ENABLED

AM_CFLAGS += $(CFLAGS_CPU)

libsgx_tgmp_a_CPPFLAGS=$(AM_CPPFLAGS) -D__GMP_WITHIN_GMP
libsgx_tgmp_a_CFLAGS= $(AM_CFLAGS) -fno-builtin-memset

else !SGX_ENABLED

AM_CPPFLAGS+=-D__GMP_WITHIN_GMP

endif

Note that this solution makes use of per-target compiler flags. This may not be strictly necessary now, but it is designed to be forward-thinking. In the future, it might be desirable to add a companion untrusted library to the GMP build to provide memory allocators for GMP variables passed into the enclave (see Future Work). Per-target flags would let the build process compile both the trusted and untrusted libraries at the same time.

To use per-target flags, the following line has to be added to configure.ac:

AM_PROG_CC_C_O

Next, the object modules that need to be excluded from the build are also handled via the automake conditional. Note that the file extensions are also replaced with the $O variable.

if SGX_ENABLED
MPQ_OBJECTS = mpq/libmpq_a-abs$U.$O mpq/libmpq_a-aors$U.$O				\
  mpq/libmpq_a-canonicalize$U.$O mpq/libmpq_a-clear$U.$O mpq/libmpq_a-clears$U.$O			\
  mpq/libmpq_a-cmp$U.$O mpq/libmpq_a-cmp_si$U.$O mpq/libmpq_a-cmp_ui$U.$O mpq/libmpq_a-div$U.$O		\
  mpq/libmpq_a-get_d$U.$O mpq/libmpq_a-get_den$U.$O mpq/libmpq_a-get_num$U.$O mpq/libmpq_a-get_str$U.$O	\
  mpq/libmpq_a-init$U.$O mpq/libmpq_a-inits$U.$O mpq/libmpq_a-inv$U.$O		\
  mpq/libmpq_a-md_2exp$U.$O mpq/libmpq_a-mul$U.$O mpq/libmpq_a-neg$U.$O 	\
  mpq/libmpq_a-set$U.$O mpq/libmpq_a-set_den$U.$O mpq/libmpq_a-set_num$U.$O			\
  mpq/libmpq_a-set_si$U.$O mpq/libmpq_a-set_str$U.$O mpq/libmpq_a-set_ui$U.$O			\
  mpq/libmpq_a-equal$U.$O mpq/libmpq_a-set_z$U.$O mpq/libmpq_a-set_d$U.$O				\
  mpq/libmpq_a-set_f$U.$O mpq/libmpq_a-swap$U.$O

else !SGX_ENABLED

MPQ_OBJECTS = mpq/abs$U.$O mpq/aors$U.$O				\
  mpq/canonicalize$U.$O mpq/clear$U.$O mpq/clears$U.$O			\
  mpq/cmp$U.$O mpq/cmp_si$U.$O mpq/cmp_ui$U.$O mpq/div$U.$O		\
  mpq/get_d$U.$O mpq/get_den$U.$O mpq/get_num$U.$O mpq/get_str$U.$O	\
  mpq/init$U.$O mpq/inits$U.$O mpq/inv$U.$O		\
  mpq/md_2exp$U.$O mpq/mul$U.$O mpq/neg$U.$O 	\
  mpq/set$U.$O mpq/set_den$U.$O mpq/set_num$U.$O			\
  mpq/set_si$U.$O mpq/set_str$U.$O mpq/set_ui$U.$O			\
  mpq/equal$U.$O mpq/set_z$U.$O mpq/set_d$U.$O				\
  mpq/set_f$U.$O mpq/swap$U.$O mpq/inp_str$U.$O mpq/out_str$U.$O

endif

Also note that the object files have been renamed from file.$O to target-file$U.$O in the SGX build. This is a side effect of using per-target flags in Automake, combined with GMP’s use of object files rather than source files in the build definitions: Automake renames the object file to include the target name as a prefix. This also prevents Automake from complaining that object files are “created both with libtool and without”.

The final build line is shown below:

if SGX_ENABLED

libsgx_tgmp_a_SOURCES = gmp-impl.h longlong.h				\
  assert.c compat.c errno.c extract-dbl.c invalid.c memory.c		\
  mp_bpl.c mp_clz_tab.c mp_dv_tab.c mp_minv_tab.c mp_get_fns.c mp_set_fns.c \
  version.c nextprime.c primesieve.c
EXTRA_libsgx_tgmp_a_SOURCES = tal-notreent.c tal-reent.c
libsgx_tgmp_a_DEPENDENCIES = @TAL_OBJECT@		\
  $(MPF_OBJECTS) $(MPZ_OBJECTS) $(MPQ_OBJECTS)	\
  $(MPN_OBJECTS) @mpn_objs_in_libgmp@		\
  $(RANDOM_OBJECTS)
libsgx_tgmp_a_LIBADD = $(libsgx_tgmp_a_DEPENDENCIES)

else 

libgmp_la_SOURCES = gmp-impl.h longlong.h				\
  assert.c compat.c errno.c extract-dbl.c invalid.c memory.c		\
  mp_bpl.c mp_clz_tab.c mp_dv_tab.c mp_minv_tab.c mp_get_fns.c mp_set_fns.c \
  version.c nextprime.c primesieve.c
EXTRA_libgmp_la_SOURCES = tal-debug.c tal-notreent.c tal-reent.c
libgmp_la_DEPENDENCIES = @TAL_OBJECT@		\
  $(MPF_OBJECTS) $(MPZ_OBJECTS) $(MPQ_OBJECTS)	\
  $(MPN_OBJECTS) @mpn_objs_in_libgmp@		\
  $(PRINTF_OBJECTS)  $(SCANF_OBJECTS) $(RANDOM_OBJECTS)
libgmp_la_LIBADD = $(libgmp_la_DEPENDENCIES)
libgmp_la_LDFLAGS = $(GMP_LDFLAGS) $(LIBGMP_LDFLAGS) \
  -version-info $(LIBGMP_LT_CURRENT):$(LIBGMP_LT_REVISION):$(LIBGMP_LT_AGE)

endif

Note that there is no LDFLAGS variable for the SGX library. LDFLAGS contains flags that are passed to the linker, and the linker is not invoked when building a static library.

Per-Subdirectory Makefile.am

Each Makefile.am in the GMP source tree must also be modified to support the Intel SGX build. As with the top-level Makefile.am, each one must include sgx_tlib.am, though note that they must explicitly reference the path:

include $(top_srcdir)/sgx_tlib.am

GMP compiles each of these subdirectories into intermediate library files that are combined into the full GMP library. As they are configured to be built using libtool, the build target assumes the “.la” suffix instead of “.a”.

Automake does not allow the use of variables in build target names, so the Automake conditional is used to specify the target. The common source files for both the SGX and non-SGX libraries can be defined in a variable.

As in the root subdirectory, it’s necessary to prevent libtool from complaining that objects are “created both with libtool and without” by forcing the SGX build to use per-target compiler flags. The following excerpt is from rand/Makefile.am:

RAND_SOURCES = randmt.h                                         \
  rand.c randclr.c randdef.c randiset.c randlc2s.c randlc2x.c randmt.c  \
  randmts.c rands.c randsd.c randsdui.c randbui.c randmui.c

if SGX_ENABLED

AM_CFLAGS += $(CFLAGS_CPU)
AM_CPPFLAGS += -D__GMP_WITHIN_GMP -I$(top_srcdir)

noinst_LIBRARIES = librandom.a

librandom_a_SOURCES = $(RAND_SOURCES)
librandom_a_CFLAGS= $(AM_CFLAGS)

else !SGX_ENABLED

AM_CPPFLAGS = -D__GMP_WITHIN_GMP -I$(top_srcdir)

noinst_LTLIBRARIES = librandom.la
librandom_la_SOURCES = $(RAND_SOURCES)

endif

npm/Makeasm.am

The npm routines are low-level functions designed for high performance and serve as the base for the higher-level routines in the mpf, mpq, and mpz functions. Much of the npm subsystem is written in assembly, and Makeasm.am defines the build procedures.

Because the SGX build is not using libtool, a few edits are needed to this file to ensure the proper flags are passed to the assembler. This is complicated by the fact that m4 is used to preprocess the assembly routines and produce the final assembly source code.

First, we want to ensure that the assembler always produces position independent code. This is a security requirement for enclaves.

if SGX_ENABLED
ASMFLAGS += -fPIC -DPIC
endif

Second, we need to ensure the PIC symbol is also defined when m4 is run, as the source files include conditionals that check for it. This is done by creating a new Makefile variable, $M4FLAGS:

if SGX_ENABLED
M4FLAGS=-DPIC
else
M4FLAGS=
endif

With that defined, the rule for generating object files has to be updated to include the new variable:

.asm.o:
    $(M4) $(M4FLAGS) -DOPERATION_$* `test -f '$<' || echo '$(srcdir)/'`$< >tmp-$*.s
    $(CCAS) $(COMPILE_FLAGS) tmp-$*.s -o $@
    $(RM_TMP) tmp-$*.s

Regenerate the Build System

The files that were edited are the source files for the build system. To apply these changes, it’s necessary to run aclocal, automake, and autoconf.

aclocal comes first, and it will build aclocal.m4

$ aclocal

This only prints output if there’s an error.

automake and autoconf can be run in a single step by executing autoreconf instead:

$ autoreconf

(Note that some versions of Automake may print spurious error messages about the libsgx_tgmp.a build target. You can ignore these: the definitions are correct).

Source Code Modifications

GMP needs some minor source code changes to account for the Intel SGX build, and all of them are simple C preprocessor directives.

gmp-h.in

The build system creates the header file gmp.h from gmp-h.in. This header defines the symbol _GMP_H_HAVE_FILE if the underlying OS supports the FILE type in C, but the logic is not correct for an Intel SGX build. Fixing this is a two-step process.

First, there needs to be a symbol defined that can be used by the preprocessor as a conditional:

#if ! defined (__GMP_WITHIN_CONFIGURE)
#define __GMP_HAVE_HOST_CPU_FAMILY_power   @HAVE_HOST_CPU_FAMILY_power@
#define __GMP_HAVE_HOST_CPU_FAMILY_powerpc @HAVE_HOST_CPU_FAMILY_powerpc@
#define GMP_LIMB_BITS                      @GMP_LIMB_BITS@
#define GMP_NAIL_BITS                      @GMP_NAIL_BITS@
@GMP_WITH_SGX@
#endif

You might recognize @GMP_WITH_SGX@ from the modifications to configure.ac. If the build is for Intel SGX, it’s replaced with:

#define GMP_WITH_SGX 1 

If it’s not an Intel SGX build, it becomes:

/* #undef GMP_WITH_SGX */

Note that it is placed within an #ifdef block that checks for __GMP_WITHIN_CONFIGURE. This is necessary because the GMP configure script compiles test programs that include gmp.h before substitutions have been done. It defines this symbol to prevent compile-time errors.

Next, the logic for setting _GMP_H_HAVE_FILE must be modified. The simplest approach is to wrap the entire logic block with an #ifndef block.

#ifndef GMP_WITH_SGX
#if defined (FILE)                                              \
  || defined (H_STDIO)                                          \
  || defined (_H_STDIO)               /* AIX */                 \
  || defined (_STDIO_H)               /* glibc, Sun, SCO */     \
  || defined (_STDIO_H_)              /* BSD, OSF */            \
  || defined (__STDIO_H)              /* Borland */             \
  || defined (__STDIO_H__)            /* IRIX */                \
  || defined (_STDIO_INCLUDED)        /* HPUX */                \
  || defined (__dj_include_stdio_h_)  /* DJGPP */               \
  || defined (_FILE_DEFINED)          /* Microsoft */           \
  || defined (__STDIO__)              /* Apple MPW MrC */       \
  || defined (_MSL_STDIO_H)           /* Metrowerks */          \
  || defined (_STDIO_H_INCLUDED)      /* QNX4 */                \
  || defined (_ISO_STDIO_ISO_H)       /* Sun C++ */             \
  || defined (__STDIO_LOADED)         /* VMS */                 \
  || defined (__DEFINED_FILE)         /* musl */
#define _GMP_H_HAVE_FILE 1
#endif
#endif

There is a problem here, however, that requires a change to the build system: the gmp.h file is the header file for both the GMP build and for end developers. The SGX-specific definition should not apply to the standard build of GMP. There needs to be a separate header file for use in enclaves.

This is accomplished by adding the following line to configure.ac:

SGX_IF_ENABLED([ sgxgmph="sgx_tgmp.h:gmp-h.in" ])

And, in the same file, adding this new shell variable to AC_OUTPUT:

AC_OUTPUT(Makefile                                                      \
  mpf/Makefile mpn/Makefile mpq/Makefile                                \
  mpz/Makefile printf/Makefile scanf/Makefile rand/Makefile cxx/Makefile \
  tests/Makefile tests/devel/Makefile                                   \
  tests/mpf/Makefile tests/mpn/Makefile tests/mpq/Makefile              \
  tests/mpz/Makefile tests/rand/Makefile tests/misc/Makefile            \
  tests/cxx/Makefile                                                    \
  doc/Makefile tune/Makefile                                            \
  demos/Makefile demos/calc/Makefile demos/expr/Makefile                \
  gmp.h:gmp-h.in $sgxgmph)

This instructs the build system to create sgx_tgmp.h from gmp-h.in (this is in addition to building gmp.h, which is needed when compiling the library, itself).

To install the correct header file when “make install” is executed, the following changes are made to Makefile.am:

if SGX_ENABLED
nodist_includeexec_HEADERS = sgx_tgmp.h
else !SGX_ENABLED
nodist_includeexec_HEADERS = gmp.h
endif

And:

BUILT_SOURCES = gmp.h
if SGX_ENABLED
BUILT_SOURCES += sgx_tgmp.h
endif

This allows both the regular and SGX-specific header files to co-exist in the package directory. Enclaves that wish to use the SGX-enabled build of GMP simply include the sgx_tgmp.h header file instead of gmp.h.

C sources

The following files print a message to stderr if a fatal error occurs:

  • assert.c
  • invalid.c
  • memory.c
  • mpz/init2.c
  • realloc.c
  • realloc2.c

The easy solution for now is to wrap these with #ifndef blocks. A sample change from assert.c is shown below.

void
__gmp_assert_fail (const char *filename, int linenum,
                   const char *expr)
{
  __gmp_assert_header (filename, linenum);
#ifndef HAVE_SGX
  fprintf (stderr, "GNU MP assertion failed: %s\n", expr);
#endif
  abort();
}

Building the Trusted Library

The modified package can be used to build both the original and SGX versions of the GMP library. If you are downloading the sample source code, the changes to the source code and build system have already been made for you. All that’s needed is to configure the package and build it.

To configure the package for an Intel SGX build, provide the --enable-sgx option to configure. The --enable-assembly option is strongly recommended to enable the high-performance code. Other useful flags for the SGX build are --disable-shared and --with-pic.

For convenience, the code sample includes a shell script wrapper around configure called sgx-configure which sets these flags as well as an installation directory in /opt:

#! /bin/sh
#
# Wrapper around configure to build libgmp as an Intel SGX trusted enclave
# library.

./configure --prefix=/opt/gmp/6.1.2 \
        --enable-assembly \
        --disable-shared \
        --enable-static \
        --with-pic \
        --enable-sgx

You can then run make as usual:

$ make
$ sudo make install

Note: The Intel SGX build does not support additional make targets, such as ‘make check’.

Both the standard and Intel SGX versions of the library can be installed to the same location since both the static library and header file have unique names.

Using the Trusted Library in an Enclave

An earlier version of this article suggested that GMP variables could be shared between the untrusted application and the enclave. Intel no longer recommends this approach as it has been demonstrated to be vulnerable to malicious software. The dynamic memory management of the GMP library complicates efforts to properly check whether a GMP variable in untrusted space only references and allocates memory in untrusted space. To thoroughly validate an untrusted GMP variable prior to accessing it inside an enclave, the developer would have to analyze the variable's internal pointers. The issue with this approach is that the GMP data types are intended to be opaque; code that requires knowledge of their internals would introduce compatibility issues with future releases.

Intel now recommends that developers serialize GMP variables into an intermediate buffer when passing them across enclave boundaries. An enclave must not reference or operate on GMP variables that originate from untrusted space.

GMP Memory Management.

GMP variables are pointers to opaque data structures which include pointers to memory that is dynamically allocated to store the variable’s value. If the variable needs to be resized, GMP will reallocate that pointer. For safety reasons, developers should install handlers for realloc() and free() in the enclave that examine the pointer being resized or freed to ensure it’s referencing trusted memory. If the memory being free or resized is not contained entirely within the enclave, the operation should abort.

The following is an excerpt from the enclave code in the SGX GMP Demo application, showing the memory management hooks in EnclaveGmpTest.c. Note that it’s necessary to add an ECALL to perform these initialization steps before any GMP variables are accessed in the enclave.

void *(*gmp_realloc_func)(void *, size_t, size_t);
void (*gmp_free_func)(void *, size_t);

void *reallocate_function(void *, size_t, size_t);
void free_function(void *, size_t);

void tgmp_init()
{
	mp_get_memory_functions(NULL, &gmp_realloc_func, &gmp_free_func);
        mp_set_memory_functions(NULL, &reallocate_function, &free_function);
}

void free_function (void *ptr, size_t sz)
{
	if ( sgx_is_within_enclave(ptr, sz) ) gmp_free_func(ptr, sz);
	else abort();
}

void *reallocate_function (void *ptr, size_t osize, size_t nsize)
{
	if ( ! sgx_is_within_enclave(ptr, osize) ) abort();
	return gmp_realloc_func(ptr, osize, nsize);
}

Both reallocate_function() and free_function() call the SGX function sgx_is_within_enclave(). The latter takes a pointer and a size as arguments, and is used to determine whether or not the pointer that GMP needs to reallocate or free references memory that is completely contained within the enclave.

The corresponding enclave definition language (EDL) looks like this:

enclave {
	trusted {
		public void tgmp_init();
 }
};

Serializing GMP variables

An easy way to serialize GMP values is to pass them as strings. The GMP library has functions for packing and unpacking values in a manner that is more compact than writing a number out in decimal. GMP integers are easily serialized and deserialized using the mpz_get_str() and mpz_set_str() functions, but floating-point variables require extra processing because mpf_get_str() does not produce a string that can be immediately parsed by mpf_get_str(). A further complication is that the serialized data buffer can be of an arbitrary length, and when passing the buffer out of an ECALL or OCALL (designated by the "out" keyword in EDL) the edge routines can't know the size of the buffer to be checked in advance. To solve this issue, the developer must use two functions when passing the serialized data out of the ECALL or OCALL: one to transfer the buffer size, and another to transfer the buffer once the length is known.

The sample programs in the sgx-gmp-demo project demonstrate the marshaling procedures. The functions for serializing and de-serializing floating-point values are shown below, and can be found in serialize.c.

char *mpf_serialize (mpf_t val, int digits)
{
        mp_exp_t mpe= 0;
        mpz_t e;
        char *smant, *se, *s;
        size_t len;

        /* Get our free function. */

        if ( gmp_free_func == NULL || gmp_alloc_func == NULL )
                mp_get_memory_functions(&gmp_alloc_func, NULL, &gmp_free_func);

        smant= mpf_get_str(NULL, &mpe, S_BASE, digits, val);
        if ( smant == NULL ) return NULL;

        mpz_init_set_si(e, mpe);

        se= mpz_get_str(NULL, S_BASE, e);
        if ( se == NULL ) {
                gmp_free_func(smant, 0);
                return NULL;
        }

        len= strlen(smant)+strlen(se)+3;
        s= gmp_alloc_func(len); /* .M@N + NULL */
        if ( s == NULL ) {
                gmp_free_func(smant, 0);
                gmp_free_func(se, 0);
                return NULL;
        }

        /*
         * mpf_get_str produces strings that can't be directly consumed by
         * mpf_set_str, so deal with that.
         */

        if ( smant[0] == '-' ) {
                strncpy(s, "-.", 2);
                strncat(s, &smant[1], strlen(smant)-1);
        } else {
                strncpy(s, ".", 2);
                strncat(s, smant, strlen(smant));
        }
        strncat(s, "@", 1);
        strncat(s, se, strlen(se));
        s[len]= '\0';
        gmp_free_func(smant, 0);
        gmp_free_func(se, 0);

        return s;
}

int mpf_deserialize(mpf_t *val, char *s, int digits)
{
        static double bits= log2(10);
        mpf_set_prec(*val, (digits*bits)+1);
        return mpf_set_str(*val, s, S_BASE);
}

The function mpf_serialize() takes the output of mpf_get_str() and uses it to produce a string that mpf_set_str() will accept when called in mpf_deserialize(). Both the serialization and de-serialization functions require that the developer specify the number of digits of accuracy that should be preserved.

Note that both the enclave and the untrusted application must compile this source file separately, as each must use the corresponding GMP header file. This is reflected at the top of serialize.c:

#ifdef HAVE_SGX
# include <sgx_tgmp.h>
#else
# include <gmp.h>
#endif

This excerpt from the sgxgmpmath demo program shows how two integers are passed into the enclave, multiplied together, and the result passed back out. First, the source variables must be serialized to character buffers.

        str_a= mpz_serialize(a);
        str_b= mpz_serialize(b);
        if ( str_a == NULL || str_b == NULL ) {
                fprintf(stderr, "could not convert mpz to string");
                return 1;
        }

Next, it invokes the ECALL to multiply them together. The ECALL is designed to return the length of the serialized result, which is stored in the len variable. This length is used to allocate a buffer large enough to store the result string and its terminating NULL.

        status= e_mpz_mul(eid, &len, str_a, str_b);
        if ( status != SGX_SUCCESS ) {
                fprintf(stderr, "ECALL test_mpz_mul: 0x%04x\n", status);
                return 1;
        }
        if ( !len ) {
                fprintf(stderr, "e_mpz_mul: invalid result\n");
                return 1;
        }

        str_c= realloc(str_c, len+1);

The result is then fetched using another ECALL which places the serialized data in the buffer referenced by str_c.

        status= e_get_result(eid, &rv, str_c, len);
        if ( status != SGX_SUCCESS ) {
                fprintf(stderr, "ECALL e_mpz_get_result: 0x%04x\n", status);
                return 1;
        }
        if ( rv == 0 ) {
                fprintf(stderr, "e_get_result: bad parameters\n");
                return 1;
        }

Finally, the data is desrialized into a GMP variable and the result is printed.

        if ( mpz_deserialize(&c, str_c) == -1 ) {
                fprintf(stderr, "mpz_deserialize: bad float string\n");
                return 1;
        }

        gmp_printf("imul : %Zd * %Zd = %Zd\n\n", a, b, c);

The enclave function that performs the multiplication takes the serialized input, deserializes them to GMP variables, multiplies them together, serializes the result, and then returns the length of the serialized form. For simplicity, this enclave uses a single, global variable to store a result. A production application would need a more sophisticated mechanism for storing and retrieving values.

size_t e_mpz_mul(char *str_a, char *str_b)
{
        mpz_t a, b, c;

        /* Marshal untrusted values into the enclave. */

        /* Clear the last, serialized result */

        if ( result != NULL ) {
                gmp_free_func(result, NULL);
                result= NULL;
        }

        mpz_inits(a, b, c, NULL);

        /* Deserialize */

        if ( mpz_deserialize(&a, str_a) == -1 ) return 0;
        if ( mpz_deserialize(&b, str_b) == -1 ) return 0;

        mpz_mul(c, a, b);

        /* Serialize the result */

        result= mpz_serialize(c);
        if ( result == NULL ) return 0;

        return strlen(result);
}

The ECALL that returns the result writes to the raw pointer that is passed into the enclave. This avoids a double-copy of the data but requires that the enclave ensure the target buffer is wholly outside the enclave, and the source data is wholly inside of it.

int e_get_result(char *str, size_t len)
{
        if ( result == NULL || str == NULL || len == 0 ) return 0;

        if ( ! sgx_is_within_enclave(result, len) ) return 0;

        if ( sgx_is_outside_enclave(str, len+1) ) { /* Include terminating NULL */
                strncpy(str, result, len);
                str[len]= '\0';

                gmp_free_func(result, NULL);
                result= NULL;

                return 1;
        }

        return 0;
}

The EDL for these functions is as follows:

public size_t e_mpz_mul( [string, in] char *str_a, [string, in] char *str_b );

public int e_get_result( [user_check] char *str_c, size_t len );

Building the Sample Code

Before you can build the sample, you’ll need to build GMP both with and without SGX support, and install it to a convenient location.

To build the sample, run the configure script and supply the location of your GMP libraries. The following configuration options let you specify the location of your trusted and untrusted GMP libraries:

  --with-gmpdir=PATH           specify the libgmp directory
  --with-trusted-gmpdir=PATH   the trusted libgmp directory (default: gmp
                               directory)

For example, if you installed both versions of GMP to /opt/gmp/6.1.2, then you would run:

$ configure --with-gmpdir=/opt/gmp/6.1.2

Once configuration is done, you can run “make“ to perform the build. It will create two programs: sgxgmpmath and sgxgmppi.

Demo: sgxgmpmath

This program takes two numbers on the command line, and then calls into the enclave to perform addition, multiplication, integer division, and floating point division. Each of these results is printed to stdout.

Usage is:

sgxgmpmath num1 num2

Sample output is shown below:

$ ./sgxgmpmath 12345678901234567890 9876543210
Enclave launched
libtgmp initialized
iadd : 12345678901234567890 + 9876543210 = 12345678911111111100

imul : 12345678901234567890 * 9876543210 = 121932631124828532111263526900

idiv : 12345678901234567890 / 9876543210 = 1249999988

fdiv : 12345678901234567890 / 9876543210 = 1249999988.734374999000

Demo: sgxgmppi

This program is a more advanced example of using the GMP library in an enclave, and it exercises several GMP capabilities including factorials, exponentiation, n-roots, floating point division, and bits of precision. It makes an ECALL to calculate the value of pi to the specified number of digits using the Chudnovsky algorithm and places the value in a GMP variable that is passed to the ECALL as a parameter.

Usage is:

sgxgmppi ndigits

Note that the implementation of Chudnovsky’s algorithm in this demo application emphasizes clarity over performance.

$ ./sgxgmppi 1000
Enclave launched
libtgmp initialized
pi : 3.1415926535897932384626433832795028841971693993751
05820974944592307816406286208998628034825342117067982148
08651328230664709384460955058223172535940812848111745028
41027019385211055596446229489549303819644288109756659334
46128475648233786783165271201909145648566923460348610454
32664821339360726024914127372458700660631558817488152092
09628292540917153643678925903600113305305488204665213841
46951941511609433057270365759591953092186117381932611793
10511854807446237996274956735188575272489122793818301194
91298336733624406566430860213949463952247371907021798609
43702770539217176293176752384674818467669405132000568127
14526356082778577134275778960917363717872146844090122495
34301465495853710507922796892589235420199561121290219608
64034418159813629774771309960518707211349999998372978049
95105973173281609631859502445945534690830264252230825334
46850352619311881710100031378387528865875332083814206171
77669147303598253490428755468731159562863882353787593751
9577818577805321712268066130019278766111959092164201989

Future Work

This port of GMP as an SGX trusted library is a proof-of-concept, and serves as an example of how to adapt libraries for use in SGX enclaves. The changes made to GMP were kept to a minimum, and mostly limited to the build system.

To turn this into a production-ready product, some additional work is needed:

  • C++ support needs to be implemented.
  • More extensive validation is needed to ensure the library functions properly in an enclave.
  • The assembly code needs to be reviewed to ensure it doesn’t contain instructions that are illegal in an enclave.
  • Fat library support should be investigated for feasibility.
  • A more graceful solution than “comment it out” should to be implemented for those code modules which print messages to stderr when a fatal error occurs.
  • Enclave support needs to be added to the test suite (run via make check).
  • The changes to the build system emphasized clarity over efficiency. Some consolidation of the conditionals in the Automake files would probably make these changes easier to maintain.
  • Configuration options that are not compatible with an enclave build should produce an error when configure is run. In a similar vein, the SGX option should also turn on configuration options that are required for the enclave build.

Conclusion

This article walked through the process of porting the GNU Multiple Precision Arithmetic Library (GMP) to a trusted library. It's intended to serve as a model for developers to follow when adapting libraries for an Intel SGX enclave.

The majority of the work spent porting GMP to Intel SGX was in build system integration, primarily because GMP’s compartmentalized design meant that very little source code had to be modified. While other libraries might require varying levels of source code changes, build system integration will likely always be a significant factor in any porting effort. The Automake and Autoconf configuration templates and macros for Intel SGX provide a convenient framework for incorporating the Intel SGX build procedures while preserving the original build definitions. This lets the package produce both the original and Intel SGX capable versions of the binaries.

Using the trusted library can be complicated if GMP variables must be passed between untrusted and trusted memory. Care must be taken to ensure that the enclave does not leak secrets to untrusted space, or call functions like realloc() on an untrusted memory pointer while inside an enclave. Libraries such as GMP need to document their data marshalling procedures to protect against runtime errors and accidental exposure of secrets.

For more information, refer to the Intel SGX, Intel SGX SDK, and the GMP documentation.

Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.