Intel-Specific Pragmas

The Intel-specific C++ compiler pragmas described in the Intel-Specific Pragma reference are listed below.

Some pragmas are available for both Intel® and non-Intel microprocessors but they may perform additional optimizations for Intel® microprocessors than they perform for non-Intel microprocessors.

Click on the pragmas for a more detailed description.

Pragma

Description

alloc_section

Allocates one or more variables in the specified section. Controls section attribute specification for variables.

cilk grainsize

Specifies the grain size for one cilk_for loop.

distribute_point

Instructs the compiler to prefer loop distribution at the location indicated.

inline

Specifies inlining of all calls in a statement. This also describes pragmas forceinline and noinline.

inline-max-per-routine

Controls the number of times inlining may be applied to a routine.

inline-max-total-size

Controls the size that an individual routine can grow through inlining.

intel_omp_task

For Intel legacy tasking, specifies a unit of work, potentially executed by a different thread.

intel_omp_taskq

For Intel legacy tasking, specifies an environment for the while loop in which to queue the units of work specified by the enclosed task pragma.

ivdep

Instructs the compiler to ignore assumed vector dependencies.

loop_count

Specifies the iterations for the for loop.

nofusion

Prevents a loop from fusing with adjacent loops.

novector

Specifies that a particular loop should never be vectorized.

offload

Executes the statements on the target. This pragma only applies to Intel® MIC Architecture and Intel® Graphics Technology. Intel® Graphics Technology is a preview feature.

offload_attribute

Specifies that all functions and variables declared subsequent to the pragma are available on the target. This pragma only applies to Intel® MIC Architecture and Intel® Graphics Technology.

offload_transfer

Initiates and completes a synchronous data transfer. If used with the signal clause, initiates an asynchronous data transfer. This pragma only applies to Intel® MIC Architecture.

offload_wait

Specifies a wait for a previously initiated asynchronous data transfer. This pragma only applies to Intel® MIC Architecture.

omp atomic

Specifies a computation that must be executed atomically.

omp barrier

Specifies a point in the code where each thread must wait until all threads in the team arrive.

omp critical

Specifies a code block that is restricted to access by only one thread at a time.

omp declare simd

Creates a version of a function that can process multiple arguments using Single Instruction Multiple Data (SIMD) instructions from a single invocation from a SIMD loop.

omp declare target

Creates a device-specific version of a function that can be called from a target region. This pragma only applies to Intel® MIC Architecture.

omp distribute

Specifies that the iterations of one of more loops should be shared among the master threads of all thread teams in a league.

omp flush

Identifies a point at which the view of the memory by the thread becomes consistent with the memory.

omp for

Specifies a parallel loop. Each iteration of the loop is executed by one of the threads in the team.

omp for simd

Specifies the beginning of a loop that can be executed concurrently using Single Instruction Multiple Data (SIMD) instructions. Each iteration of the loop is executed by one of the threads in the team.

omp parallel for simd

Specifies a parallel region that contains a loop to execute with Single Instruction Multiple Data (SIMD) instructions.

omp master

Specifies the beginning of a code block that must be executed only once by the master thread of the team.

omp ordered

Specifies a code block in a worksharing loop that will be run in the order of the loop iterations.

omp parallel

Specifies that a structured block should be run in parallel by a team of threads.

omp parallel for

Specifies a parallel construct containing one or more associated loops.

omp parallel sections

Specifies a parallel construct that contains a single sections construct.

omp sections

Defines a region of structured blocks that will be distributed among the threads in a team.

omp simd

Transforms the loop into a loop that will be executed concurrently using Single Instruction Multiple Data (SIMD) instructions.

omp single

Specifies a structured block that will be executed only once by a single thread in the team.

omp target

Creates a device data environment and then executes the construct on that device.

omp target data

Creates a device data environment by making a mapping of variables from the host to the target device.

omp target update

Makes the items listed in the device data environment consistent between the device and host, in accordance with the motion clauses on the pragma.

omp task

Specifies the beginning of a code block whose execution may be deferred.

omp taskgroup

Causes the program to wait until the completion of all enclosed and descendant tasks.

omp taskyield

Specifies that the current task can be suspended at this point, in favor of execution of a different task.

omp taskwait

Specifies a wait on the completion of child tasks generated since the beginning of the current task.

omp teams

Creates a league of thread teams to execute the structured block in the master thread of each team.

omp threadprivate

Specifies a list of globally-visible variables that will be allocated private to each thread.

optimize

Enables or disables optimizations for code after this pragma until another optimize pragma or end of the translation unit.

optimization_level

Controls optimization for one function or all functions after its first occurrence.

optimization_parameter

Tells the compiler to generate code specialized for a particular processor, at the function level, similar to the -m (/arch) options.

parallel/noparallel

Resolves dependencies to facilitate auto-parallelization of the immediately following loop (parallel) or prevents auto-parallelization of the immediately following loop (noparallel).

prefetch/noprefetch

Invites the compiler to issue or disable requests to prefetch data from memory. This pragma only applies to Intel® MIC Architecture.

simd

Enforces vectorization of loops.

unroll/nounroll

Indicates to the compiler to unroll or not to unroll a counted loop.

unroll_and_jam/nounroll_and_jam

Enables or disables loop unrolling and jamming. These pragmas can only be applied to iterative for loops.

unused

Describes variables that are unused (warnings not generated).

vector

Indicates to the compiler that the loop should be vectorized according to the argument keywords.

Пожалуйста, обратитесь к странице Уведомление об оптимизации для более подробной информации относительно производительности и оптимизации в программных продуктах компании Intel.