For all memory address spaces, to optimize performance, a kernel must access data in at least 32-bit quantities, from addresses that are aligned to 32-bit boundaries. A 32-bit quantity can consist of any type, for example:
These data types can be accessed with identical memory performance. If possible, access up to four 32-bit quantities (float4, int4, etc) at a time to improve performance. Accessing more than four 32-bit quantities at a time may reduce performance.