Jan 12, 2014

[C++ / C][NOTE] The Lost Art of C Structure Packing by Eric S. Raymond

quote from:
The Lost Art of C Structure Packing

Also reference:
Why does a single integer assignment statement consume all of my CPU?

ANSI C provides an offsetof() macro which can be used to read out structure member offsets Storage for the basic C datatypes on an x86 or ARM processor doesn’t normally start at arbitrary byte addresses in memory. Each type except char has an alignment requirement: * chars can start on any byte address they’re equally expensive from anywhere they live inside a single machine word. That’s why they don’t have a preferred alignment. * 2-byte shorts must start on an even address * 4-byte ints or floats must start on an address divisible by 4 * 8-byte longs or doubles must start on an address divisible by 8 * Signed or unsigned makes no difference. * basic C types on x86 and ARM are self-aligned * Pointers, whether 32-bit (4-byte) or 64-bit (8-byte) are self-aligned Can coerce compiler into not using the processor’s normal alignment rules by using a pragma usually #pragma pack. * Do not do this casually, as it forces the generation of more expensive and slower code. Pointer alignment - the strictest possible:
 
char *p;
 
char *p;      /* 4 or 8 bytes */
char c;       /* 1 byte */
char pad[3];  /* 3 bytes */
int x;        /* 4 bytes */

//----
char *p;      /* 4 or 8 bytes */
char c;       /* 1 byte */
char pad[1];  /* 1 byte */
short x;      /* 2 bytes */

//----
char *p;     /* 8 bytes */
char c;      /* 1 byte
char pad[7]; /* 7 bytes */
long x;      /* 8 bytes */

//----

char c;
char pad1[M]; //M unpredicable
char *p;
char pad2[N]; //N is 0
int x;

//---- Make predicable ----
char *p;     /* 8 bytes */
long x;      /* 8 bytes */
char c;      /* 1 byte

In general, a struct instance will have the alignment of its widest scalar member. Compilers do this as the easiest way to ensure that all the members are self-aligned for fast access.
 
struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    char pad[7]; /* 7 bytes */
    long x;      /* 8 bytes */
}

//--- locked in padding , unlike non-struct variables---
struct foo2 {
    char c;      /* 1 byte */
    char pad[7]; /* 7 bytes , predicable, since foo2 is considered a 
    * 'variable strucure' , c always starts at first byte boundry */
    char *p;     /* 8 bytes */
    long x;      /* 8 bytes */
};

on a 64-bit x86 or ARM machine:
 
struct foo3 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte */
    /* trailing padding with 7 bytes*/
};

struct foo3 singleton; // sizeof(singleton); is 16 bytes
struct foo3 quad[4];

//------
struct foo4 {
    short s;     /* 2 bytes */
    char c;      /* 1 byte */
     /* trailing padding with 1 byte*/
};  //sizeof(foo4); is 4 bytes

//------

struct foo5 {
    short s;       /* 2 bytes */
    // ----- there's no 3 bytes padding after char c.
    // Continue with bit fields. Padding are at last
    char c;        /* 1 byte */
    int flip:1;    /* total 1 bit */
    int nybble:4;  /* total 5 bits */
    int septet:7;  /* total 12 bits */
    int pad1:4;    /* total 16 bits = 2 bytes */
    char pad2;     /* 1 byte */
};

//------

struct foo6 {
    char c;           /* 1 byte*/
    char pad1[7];     /* 7 bytes */
    struct foo6_inner {
        char *p;      /* 8 bytes, inner struct's data member forces 
         * outter to sync with largest alignment */
        short x;      /* 2 bytes */
        char pad2[6]; /* 6 bytes */
    } inner;
};
Rule of thumb: Make all the pointer-aligned subfields come first, because on a 64-bit machine they will be 8 bytes. Then the 4-byte ints; then the 2-byte shorts; then the character fields. e.g:
 
struct foo7 {
    char c;         /* 1 byte */
    char pad1[7];   /* 7 bytes */
    struct foo7 *p; /* 8 bytes */
    short x;        /* 2 bytes */
    char pad2[6];   /* 6 bytes */
};

//--- to ----
struct foo8 {
    struct foo8 *p;
    short x;
    char c;
};

//excerpt Using enumerated types instead of #defines is a good idea, if only because symbolic debuggers have those symbols available and can show them rather than raw integers. But, while enums are guaranteed to be compatible with an integral type(i.e In C), the C standard does not specify which underlying integral type is to be used for them. (In C++11, we could specify enum underlying type) Be aware when repacking your structs that while enumerated-type variables are usually ints, this is compiler-dependent; they could be shorts, longs, or even chars by default. Your compiler may have a pragma or command-line option to force the size. The long double type is a similar trouble spot. Some C platforms implement this in 80 bits, some in 128, and some of the 80-bit platforms pad it to 96 or 128 bits. In both cases it’s best to use sizeof() to check the storage size. Finally, under x86 Linux doubles are sometimes an exception to the self-alignment rule: An 8-byte double may require only 4-byte alignment within a struct even though standalone doubles variables have 8-byte self-alignment. This depends on compiler and options.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.