Structures¶
The main tool a C programmer has for creating new data types is the
declaration of structures, using the struct
keyword. Here,
for example, is a type representing a point in a Cartesian plane.
struct point {
double x;
double y;
};
After this declaration, struct point
is a type, and you can
declare variables of this type which in turn have members named x
and y
, accessed through the .
‘dot’ member-access operator.
struct point p;
p.x = 1.2;
p.y = 3.4;
printf("(%f, %f)\n", p.x, p.y);
A structure can be initialized using a list of values in curly brackets, corresponding to the initial values of each member in the order they are appear in the structure’s declaration.
struct point p = {1.2, 3.4};
printf("(%f, %f)\n", p.x, p.y); /* prints (1.2, 3.4) */
For clarity, it is possible to list the initial values in any order with each given a ‘member designator’, as in the following.
struct point p = {.x = 1.2, .y = 3.4};
The members of a structure can be of any scalar or aggregate type, including arrays, pointers, and other structures. Unlike arrays, which are homogeneous, a structure is heterogeneous, meaning the members are not necessarily all of one type.
struct circle {
struct point center;
double radius;
};
struct color {
uint8_t red;
uint8_t green;
uint8_t blue;
};
struct node {
struct node *next; /* must be a pointer */
int value;
};
struct person {
char *name;
time_t age;
};
struct grade {
int scores[8];
double average;
};
/* from <time.h> */
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
If a structure contains a structure member, they can be initialized from already-existing values of that type or nested initializers. Nested members are accessed using repeated member access operators.
struct circle c;
c.center.x = 1.2;
c.center.y = 3.4;
c.radius = 1.0;
struct point p = {1.2, 3.4};
struct circle c1 = {p, 1.0};
struct circle c2 = {{1.2, 3.4}, 1.0};
More about declaring struct variables¶
The full version of a structure declaration allows the programmer to optionally declare objects of that time at the same time.
struct point {
double x;
double y;
} p;
This both declares the type struct point
and also an object,
p
, of that type. Both the tag and the object declaration(s)
are optional, so you can declare a structured variable while leaving
the type anonymous (hence not conveniently reusable).
struct { double x; double y; } p;
(This is why the declaration always ends with a semicolon, while the curly brackets that end function declarations or bracketed blocks do not have semicolons after them. Even if you don’t include any variable names after the curly bracket, the syntax is treated just like any other line declaring variables.)
It is allowed to leave off both parts, as in struct { double x;
double y; };
, but of course since that neither makes a type that can
be used later nor a variable of that type, it is of only very limited use.
Stylistically, I prefer to declare the type separately from then declaring any variables I will want of that type, but you will see both styles.
Tags vs typenames¶
A structure type’s identifier, e.g. point
in struct point
,
is called a ‘tag’. Structure tags are in their own namespace, that is,
they do not conflict with identifiers for variables, so even in a program
with struct foo
there is no problem with declaring a variable
or function named foo
. To form a typename, it is necessary to
spell out struct foo
in full.
It is common to use typedef
to abbreviate structure types. A
typedef, syntactically, looks like a variable declaration preceded by
the keyword typedef
, as in the following.
typedef unsigned long size_t;
typedef struct point point;
The identifier being declared, rather than naming an object of the
given type, instead becomes an alias for that type. So the first type
definition above allows size_t
to be used interchangeably with
unsigned long
(as might appear in stdint.h
on some
architectures). The second introduces point
as a type that can
be used interchangeably with struct point
.
It is common practice, though by no means required, for types introduced
in this way to have a _t
suffix, read as ‘type’, as in
size_t
‘size type’. This prevents conflicts with variable
names, and allows e.g. size_t size;
. It is equally common for
abbreviated structure names to be the same as the structure’s tag, or even
for the structure not to have a tag and only be named through the typedef.
In C++, structure tags can implicitly be used as type names in
declarations under somewhat complicated rules. So, after declaring
struct node
in C++, it is possible to declare a variable as
node n;
without an explicit typedef node node;
.
Memory representation of structures¶
To learn quite a lot about how structures are laid out in memory, read The Lost Art of Structure Packing by Eric S Raymond; the key notions to take away are alignment and padding. Even thought the language standard doesn’t promise much, knowing about these essentially universal implementation details should enable you to reason about the layout of structures in order to make them smaller or correspond correctly to the layout of information in a file format or network protocol.