Declarations, definitions, and numbers¶
Consider this line of C code.
int x = 2 + 2;
This is called a ‘declaration’. It declares the existence of an
object—in this case, it is named ‘x’, it holds values that are
integers (int
), it starts out with the value 4 (i.e. the result
of evaluating the expression \(2 + 2\)). It is possible to declare
something exists in general without immediately finding a place for it to
actually exist in memory, but this declaration does indeed cause x
to to exist as soon as the main
function is invoked. A declaration
that causes storage to be allocated is called a ‘definition’.
In fact, the code in twoplustwo.c
that contains this line has
two definitions: one for the function main
, and one for the
variable x
inside of it. (We will talk more about the scope and
duration of things, but it suffices to say for now that our x
is confined within main
.)
Unlike in math, variables in C can be changed to hold different values at
different instants in the computation. Somewhat confusingly, C uses the
same symbol, =
, both to declare that a variable is equal to some
value, and to say that a variable is to be changed to hold some value.
int main()
{
int x = 2 + 2;
x = 2 + 3;
}
In this program, x
is initialized to 4, and then set to 5
(i.e. the result of evaluation the expression \(2 + 3\)). It is not
strictly necessary to initialize a variable; to calculate \(2 +
2\), one could write the following.
int main()
{
int x;
x = 2 + 2;
}
The definition int x;
declares that a variable named ‘x’
that holds integers exists, and sets aside a suitable part of memory
for it. Then the assignment x = 2 + 2
calculates a new value
and changes the memory for x
to represent the result. This is
worse style, because between definining x
and assigning it a
meaningful value, it does still exist and represent an integer—it
is just undefined which one! Such uninitialized variables are a common
source of bugs. Still, there are often situations where it makes sense
to define a variable before there is a meaningful value to put in it;
so, just strive to put the definition as close as possible to giving it
a value, and the ideal is to put them in the exact same place using an
initializer in the definition.
Here is a program that calculates \(2 + 2\) in yet another way.
int main()
{
int x = 2;
x = x + 2;
}
First, this program defines an integer x
and initializes it to 2.
Then, it evaluates the expression \(x + 2\), which since \(x\)
is currently 2 means \(2 + 2\) or \(4\), and that result is
stored into x
. You can follow along in the debugger.
The code I’ve shown you so far has included two types, int
and double
. They correspond to types of numbers, but there are
some interesting details to discuss because those numbers are stored in
computer memory.
Integral types¶
An int
isn’t exactly the same as an integer. For one thing,
mathematically the integers are infinite, while a computer with finite
mass can only represent a finite amount of information. Furthermore,
the way computing hardware is designed, there are certain fixed-size
chunks of information (e.g. 32 or 64 bits) that it can handle directly,
and anything bigger or smaller must generally be simulated in software.
The C language is designed to respect that sort of design decision and give the programmer an architecture-agnostic way to work within such limitations. So, C has a family of related types, called the integral types, which all store integers but with different architectural characteristics.
A C integral type is either ‘signed’ or ‘unsigned’. A signed
integer can store negative, zero, or positive numbers; an unsigned integer
stores nonnegative (i.e. zero or positive) numbers. (Thus an unsigned
integer is more like a natural number in math than an integer.) You
can specify signed int
or unsigned int
but if you don’t
specify the default is signed
, i.e. int
is equivalent
to signed int
.
Integral types are also distinguished by width, i.e. how many bits
are used to represent a value. The int
type is chosen for each
architecture to be the most natural width for the hardware’s bus, math
instructions, etc.; on many current systems that is 32 bits (4 bytes),
but the C language specification leaves it implementation-defined so an
int
could be a different size on, say, an Arduino.
The char
type is an integral type that is the smallest unit
the architecture can subdivide memory into, while being large enough to
store the numeric code for a character of text (hence the abbreviated
name ‘char’). Thus, char
is synonymous with ‘byte’,
and it follows that although the standard allows different widths on
different architectures, on every known system it has a width of 8 bits.
A short int
, or just short
, is at least as wide as
a char
and at most as wide as an int
. That is all
the standard guarantees (once again, it is implementation-defined),
but on a system with an 8-bit char
and a 32-bit int
,
a short
is usually 16 bits (2 bytes). Going the other way,
a long int
, or just long
, is at least as wide as an
int
but possibly wider, and a long long int
or just
long long
is at least as wide as a long
but possibly
wider. Most architectures don’t actually have five different widths
of integers (char
, short
, int
, long
,
long long
) built into the hardware, so typically some of them
will actually be the same size; however, each wider C type is at least
as wide as the previous, giving the programmer some control over width
while remaining portable.
When in doubt, use int
for whole numbers.
Floating-point types¶
To represent rational and real numbers, which aren’t necessarily whole
numbers, C provies the float
type. The name float
is an
abbreviation of ‘floating point’, as opposed to ‘fixed point’. As
an example of a fixed-point scheme, consider US currency: when it is
written with a decimal point, there are conventionally 2 places after
it, like 1.99. So one could, rather than trying to represent fractions
of dollars at all, just use an int
for the whole number of
pennies. However, in general, you might want 1.99, or 19.9, or 0.199—the
point ‘floats’ around based on the value’s order of magnitude.
Unlike the integral types, real types in C are always signed; however, there are multiple widths available. Being stored in a fixed number of bits limits a floating-point variable not only in magnitude but also in precision; there is much more to explore here, but the main idea is that floats attempt to represent real numbers, but a programmer must remember that they are only finite approximations.
The double
type is a larger floating point type, so named because
it has double the width of a float. There is also a long double
type, at least as wide as a double but even bigger on architectures that
can support that.
When in doubt, use double
for real numbers.