In computer science, a union is a value that may have any of several representations or formats within the same position in memory; that consists of a variable that may hold such a data structure. Some programming languages support special data types, called union types, to describe such values and variables.
In other words, a union type definition will specify which of a number of permitted primitive types may be stored in its instances, e.g., "float or long integer". In contrast with a record (or structure), which could be defined to contain a float and an integer; in a union, there is only one value at any given time.
A union can be pictured as a chunk of memory that is used to store variables of different data types. Once a new value is assigned to a field, the existing data is overwritten with the new data. The memory area storing the value has no intrinsic type (other than just bytes or words of memory), but the value can be treated as one of several abstract data types, having the type of the value that was last written to the memory area.
Syntax and example
In C, the syntax is:
union <name>
{
<datatype> <1st variable name>;
<datatype> <2nd variable name>;
.
.
.
<datatype> <nth variable name>;
} <union variable name>;
A structure can also be a member of a union, as the following example shows:
union name1
{
struct name2
{
int a;
float b;
char c;
} svar;
int d;
} uvar;
This example defines a variable uvar
as a union (tagged as name1
), which contains two members, a structure (tagged as name2
) named svar
(which in turn contains three members), and an integer variable named d
.
Unions may occur within structures and arrays, and vice versa:
struct
{
int flags;
char *name;
int utype;
union {
int ival;
float fval;
char *sval;
} u;
} symtab[NSYM];
The number ival is referred to as symtab[i].u.ival
and the first character of string sval by either of *symtab[i].u.sval
or symtab[i].u.sval[0]
.
Difference between union and structure
As an example illustrating this point, the declaration
struct foo { int a; float b; }
defines a data object with two members occupying consecutive memory locations:
┌─────┬─────┐
foo │ a │ b │
└─────┴─────┘
↑ ↑
Memory address: 0150 0154
In contrast, the declaration
union bar { int a; float b; }
defines a data object with two members occupying the same memory location:
┌─────┐
bar │ a │
│ b │
└─────┘
↑
Memory address: 0150