Interesting quirk of union

I just got bit by this, and thought I might tell you all about it in the
hope that it might help someone else avoid the mistake. I don’t recall
seeing any mention of anything like it before, so it might be compiler
dependent.

The following struct should be a single byte, but on MSVC6, it becomes two
bytes. The first 5 bits are in the first byte, and the last 3 are in the
second byte. So I conclude that for MSVC6, at least, union has a minimum
sizeof() of 1.

struct
{
unsigned char signal1:1;
unsigned char signal2:1;
unsigned char signal3:1;
unsigned char signal4:1;
unsigned char signal5:1;
union
{
unsigned char signalGroup:3;
struct
{
unsigned char signal6:1;
unsigned char signal7:1;
unsigned char signal8:1;
};
} ;
}struct1;

Hoping this helps someone else,

Phil
Phil Barila
Seagate Technology, LLC
(720) 684-1842

That is correct behavior. Refer to the C/C++ bibles, name K&R. The minimum
size of any structure/union/enum is 1 byte.

Greg

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of
xxxxx@seagate.com
Sent: Friday, March 08, 2002 1:18 PM
To: NT Developers Interest List
Subject: [ntdev] Interesting quirk of union

I just got bit by this, and thought I might tell you all about it in the
hope that it might help someone else avoid the mistake. I don’t recall
seeing any mention of anything like it before, so it might be compiler
dependent.

The following struct should be a single byte, but on MSVC6, it becomes two
bytes. The first 5 bits are in the first byte, and the last 3 are in the
second byte. So I conclude that for MSVC6, at least, union has a minimum
sizeof() of 1.

struct
{
unsigned char signal1:1;
unsigned char signal2:1;
unsigned char signal3:1;
unsigned char signal4:1;
unsigned char signal5:1;
union
{
unsigned char signalGroup:3;
struct
{
unsigned char signal6:1;
unsigned char signal7:1;
unsigned char signal8:1;
};
} ;
}struct1;

Hoping this helps someone else,

Phil
Phil Barila
Seagate Technology, LLC
(720) 684-1842


You are currently subscribed to ntdev as: xxxxx@pdq.net
To unsubscribe send a blank email to %%email.unsub%%

But what about do the things in opposite order?
First the union and it will contain two different
structs, every of them one byte in size, so the
union will also have only one byte.

union
{
struct
{
unsigned char signal1:1;
unsigned char signal2:1;
unsigned char signal3:1;
unsigned char signal4:1;
unsigned char signal5:1;
unsigned char signal6:1;
unsigned char signal7:1;
unsigned char signal8:1;
};
struct
{
unsigned char dummy:5;
unsigned char signalGroup:3;
};
} stuct1;

Does it make sense?

Paul

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of
xxxxx@seagate.com
Sent: Friday, March 08, 2002 8:18 PM
To: NT Developers Interest List
Subject: [ntdev] Interesting quirk of union

I just got bit by this, and thought I might tell you all about it in the
hope that it might help someone else avoid the mistake. I don’t recall
seeing any mention of anything like it before, so it might be compiler
dependent.

The following struct should be a single byte, but on MSVC6, it becomes two
bytes. The first 5 bits are in the first byte, and the last 3 are in the
second byte. So I conclude that for MSVC6, at least, union has a minimum
sizeof() of 1.

struct
{
unsigned char signal1:1;
unsigned char signal2:1;
unsigned char signal3:1;
unsigned char signal4:1;
unsigned char signal5:1;
union
{
unsigned char signalGroup:3;
struct
{
unsigned char signal6:1;
unsigned char signal7:1;
unsigned char signal8:1;
};
} ;
}struct1;

Hoping this helps someone else,

Phil
Phil Barila
Seagate Technology, LLC
(720) 684-1842


You are currently subscribed to ntdev as: xxxxx@compelson.com
To unsubscribe send a blank email to %%email.unsub%%

Actually, the C++ bible is Stroustrup, and it neglects to mention the
minimum size requirement. Since I was using C++, I consider it an
oversight in the documentation.

Phil

“Gregory G. Dyess” @lists.osr.com on 03/08/2002 12:26:53
PM

Please respond to “NT Developers Interest List”

Sent by: xxxxx@lists.osr.com

To: “NT Developers Interest List”
cc:

Subject: [ntdev] RE: Interesting quirk of union

That is correct behavior. Refer to the C/C++ bibles, name K&R. The
minimum
size of any structure/union/enum is 1 byte.

Greg

I believe the issue isn’t one of minimum size, but one of alignment.
Harbison and Steele (Section 5.7) states that “An object of a union type
will begin on a storage alignment boundary appropriate for any contained
component”. So, the union aligns on a char boundary. I went to MSVC and
replaced all “char” with “long”, and, what do you know, the union is now at
offset +4 instead of +1. It’s a question of alignment, and it seems to be
standard C behavior.

Alberto.

-----Original Message-----
From: xxxxx@seagate.com [mailto:xxxxx@seagate.com]
Sent: Friday, March 08, 2002 3:36 PM
To: NT Developers Interest List
Subject: [ntdev] RE: Interesting quirk of union

Actually, the C++ bible is Stroustrup, and it neglects to mention the
minimum size requirement. Since I was using C++, I consider it an
oversight in the documentation.

Phil

“Gregory G. Dyess” @lists.osr.com on 03/08/2002 12:26:53
PM

Please respond to “NT Developers Interest List”

Sent by: xxxxx@lists.osr.com

To: “NT Developers Interest List”
cc:

Subject: [ntdev] RE: Interesting quirk of union

That is correct behavior. Refer to the C/C++ bibles, name K&R. The
minimum
size of any structure/union/enum is 1 byte.

Greg


You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to %%email.unsub%%

Paul,

That does appear to be correct. I’m not going to test it, because I so=
lved
the problem and moved on, but it looks correct.

Thanks for the suggestion.

Phil

“Pavel Hrdina” @lists.osr.com on 03/08/2002 12:42:0=
1 PM

Please respond to “NT Developers Interest List”

Sent by: xxxxx@lists.osr.com

To: “NT Developers Interest List”
cc:

Subject: [ntdev] RE: Interesting quirk of union

But what about do the things in opposite order?
First the union and it will contain two different
structs, every of them one byte in size, so the
union will also have only one byte.

union
{
=A0=A0=A0 struct
=A0=A0=A0 {
=A0=A0=A0=A0=A0=A0=A0 unsigned char signal1:1;
=A0=A0=A0=A0=A0=A0=A0 unsigned char signal2:1;
=A0=A0=A0=A0=A0=A0=A0 unsigned char signal3:1;
=A0=A0=A0=A0=A0=A0=A0 unsigned char signal4:1;
=A0=A0=A0=A0=A0=A0=A0 unsigned char signal5:1;
=A0=A0=A0=A0=A0=A0=A0 unsigned char signal6:1;
=A0=A0=A0=A0=A0=A0=A0 unsigned char signal7:1;
=A0=A0=A0=A0=A0=A0=A0 unsigned char signal8:1;
=A0=A0=A0 };
=A0=A0=A0 struct
=A0=A0=A0 {
=A0=A0=A0=A0=A0=A0=A0 unsigned char dummy :5;
=A0=A0=A0=A0=A0=A0=A0 unsigned char signalGroup:3;
=A0=A0=A0 };
} stuct1;

Does it make sense?

Paul

=

I agree it is alingment issue. MSDN says following (Storage and Alignment of
Structures):

Every data object has an alignment-requirement. For structures, the
requirement is the largest of its members. Every object is allocated an
offset so that
offset % alignment-requirement == 0

and

Bit fields default to size long for the Microsoft C compiler. Structure
members are aligned on the size of the type or the /Zp[n] size, whichever is
smaller. The default size is 4.

Best regards,

Michal Vodicka
STMicroelectronics Design and Application s.r.o.
[michal.vodicka@st.com, http:://www.st.com]


From:
xxxxx@compuware.com[SMTP:xxxxx@compuware.com]
Reply To: xxxxx@lists.osr.com
Sent: Friday, March 08, 2002 9:48 PM
To: xxxxx@lists.osr.com
Subject: [ntdev] RE: Interesting quirk of union

I believe the issue isn’t one of minimum size, but one of alignment.
Harbison and Steele (Section 5.7) states that “An object of a union type
will begin on a storage alignment boundary appropriate for any contained
component”. So, the union aligns on a char boundary. I went to MSVC and
replaced all “char” with “long”, and, what do you know, the union is now
at
offset +4 instead of +1. It’s a question of alignment, and it seems to be
standard C behavior.

Alberto.

-----Original Message-----
From: xxxxx@seagate.com [mailto:xxxxx@seagate.com]
Sent: Friday, March 08, 2002 3:36 PM
To: NT Developers Interest List
Subject: [ntdev] RE: Interesting quirk of union

Actually, the C++ bible is Stroustrup, and it neglects to mention the
minimum size requirement. Since I was using C++, I consider it an
oversight in the documentation.

Phil

“Gregory G. Dyess” @lists.osr.com on 03/08/2002 12:26:53
> PM
>
> Please respond to “NT Developers Interest List”
>
> Sent by: xxxxx@lists.osr.com
>
>
> To: “NT Developers Interest List”
> cc:
>
> Subject: [ntdev] RE: Interesting quirk of union
>
>
> That is correct behavior. Refer to the C/C++ bibles, name K&R. The
> minimum
> size of any structure/union/enum is 1 byte.
>
> Greg
>
>
>
>
> —
> You are currently subscribed to ntdev as: xxxxx@compuware.com
> To unsubscribe send a blank email to %%email.unsub%%
>
> —
> You are currently subscribed to ntdev as: michal.vodicka@st.com
> To unsubscribe send a blank email to %%email.unsub%%
>

The minimum size that C++ can allocate is a char. sizeof(char) is defined
as being 1. (expr.sizeof).

This unit is called a byte; a byte is the unit of memory capable of
storing any one character of the basic character set (intro.memory).

According to ANSI/ISO C++. Surely this constitutes the bible? Certainly
not K&R (K&R C stands a good chance of being illegal ANSI C and will
almost certainly be illegal ANSI C++).

Though as far as I can see, the basic character set often only requires
seven bits, not eight.

On Fri, 8 Mar 2002 xxxxx@seagate.com wrote:

Actually, the C++ bible is Stroustrup, and it neglects to mention the
minimum size requirement. Since I was using C++, I consider it an
oversight in the documentation.

Phil

“Gregory G. Dyess” @lists.osr.com on 03/08/2002 12:26:53
> PM
>
> Please respond to “NT Developers Interest List”
>
> Sent by: xxxxx@lists.osr.com
>
>
> To: “NT Developers Interest List”
> cc:
>
> Subject: [ntdev] RE: Interesting quirk of union
>
>
> That is correct behavior. Refer to the C/C++ bibles, name K&R. The
> minimum
> size of any structure/union/enum is 1 byte.
>
> Greg
>
>
>
>
> —
> You are currently subscribed to ntdev as: xxxxx@inkvine.fluff.org
> To unsubscribe send a blank email to %%email.unsub%%
>


Peter xxxxx@inkvine.fluff.org
http://www.inkvine.fluff.org/~peter/

logic kicks ass:
(1) Horses have an even number of legs.
(2) They have two legs in back and fore legs in front.
(3) This makes a total of six legs, which certainly is an odd number of
legs for a horse.
(4) But the only number that is both odd and even is infinity.
(5) Therefore, horses must have an infinite number of legs.

As is often the case in this field, it would appear that one needs to
aggregate the information in multiple references to get a complete picture.
As you say, it is aligned on a boundary appropriate for what it contains.
But mine contained nothing but bit-fields, so one could argue from that
definition that my implementation should work as expected. One also needs
to know that the minimum size is one byte to get the complete picture. In
my case, but probably not in general, it *was* an issue of size.

Thanks to all for the informative responses.

Phil

“Moreira, Alberto” @lists.osr.com on
03/08/2002 01:48:51 PM

Please respond to “NT Developers Interest List”

Sent by: xxxxx@lists.osr.com

To: “NT Developers Interest List”
cc:

Subject: [ntdev] RE: Interesting quirk of union

I believe the issue isn’t one of minimum size, but one of alignment.
Harbison and Steele (Section 5.7) states that “An object of a union type
will begin on a storage alignment boundary appropriate for any contained
component”. So, the union aligns on a char boundary. I went to MSVC and
replaced all “char” with “long”, and, what do you know, the union is now at
offset +4 instead of +1. It’s a question of alignment, and it seems to be
standard C behavior.

> Though as far as I can see, the basic character set often only requires

seven bits, not eight.

In ahcient times there were punched tapes with five holes
in a row. So those bytes were 5 bit wide.
Probably that character set is even more 'basic?

:slight_smile:

On Fri, 8 Mar 2002 xxxxx@seagate.com wrote:

>
> Actually, the C++ bible is Stroustrup, and it neglects to mention the
> minimum size requirement. Since I was using C++, I consider it an
> oversight in the documentation.
>
> Phil
>
>
>
>
> “Gregory G. Dyess” @lists.osr.com on 03/08/2002 12:26:53
> > PM
> >
> > Please respond to “NT Developers Interest List”
> >
> > Sent by: xxxxx@lists.osr.com
> >
> >
> > To: “NT Developers Interest List”
> > cc:
> >
> > Subject: [ntdev] RE: Interesting quirk of union
> >
> >
> > That is correct behavior. Refer to the C/C++ bibles, name K&R. The
> > minimum
> > size of any structure/union/enum is 1 byte.
> >
> > Greg
> >
> >
> >
> >
> > —
> > You are currently subscribed to ntdev as: xxxxx@inkvine.fluff.org
> > To unsubscribe send a blank email to %%email.unsub%%
> >
>
> –
> Peter xxxxx@inkvine.fluff.org
> http://www.inkvine.fluff.org/~peter/
>
> logic kicks ass:
> (1) Horses have an even number of legs.
> (2) They have two legs in back and fore legs in front.
> (3) This makes a total of six legs, which certainly is an odd number of
> legs for a horse.
> (4) But the only number that is both odd and even is infinity.
> (5) Therefore, horses must have an infinite number of legs.
>
>
> —
> You are currently subscribed to ntdev as: xxxxx@setengineering.com
> To unsubscribe send a blank email to %%email.unsub%%

However, explain this !

I did four trials.

=============================
Trial 1,

main()
{
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
union
{
unsigned char x:3;
struct
{
unsigned char f:1;
unsigned char g:1;
unsigned char h:1;
}
}
s;

s.a = 1;
s.f = 1;
}

In this trial, s.f compiles to _s$[ebp+1]. This looks ok.

Trial 2,

main()
{
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
union
{
unsigned long x:3;
struct
{
unsigned long f:1;
unsigned long g:1;
unsigned long h:1;
}
}
s;

s.a = 1;
s.f = 1;
}

In this trial, s.f compiles to _s$[ebp+4]. That also looks ok.

=============================
Trial 3,

main()
{
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
union
{
unsigned char x:3;
struct
{
unsigned long f:1;
unsigned long g:1;
unsigned long h:1;
}
}
s;

s.a = 1;
s.f = 1;
}

In this trial, s.f compiles to _s$[ebp+4]. This is consistent with the
alignment view.

However, Trial 4,

main()
{
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
union
{
unsigned long x:3;
struct
{
unsigned char f:1;
unsigned char g:1;
unsigned char h:1;
}
}
s;

s.a = 1;
s.f = 1;
}

In this trial, s.f compiles to _s$[ebp+1] again ! So, I’m not too sure about
the rationale any longer. Can anyone explain what’s going on ?

Alberto.

-----Original Message-----
From: Michal Vodicka [mailto:xxxxx@veridicom.cz.nospam]
Sent: Friday, March 08, 2002 4:26 PM
To: NT Developers Interest List
Subject: [ntdev] RE: Interesting quirk of union

I agree it is alingment issue. MSDN says following (Storage and Alignment of
Structures):

Every data object has an alignment-requirement. For structures, the
requirement is the largest of its members. Every object is allocated an
offset so that
offset % alignment-requirement == 0

and

Bit fields default to size long for the Microsoft C compiler. Structure
members are aligned on the size of the type or the /Zp[n] size, whichever is
smaller. The default size is 4.

Best regards,

Michal Vodicka
STMicroelectronics Design and Application s.r.o.
[michal.vodicka@st.com, http:://www.st.com]


From:
xxxxx@compuware.com[SMTP:xxxxx@compuware.com]
Reply To: xxxxx@lists.osr.com
Sent: Friday, March 08, 2002 9:48 PM
To: xxxxx@lists.osr.com
Subject: [ntdev] RE: Interesting quirk of union

I believe the issue isn’t one of minimum size, but one of alignment.
Harbison and Steele (Section 5.7) states that “An object of a union type
will begin on a storage alignment boundary appropriate for any contained
component”. So, the union aligns on a char boundary. I went to MSVC and
replaced all “char” with “long”, and, what do you know, the union is now
at
offset +4 instead of +1. It’s a question of alignment, and it seems to be
standard C behavior.

Alberto.

-----Original Message-----
From: xxxxx@seagate.com [mailto:xxxxx@seagate.com]
Sent: Friday, March 08, 2002 3:36 PM
To: NT Developers Interest List
Subject: [ntdev] RE: Interesting quirk of union

Actually, the C++ bible is Stroustrup, and it neglects to mention the
minimum size requirement. Since I was using C++, I consider it an
oversight in the documentation.

Phil

“Gregory G. Dyess” @lists.osr.com on 03/08/2002 12:26:53
> PM
>
> Please respond to “NT Developers Interest List”
>
> Sent by: xxxxx@lists.osr.com
>
>
> To: “NT Developers Interest List”
> cc:
>
> Subject: [ntdev] RE: Interesting quirk of union
>
>
> That is correct behavior. Refer to the C/C++ bibles, name K&R. The
> minimum
> size of any structure/union/enum is 1 byte.
>
> Greg
>
>
>
>
> —
> You are currently subscribed to ntdev as: xxxxx@compuware.com
> To unsubscribe send a blank email to %%email.unsub%%
>
> —
> You are currently subscribed to ntdev as: michal.vodicka@st.com
> To unsubscribe send a blank email to %%email.unsub%%
>


You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to %%email.unsub%%

Phil,

Your fields were defined as “unsigned char”, hence the +1. If you change it
all to “unsigned long”, you get +4, even though they’re still big fields !

Alberto.

-----Original Message-----
From: xxxxx@seagate.com [mailto:xxxxx@seagate.com]
Sent: Friday, March 08, 2002 4:29 PM
To: NT Developers Interest List
Subject: [ntdev] RE: Interesting quirk of union

As is often the case in this field, it would appear that one needs to
aggregate the information in multiple references to get a complete picture.
As you say, it is aligned on a boundary appropriate for what it contains.
But mine contained nothing but bit-fields, so one could argue from that
definition that my implementation should work as expected. One also needs
to know that the minimum size is one byte to get the complete picture. In
my case, but probably not in general, it *was* an issue of size.

Thanks to all for the informative responses.

Phil

“Moreira, Alberto” @lists.osr.com on
03/08/2002 01:48:51 PM

Please respond to “NT Developers Interest List”

Sent by: xxxxx@lists.osr.com

To: “NT Developers Interest List”
cc:

Subject: [ntdev] RE: Interesting quirk of union

I believe the issue isn’t one of minimum size, but one of alignment.
Harbison and Steele (Section 5.7) states that “An object of a union type
will begin on a storage alignment boundary appropriate for any contained
component”. So, the union aligns on a char boundary. I went to MSVC and
replaced all “char” with “long”, and, what do you know, the union is now at
offset +4 instead of +1. It’s a question of alignment, and it seems to be
standard C behavior.


You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to %%email.unsub%%

> ----------

From:
xxxxx@compuware.com[SMTP:xxxxx@compuware.com]
Reply To: xxxxx@lists.osr.com
Sent: Friday, March 08, 2002 10:42 PM
To: xxxxx@lists.osr.com
Subject: [ntdev] RE: Interesting quirk of union

However, Trial 4,

main()
{
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
union
{
unsigned long x:3;
struct
{
unsigned char f:1;
unsigned char g:1;
unsigned char h:1;
}
}
s;

s.a = 1;
s.f = 1;
}

In this trial, s.f compiles to _s$[ebp+1] again ! So, I’m not too sure
about
the rationale any longer. Can anyone explain what’s going on ?

Docs says:


The following guidelines ensure proper alignment for processors targeted by
Win32:
structures Largest alignment requirement of any member
unions Alignment requirement of the first member

so it should be 4… I tried to change x:3 to x:32 and the result is the
same. When changed to unsigned long (no bitfield), it is 4. It seems
bitfield are handled differently. But docs also says:

Unnamed bit fields with base type long, short, or char (signed or unsigned)
force alignment to a boundary appropriate to the base type.

Compiler or docs bug?

Best regards,

Michal Vodicka
STMicroelectronics Design and Application s.r.o.
[michal.vodicka@st.com, http:://www.st.com]

> ----------

From:
xxxxx@veridicom.cz.nospam[SMTP:xxxxx@veridicom.cz.nospam]
Reply To: xxxxx@lists.osr.com
Sent: Friday, March 08, 2002 11:58 PM
To: xxxxx@lists.osr.com
Subject: [ntdev] RE: Interesting quirk of union

Unnamed bit fields with base type long, short, or char (signed or
unsigned)
force alignment to a boundary appropriate to the base type.

Compiler or docs bug?

Nope, I missed the word “unnamed” :frowning:

Best regards,

Michal Vodicka
STMicroelectronics Design and Application s.r.o.
[michal.vodicka@st.com, http:://www.st.com]