[0/6] Support use of -fstrict-flex-arrays in user code

Message ID 20230710150454.5490-1-cristian@rodriguez.im
Headers
Series Support use of -fstrict-flex-arrays in user code |

Message

Cristian Rodríguez July 10, 2023, 3:04 p.m. UTC
  Currently GCC 13  -fstrict-flex-arrays cannot be used safely with user
code because the library makes use of pre-c99 flexible arrays.

This series introduces a macro to annotate such uses and grandfathers-in
this legacy use.

This series only cover the user-visible *library* types. the linux kernel types
are subject to a different patch set and discussion since the kernel has
moved on to c11 leaving pre-c99 compatibilty behind.


Cristian Rodríguez (6):
  Introduce __decl_is_flex_array macro
  dlfcn : annotate dls_serpath with __decl_is_flex_array
  inet: annotate  tu_data and tu_stuff with __decl_is_flex_array
  inet: annotate ip6r0_addr with __decl_is_flex_array
  io: annotate fts_name with __decl_is_flex_array
  misc: annotate q_data with __decl_is_flex_array

 dlfcn/dlfcn.h      | 2 +-
 inet/arpa/tftp.h   | 6 ++++--
 inet/netinet/ip6.h | 2 +-
 io/fts.h           | 4 ++--
 misc/search.h      | 2 +-
 misc/sys/cdefs.h   | 6 ++++++
 6 files changed, 15 insertions(+), 7 deletions(-)
  

Comments

Cristian Rodríguez July 19, 2023, 12:40 p.m. UTC | #1
On Mon, Jul 10, 2023 at 11:05 AM Cristian Rodríguez <cristian@rodriguez.im>
wrote:

> Currently GCC 13  -fstrict-flex-arrays cannot be used safely with user
> code because the library makes use of pre-c99 flexible arrays.
>
> This series introduces a macro to annotate such uses and grandfathers-in
> this legacy use.
>
> This series only cover the user-visible *library* types. the linux kernel
> types
> are subject to a different patch set and discussion since the kernel has
> moved on to c11 leaving pre-c99 compatibilty behind.
>
>
>
>
Any comments about this series.. ? as is it should introduce no
regressions.
  
Adhemerval Zanella Netto July 19, 2023, 2:27 p.m. UTC | #2
On 19/07/23 09:40, Cristian Rodríguez via Libc-alpha wrote:
> On Mon, Jul 10, 2023 at 11:05 AM Cristian Rodríguez <cristian@rodriguez.im>
> wrote:
> 
>> Currently GCC 13  -fstrict-flex-arrays cannot be used safely with user
>> code because the library makes use of pre-c99 flexible arrays.
>>
>> This series introduces a macro to annotate such uses and grandfathers-in
>> this legacy use.
>>
>> This series only cover the user-visible *library* types. the linux kernel
>> types
>> are subject to a different patch set and discussion since the kernel has
>> moved on to c11 leaving pre-c99 compatibilty behind.
>>
>>
>>
>>
> Any comments about this series.. ? as is it should introduce no
> regressions.


Can't we use the already define __flexarr/__glibc_c99_flexarr_available macro 
instead? Or do we really need to handle the usage of  -fstrict-flex-arrays
with pre-c99 usage?
  
Cristian Rodríguez July 19, 2023, 2:57 p.m. UTC | #3
On Wed, Jul 19, 2023 at 10:27 AM Adhemerval Zanella Netto <
adhemerval.zanella@linaro.org> wrote:

>
>
> On 19/07/23 09:40, Cristian Rodríguez via Libc-alpha wrote:
> > On Mon, Jul 10, 2023 at 11:05 AM Cristian Rodríguez <
> cristian@rodriguez.im>
> > wrote:
> >
> >> Currently GCC 13  -fstrict-flex-arrays cannot be used safely with user
> >> code because the library makes use of pre-c99 flexible arrays.
> >>
> >> This series introduces a macro to annotate such uses and grandfathers-in
> >> this legacy use.
> >>
> >> This series only cover the user-visible *library* types. the linux
> kernel
> >> types
> >> are subject to a different patch set and discussion since the kernel has
> >> moved on to c11 leaving pre-c99 compatibilty behind.
> >>
> >>
> >>
> >>
> > Any comments about this series.. ? as is it should introduce no
> > regressions.
>
>
> Can't we use the already define __flexarr/__glibc_c99_flexarr_available
> macro
> instead?


If I did only that, sizeof public structures may change causing an abi
break.
if breaking the ABi is an option then one can switch completely to c99
flexible arrays and using/importing DECL_FLEX_ARRAY macro from the linux
kernel and move on.

I concluded it was better to grandfather them in and leave them as special
cases is the less painful solution for user visible parts, internals can
just  use standard flex arrays.
  
Adhemerval Zanella Netto July 19, 2023, 5:48 p.m. UTC | #4
On 19/07/23 11:57, Cristian Rodríguez wrote:
> 
> 
> On Wed, Jul 19, 2023 at 10:27 AM Adhemerval Zanella Netto <adhemerval.zanella@linaro.org <mailto:adhemerval.zanella@linaro.org>> wrote:
> 
> 
> 
>     On 19/07/23 09:40, Cristian Rodríguez via Libc-alpha wrote:
>     > On Mon, Jul 10, 2023 at 11:05 AM Cristian Rodríguez <cristian@rodriguez.im <mailto:cristian@rodriguez.im>>
>     > wrote:
>     >
>     >> Currently GCC 13  -fstrict-flex-arrays cannot be used safely with user
>     >> code because the library makes use of pre-c99 flexible arrays.
>     >>
>     >> This series introduces a macro to annotate such uses and grandfathers-in
>     >> this legacy use.
>     >>
>     >> This series only cover the user-visible *library* types. the linux kernel
>     >> types
>     >> are subject to a different patch set and discussion since the kernel has
>     >> moved on to c11 leaving pre-c99 compatibilty behind.
>     >>
>     >>
>     >>
>     >>
>     > Any comments about this series.. ? as is it should introduce no
>     > regressions.
> 
> 
>     Can't we use the already define __flexarr/__glibc_c99_flexarr_available macro
>     instead? 
> 
> 
> If I did only that, sizeof public structures may change causing an abi break. 
> if breaking the ABi is an option then one can switch completely to c99 flexible arrays and using/importing DECL_FLEX_ARRAY macro from the linux kernel and move on.
> 
> I concluded it was better to grandfather them in and leave them as special cases is the less painful solution for user visible parts, internals can just  use standard flex arrays.
>  

Fair enough, it is unfortunate that we had to use old flexible array syntax
on public headers.
  
Cristian Rodríguez Aug. 11, 2023, 3:34 p.m. UTC | #5
On Wed, Jul 19, 2023 at 1:48 PM Adhemerval Zanella Netto <
adhemerval.zanella@linaro.org> wrote:

>
> Fair enough, it is unfortunate that we had to use old flexible array syntax
> on public headers.
>

Any hope on getting this merged or get other feedback ?
  
Paul Eggert Aug. 11, 2023, 10:10 p.m. UTC | #6
On 2023-07-19 07:57, Cristian Rodríguez via Libc-alpha wrote:
>> Can't we use the already define __flexarr/__glibc_c99_flexarr_available
>> macro
>> instead?
> 
> If I did only that, sizeof public structures may change causing an abi
> break.

Why would it cause an ABI break?

For example, no program's machine code should depend on sizeof (FTSENT). 
fts.h's FTSENT is supposed to be an incomplete data type, like FILE: the 
system allocates FTSENT objects, not user code. So I don't see the harm 
in replacing [1] with [] in the definition of FTSENT's fts_name member, 
if the compiler supports flex arrays.
  
Alejandro Colomar Aug. 12, 2023, 11:07 a.m. UTC | #7
Hi Paul, Cristian,

On 2023-08-12 00:10, Paul Eggert wrote:
> On 2023-07-19 07:57, Cristian Rodríguez via Libc-alpha wrote:
>>> Can't we use the already define __flexarr/__glibc_c99_flexarr_available
>>> macro
>>> instead?
>>
>> If I did only that, sizeof public structures may change causing an abi
>> break.
> 
> Why would it cause an ABI break?
> 
> For example, no program's machine code should depend on sizeof (FTSENT).

+1


> fts.h's FTSENT is supposed to be an incomplete data type, like FILE: the 
> system allocates FTSENT objects, not user code. So I don't see the harm 
> in replacing [1] with [] in the definition of FTSENT's fts_name member, 
> if the compiler supports flex arrays.
> 

I'll raise the bid, and claim that no valid program can depend on
sizeof() of any struct with a flexible array member, of any kind (old
`[1]`, zero-length `[0]`, nor standard `[]`).

It is conceptually wrong to calculate sizeof() it, and the right thing to
do is to call offseof(s, fam).

In the best case, sizeof() will return the same as offsetof(3), and the
program will happen to work.  But in some cases, it will return a value
slightly bigger (ISO C guarantees that it will not be smaller, for no
reason I think), and in those cases, the program must have a bug.  That
bug can either be benign, and the program will just be wasting a few
bytes, or it can be bad, and read memory a few bytes off from where it
was written (still within the allocation bounds, so it won't crash; at
least not directly due to that, but maybe by a secondary bug).

See this GCC thread where I propose adding a warning when
sizeof(flexi_struct) is used in arithmetics.  There's an example program
showing why sizeof() is (almost) always a bad thing with these structures.
In the only use case where sizeof() is valid, reducing it wouldn't
change program behavior (it's within a malloc(MAX(sizeof(s), ...));).

Cheers,
Alex
  
Alejandro Colomar Aug. 12, 2023, 1:29 p.m. UTC | #8
On 2023-08-12 13:07, Alejandro Colomar wrote:
> Hi Paul, Cristian,
> 
> On 2023-08-12 00:10, Paul Eggert wrote:
>> On 2023-07-19 07:57, Cristian Rodríguez via Libc-alpha wrote:
>>>> Can't we use the already define __flexarr/__glibc_c99_flexarr_available
>>>> macro
>>>> instead?
>>>
>>> If I did only that, sizeof public structures may change causing an abi
>>> break.
>>
>> Why would it cause an ABI break?
>>
>> For example, no program's machine code should depend on sizeof (FTSENT).
> 
> +1
> 
> 
>> fts.h's FTSENT is supposed to be an incomplete data type, like FILE: the 
>> system allocates FTSENT objects, not user code. So I don't see the harm 
>> in replacing [1] with [] in the definition of FTSENT's fts_name member, 
>> if the compiler supports flex arrays.
>>
> 
> I'll raise the bid, and claim that no valid program can depend on
> sizeof() of any struct with a flexible array member, of any kind (old
> `[1]`, zero-length `[0]`, nor standard `[]`).
> 
> It is conceptually wrong to calculate sizeof() it, and the right thing to
> do is to call offseof(s, fam).
> 
> In the best case, sizeof() will return the same as offsetof(3), and the
> program will happen to work.  But in some cases, it will return a value
> slightly bigger (ISO C guarantees that it will not be smaller, for no
> reason I think), and in those cases, the program must have a bug.  That
> bug can either be benign, and the program will just be wasting a few
> bytes, or it can be bad, and read memory a few bytes off from where it
> was written (still within the allocation bounds, so it won't crash; at
> least not directly due to that, but maybe by a secondary bug).
> 
> See this GCC thread where I propose adding a warning when

Oops, I forgot to paste the link.

<https://inbox.sourceware.org/gcc/dac8afb7-5026-c702-85d2-c3ad977d9a48@kernel.org/T/>

> sizeof(flexi_struct) is used in arithmetics.  There's an example program
> showing why sizeof() is (almost) always a bad thing with these structures.
> In the only use case where sizeof() is valid, reducing it wouldn't
> change program behavior (it's within a malloc(MAX(sizeof(s), ...));).
> 
> Cheers,
> Alex
>
  
Cristian Rodríguez Aug. 12, 2023, 2:53 p.m. UTC | #9
On Fri, Aug 11, 2023 at 6:10 PM Paul Eggert <eggert@cs.ucla.edu> wrote:

> On 2023-07-19 07:57, Cristian Rodríguez via Libc-alpha wrote:
> >> Can't we use the already define __flexarr/__glibc_c99_flexarr_available
> >> macro
> >> instead?
> >
> > If I did only that, sizeof public structures may change causing an abi
> > break.
>
> Why would it cause an ABI break?
>
> For example, no program's machine code should depend on sizeof (FTSENT).
> fts.h's FTSENT is supposed to be an incomplete data type, like FILE: the
> system allocates FTSENT objects, not user code. So I don't see the harm
> in replacing [1] with [] in the definition of FTSENT's fts_name member,
> if the compiler supports flex arrays.
>

I , unsurprisingly, agree with your rationale. been just extremely
conservative to avoid breaking even wrong programs..all this would be
easier if there was a way to "error: sizeof cannot be used correctly on
struct blabla"  and fail compile in the first place.
  
Paul Eggert Aug. 12, 2023, 6:20 p.m. UTC | #10
On 2023-08-12 07:53, Cristian Rodríguez wrote:
> I , unsurprisingly, agree with your rationale.

Then let's do that instead.

> been just extremely
> conservative to avoid breaking even wrong programs.

I can't offhand see what plausible-but-wrong program would break.

One can imagine implausible scenarios where these invalid programs would 
break. But they're so unlikely we needn't worry about them.