place `const volatile' objects in read-only sections

Message ID 87sfmbxyyj.fsf@oracle.com
State New
Headers
Series place `const volatile' objects in read-only sections |

Commit Message

Jose E. Marchesi Aug. 5, 2022, 1:26 a.m. UTC
  Hi people!

First of all, a bit of context.

It is common for C BPF programs to use variables that are implicitly set
by the underlying BPF machinery and not by the program itself.  It is
also necessary for these variables to be stored in read-only storage so
the BPF verifier recognizes them as such.  This leads to declarations
using both `const' and `volatile' qualifiers, like this:

  const volatile unsigned char is_allow_list = 0;

Where `volatile' is used to avoid the compiler to optimize out the
variable, or turn it into a constant, and `const' to make sure it is
placed in .rodata.

Now, it happens that:

- GCC places `const volatile' objects in the .data section, under the
  assumption that `volatile' somehow voids the `const'.

- LLVM places `const volatile' objects in .rodata, under the
  assumption that `volatile' is orthogonal to `const'.

So there is a divergence, and this divergence has practical
consequences: it makes BPF programs compiled with GCC to not work
properly.

When looking into this, I found this bugzilla:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521
  "change semantics of const volatile variables"

which was filed back in 2005.  This report was already asking to put
`const volatile' objects in .rodata, questioning the current behavior.

While discussing this in the #gcc IRC channel I was pointed out to the
following excerpt from the C18 spec:

   6.7.3 Type qualifiers / 5 The properties associated with qualified
         types are meaningful only for expressions that are
         lval-values [note 135]


   135) The implementation may place a const object that is not
        volatile in a read-only region of storage. Moreover, the
        implementation need not allocate storage for such an object if
        its $ address is never used.

This footnote may be interpreted as if const objects that are volatile
shouldn't be put in read-only storage.  Even if I was not very convinced
of that interpretation (see my earlier comment in BZ 25521) I filed the
following issue in the LLVM tracker in order to discuss the matter:

  https://github.com/llvm/llvm-project/issues/56468

As you can see, Aaron Ballman, one of the LLVM hackers, asked the WG14
reflectors about this.  He reported back that the reflectors consider
footnote 135 has not normative value.

So, not having a normative mandate on either direction, there are two
options:

a) To change GCC to place `const volatile' objects in .rodata instead
   of .data.

b) To change LLVM to place `const volatile' objects in .data instead
   of .rodata.

Considering that:

- One target (bpf-unknown-none) breaks with the current GCC behavior.

- No target/platform relies on the GCC behavior, that we know.  (And it
  is unlikely there is any, at least for targets also supported by
  LLVM.)

- Changing the LLVM behavior at this point would be very severely
  traumatic for the BPF people and their users.

I think the right thing to do is a).
Therefore this patch.

A note about the patch itself:

I am not that familiar with the middle-end and in this patch I am
assuming that a `var|constructor + SIDE_EFFECTS' is the result of
`volatile' (or an equivalent language construction) and nothing else.
It would be good if some middle-end wizard could confirm this.

Regtested in x86_64-linux-gnu and bpf-unknown-none.
No regressions observed.

gcc/ChangeLog:

	PR middle-end/25521
	* varasm.cc (categorize_decl_for_section): Place `const volatile'
	objects in read-only sections.
	(default_select_section): Likewise.
---
 gcc/varasm.cc | 3 ---
 1 file changed, 3 deletions(-)
  

Comments

Richard Biener Aug. 5, 2022, 6:52 a.m. UTC | #1
On Fri, Aug 5, 2022 at 3:27 AM Jose E. Marchesi via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
> Hi people!
>
> First of all, a bit of context.
>
> It is common for C BPF programs to use variables that are implicitly set
> by the underlying BPF machinery and not by the program itself.  It is
> also necessary for these variables to be stored in read-only storage so
> the BPF verifier recognizes them as such.  This leads to declarations
> using both `const' and `volatile' qualifiers, like this:
>
>   const volatile unsigned char is_allow_list = 0;
>
> Where `volatile' is used to avoid the compiler to optimize out the
> variable, or turn it into a constant, and `const' to make sure it is
> placed in .rodata.
>
> Now, it happens that:
>
> - GCC places `const volatile' objects in the .data section, under the
>   assumption that `volatile' somehow voids the `const'.
>
> - LLVM places `const volatile' objects in .rodata, under the
>   assumption that `volatile' is orthogonal to `const'.
>
> So there is a divergence, and this divergence has practical
> consequences: it makes BPF programs compiled with GCC to not work
> properly.
>
> When looking into this, I found this bugzilla:
>
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521
>   "change semantics of const volatile variables"
>
> which was filed back in 2005.  This report was already asking to put
> `const volatile' objects in .rodata, questioning the current behavior.
>
> While discussing this in the #gcc IRC channel I was pointed out to the
> following excerpt from the C18 spec:
>
>    6.7.3 Type qualifiers / 5 The properties associated with qualified
>          types are meaningful only for expressions that are
>          lval-values [note 135]
>
>
>    135) The implementation may place a const object that is not
>         volatile in a read-only region of storage. Moreover, the
>         implementation need not allocate storage for such an object if
>         its $ address is never used.
>
> This footnote may be interpreted as if const objects that are volatile
> shouldn't be put in read-only storage.  Even if I was not very convinced
> of that interpretation (see my earlier comment in BZ 25521) I filed the
> following issue in the LLVM tracker in order to discuss the matter:
>
>   https://github.com/llvm/llvm-project/issues/56468
>
> As you can see, Aaron Ballman, one of the LLVM hackers, asked the WG14
> reflectors about this.  He reported back that the reflectors consider
> footnote 135 has not normative value.
>
> So, not having a normative mandate on either direction, there are two
> options:
>
> a) To change GCC to place `const volatile' objects in .rodata instead
>    of .data.
>
> b) To change LLVM to place `const volatile' objects in .data instead
>    of .rodata.
>
> Considering that:
>
> - One target (bpf-unknown-none) breaks with the current GCC behavior.
>
> - No target/platform relies on the GCC behavior, that we know.  (And it
>   is unlikely there is any, at least for targets also supported by
>   LLVM.)
>
> - Changing the LLVM behavior at this point would be very severely
>   traumatic for the BPF people and their users.
>
> I think the right thing to do is a).
> Therefore this patch.
>
> A note about the patch itself:
>
> I am not that familiar with the middle-end and in this patch I am
> assuming that a `var|constructor + SIDE_EFFECTS' is the result of
> `volatile' (or an equivalent language construction) and nothing else.
> It would be good if some middle-end wizard could confirm this.

Yes, for decls that sounds correct.  For a CTOR it just means
re-evaluation is not safe.

> Regtested in x86_64-linux-gnu and bpf-unknown-none.
> No regressions observed.

I think this warrants a testcase.

I'm not sure I agree about the whole thing though, I'm leaving it
to Joseph.

> gcc/ChangeLog:
>
>         PR middle-end/25521
>         * varasm.cc (categorize_decl_for_section): Place `const volatile'
>         objects in read-only sections.
>         (default_select_section): Likewise.
> ---
>  gcc/varasm.cc | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 4db8506b106..7864db11faf 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -6971,7 +6971,6 @@ default_select_section (tree decl, int reloc,
>      {
>        if (! ((flag_pic && reloc)
>              || !TREE_READONLY (decl)
> -            || TREE_SIDE_EFFECTS (decl)
>              || !TREE_CONSTANT (decl)))
>         return readonly_data_section;
>      }
> @@ -7005,7 +7004,6 @@ categorize_decl_for_section (const_tree decl, int reloc)
>        if (bss_initializer_p (decl))
>         ret = SECCAT_BSS;
>        else if (! TREE_READONLY (decl)
> -              || TREE_SIDE_EFFECTS (decl)
>                || (DECL_INITIAL (decl)
>                    && ! TREE_CONSTANT (DECL_INITIAL (decl))))
>         {
> @@ -7046,7 +7044,6 @@ categorize_decl_for_section (const_tree decl, int reloc)
>    else if (TREE_CODE (decl) == CONSTRUCTOR)
>      {
>        if ((reloc & targetm.asm_out.reloc_rw_mask ())
> -         || TREE_SIDE_EFFECTS (decl)
>           || ! TREE_CONSTANT (decl))
>         ret = SECCAT_DATA;
>        else
> --
> 2.30.2
>
  
Jose E. Marchesi Aug. 5, 2022, 8:26 a.m. UTC | #2
Hi Richard.

> On Fri, Aug 5, 2022 at 3:27 AM Jose E. Marchesi via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>> Hi people!
>>
>> First of all, a bit of context.
>>
>> It is common for C BPF programs to use variables that are implicitly set
>> by the underlying BPF machinery and not by the program itself.  It is
>> also necessary for these variables to be stored in read-only storage so
>> the BPF verifier recognizes them as such.  This leads to declarations
>> using both `const' and `volatile' qualifiers, like this:
>>
>>   const volatile unsigned char is_allow_list = 0;
>>
>> Where `volatile' is used to avoid the compiler to optimize out the
>> variable, or turn it into a constant, and `const' to make sure it is
>> placed in .rodata.
>>
>> Now, it happens that:
>>
>> - GCC places `const volatile' objects in the .data section, under the
>>   assumption that `volatile' somehow voids the `const'.
>>
>> - LLVM places `const volatile' objects in .rodata, under the
>>   assumption that `volatile' is orthogonal to `const'.
>>
>> So there is a divergence, and this divergence has practical
>> consequences: it makes BPF programs compiled with GCC to not work
>> properly.
>>
>> When looking into this, I found this bugzilla:
>>
>>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521
>>   "change semantics of const volatile variables"
>>
>> which was filed back in 2005.  This report was already asking to put
>> `const volatile' objects in .rodata, questioning the current behavior.
>>
>> While discussing this in the #gcc IRC channel I was pointed out to the
>> following excerpt from the C18 spec:
>>
>>    6.7.3 Type qualifiers / 5 The properties associated with qualified
>>          types are meaningful only for expressions that are
>>          lval-values [note 135]
>>
>>
>>    135) The implementation may place a const object that is not
>>         volatile in a read-only region of storage. Moreover, the
>>         implementation need not allocate storage for such an object if
>>         its $ address is never used.
>>
>> This footnote may be interpreted as if const objects that are volatile
>> shouldn't be put in read-only storage.  Even if I was not very convinced
>> of that interpretation (see my earlier comment in BZ 25521) I filed the
>> following issue in the LLVM tracker in order to discuss the matter:
>>
>>   https://github.com/llvm/llvm-project/issues/56468
>>
>> As you can see, Aaron Ballman, one of the LLVM hackers, asked the WG14
>> reflectors about this.  He reported back that the reflectors consider
>> footnote 135 has not normative value.
>>
>> So, not having a normative mandate on either direction, there are two
>> options:
>>
>> a) To change GCC to place `const volatile' objects in .rodata instead
>>    of .data.
>>
>> b) To change LLVM to place `const volatile' objects in .data instead
>>    of .rodata.
>>
>> Considering that:
>>
>> - One target (bpf-unknown-none) breaks with the current GCC behavior.
>>
>> - No target/platform relies on the GCC behavior, that we know.  (And it
>>   is unlikely there is any, at least for targets also supported by
>>   LLVM.)
>>
>> - Changing the LLVM behavior at this point would be very severely
>>   traumatic for the BPF people and their users.
>>
>> I think the right thing to do is a).
>> Therefore this patch.
>>
>> A note about the patch itself:
>>
>> I am not that familiar with the middle-end and in this patch I am
>> assuming that a `var|constructor + SIDE_EFFECTS' is the result of
>> `volatile' (or an equivalent language construction) and nothing else.
>> It would be good if some middle-end wizard could confirm this.
>
> Yes, for decls that sounds correct.  For a CTOR it just means
> re-evaluation is not safe.

Thanks for confirming.

>> Regtested in x86_64-linux-gnu and bpf-unknown-none.
>> No regressions observed.
>
> I think this warrants a testcase.

Sure, will add one.
What would be the right testsuite?  gcc.dg?

> I'm not sure I agree about the whole thing though, I'm leaving it
> to Joseph.
>
>> gcc/ChangeLog:
>>
>>         PR middle-end/25521
>>         * varasm.cc (categorize_decl_for_section): Place `const volatile'
>>         objects in read-only sections.
>>         (default_select_section): Likewise.
>> ---
>>  gcc/varasm.cc | 3 ---
>>  1 file changed, 3 deletions(-)
>>
>> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
>> index 4db8506b106..7864db11faf 100644
>> --- a/gcc/varasm.cc
>> +++ b/gcc/varasm.cc
>> @@ -6971,7 +6971,6 @@ default_select_section (tree decl, int reloc,
>>      {
>>        if (! ((flag_pic && reloc)
>>              || !TREE_READONLY (decl)
>> -            || TREE_SIDE_EFFECTS (decl)
>>              || !TREE_CONSTANT (decl)))
>>         return readonly_data_section;
>>      }
>> @@ -7005,7 +7004,6 @@ categorize_decl_for_section (const_tree decl, int reloc)
>>        if (bss_initializer_p (decl))
>>         ret = SECCAT_BSS;
>>        else if (! TREE_READONLY (decl)
>> -              || TREE_SIDE_EFFECTS (decl)
>>                || (DECL_INITIAL (decl)
>>                    && ! TREE_CONSTANT (DECL_INITIAL (decl))))
>>         {
>> @@ -7046,7 +7044,6 @@ categorize_decl_for_section (const_tree decl, int reloc)
>>    else if (TREE_CODE (decl) == CONSTRUCTOR)
>>      {
>>        if ((reloc & targetm.asm_out.reloc_rw_mask ())
>> -         || TREE_SIDE_EFFECTS (decl)
>>           || ! TREE_CONSTANT (decl))
>>         ret = SECCAT_DATA;
>>        else
>> --
>> 2.30.2
>>
  
Richard Biener Aug. 5, 2022, 8:44 a.m. UTC | #3
On Fri, Aug 5, 2022 at 10:30 AM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
>
> Hi Richard.
>
> > On Fri, Aug 5, 2022 at 3:27 AM Jose E. Marchesi via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >>
> >> Hi people!
> >>
> >> First of all, a bit of context.
> >>
> >> It is common for C BPF programs to use variables that are implicitly set
> >> by the underlying BPF machinery and not by the program itself.  It is
> >> also necessary for these variables to be stored in read-only storage so
> >> the BPF verifier recognizes them as such.  This leads to declarations
> >> using both `const' and `volatile' qualifiers, like this:
> >>
> >>   const volatile unsigned char is_allow_list = 0;
> >>
> >> Where `volatile' is used to avoid the compiler to optimize out the
> >> variable, or turn it into a constant, and `const' to make sure it is
> >> placed in .rodata.
> >>
> >> Now, it happens that:
> >>
> >> - GCC places `const volatile' objects in the .data section, under the
> >>   assumption that `volatile' somehow voids the `const'.
> >>
> >> - LLVM places `const volatile' objects in .rodata, under the
> >>   assumption that `volatile' is orthogonal to `const'.
> >>
> >> So there is a divergence, and this divergence has practical
> >> consequences: it makes BPF programs compiled with GCC to not work
> >> properly.
> >>
> >> When looking into this, I found this bugzilla:
> >>
> >>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521
> >>   "change semantics of const volatile variables"
> >>
> >> which was filed back in 2005.  This report was already asking to put
> >> `const volatile' objects in .rodata, questioning the current behavior.
> >>
> >> While discussing this in the #gcc IRC channel I was pointed out to the
> >> following excerpt from the C18 spec:
> >>
> >>    6.7.3 Type qualifiers / 5 The properties associated with qualified
> >>          types are meaningful only for expressions that are
> >>          lval-values [note 135]
> >>
> >>
> >>    135) The implementation may place a const object that is not
> >>         volatile in a read-only region of storage. Moreover, the
> >>         implementation need not allocate storage for such an object if
> >>         its $ address is never used.
> >>
> >> This footnote may be interpreted as if const objects that are volatile
> >> shouldn't be put in read-only storage.  Even if I was not very convinced
> >> of that interpretation (see my earlier comment in BZ 25521) I filed the
> >> following issue in the LLVM tracker in order to discuss the matter:
> >>
> >>   https://github.com/llvm/llvm-project/issues/56468
> >>
> >> As you can see, Aaron Ballman, one of the LLVM hackers, asked the WG14
> >> reflectors about this.  He reported back that the reflectors consider
> >> footnote 135 has not normative value.
> >>
> >> So, not having a normative mandate on either direction, there are two
> >> options:
> >>
> >> a) To change GCC to place `const volatile' objects in .rodata instead
> >>    of .data.
> >>
> >> b) To change LLVM to place `const volatile' objects in .data instead
> >>    of .rodata.
> >>
> >> Considering that:
> >>
> >> - One target (bpf-unknown-none) breaks with the current GCC behavior.
> >>
> >> - No target/platform relies on the GCC behavior, that we know.  (And it
> >>   is unlikely there is any, at least for targets also supported by
> >>   LLVM.)
> >>
> >> - Changing the LLVM behavior at this point would be very severely
> >>   traumatic for the BPF people and their users.
> >>
> >> I think the right thing to do is a).
> >> Therefore this patch.
> >>
> >> A note about the patch itself:
> >>
> >> I am not that familiar with the middle-end and in this patch I am
> >> assuming that a `var|constructor + SIDE_EFFECTS' is the result of
> >> `volatile' (or an equivalent language construction) and nothing else.
> >> It would be good if some middle-end wizard could confirm this.
> >
> > Yes, for decls that sounds correct.  For a CTOR it just means
> > re-evaluation is not safe.
>
> Thanks for confirming.
>
> >> Regtested in x86_64-linux-gnu and bpf-unknown-none.
> >> No regressions observed.
> >
> > I think this warrants a testcase.
>
> Sure, will add one.
> What would be the right testsuite?  gcc.dg?

That sounds good.

> > I'm not sure I agree about the whole thing though, I'm leaving it
> > to Joseph.
> >
> >> gcc/ChangeLog:
> >>
> >>         PR middle-end/25521
> >>         * varasm.cc (categorize_decl_for_section): Place `const volatile'
> >>         objects in read-only sections.
> >>         (default_select_section): Likewise.
> >> ---
> >>  gcc/varasm.cc | 3 ---
> >>  1 file changed, 3 deletions(-)
> >>
> >> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> >> index 4db8506b106..7864db11faf 100644
> >> --- a/gcc/varasm.cc
> >> +++ b/gcc/varasm.cc
> >> @@ -6971,7 +6971,6 @@ default_select_section (tree decl, int reloc,
> >>      {
> >>        if (! ((flag_pic && reloc)
> >>              || !TREE_READONLY (decl)
> >> -            || TREE_SIDE_EFFECTS (decl)
> >>              || !TREE_CONSTANT (decl)))
> >>         return readonly_data_section;
> >>      }
> >> @@ -7005,7 +7004,6 @@ categorize_decl_for_section (const_tree decl, int reloc)
> >>        if (bss_initializer_p (decl))
> >>         ret = SECCAT_BSS;
> >>        else if (! TREE_READONLY (decl)
> >> -              || TREE_SIDE_EFFECTS (decl)
> >>                || (DECL_INITIAL (decl)
> >>                    && ! TREE_CONSTANT (DECL_INITIAL (decl))))
> >>         {
> >> @@ -7046,7 +7044,6 @@ categorize_decl_for_section (const_tree decl, int reloc)
> >>    else if (TREE_CODE (decl) == CONSTRUCTOR)
> >>      {
> >>        if ((reloc & targetm.asm_out.reloc_rw_mask ())
> >> -         || TREE_SIDE_EFFECTS (decl)
> >>           || ! TREE_CONSTANT (decl))
> >>         ret = SECCAT_DATA;
> >>        else
> >> --
> >> 2.30.2
> >>
  

Patch

diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 4db8506b106..7864db11faf 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -6971,7 +6971,6 @@  default_select_section (tree decl, int reloc,
     {
       if (! ((flag_pic && reloc)
 	     || !TREE_READONLY (decl)
-	     || TREE_SIDE_EFFECTS (decl)
 	     || !TREE_CONSTANT (decl)))
 	return readonly_data_section;
     }
@@ -7005,7 +7004,6 @@  categorize_decl_for_section (const_tree decl, int reloc)
       if (bss_initializer_p (decl))
 	ret = SECCAT_BSS;
       else if (! TREE_READONLY (decl)
-	       || TREE_SIDE_EFFECTS (decl)
 	       || (DECL_INITIAL (decl)
 		   && ! TREE_CONSTANT (DECL_INITIAL (decl))))
 	{
@@ -7046,7 +7044,6 @@  categorize_decl_for_section (const_tree decl, int reloc)
   else if (TREE_CODE (decl) == CONSTRUCTOR)
     {
       if ((reloc & targetm.asm_out.reloc_rw_mask ())
-	  || TREE_SIDE_EFFECTS (decl)
 	  || ! TREE_CONSTANT (decl))
 	ret = SECCAT_DATA;
       else