[RFC,libgcc] Fix two issues in libgcc/unwind-dw2-fde.c.
Checks
Commit Message
Hi,
One of our big application segv in libgcc code while unwinding the stack.
This is a random crash while the application throws a c++ exception and
unwinds the stack. Incidents are random and never can be reproduced by any
test case.
The libgcc that is used is based on GCC 8.
Fortunately, we got some information through the stack trace, and narrowed
down the crash is in the routine "init_object" of libgcc/unwind-dw2-pde.c:
777 count = classify_object_over_fdes (ob, ob->u.single);
when loading the 2nd parameter of the call to "classify_object_over_fdes".
i.e, the address that is pointed by "ob->u.single" is invalid.
And the outer caller we can track back is the following line 1066 of the routine
"_Unwind_Find_FDE":
1060 /* Classify and search the objects we've not yet processed. */
1061 while ((ob = unseen_objects))
1062 {
1063 struct object **p;
1064
1065 unseen_objects = ob->next;
1066 f = search_object (ob, pc);
Then we suspected that the construction of the static variable "unseen_objects"
might have some bug.
Then after studying the source code that constructs "unseen_objects" in
libgcc/unwind-dw2-pde.c, I found an issue in the routines "__register_frame"
and "__register_frame_table" as following:
127 void
128 __register_frame (void *begin)
129 {
130 struct object *ob;
131
132 /* If .eh_frame is empty, don't register at all. */
133 if (*(uword *) begin == 0)
134 return;
135
136 ob = malloc (sizeof (struct object));
137 __register_frame_info (begin, ob);
138 }
181 void
182 __register_frame_table (void *begin)
183 {
184 struct object *ob = malloc (sizeof (struct object));
185 __register_frame_info_table (begin, ob);
186 }
187
In the above, one obvious issue in the source code is at line 136, line 137,
and line 184 and line 185: the return value of call to "malloc" is not checked
against NULL before it was passed as the second parameter of the follwoing call.
This might cause unpredicted random behavior.
I checked the latest trunk GCC libgcc, the same issue is there too.
This patch is for the latest trunk GCC.
Please let me know any comments on this?
thanks.
Qing
=======================
Fix two issues in libgcc/unwind-dw2-fde.c:
A. The returned value of call to malloc is not checked against NULL.
This might cause unpredicted behavior during stack unwinding.
B. Check for null begin parameter (as well as pointer to null) in
__register_frame_info_table_bases and __register_frame_table.
libgcc/ChangeLog:
* unwind-dw2-fde.c (__register_frame): Check the return value of call
to malloc.
(__register_frame_info_table_bases): Check for null begin parameter.
(__register_frame_table): Check the return value of call to malloc,
Check for null begin parameter.
---
libgcc/unwind-dw2-fde.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
Comments
On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>
> Hi,
>
> One of our big application segv in libgcc code while unwinding the stack.
>
> This is a random crash while the application throws a c++ exception and
> unwinds the stack. Incidents are random and never can be reproduced by any
> test case.
>
> The libgcc that is used is based on GCC 8.
>
> Fortunately, we got some information through the stack trace, and narrowed
> down the crash is in the routine "init_object" of libgcc/unwind-dw2-pde.c:
>
> 777 count = classify_object_over_fdes (ob, ob->u.single);
>
> when loading the 2nd parameter of the call to "classify_object_over_fdes".
> i.e, the address that is pointed by "ob->u.single" is invalid.
>
> And the outer caller we can track back is the following line 1066 of the routine
> "_Unwind_Find_FDE":
>
> 1060 /* Classify and search the objects we've not yet processed. */
> 1061 while ((ob = unseen_objects))
> 1062 {
> 1063 struct object **p;
> 1064
> 1065 unseen_objects = ob->next;
> 1066 f = search_object (ob, pc);
>
> Then we suspected that the construction of the static variable "unseen_objects"
> might have some bug.
>
> Then after studying the source code that constructs "unseen_objects" in
> libgcc/unwind-dw2-pde.c, I found an issue in the routines "__register_frame"
> and "__register_frame_table" as following:
>
> 127 void
> 128 __register_frame (void *begin)
> 129 {
> 130 struct object *ob;
> 131
> 132 /* If .eh_frame is empty, don't register at all. */
> 133 if (*(uword *) begin == 0)
> 134 return;
> 135
> 136 ob = malloc (sizeof (struct object));
> 137 __register_frame_info (begin, ob);
> 138 }
>
> 181 void
> 182 __register_frame_table (void *begin)
> 183 {
> 184 struct object *ob = malloc (sizeof (struct object));
> 185 __register_frame_info_table (begin, ob);
> 186 }
> 187
>
> In the above, one obvious issue in the source code is at line 136, line 137,
> and line 184 and line 185: the return value of call to "malloc" is not checked
> against NULL before it was passed as the second parameter of the follwoing call.
>
> This might cause unpredicted random behavior.
>
> I checked the latest trunk GCC libgcc, the same issue is there too.
>
> This patch is for the latest trunk GCC.
>
> Please let me know any comments on this?
I think I've seen this elsewhere -- the issue is the unwind register API does
not allow for failures but I also think calling abort() is bad.
Are you calling this from a JIT context or so? We're assuming that at program
start malloc() will not fail.
The proper solution is probably to add an alternate ABI which has a way to fail
during registration.
Richard.
> thanks.
>
> Qing
>
>
> =======================
>
> Fix two issues in libgcc/unwind-dw2-fde.c:
>
> A. The returned value of call to malloc is not checked against NULL.
> This might cause unpredicted behavior during stack unwinding.
>
> B. Check for null begin parameter (as well as pointer to null) in
> __register_frame_info_table_bases and __register_frame_table.
>
> libgcc/ChangeLog:
>
> * unwind-dw2-fde.c (__register_frame): Check the return value of call
> to malloc.
> (__register_frame_info_table_bases): Check for null begin parameter.
> (__register_frame_table): Check the return value of call to malloc,
> Check for null begin parameter.
> ---
> libgcc/unwind-dw2-fde.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
> index cdfd3974c99..f0bc29d682a 100644
> --- a/libgcc/unwind-dw2-fde.c
> +++ b/libgcc/unwind-dw2-fde.c
> @@ -169,6 +169,8 @@ __register_frame (void *begin)
> return;
>
> ob = malloc (sizeof (struct object));
> + if (ob == NULL)
> + abort ();
> __register_frame_info (begin, ob);
> }
>
> @@ -180,6 +182,10 @@ void
> __register_frame_info_table_bases (void *begin, struct object *ob,
> void *tbase, void *dbase)
> {
> + /* If .eh_frame is empty, don't register at all. */
> + if ((const uword *) begin == 0 || *(const uword *) begin == 0)
> + return;
> +
> ob->pc_begin = (void *)-1;
> ob->tbase = tbase;
> ob->dbase = dbase;
> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct object *ob)
> void
> __register_frame_table (void *begin)
> {
> + /* If .eh_frame is empty, don't register at all. */
> + if (*(uword *) begin == 0)
> + return;
> +
> struct object *ob = malloc (sizeof (struct object));
> + if (ob == NULL)
> + abort ();
> __register_frame_info_table (begin, ob);
> }
>
> --
> 2.31.1
>
On Wed, Feb 05, 2025 at 01:46:02PM +0100, Richard Biener wrote:
> > Please let me know any comments on this?
>
> I think I've seen this elsewhere -- the issue is the unwind register API does
> not allow for failures but I also think calling abort() is bad.
>
> Are you calling this from a JIT context or so? We're assuming that at program
> start malloc() will not fail.
>
> The proper solution is probably to add an alternate ABI which has a way to fail
> during registration.
Note, e.g. the __register_frame_info_table_bases changes look just wrong.
This function is normally called by code created by collect2.cc and
that actually guarantees that it is only called if the table is not empty.
If it is some JIT and it is trying to register something that is empty or
NULL, just don't call that.
Jakub
> On Feb 5, 2025, at 07:46, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>
>> Hi,
>>
>> One of our big application segv in libgcc code while unwinding the stack.
>>
>> This is a random crash while the application throws a c++ exception and
>> unwinds the stack. Incidents are random and never can be reproduced by any
>> test case.
>>
>> The libgcc that is used is based on GCC 8.
>>
>> Fortunately, we got some information through the stack trace, and narrowed
>> down the crash is in the routine "init_object" of libgcc/unwind-dw2-pde.c:
>>
>> 777 count = classify_object_over_fdes (ob, ob->u.single);
>>
>> when loading the 2nd parameter of the call to "classify_object_over_fdes".
>> i.e, the address that is pointed by "ob->u.single" is invalid.
>>
>> And the outer caller we can track back is the following line 1066 of the routine
>> "_Unwind_Find_FDE":
>>
>> 1060 /* Classify and search the objects we've not yet processed. */
>> 1061 while ((ob = unseen_objects))
>> 1062 {
>> 1063 struct object **p;
>> 1064
>> 1065 unseen_objects = ob->next;
>> 1066 f = search_object (ob, pc);
>>
>> Then we suspected that the construction of the static variable "unseen_objects"
>> might have some bug.
>>
>> Then after studying the source code that constructs "unseen_objects" in
>> libgcc/unwind-dw2-pde.c, I found an issue in the routines "__register_frame"
>> and "__register_frame_table" as following:
>>
>> 127 void
>> 128 __register_frame (void *begin)
>> 129 {
>> 130 struct object *ob;
>> 131
>> 132 /* If .eh_frame is empty, don't register at all. */
>> 133 if (*(uword *) begin == 0)
>> 134 return;
>> 135
>> 136 ob = malloc (sizeof (struct object));
>> 137 __register_frame_info (begin, ob);
>> 138 }
>>
>> 181 void
>> 182 __register_frame_table (void *begin)
>> 183 {
>> 184 struct object *ob = malloc (sizeof (struct object));
>> 185 __register_frame_info_table (begin, ob);
>> 186 }
>> 187
>>
>> In the above, one obvious issue in the source code is at line 136, line 137,
>> and line 184 and line 185: the return value of call to "malloc" is not checked
>> against NULL before it was passed as the second parameter of the follwoing call.
>>
>> This might cause unpredicted random behavior.
>>
>> I checked the latest trunk GCC libgcc, the same issue is there too.
>>
>> This patch is for the latest trunk GCC.
>>
>> Please let me know any comments on this?
>
> I think I've seen this elsewhere —
Do you mean that you saw this problem (malloc return NULL during register_frame) previously?
> the issue is the unwind register API does
> not allow for failures but I also think calling abort() is bad.
Okay, I see.
>
> Are you calling this from a JIT context or so?
I will ask the bug submitter on this to make sure.
> We're assuming that at program
> start malloc() will not fail.
So, the routine __register_frame is only called at the _start_ of a program?
>
> The proper solution is probably to add an alternate ABI which has a way to fail
> during registration.
Any suggestion on this (or any other routine I can refer to?)
Thanks a lot for the help.
Qing
>
> Richard.
>
>> thanks.
>>
>> Qing
>>
>>
>> =======================
>>
>> Fix two issues in libgcc/unwind-dw2-fde.c:
>>
>> A. The returned value of call to malloc is not checked against NULL.
>> This might cause unpredicted behavior during stack unwinding.
>>
>> B. Check for null begin parameter (as well as pointer to null) in
>> __register_frame_info_table_bases and __register_frame_table.
>>
>> libgcc/ChangeLog:
>>
>> * unwind-dw2-fde.c (__register_frame): Check the return value of call
>> to malloc.
>> (__register_frame_info_table_bases): Check for null begin parameter.
>> (__register_frame_table): Check the return value of call to malloc,
>> Check for null begin parameter.
>> ---
>> libgcc/unwind-dw2-fde.c | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
>> index cdfd3974c99..f0bc29d682a 100644
>> --- a/libgcc/unwind-dw2-fde.c
>> +++ b/libgcc/unwind-dw2-fde.c
>> @@ -169,6 +169,8 @@ __register_frame (void *begin)
>> return;
>>
>> ob = malloc (sizeof (struct object));
>> + if (ob == NULL)
>> + abort ();
>> __register_frame_info (begin, ob);
>> }
>>
>> @@ -180,6 +182,10 @@ void
>> __register_frame_info_table_bases (void *begin, struct object *ob,
>> void *tbase, void *dbase)
>> {
>> + /* If .eh_frame is empty, don't register at all. */
>> + if ((const uword *) begin == 0 || *(const uword *) begin == 0)
>> + return;
>> +
>> ob->pc_begin = (void *)-1;
>> ob->tbase = tbase;
>> ob->dbase = dbase;
>> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct object *ob)
>> void
>> __register_frame_table (void *begin)
>> {
>> + /* If .eh_frame is empty, don't register at all. */
>> + if (*(uword *) begin == 0)
>> + return;
>> +
>> struct object *ob = malloc (sizeof (struct object));
>> + if (ob == NULL)
>> + abort ();
>> __register_frame_info_table (begin, ob);
>> }
>>
>> --
>> 2.31.1
> On Feb 5, 2025, at 07:49, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Wed, Feb 05, 2025 at 01:46:02PM +0100, Richard Biener wrote:
>>> Please let me know any comments on this?
>>
>> I think I've seen this elsewhere -- the issue is the unwind register API does
>> not allow for failures but I also think calling abort() is bad.
>>
>> Are you calling this from a JIT context or so? We're assuming that at program
>> start malloc() will not fail.
>>
>> The proper solution is probably to add an alternate ABI which has a way to fail
>> during registration.
>
> Note, e.g. the __register_frame_info_table_bases changes look just wrong.
> This function is normally called by code created by collect2.cc and
> that actually guarantees that it is only called if the table is not empty.
> If it is some JIT and it is trying to register something that is empty or
> NULL, just don't call that.
Thanks for the info.
I will ask the bug submitter to make sure it’s not called from JIT code.
Qing
>
> Jakub
>
On Wed, Feb 5, 2025 at 4:40 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>
>
>
> > On Feb 5, 2025, at 07:46, Richard Biener <richard.guenther@gmail.com> wrote:
> >
> > On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <qing.zhao@oracle.com> wrote:
> >>
> >> Hi,
> >>
> >> One of our big application segv in libgcc code while unwinding the stack.
> >>
> >> This is a random crash while the application throws a c++ exception and
> >> unwinds the stack. Incidents are random and never can be reproduced by any
> >> test case.
> >>
> >> The libgcc that is used is based on GCC 8.
> >>
> >> Fortunately, we got some information through the stack trace, and narrowed
> >> down the crash is in the routine "init_object" of libgcc/unwind-dw2-pde.c:
> >>
> >> 777 count = classify_object_over_fdes (ob, ob->u.single);
> >>
> >> when loading the 2nd parameter of the call to "classify_object_over_fdes".
> >> i.e, the address that is pointed by "ob->u.single" is invalid.
> >>
> >> And the outer caller we can track back is the following line 1066 of the routine
> >> "_Unwind_Find_FDE":
> >>
> >> 1060 /* Classify and search the objects we've not yet processed. */
> >> 1061 while ((ob = unseen_objects))
> >> 1062 {
> >> 1063 struct object **p;
> >> 1064
> >> 1065 unseen_objects = ob->next;
> >> 1066 f = search_object (ob, pc);
> >>
> >> Then we suspected that the construction of the static variable "unseen_objects"
> >> might have some bug.
> >>
> >> Then after studying the source code that constructs "unseen_objects" in
> >> libgcc/unwind-dw2-pde.c, I found an issue in the routines "__register_frame"
> >> and "__register_frame_table" as following:
> >>
> >> 127 void
> >> 128 __register_frame (void *begin)
> >> 129 {
> >> 130 struct object *ob;
> >> 131
> >> 132 /* If .eh_frame is empty, don't register at all. */
> >> 133 if (*(uword *) begin == 0)
> >> 134 return;
> >> 135
> >> 136 ob = malloc (sizeof (struct object));
> >> 137 __register_frame_info (begin, ob);
> >> 138 }
> >>
> >> 181 void
> >> 182 __register_frame_table (void *begin)
> >> 183 {
> >> 184 struct object *ob = malloc (sizeof (struct object));
> >> 185 __register_frame_info_table (begin, ob);
> >> 186 }
> >> 187
> >>
> >> In the above, one obvious issue in the source code is at line 136, line 137,
> >> and line 184 and line 185: the return value of call to "malloc" is not checked
> >> against NULL before it was passed as the second parameter of the follwoing call.
> >>
> >> This might cause unpredicted random behavior.
> >>
> >> I checked the latest trunk GCC libgcc, the same issue is there too.
> >>
> >> This patch is for the latest trunk GCC.
> >>
> >> Please let me know any comments on this?
> >
> > I think I've seen this elsewhere —
>
> Do you mean that you saw this problem (malloc return NULL during register_frame) previously?
It was probably reported to us (SUSE) from a customer/partner, also
from a (llvm?) JIT context.
> > the issue is the unwind register API does
> > not allow for failures but I also think calling abort() is bad.
>
> Okay, I see.
>
> >
> > Are you calling this from a JIT context or so?
>
> I will ask the bug submitter on this to make sure.
>
> > We're assuming that at program
> > start malloc() will not fail.
>
> So, the routine __register_frame is only called at the _start_ of a program?
> >
> > The proper solution is probably to add an alternate ABI which has a way to fail
> > during registration.
>
> Any suggestion on this (or any other routine I can refer to?)
>
>
> Thanks a lot for the help.
>
> Qing
> >
> > Richard.
> >
> >> thanks.
> >>
> >> Qing
> >>
> >>
> >> =======================
> >>
> >> Fix two issues in libgcc/unwind-dw2-fde.c:
> >>
> >> A. The returned value of call to malloc is not checked against NULL.
> >> This might cause unpredicted behavior during stack unwinding.
> >>
> >> B. Check for null begin parameter (as well as pointer to null) in
> >> __register_frame_info_table_bases and __register_frame_table.
> >>
> >> libgcc/ChangeLog:
> >>
> >> * unwind-dw2-fde.c (__register_frame): Check the return value of call
> >> to malloc.
> >> (__register_frame_info_table_bases): Check for null begin parameter.
> >> (__register_frame_table): Check the return value of call to malloc,
> >> Check for null begin parameter.
> >> ---
> >> libgcc/unwind-dw2-fde.c | 12 ++++++++++++
> >> 1 file changed, 12 insertions(+)
> >>
> >> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
> >> index cdfd3974c99..f0bc29d682a 100644
> >> --- a/libgcc/unwind-dw2-fde.c
> >> +++ b/libgcc/unwind-dw2-fde.c
> >> @@ -169,6 +169,8 @@ __register_frame (void *begin)
> >> return;
> >>
> >> ob = malloc (sizeof (struct object));
> >> + if (ob == NULL)
> >> + abort ();
> >> __register_frame_info (begin, ob);
> >> }
> >>
> >> @@ -180,6 +182,10 @@ void
> >> __register_frame_info_table_bases (void *begin, struct object *ob,
> >> void *tbase, void *dbase)
> >> {
> >> + /* If .eh_frame is empty, don't register at all. */
> >> + if ((const uword *) begin == 0 || *(const uword *) begin == 0)
> >> + return;
> >> +
> >> ob->pc_begin = (void *)-1;
> >> ob->tbase = tbase;
> >> ob->dbase = dbase;
> >> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct object *ob)
> >> void
> >> __register_frame_table (void *begin)
> >> {
> >> + /* If .eh_frame is empty, don't register at all. */
> >> + if (*(uword *) begin == 0)
> >> + return;
> >> +
> >> struct object *ob = malloc (sizeof (struct object));
> >> + if (ob == NULL)
> >> + abort ();
> >> __register_frame_info_table (begin, ob);
> >> }
> >>
> >> --
> >> 2.31.1
>
>
> On Feb 6, 2025, at 04:36, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Wed, Feb 5, 2025 at 4:40 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>
>>
>>
>>> On Feb 5, 2025, at 07:46, Richard Biener <richard.guenther@gmail.com> wrote:
>>>
>>> On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> One of our big application segv in libgcc code while unwinding the stack.
>>>>
>>>> This is a random crash while the application throws a c++ exception and
>>>> unwinds the stack. Incidents are random and never can be reproduced by any
>>>> test case.
>>>>
>>>> The libgcc that is used is based on GCC 8.
>>>>
>>>> Fortunately, we got some information through the stack trace, and narrowed
>>>> down the crash is in the routine "init_object" of libgcc/unwind-dw2-pde.c:
>>>>
>>>> 777 count = classify_object_over_fdes (ob, ob->u.single);
>>>>
>>>> when loading the 2nd parameter of the call to "classify_object_over_fdes".
>>>> i.e, the address that is pointed by "ob->u.single" is invalid.
>>>>
>>>> And the outer caller we can track back is the following line 1066 of the routine
>>>> "_Unwind_Find_FDE":
>>>>
>>>> 1060 /* Classify and search the objects we've not yet processed. */
>>>> 1061 while ((ob = unseen_objects))
>>>> 1062 {
>>>> 1063 struct object **p;
>>>> 1064
>>>> 1065 unseen_objects = ob->next;
>>>> 1066 f = search_object (ob, pc);
>>>>
>>>> Then we suspected that the construction of the static variable "unseen_objects"
>>>> might have some bug.
>>>>
>>>> Then after studying the source code that constructs "unseen_objects" in
>>>> libgcc/unwind-dw2-pde.c, I found an issue in the routines "__register_frame"
>>>> and "__register_frame_table" as following:
>>>>
>>>> 127 void
>>>> 128 __register_frame (void *begin)
>>>> 129 {
>>>> 130 struct object *ob;
>>>> 131
>>>> 132 /* If .eh_frame is empty, don't register at all. */
>>>> 133 if (*(uword *) begin == 0)
>>>> 134 return;
>>>> 135
>>>> 136 ob = malloc (sizeof (struct object));
>>>> 137 __register_frame_info (begin, ob);
>>>> 138 }
>>>>
>>>> 181 void
>>>> 182 __register_frame_table (void *begin)
>>>> 183 {
>>>> 184 struct object *ob = malloc (sizeof (struct object));
>>>> 185 __register_frame_info_table (begin, ob);
>>>> 186 }
>>>> 187
>>>>
>>>> In the above, one obvious issue in the source code is at line 136, line 137,
>>>> and line 184 and line 185: the return value of call to "malloc" is not checked
>>>> against NULL before it was passed as the second parameter of the follwoing call.
>>>>
>>>> This might cause unpredicted random behavior.
>>>>
>>>> I checked the latest trunk GCC libgcc, the same issue is there too.
>>>>
>>>> This patch is for the latest trunk GCC.
>>>>
>>>> Please let me know any comments on this?
>>>
>>> I think I've seen this elsewhere —
>>
>> Do you mean that you saw this problem (malloc return NULL during register_frame) previously?
>
> It was probably reported to us (SUSE) from a customer/partner, also
> from a (llvm?) JIT context.
Okay.
So, that bug has not been resolved yet?
Qing
>
>>> the issue is the unwind register API does
>>> not allow for failures but I also think calling abort() is bad.
>>
>> Okay, I see.
>>
>>>
>>> Are you calling this from a JIT context or so?
>>
>> I will ask the bug submitter on this to make sure.
>>
>>> We're assuming that at program
>>> start malloc() will not fail.
>>
>> So, the routine __register_frame is only called at the _start_ of a program?
>>>
>>> The proper solution is probably to add an alternate ABI which has a way to fail
>>> during registration.
>>
>> Any suggestion on this (or any other routine I can refer to?)
>>
>>
>> Thanks a lot for the help.
>>
>> Qing
>>>
>>> Richard.
>>>
>>>> thanks.
>>>>
>>>> Qing
>>>>
>>>>
>>>> =======================
>>>>
>>>> Fix two issues in libgcc/unwind-dw2-fde.c:
>>>>
>>>> A. The returned value of call to malloc is not checked against NULL.
>>>> This might cause unpredicted behavior during stack unwinding.
>>>>
>>>> B. Check for null begin parameter (as well as pointer to null) in
>>>> __register_frame_info_table_bases and __register_frame_table.
>>>>
>>>> libgcc/ChangeLog:
>>>>
>>>> * unwind-dw2-fde.c (__register_frame): Check the return value of call
>>>> to malloc.
>>>> (__register_frame_info_table_bases): Check for null begin parameter.
>>>> (__register_frame_table): Check the return value of call to malloc,
>>>> Check for null begin parameter.
>>>> ---
>>>> libgcc/unwind-dw2-fde.c | 12 ++++++++++++
>>>> 1 file changed, 12 insertions(+)
>>>>
>>>> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
>>>> index cdfd3974c99..f0bc29d682a 100644
>>>> --- a/libgcc/unwind-dw2-fde.c
>>>> +++ b/libgcc/unwind-dw2-fde.c
>>>> @@ -169,6 +169,8 @@ __register_frame (void *begin)
>>>> return;
>>>>
>>>> ob = malloc (sizeof (struct object));
>>>> + if (ob == NULL)
>>>> + abort ();
>>>> __register_frame_info (begin, ob);
>>>> }
>>>>
>>>> @@ -180,6 +182,10 @@ void
>>>> __register_frame_info_table_bases (void *begin, struct object *ob,
>>>> void *tbase, void *dbase)
>>>> {
>>>> + /* If .eh_frame is empty, don't register at all. */
>>>> + if ((const uword *) begin == 0 || *(const uword *) begin == 0)
>>>> + return;
>>>> +
>>>> ob->pc_begin = (void *)-1;
>>>> ob->tbase = tbase;
>>>> ob->dbase = dbase;
>>>> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct object *ob)
>>>> void
>>>> __register_frame_table (void *begin)
>>>> {
>>>> + /* If .eh_frame is empty, don't register at all. */
>>>> + if (*(uword *) begin == 0)
>>>> + return;
>>>> +
>>>> struct object *ob = malloc (sizeof (struct object));
>>>> + if (ob == NULL)
>>>> + abort ();
>>>> __register_frame_info_table (begin, ob);
>>>> }
>>>>
>>>> --
>>>> 2.31.1
On Thu, Feb 6, 2025 at 2:57 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>
>
>
> > On Feb 6, 2025, at 04:36, Richard Biener <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Feb 5, 2025 at 4:40 PM Qing Zhao <qing.zhao@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Feb 5, 2025, at 07:46, Richard Biener <richard.guenther@gmail.com> wrote:
> >>>
> >>> On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <qing.zhao@oracle.com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> One of our big application segv in libgcc code while unwinding the stack.
> >>>>
> >>>> This is a random crash while the application throws a c++ exception and
> >>>> unwinds the stack. Incidents are random and never can be reproduced by any
> >>>> test case.
> >>>>
> >>>> The libgcc that is used is based on GCC 8.
> >>>>
> >>>> Fortunately, we got some information through the stack trace, and narrowed
> >>>> down the crash is in the routine "init_object" of libgcc/unwind-dw2-pde.c:
> >>>>
> >>>> 777 count = classify_object_over_fdes (ob, ob->u.single);
> >>>>
> >>>> when loading the 2nd parameter of the call to "classify_object_over_fdes".
> >>>> i.e, the address that is pointed by "ob->u.single" is invalid.
> >>>>
> >>>> And the outer caller we can track back is the following line 1066 of the routine
> >>>> "_Unwind_Find_FDE":
> >>>>
> >>>> 1060 /* Classify and search the objects we've not yet processed. */
> >>>> 1061 while ((ob = unseen_objects))
> >>>> 1062 {
> >>>> 1063 struct object **p;
> >>>> 1064
> >>>> 1065 unseen_objects = ob->next;
> >>>> 1066 f = search_object (ob, pc);
> >>>>
> >>>> Then we suspected that the construction of the static variable "unseen_objects"
> >>>> might have some bug.
> >>>>
> >>>> Then after studying the source code that constructs "unseen_objects" in
> >>>> libgcc/unwind-dw2-pde.c, I found an issue in the routines "__register_frame"
> >>>> and "__register_frame_table" as following:
> >>>>
> >>>> 127 void
> >>>> 128 __register_frame (void *begin)
> >>>> 129 {
> >>>> 130 struct object *ob;
> >>>> 131
> >>>> 132 /* If .eh_frame is empty, don't register at all. */
> >>>> 133 if (*(uword *) begin == 0)
> >>>> 134 return;
> >>>> 135
> >>>> 136 ob = malloc (sizeof (struct object));
> >>>> 137 __register_frame_info (begin, ob);
> >>>> 138 }
> >>>>
> >>>> 181 void
> >>>> 182 __register_frame_table (void *begin)
> >>>> 183 {
> >>>> 184 struct object *ob = malloc (sizeof (struct object));
> >>>> 185 __register_frame_info_table (begin, ob);
> >>>> 186 }
> >>>> 187
> >>>>
> >>>> In the above, one obvious issue in the source code is at line 136, line 137,
> >>>> and line 184 and line 185: the return value of call to "malloc" is not checked
> >>>> against NULL before it was passed as the second parameter of the follwoing call.
> >>>>
> >>>> This might cause unpredicted random behavior.
> >>>>
> >>>> I checked the latest trunk GCC libgcc, the same issue is there too.
> >>>>
> >>>> This patch is for the latest trunk GCC.
> >>>>
> >>>> Please let me know any comments on this?
> >>>
> >>> I think I've seen this elsewhere —
> >>
> >> Do you mean that you saw this problem (malloc return NULL during register_frame) previously?
> >
> > It was probably reported to us (SUSE) from a customer/partner, also
> > from a (llvm?) JIT context.
>
> Okay.
> So, that bug has not been resolved yet?
It was resolved from our side IIRC as being a problem in the
application with the JIT and
otherwise a feature request (better unwind API for this use case,
allowing a failure mode).
Richard.
> Qing
> >
> >>> the issue is the unwind register API does
> >>> not allow for failures but I also think calling abort() is bad.
> >>
> >> Okay, I see.
> >>
> >>>
> >>> Are you calling this from a JIT context or so?
> >>
> >> I will ask the bug submitter on this to make sure.
> >>
> >>> We're assuming that at program
> >>> start malloc() will not fail.
> >>
> >> So, the routine __register_frame is only called at the _start_ of a program?
> >>>
> >>> The proper solution is probably to add an alternate ABI which has a way to fail
> >>> during registration.
> >>
> >> Any suggestion on this (or any other routine I can refer to?)
> >>
> >>
> >> Thanks a lot for the help.
> >>
> >> Qing
> >>>
> >>> Richard.
> >>>
> >>>> thanks.
> >>>>
> >>>> Qing
> >>>>
> >>>>
> >>>> =======================
> >>>>
> >>>> Fix two issues in libgcc/unwind-dw2-fde.c:
> >>>>
> >>>> A. The returned value of call to malloc is not checked against NULL.
> >>>> This might cause unpredicted behavior during stack unwinding.
> >>>>
> >>>> B. Check for null begin parameter (as well as pointer to null) in
> >>>> __register_frame_info_table_bases and __register_frame_table.
> >>>>
> >>>> libgcc/ChangeLog:
> >>>>
> >>>> * unwind-dw2-fde.c (__register_frame): Check the return value of call
> >>>> to malloc.
> >>>> (__register_frame_info_table_bases): Check for null begin parameter.
> >>>> (__register_frame_table): Check the return value of call to malloc,
> >>>> Check for null begin parameter.
> >>>> ---
> >>>> libgcc/unwind-dw2-fde.c | 12 ++++++++++++
> >>>> 1 file changed, 12 insertions(+)
> >>>>
> >>>> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
> >>>> index cdfd3974c99..f0bc29d682a 100644
> >>>> --- a/libgcc/unwind-dw2-fde.c
> >>>> +++ b/libgcc/unwind-dw2-fde.c
> >>>> @@ -169,6 +169,8 @@ __register_frame (void *begin)
> >>>> return;
> >>>>
> >>>> ob = malloc (sizeof (struct object));
> >>>> + if (ob == NULL)
> >>>> + abort ();
> >>>> __register_frame_info (begin, ob);
> >>>> }
> >>>>
> >>>> @@ -180,6 +182,10 @@ void
> >>>> __register_frame_info_table_bases (void *begin, struct object *ob,
> >>>> void *tbase, void *dbase)
> >>>> {
> >>>> + /* If .eh_frame is empty, don't register at all. */
> >>>> + if ((const uword *) begin == 0 || *(const uword *) begin == 0)
> >>>> + return;
> >>>> +
> >>>> ob->pc_begin = (void *)-1;
> >>>> ob->tbase = tbase;
> >>>> ob->dbase = dbase;
> >>>> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct object *ob)
> >>>> void
> >>>> __register_frame_table (void *begin)
> >>>> {
> >>>> + /* If .eh_frame is empty, don't register at all. */
> >>>> + if (*(uword *) begin == 0)
> >>>> + return;
> >>>> +
> >>>> struct object *ob = malloc (sizeof (struct object));
> >>>> + if (ob == NULL)
> >>>> + abort ();
> >>>> __register_frame_info_table (begin, ob);
> >>>> }
> >>>>
> >>>> --
> >>>> 2.31.1
>
>
> On Feb 6, 2025, at 09:25, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Thu, Feb 6, 2025 at 2:57 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>
>>
>>
>>> On Feb 6, 2025, at 04:36, Richard Biener <richard.guenther@gmail.com> wrote:
>>>
>>> On Wed, Feb 5, 2025 at 4:40 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>>>
>>>>
>>>>
>>>>> On Feb 5, 2025, at 07:46, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>
>>>>> On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> One of our big application segv in libgcc code while unwinding the stack.
>>>>>>
>>>>>> This is a random crash while the application throws a c++ exception and
>>>>>> unwinds the stack. Incidents are random and never can be reproduced by any
>>>>>> test case.
>>>>>>
>>>>>> The libgcc that is used is based on GCC 8.
>>>>>>
>>>>>> Fortunately, we got some information through the stack trace, and narrowed
>>>>>> down the crash is in the routine "init_object" of libgcc/unwind-dw2-pde.c:
>>>>>>
>>>>>> 777 count = classify_object_over_fdes (ob, ob->u.single);
>>>>>>
>>>>>> when loading the 2nd parameter of the call to "classify_object_over_fdes".
>>>>>> i.e, the address that is pointed by "ob->u.single" is invalid.
>>>>>>
>>>>>> And the outer caller we can track back is the following line 1066 of the routine
>>>>>> "_Unwind_Find_FDE":
>>>>>>
>>>>>> 1060 /* Classify and search the objects we've not yet processed. */
>>>>>> 1061 while ((ob = unseen_objects))
>>>>>> 1062 {
>>>>>> 1063 struct object **p;
>>>>>> 1064
>>>>>> 1065 unseen_objects = ob->next;
>>>>>> 1066 f = search_object (ob, pc);
>>>>>>
>>>>>> Then we suspected that the construction of the static variable "unseen_objects"
>>>>>> might have some bug.
>>>>>>
>>>>>> Then after studying the source code that constructs "unseen_objects" in
>>>>>> libgcc/unwind-dw2-pde.c, I found an issue in the routines "__register_frame"
>>>>>> and "__register_frame_table" as following:
>>>>>>
>>>>>> 127 void
>>>>>> 128 __register_frame (void *begin)
>>>>>> 129 {
>>>>>> 130 struct object *ob;
>>>>>> 131
>>>>>> 132 /* If .eh_frame is empty, don't register at all. */
>>>>>> 133 if (*(uword *) begin == 0)
>>>>>> 134 return;
>>>>>> 135
>>>>>> 136 ob = malloc (sizeof (struct object));
>>>>>> 137 __register_frame_info (begin, ob);
>>>>>> 138 }
>>>>>>
>>>>>> 181 void
>>>>>> 182 __register_frame_table (void *begin)
>>>>>> 183 {
>>>>>> 184 struct object *ob = malloc (sizeof (struct object));
>>>>>> 185 __register_frame_info_table (begin, ob);
>>>>>> 186 }
>>>>>> 187
>>>>>>
>>>>>> In the above, one obvious issue in the source code is at line 136, line 137,
>>>>>> and line 184 and line 185: the return value of call to "malloc" is not checked
>>>>>> against NULL before it was passed as the second parameter of the follwoing call.
>>>>>>
>>>>>> This might cause unpredicted random behavior.
>>>>>>
>>>>>> I checked the latest trunk GCC libgcc, the same issue is there too.
>>>>>>
>>>>>> This patch is for the latest trunk GCC.
>>>>>>
>>>>>> Please let me know any comments on this?
>>>>>
>>>>> I think I've seen this elsewhere —
>>>>
>>>> Do you mean that you saw this problem (malloc return NULL during register_frame) previously?
>>>
>>> It was probably reported to us (SUSE) from a customer/partner, also
>>> from a (llvm?) JIT context.
>>
>> Okay.
>> So, that bug has not been resolved yet?
>
> It was resolved from our side IIRC as being a problem in the
> application with the JIT and
> otherwise a feature request (better unwind API for this use case,
> allowing a failure mode).
Okay, I see.
Thanks.
Qing
>
> Richard.
>
>> Qing
>>>
>>>>> the issue is the unwind register API does
>>>>> not allow for failures but I also think calling abort() is bad.
>>>>
>>>> Okay, I see.
>>>>
>>>>>
>>>>> Are you calling this from a JIT context or so?
>>>>
>>>> I will ask the bug submitter on this to make sure.
>>>>
>>>>> We're assuming that at program
>>>>> start malloc() will not fail.
>>>>
>>>> So, the routine __register_frame is only called at the _start_ of a program?
>>>>>
>>>>> The proper solution is probably to add an alternate ABI which has a way to fail
>>>>> during registration.
>>>>
>>>> Any suggestion on this (or any other routine I can refer to?)
>>>>
>>>>
>>>> Thanks a lot for the help.
>>>>
>>>> Qing
>>>>>
>>>>> Richard.
>>>>>
>>>>>> thanks.
>>>>>>
>>>>>> Qing
>>>>>>
>>>>>>
>>>>>> =======================
>>>>>>
>>>>>> Fix two issues in libgcc/unwind-dw2-fde.c:
>>>>>>
>>>>>> A. The returned value of call to malloc is not checked against NULL.
>>>>>> This might cause unpredicted behavior during stack unwinding.
>>>>>>
>>>>>> B. Check for null begin parameter (as well as pointer to null) in
>>>>>> __register_frame_info_table_bases and __register_frame_table.
>>>>>>
>>>>>> libgcc/ChangeLog:
>>>>>>
>>>>>> * unwind-dw2-fde.c (__register_frame): Check the return value of call
>>>>>> to malloc.
>>>>>> (__register_frame_info_table_bases): Check for null begin parameter.
>>>>>> (__register_frame_table): Check the return value of call to malloc,
>>>>>> Check for null begin parameter.
>>>>>> ---
>>>>>> libgcc/unwind-dw2-fde.c | 12 ++++++++++++
>>>>>> 1 file changed, 12 insertions(+)
>>>>>>
>>>>>> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
>>>>>> index cdfd3974c99..f0bc29d682a 100644
>>>>>> --- a/libgcc/unwind-dw2-fde.c
>>>>>> +++ b/libgcc/unwind-dw2-fde.c
>>>>>> @@ -169,6 +169,8 @@ __register_frame (void *begin)
>>>>>> return;
>>>>>>
>>>>>> ob = malloc (sizeof (struct object));
>>>>>> + if (ob == NULL)
>>>>>> + abort ();
>>>>>> __register_frame_info (begin, ob);
>>>>>> }
>>>>>>
>>>>>> @@ -180,6 +182,10 @@ void
>>>>>> __register_frame_info_table_bases (void *begin, struct object *ob,
>>>>>> void *tbase, void *dbase)
>>>>>> {
>>>>>> + /* If .eh_frame is empty, don't register at all. */
>>>>>> + if ((const uword *) begin == 0 || *(const uword *) begin == 0)
>>>>>> + return;
>>>>>> +
>>>>>> ob->pc_begin = (void *)-1;
>>>>>> ob->tbase = tbase;
>>>>>> ob->dbase = dbase;
>>>>>> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct object *ob)
>>>>>> void
>>>>>> __register_frame_table (void *begin)
>>>>>> {
>>>>>> + /* If .eh_frame is empty, don't register at all. */
>>>>>> + if (*(uword *) begin == 0)
>>>>>> + return;
>>>>>> +
>>>>>> struct object *ob = malloc (sizeof (struct object));
>>>>>> + if (ob == NULL)
>>>>>> + abort ();
>>>>>> __register_frame_info_table (begin, ob);
>>>>>> }
>>>>>>
>>>>>> --
>>>>>> 2.31.1
@@ -169,6 +169,8 @@ __register_frame (void *begin)
return;
ob = malloc (sizeof (struct object));
+ if (ob == NULL)
+ abort ();
__register_frame_info (begin, ob);
}
@@ -180,6 +182,10 @@ void
__register_frame_info_table_bases (void *begin, struct object *ob,
void *tbase, void *dbase)
{
+ /* If .eh_frame is empty, don't register at all. */
+ if ((const uword *) begin == 0 || *(const uword *) begin == 0)
+ return;
+
ob->pc_begin = (void *)-1;
ob->tbase = tbase;
ob->dbase = dbase;
@@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct object *ob)
void
__register_frame_table (void *begin)
{
+ /* If .eh_frame is empty, don't register at all. */
+ if (*(uword *) begin == 0)
+ return;
+
struct object *ob = malloc (sizeof (struct object));
+ if (ob == NULL)
+ abort ();
__register_frame_info_table (begin, ob);
}