libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs (was: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling)
Commit Message
Hi!
On 2022-10-24T21:11:04+0200, I wrote:
> On 2022-10-24T21:05:46+0200, I wrote:
>> On 2022-10-24T16:07:25+0200, Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>> On Wed, Oct 12, 2022 at 10:55:26AM +0200, Tobias Burnus wrote:
>>>> libgomp/nvptx: Prepare for reverse-offload callback handling
>>
>>> Ok, thanks.
>>
>> Per commit r13-3460-g131d18e928a3ea1ab2d3bf61aa92d68a8a254609
>> "libgomp/nvptx: Prepare for reverse-offload callback handling",
>> I'm seeing a lot of libgomp execution test regressions. Random
>> example, 'libgomp.c-c++-common/error-1.c':
>>
>> [...]
>> GOMP_OFFLOAD_run: kernel main$_omp_fn$0: launch [(teams: 1), 1, 1] [(lanes: 32), (threads: 8), 1]
>>
>> Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
>> 0x00007ffff793b87d in GOMP_OFFLOAD_run (ord=<optimized out>, tgt_fn=<optimized out>, tgt_vars=<optimized out>, args=<optimized out>) at [...]/source-gcc/libgomp/plugin/plugin-nvptx.c:2127
>> 2127 if (__atomic_load_n (&ptx_dev->rev_data->fn, __ATOMIC_ACQUIRE) != 0)
>> (gdb) print ptx_dev
>> $1 = (struct ptx_device *) 0x6a55a0
>> (gdb) print ptx_dev->rev_data
>> $2 = (struct rev_offload *) 0xffffffff00000000
>> (gdb) print ptx_dev->rev_data->fn
>> Cannot access memory at address 0xffffffff00000000
>>
>> Why is it even taking this 'if (reverse_offload)' code path, which isn't
>> applicable to this test case (as far as I understand)? (Well, the answer
>> is 'bool reverse_offload = ptx_dev->rev_data != NULL;', but why is that?)
>
> Well.
>
> --- a/libgomp/plugin/plugin-nvptx.c
> +++ b/libgomp/plugin/plugin-nvptx.c
>
> @@ -329,6 +332,7 @@ struct ptx_device
> pthread_mutex_t lock;
> } omp_stacks;
>
> + struct rev_offload *rev_data;
> struct ptx_device *next;
> };
>
> ... but as far as I can tell, this is never initialized in
> 'nvptx_open_device', which does 'ptx_dev = GOMP_PLUGIN_malloc ([...]);'.
> Would the following be the correct fix (currently testing)?
>
> --- libgomp/plugin/plugin-nvptx.c
> +++ libgomp/plugin/plugin-nvptx.c
> @@ -546,6 +546,8 @@ nvptx_open_device (int n)
> ptx_dev->omp_stacks.size = 0;
> pthread_mutex_init (&ptx_dev->omp_stacks.lock, NULL);
>
> + ptx_dev->rev_data = NULL;
> +
> return ptx_dev;
> }
That did clean up libgomp execution test regressions; pushed to
master branch commit 205538832b7033699047900cf25928f5920d8b93
"libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs",
see attached.
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
From 205538832b7033699047900cf25928f5920d8b93 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Mon, 24 Oct 2022 21:11:47 +0200
Subject: [PATCH] libgomp/nvptx: Prepare for reverse-offload callback handling,
resolve spurious SIGSEGVs
Per commit r13-3460-g131d18e928a3ea1ab2d3bf61aa92d68a8a254609
"libgomp/nvptx: Prepare for reverse-offload callback handling",
I'm seeing a lot of libgomp execution test regressions. Random
example, 'libgomp.c-c++-common/error-1.c':
[...]
GOMP_OFFLOAD_run: kernel main$_omp_fn$0: launch [(teams: 1), 1, 1] [(lanes: 32), (threads: 8), 1]
Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
0x00007ffff793b87d in GOMP_OFFLOAD_run (ord=<optimized out>, tgt_fn=<optimized out>, tgt_vars=<optimized out>, args=<optimized out>) at [...]/source-gcc/libgomp/plugin/plugin-nvptx.c:2127
2127 if (__atomic_load_n (&ptx_dev->rev_data->fn, __ATOMIC_ACQUIRE) != 0)
(gdb) print ptx_dev
$1 = (struct ptx_device *) 0x6a55a0
(gdb) print ptx_dev->rev_data
$2 = (struct rev_offload *) 0xffffffff00000000
(gdb) print ptx_dev->rev_data->fn
Cannot access memory at address 0xffffffff00000000
libgomp/
* plugin/plugin-nvptx.c (nvptx_open_device): Initialize
'ptx_dev->rev_data'.
---
libgomp/plugin/plugin-nvptx.c | 2 ++
1 file changed, 2 insertions(+)
@@ -546,6 +546,8 @@ nvptx_open_device (int n)
ptx_dev->omp_stacks.size = 0;
pthread_mutex_init (&ptx_dev->omp_stacks.lock, NULL);
+ ptx_dev->rev_data = NULL;
+
return ptx_dev;
}
--
2.35.1