riscv: configure.ac bug fix for misaligned access when __riscv_misaligned_slow

Message ID CAC82fA0bEoVPA3mSGT5WpupseOa9odE2N-EeuBtmL5ii_ejt4A@mail.gmail.com
State New
Headers
Series riscv: configure.ac bug fix for misaligned access when __riscv_misaligned_slow |

Commit Message

Gedare Bloom May 1, 2026, 5:28 a.m. UTC
  memcmp is generating data alignment errors on risc-v targets where the
hw does not allow unaligned access. This behavior was observed on
microchip polarfire with rtems. Here is the code generated:

000000008000bb34 <memcmp>:
    8000bb34:   469d                    li      a3,7
    8000bb36:   00c6fb63                bgeu    a3,a2,8000bb4c <memcmp+0x18>
    8000bb3a:   6118                    ld      a4,0(a0)
    8000bb3c:   619c                    ld      a5,0(a1)

You can see that 8000bb3a will cause a fault if a0 is not aligned and
hw does not support it.

riscv-rtems7-gcc -dM -E - < /dev/null | grep aligned
#define __riscv_misaligned_slow 1

Attached fix corrects this. Generated code is now:
00000008000baf6 <memcmp>:
    8000baf6:   469d                    li      a3,7
    8000baf8:   04c6f063                bgeu    a3,a2,8000bb38 <memcmp+0x42>
    8000bafc:   00a5e7b3                or      a5,a1,a0
    8000bb00:   8b9d                    andi    a5,a5,7
    8000bb02:   c395                    beqz    a5,8000bb26 <memcmp+0x30>
    8000bb04:   167d                    addi    a2,a2,-1
    8000bb06:   0605                    addi    a2,a2,1
    8000bb08:   962a                    add     a2,a2,a0
    8000bb0a:   a019                    j       8000bb10 <memcmp+0x1a>

Now we have the check at 8000bb02 that will handle alignment and
falls-thru to byte-by-byte copy, or jumps to aligned long copies.

Gedare
  

Comments

Pincheng Wang May 7, 2026, 12:06 p.m. UTC | #1
Hi Gedare,

I have a question regarding this inconsistency between the 
compiler-defined macro and actual hardware behavior. Please do not take 
this message as a review comment.

On 2026/5/1 13:28, Gedare Bloom wrote:
> memcmp is generating data alignment errors on risc-v targets where the
> hw does not allow unaligned access. This behavior was observed on
> microchip polarfire with rtems. Here is the code generated:
> 
> 000000008000bb34 <memcmp>:
>      8000bb34:   469d                    li      a3,7
>      8000bb36:   00c6fb63                bgeu    a3,a2,8000bb4c <memcmp+0x18>
>      8000bb3a:   6118                    ld      a4,0(a0)
>      8000bb3c:   619c                    ld      a5,0(a1)
> 
> You can see that 8000bb3a will cause a fault if a0 is not aligned and
> hw does not support it.
> 
> riscv-rtems7-gcc -dM -E - < /dev/null | grep aligned
> #define __riscv_misaligned_slow 1
> 

According to the riscv-c-api-doc[1], "__riscv_misaligned_slow" macro 
shuold be defined when scalar misaligned *are supported* but slower than 
aligned accesses. For hardwares that not allow unaligned access at all, 
"__riscv_misaligned_avoid" seems to be the more appropriate macro.

So, I am wondering whether this is actually a compiler issue rather than 
a C library issue?

> Attached fix corrects this. Generated code is now:
> 00000008000baf6 <memcmp>:
>      8000baf6:   469d                    li      a3,7
>      8000baf8:   04c6f063                bgeu    a3,a2,8000bb38 <memcmp+0x42>
>      8000bafc:   00a5e7b3                or      a5,a1,a0
>      8000bb00:   8b9d                    andi    a5,a5,7
>      8000bb02:   c395                    beqz    a5,8000bb26 <memcmp+0x30>
>      8000bb04:   167d                    addi    a2,a2,-1
>      8000bb06:   0605                    addi    a2,a2,1
>      8000bb08:   962a                    add     a2,a2,a0
>      8000bb0a:   a019                    j       8000bb10 <memcmp+0x1a>
> 
> Now we have the check at 8000bb02 that will handle alignment and
> falls-thru to byte-by-byte copy, or jumps to aligned long copies.
> 
> Gedare

Best regards,
Pincheng Wang

[1] 
https://github.com/riscv-non-isa/riscv-c-api-doc/blob/main/src/c-api.adoc
  
Gedare Bloom May 11, 2026, 4:33 p.m. UTC | #2
On Thu, May 7, 2026 at 6:06 AM Pincheng Wang
<pincheng.plct@isrc.iscas.ac.cn> wrote:
>
> Hi Gedare,
>
> I have a question regarding this inconsistency between the
> compiler-defined macro and actual hardware behavior. Please do not take
> this message as a review comment.
>
> On 2026/5/1 13:28, Gedare Bloom wrote:
> > memcmp is generating data alignment errors on risc-v targets where the
> > hw does not allow unaligned access. This behavior was observed on
> > microchip polarfire with rtems. Here is the code generated:
> >
> > 000000008000bb34 <memcmp>:
> >      8000bb34:   469d                    li      a3,7
> >      8000bb36:   00c6fb63                bgeu    a3,a2,8000bb4c <memcmp+0x18>
> >      8000bb3a:   6118                    ld      a4,0(a0)
> >      8000bb3c:   619c                    ld      a5,0(a1)
> >
> > You can see that 8000bb3a will cause a fault if a0 is not aligned and
> > hw does not support it.
> >
> > riscv-rtems7-gcc -dM -E - < /dev/null | grep aligned
> > #define __riscv_misaligned_slow 1
> >
>
> According to the riscv-c-api-doc[1], "__riscv_misaligned_slow" macro
> shuold be defined when scalar misaligned *are supported* but slower than
> aligned accesses. For hardwares that not allow unaligned access at all,
> "__riscv_misaligned_avoid" seems to be the more appropriate macro.
>
> So, I am wondering whether this is actually a compiler issue rather than
> a C library issue?
>

This is a good point. I was taking my inspiration from the
implementations of memcpy and memmove, which only check for
__riscv_misaligned_fast.
I'm not sure what the right answer is for the optimization.

Regarding my problem with alignment error, you are correct that this
is probably better fixed by using a multilib with mstrict-align
defined so that I get the _avoid variant.

> > Attached fix corrects this. Generated code is now:
> > 00000008000baf6 <memcmp>:
> >      8000baf6:   469d                    li      a3,7
> >      8000baf8:   04c6f063                bgeu    a3,a2,8000bb38 <memcmp+0x42>
> >      8000bafc:   00a5e7b3                or      a5,a1,a0
> >      8000bb00:   8b9d                    andi    a5,a5,7
> >      8000bb02:   c395                    beqz    a5,8000bb26 <memcmp+0x30>
> >      8000bb04:   167d                    addi    a2,a2,-1
> >      8000bb06:   0605                    addi    a2,a2,1
> >      8000bb08:   962a                    add     a2,a2,a0
> >      8000bb0a:   a019                    j       8000bb10 <memcmp+0x1a>
> >
> > Now we have the check at 8000bb02 that will handle alignment and
> > falls-thru to byte-by-byte copy, or jumps to aligned long copies.
> >
> > Gedare
>
> Best regards,
> Pincheng Wang
>
> [1]
> https://github.com/riscv-non-isa/riscv-c-api-doc/blob/main/src/c-api.adoc
>
  
Gedare Bloom May 11, 2026, 5:32 p.m. UTC | #3
On Mon, May 11, 2026 at 10:33 AM Gedare Bloom <gedare@rtems.org> wrote:
>
> On Thu, May 7, 2026 at 6:06 AM Pincheng Wang
> <pincheng.plct@isrc.iscas.ac.cn> wrote:
> >
> > Hi Gedare,
> >
> > I have a question regarding this inconsistency between the
> > compiler-defined macro and actual hardware behavior. Please do not take
> > this message as a review comment.
> >
> > On 2026/5/1 13:28, Gedare Bloom wrote:
> > > memcmp is generating data alignment errors on risc-v targets where the
> > > hw does not allow unaligned access. This behavior was observed on
> > > microchip polarfire with rtems. Here is the code generated:
> > >
> > > 000000008000bb34 <memcmp>:
> > >      8000bb34:   469d                    li      a3,7
> > >      8000bb36:   00c6fb63                bgeu    a3,a2,8000bb4c <memcmp+0x18>
> > >      8000bb3a:   6118                    ld      a4,0(a0)
> > >      8000bb3c:   619c                    ld      a5,0(a1)
> > >
> > > You can see that 8000bb3a will cause a fault if a0 is not aligned and
> > > hw does not support it.
> > >
> > > riscv-rtems7-gcc -dM -E - < /dev/null | grep aligned
> > > #define __riscv_misaligned_slow 1
> > >
> >
> > According to the riscv-c-api-doc[1], "__riscv_misaligned_slow" macro
> > shuold be defined when scalar misaligned *are supported* but slower than
> > aligned accesses. For hardwares that not allow unaligned access at all,
> > "__riscv_misaligned_avoid" seems to be the more appropriate macro.
> >
> > So, I am wondering whether this is actually a compiler issue rather than
> > a C library issue?
> >
>
> This is a good point. I was taking my inspiration from the
> implementations of memcpy and memmove, which only check for
> __riscv_misaligned_fast.
> I'm not sure what the right answer is for the optimization.
>
> Regarding my problem with alignment error, you are correct that this
> is probably better fixed by using a multilib with mstrict-align
> defined so that I get the _avoid variant.
>
This seems to work for me now. However, it does raise the question of
whether the other optimized variants should also be checking for
__riscv_misaligned_slow.

> > > Attached fix corrects this. Generated code is now:
> > > 00000008000baf6 <memcmp>:
> > >      8000baf6:   469d                    li      a3,7
> > >      8000baf8:   04c6f063                bgeu    a3,a2,8000bb38 <memcmp+0x42>
> > >      8000bafc:   00a5e7b3                or      a5,a1,a0
> > >      8000bb00:   8b9d                    andi    a5,a5,7
> > >      8000bb02:   c395                    beqz    a5,8000bb26 <memcmp+0x30>
> > >      8000bb04:   167d                    addi    a2,a2,-1
> > >      8000bb06:   0605                    addi    a2,a2,1
> > >      8000bb08:   962a                    add     a2,a2,a0
> > >      8000bb0a:   a019                    j       8000bb10 <memcmp+0x1a>
> > >
> > > Now we have the check at 8000bb02 that will handle alignment and
> > > falls-thru to byte-by-byte copy, or jumps to aligned long copies.
> > >
> > > Gedare
> >
> > Best regards,
> > Pincheng Wang
> >
> > [1]
> > https://github.com/riscv-non-isa/riscv-c-api-doc/blob/main/src/c-api.adoc
> >
  

Patch

From 74564292ffd4c0f2baa4d99715cce614dd4572f6 Mon Sep 17 00:00:00 2001
From: Gedare Bloom <gedare@rtems.org>
Date: Thu, 30 Apr 2026 11:10:53 -0600
Subject: [PATCH] riscv: avoid misaligned accesses if __riscv_misaligned_slow

The configure rule for misaligned accesses is too wide by including
__riscv_misaligned_fast or __riscv_misaligned_slow for enabling the
HW supported misaligned access.
---
 newlib/configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/newlib/configure.ac b/newlib/configure.ac
index a4807830e..4640e69c5 100644
--- a/newlib/configure.ac
+++ b/newlib/configure.ac
@@ -555,7 +555,7 @@  if test "x${newlib_hw_misaligned_access}" = "x"; then
   AC_CACHE_CHECK([if $CC has enabled misaligned hardware access],
               [newlib_cv_hw_misaligned_access], [dnl
   cat > conftest.c <<EOF
-#if __riscv_misaligned_fast || __riscv_misaligned_slow
+#if __riscv_misaligned_fast
 void misalign_access_supported(void) {}
 #else
 #error "misaligned access is not supported"
-- 
2.47.3