[RFC,v3,05/12] C-SKY: Atomic and Locking Routines

Message ID 52862eb735de64be1bd5a61b21ef9585e665558d.1530246556.git.han_mao@c-sky.com
State New, archived
Headers

Commit Message

毛晗 June 29, 2018, 7:58 a.m. UTC
  This patch implements various atomic and locking routines on C-SKY.

	* sysdeps/csky/atomic-machine.h: New file.
	* sysdeps/csky/nptl/bits/pthreadtypes-arch.h: Likewise.
	* sysdeps/csky/nptl/bits/semaphore.h: Likewise.
---
 sysdeps/csky/atomic-machine.h              | 97 ++++++++++++++++++++++++++++++
 sysdeps/csky/nptl/bits/pthreadtypes-arch.h | 70 +++++++++++++++++++++
 sysdeps/csky/nptl/bits/semaphore.h         | 35 +++++++++++
 3 files changed, 202 insertions(+)
 create mode 100644 sysdeps/csky/atomic-machine.h
 create mode 100644 sysdeps/csky/nptl/bits/pthreadtypes-arch.h
 create mode 100644 sysdeps/csky/nptl/bits/semaphore.h
  

Comments

Joseph Myers June 29, 2018, 4:36 p.m. UTC | #1
On Fri, 29 Jun 2018, Mao Han wrote:

> +#define USE_ATOMIC_COMPILER_BUILTINS 0

Does this mean the compiler expands the atomic built-in function calls to 
out-of-line calls to libgcc or libatomic, but you wish to use inline 
expansions in glibc?  (That's the main case when 
USE_ATOMIC_COMPILER_BUILTINS 0 is appropriate.  If there are suitable 
atomic instructions, typically the compiler would expand the calls inline 
and USE_ATOMIC_COMPILER_BUILTINS 1 is most appropriate rather than using 
inline asm for atomics.)
  
毛晗 July 3, 2018, 7:56 a.m. UTC | #2
On Fri, Jun 29, 2018 at 04:36:19PM +0000, Joseph Myers wrote:
> On Fri, 29 Jun 2018, Mao Han wrote:
> 
> > +#define USE_ATOMIC_COMPILER_BUILTINS 0
> 
> Does this mean the compiler expands the atomic built-in function calls to 
> out-of-line calls to libgcc or libatomic, but you wish to use inline 
> expansions in glibc?  (That's the main case when 
> USE_ATOMIC_COMPILER_BUILTINS 0 is appropriate.  If there are suitable 
> atomic instructions, typically the compiler would expand the calls inline 
> and USE_ATOMIC_COMPILER_BUILTINS 1 is most appropriate rather than using 
> inline asm for atomics.)
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Thanks a lot for the feedfack.
The macro seems someting to reduce architecture-specific code
needed to support C11-like atomics? If compiler can expands the
atomic built-in to the correct call, either libgcc or libatomic(inline)
, it is suggested to define USE_ATOMIC_COMPILER_BUILTINS to 1?
I don't know what will the difference between out-of-line call and
inline call affect on the linux system. effciency?
I'm also not sure about the macro ATOMIC_EXCHANGE_USES_CAS.
We don't have any exchange instruction.
We've only got CAS loop implement with linux helper on ck807/ck810 in libgcc,
and one implement with load-store Exclusive on ck860.
According to comment in pthread_spin_trylock.c ATOMIC_EXCHANGE_USES_CAS
should define to 1 if exchange is not supported? I did not found any
difference while running testsuits.
Is it better to have the atomic-machine.h like this below?

#define __HAVE_64B_ATOMICS 0
#define USE_ATOMIC_COMPILER_BUILTINS 1
#define ATOMIC_EXCHANGE_USES_CAS 1

#define __arch_compare_and_exchange_bool_8_int(mem, newval, oldval, model) \
  (abort (), 0)

#define __arch_compare_and_exchange_bool_16_int(mem, newval, oldval, model) \
  (abort (), 0)

#define __arch_compare_and_exchange_bool_32_int(mem, newval, oldval, model) \
  ({                                                                    \
    typeof (*mem) __oldval = (oldval);                                  \
    !__atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,   \
                                  model, __ATOMIC_RELAXED);             \
  })
#define __arch_compare_and_exchange_bool_64_int(mem, newval, oldval, model) \
  (abort (), 0)

#define __arch_compare_and_exchange_val_8_int(mem, newval, oldval, model) \
  (abort (), (__typeof (*mem)) 0)

#define __arch_compare_and_exchange_val_16_int(mem, newval, oldval, model) \
  (abort (), (__typeof (*mem)) 0)

#define __arch_compare_and_exchange_val_32_int(mem, newval, oldval, model) \
  ({                                                                    \
    typeof (*mem) __oldval = (oldval);                                  \
    __atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,    \
                                 model, __ATOMIC_RELAXED);              \
    __oldval;                                                           \
  })

#define __arch_compare_and_exchange_val_64_int(mem, newval, oldval, model) \
  (abort (), (__typeof (*mem)) 0)

#define atomic_compare_and_exchange_bool_acq(mem, new, old)             \
  __atomic_bool_bysize (__arch_compare_and_exchange_bool, int,          \
                        mem, new, old, __ATOMIC_ACQUIRE)

#define atomic_compare_and_exchange_val_acq(mem, new, old)              \
  __atomic_val_bysize (__arch_compare_and_exchange_val, int,            \
                       mem, new, old, __ATOMIC_ACQUIRE)
  
Joseph Myers July 17, 2018, 8:03 p.m. UTC | #3
On Tue, 3 Jul 2018, Mao Han wrote:

> The macro seems someting to reduce architecture-specific code
> needed to support C11-like atomics? If compiler can expands the
> atomic built-in to the correct call, either libgcc or libatomic(inline)
> , it is suggested to define USE_ATOMIC_COMPILER_BUILTINS to 1?

If the compiler expands the calls inline, defining 
USE_ATOMIC_COMPILER_BUILTINS to 1 is appropriate unless you have a clear 
reason not to do so (and such a reason would need to have detailed 
comments in glibc explaining it).

If the compiler generates out-of-line libatomic calls for atomic 
operations used in glibc, you need to use inline asm there instead of 
USE_ATOMIC_COMPILER_BUILTINS; glibc is not linked with libatomic.

If the compiler generates out-of-line libgcc calls for atomic operations 
used in glibc, it's possible inline asm will be more efficient.

The main reason for the compiler to generate out-of-line calls is if those 
calls need to use kernel helpers (vDSO / syscalls / etc.).  If the 
processors supported by the port always support suitable atomic 
instructions that can be called directly without needing such kernel 
helpers, the compiler should expand the calls inline, and you should 
define USE_ATOMIC_COMPILER_BUILTINS to 1.

> According to comment in pthread_spin_trylock.c ATOMIC_EXCHANGE_USES_CAS
> should define to 1 if exchange is not supported? I did not found any

Yes.
  

Patch

diff --git a/sysdeps/csky/atomic-machine.h b/sysdeps/csky/atomic-machine.h
new file mode 100644
index 0000000..b64674f
--- /dev/null
+++ b/sysdeps/csky/atomic-machine.h
@@ -0,0 +1,97 @@ 
+/* Atomic operations.  C-SKY version.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef __CSKY_ATOMIC_H_
+#define __CSKY_ATOMIC_H_
+
+#include <stdint.h>
+
+typedef int32_t atomic32_t;
+typedef uint32_t uatomic32_t;
+
+typedef intptr_t atomicptr_t;
+typedef uintptr_t uatomicptr_t;
+typedef intmax_t atomic_max_t;
+typedef uintmax_t uatomic_max_t;
+
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+#define ATOMIC_EXCHANGE_USES_CAS 0
+
+#define __arch_compare_and_exchange_bool_8_int(mem, newval, oldval)	\
+  (abort (), 0)
+
+#define __arch_compare_and_exchange_bool_16_int(mem, newval, oldval)	\
+  (abort (), 0)
+
+#define __arch_compare_and_exchange_bool_32_int(mem, newval, oldval)	\
+  ({ __typeof (mem) _mem = (mem);					\
+     __typeof (oldval) _oldval = oldval;				\
+     __typeof (newval) _newval = newval;				\
+     register int _a0 asm ("a0") = (int)_oldval;			\
+     register int _a1 asm ("a1") = (int)_newval;			\
+     register int _a2 asm ("a2") = (int)_mem;				\
+     __asm__ __volatile__ ("trap   2"					\
+			   : "+r" (_a0) : "r" (_a1) , "r" (_a2)		\
+			   : "a3", "memory");				\
+     (int) _a0; })
+
+#define __arch_compare_and_exchange_bool_64_int(mem, newval, oldval)	\
+  (abort (), 0)
+
+#define __arch_compare_and_exchange_val_8_int(mem, newval, oldval)	\
+  (abort (), (__typeof (*mem)) 0)
+
+#define __arch_compare_and_exchange_val_16_int(mem, newval, oldval)	\
+  (abort (), (__typeof (*mem)) 0)
+
+#define __arch_compare_and_exchange_val_32_int(mem, newval, oldval)	\
+  ({ __typeof (mem) _mem = (mem);					\
+     __typeof (*mem) __gret = *_mem;					\
+     unsigned int _tmp = 0;						\
+     __typeof (oldval) _oldval = oldval;				\
+     __typeof (newval) _newval = newval;				\
+     register int _a0 asm ("a0") = (int)_oldval;			\
+     register int _a1 asm ("a1") = (int)_newval;			\
+     register int _a2 asm ("a2") = (int)_mem;				\
+     __asm__ __volatile__ ("1:\tldw %1, (%4, 0x0)\n\t"			\
+			   "cmpne %1, %0\n\t"				\
+			   "bt 2f\n\t"					\
+			   "mov %2, %0\n\t"				\
+			   "trap 2\n\t"					\
+			   "cmpnei %0, 0\n\t"				\
+			   "mov %0, %2\n\t"				\
+			   "bt 1b\n"					\
+			   "2:"						\
+			   : "+r" (_a0), "+r"(__gret), "+r" (_tmp)	\
+			   : "r" (_a1) , "r" (_a2)			\
+			   : "a3", "memory");				\
+     __gret; })
+
+#define __arch_compare_and_exchange_val_64_int(mem, newval, oldval)	\
+  (abort (), (__typeof (*mem)) 0)
+
+#define atomic_compare_and_exchange_bool_acq(mem, new, old)		\
+  __atomic_bool_bysize (__arch_compare_and_exchange_bool, int,		\
+			mem, new, old)
+
+#define atomic_compare_and_exchange_val_acq(mem, new, old)		\
+  __atomic_val_bysize (__arch_compare_and_exchange_val, int,		\
+		       mem, new, old)
+
+#endif /* atomic-machine.h */
diff --git a/sysdeps/csky/nptl/bits/pthreadtypes-arch.h b/sysdeps/csky/nptl/bits/pthreadtypes-arch.h
new file mode 100644
index 0000000..d3423fb
--- /dev/null
+++ b/sysdeps/csky/nptl/bits/pthreadtypes-arch.h
@@ -0,0 +1,70 @@ 
+/* Machine-specific pthread type layouts.  C-SKY version.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _BITS_PTHREADTYPES_ARCH_H
+#define _BITS_PTHREADTYPES_ARCH_H	1
+
+#include <endian.h>
+
+#define __SIZEOF_PTHREAD_ATTR_T			36
+#define __SIZEOF_PTHREAD_MUTEX_T		24
+#define __SIZEOF_PTHREAD_MUTEXATTR_T		4
+#define __SIZEOF_PTHREAD_COND_T			48
+#define __SIZEOF_PTHREAD_CONDATTR_T		4
+#define __SIZEOF_PTHREAD_RWLOCK_T		32
+#define __SIZEOF_PTHREAD_RWLOCKATTR_T		8
+#define __SIZEOF_PTHREAD_BARRIER_T		20
+#define __SIZEOF_PTHREAD_BARRIERATTR_T		4
+
+/* Data structure for mutex handling. */
+#define __PTHREAD_COMPAT_PADDING_MID
+#define __PTHREAD_COMPAT_PADDING_END
+#define __PTHREAD_MUTEX_LOCK_ELISION		0
+#define __PTHREAD_MUTEX_NUSERS_AFTER_KIND	1
+#define __PTHREAD_MUTEX_USE_UNION		1
+
+#define __LOCK_ALIGNMENT
+#define __ONCE_ALIGNMENT
+
+/* Paddings in this structure are not strictly necessary on C-SKY.
+   They are left for extensibility as most other architecture do so.  */
+struct __pthread_rwlock_arch_t
+{
+  unsigned int __readers;
+  unsigned int __writers;
+  unsigned int __wrphase_futex;
+  unsigned int __writers_futex;
+  unsigned int __pad3;
+  unsigned int __pad4;
+#if __BYTE_ORDER == __BIG_ENDIAN
+  unsigned char __pad1;
+  unsigned char __pad2;
+  unsigned char __shared;
+  unsigned char __flags;
+#else
+  unsigned char __flags;
+  unsigned char __shared;
+  unsigned char __pad1;
+  unsigned char __pad2;
+#endif
+  int __cur_writer;
+};
+
+#define __PTHREAD_RWLOCK_ELISION_EXTRA 0
+
+#endif
diff --git a/sysdeps/csky/nptl/bits/semaphore.h b/sysdeps/csky/nptl/bits/semaphore.h
new file mode 100644
index 0000000..0b13f59
--- /dev/null
+++ b/sysdeps/csky/nptl/bits/semaphore.h
@@ -0,0 +1,35 @@ 
+/* Machine-specific POSIX semaphore type layouts.  C-SKY version.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SEMAPHORE_H
+# error "Never use <bits/semaphore.h> directly; include <semaphore.h> instead."
+#endif
+
+
+#define __SIZEOF_SEM_T	16
+
+
+/* Value returned if `sem_open' failed.  */
+#define SEM_FAILED	((sem_t *) 0)
+
+
+typedef union
+{
+  char __size[__SIZEOF_SEM_T];
+  long int __align;
+} sem_t;