PowerPC: memset optimization for POWER8/PPC64

  This patch adds an optimized memset implementation for POWER8.  For 
sizes from 0 to 255 bytes, a word/doubleword algorithm similar to
POWER7 optimized one is used.

For size higher than 255 two strategies are used: 

1. If the constant is different than 0, the memory is written with
   altivec vector instruction;

2. If constant is 0, dbcz instructions are used.  The loop is unrolled
   to clear 512 byte at time.

Using vector instructions increases throughput considerable, with a double
performance for sizes larger than 1024.  The dcbz loops unrolls also shows
performance improvement, by doubling throughput for sizes larger than
8192 bytes.

Tested on powerpc64 and powerpc64le (POWER8), GLIBC benchmark output attached.

--

	* benchtests/bench-memset.c (test_main): Add more test from size
	from 32 to 512 bytes.
	* sysdeps/powerpc/powerpc64/multiarch/Makefile [sysdep_routines]:
	Add POWER8 memset object.
	* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add POWER8 memset and bzero implementations.
	* sysdeps/powerpc/powerpc64/multiarch/bzero.c (__bzero): Add POWER8
	implementation.
	* sysdeps/powerpc/powerpc64/multiarch/memset.c (__libc_memset):
	Likewise.
	* sysdeps/powerpc/powerpc64/multiarch/memset-power8.S: New file:
	multiarch POWER8 memset optimization.
	* sysdeps/powerpc/powerpc64/power8/memset.S: New file: optimized
	POWER8 memset optimization.

---

Message ID	53C920CD.8030506@linux.vnet.ibm.com
State	Committed
Delegated to:	Adhemerval Zanella Netto
Headers	Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk Sender: libc-alpha-owner@sourceware.org Message-ID: <53C920CD.8030506@linux.vnet.ibm.com> Date: Fri, 18 Jul 2014 10:27:41 -0300 From: Adhemerval Zanella <azanella@linux.vnet.ibm.com> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: "GNU C. Library" <libc-alpha@sourceware.org> Subject: PowerPC: memset optimization for POWER8/PPC64 Content-Type: multipart/mixed; boundary="------------090708090801030604020000"

PowerPC: memset optimization for POWER8/PPC64

Commit Message

Comments

Patch