From patchwork Thu Jan 11 18:25:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 25344 Received: (qmail 97174 invoked by alias); 11 Jan 2018 18:25:48 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 97078 invoked by uid 89); 11 Jan 2018 18:25:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qk0-f195.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=7VowTjxHvQIxzknL5FRriKSFAgEtvQSl4L5rZls/dQU=; b=Tf/DcQH5Lxjq3Wl7CBJg47FbASpSJi9nlHvCFu3DWNpkmG3d9bXqbjKLXKjjwjTnFm J9rVZhJpwzgokGfLRbk3RphjuT+HVnQSnW8G8jCyS5wYefvFzHstkXs/VfilAyKyNpro G5MMJyH7JCr93iQWplkbh2wugSRNIhpeYyAS4bbKQ290onm5YW/sRmDW6A7o8ZPtVCW3 BVCXe+wnZM6Ftq6fcKa2lcpJKwtPz2sQYn/C4DAT1ayoIY8V2qmJUcNL1W7ST92wd4zf /8V+fInfwa7t2qXed233whbjLpYj6bG39tpa/9x2Ga+nL8OQZ1TNbCxxTRFLPl1c5a95 HMWA== X-Gm-Message-State: AKwxytctqzIQ6leLawwKzEj/7+wVCWndt2Bm/JMPwtinmzrQpeEpnUqb 5xTiU1zaUe50VyJvxriHFILxGN4jjNQ= X-Google-Smtp-Source: ACJfBovxhCJkHXkxRsUCbH5NZB+MdGVz1MCvHQrJTuePX/kpWCjJZWDdzvWTyVdops1X3SpSXsfijg== X-Received: by 10.55.80.7 with SMTP id e7mr34503276qkb.230.1515695142058; Thu, 11 Jan 2018 10:25:42 -0800 (PST) Subject: Re: [PATCH v3 04/18] Add string vectorized find and detection functions To: Joseph Myers Cc: libc-alpha@sourceware.org References: <1515588482-15744-1-git-send-email-adhemerval.zanella@linaro.org> <1515588482-15744-5-git-send-email-adhemerval.zanella@linaro.org> From: Adhemerval Zanella Message-ID: Date: Thu, 11 Jan 2018 16:25:39 -0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: On 11/01/2018 11:34, Joseph Myers wrote: > On Wed, 10 Jan 2018, Adhemerval Zanella wrote: > >> + >> +static inline unsigned char >> +extractbyte (op_t x, unsigned idx) > > Missing comment on the semantics of this function. "unsigned int". I applied this change locally: diff --git a/sysdeps/generic/string-extbyte.h b/sysdeps/generic/string-extbyte.h index 69a78ce..c4693b2 100644 --- a/sysdeps/generic/string-extbyte.h +++ b/sysdeps/generic/string-extbyte.h @@ -23,8 +23,10 @@ #include #include +/* Extract the byte at index IDX from word X, with index 0 being the + least significant byte. */ static inline unsigned char -extractbyte (op_t x, unsigned idx) +extractbyte (op_t x, unsigned int idx) { if (__BYTE_ORDER == __LITTLE_ENDIAN) return x >> (idx * CHAR_BIT); > >> +/* For architecture which only provides __builtin_clz{l} (HAVE_BUILTIN_CLZ) >> + and/or __builtin_ctz{l} (HAVE_BUILTIN_CTZ) which uses external libcalls >> + (for intance __c{l,t}z{s,d}i2 from libgcc) the following wrapper provides >> + inline implementation for both count leading zeros and count trailing >> + zeros using branchless computation. */ > > I think the comments need to say a bit more about the semantics of these > functions. In particular, do they follow the same rule as the built-in > functions that behavior is undefined if the argument is zero? If they do, > then I'd expect the comments on the functions that call them to specify > that they must not be called with a zero argument (zero arguments in this > case generally corresponding to words that are not at the end of the > string etc., so the functions indeed don't get called in that case). > I think it is reasonable to set the same semantic as the builtins and I did not pay attention to outline this mainly because they are not indeed used called in such cases (which would be UB for the builtin case in fact). I applied this patch locally: diff --git a/sysdeps/generic/string-fzi.h b/sysdeps/generic/string-fzi.h index 57101f2..534babb 100644 --- a/sysdeps/generic/string-fzi.h +++ b/sysdeps/generic/string-fzi.h @@ -29,7 +29,7 @@ [1] https://chessprogramming.wikispaces.com/Kim+Walisch */ -static inline unsigned +static inline unsigned int index_access (const op_t i) { static const char index[] = @@ -53,13 +53,14 @@ index_access (const op_t i) return index[i]; } -/* For architecture which only provides __builtin_clz{l} (HAVE_BUILTIN_CLZ) +/* For architectures which only provides __builtin_clz{l} (HAVE_BUILTIN_CLZ) and/or __builtin_ctz{l} (HAVE_BUILTIN_CTZ) which uses external libcalls (for intance __c{l,t}z{s,d}i2 from libgcc) the following wrapper provides inline implementation for both count leading zeros and count trailing - zeros using branchless computation. */ + zeros using branchless computation. As for builtin, if x is 0 the + result is undefined.*/ -static inline unsigned +static inline unsigned int __ctz (op_t x) { #if !HAVE_BUILTIN_CTZ @@ -78,7 +79,7 @@ __ctz (op_t x) #endif }; -static inline unsigned +static inline unsigned int __clz (op_t x) { #if !HAVE_BUILTIN_CLZ