From patchwork Mon Oct 17 15:44:03 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carlos O'Donell X-Patchwork-Id: 16580 Received: (qmail 90018 invoked by alias); 17 Oct 2016 15:44:09 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 89994 invoked by uid 89); 17 Oct 2016 15:44:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL, BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW, RCVD_IN_SORBS_SPAM autolearn=no version=3.3.2 spammy=detect, our X-HELO: mail-qk0-f180.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding; bh=sJt7pQB8y/AFM89r47EAMNgAqlLglqQsjnzXKBx6JDg=; b=S2wQxX0qVdvEJHuDQJwhgTQVoo4eQGo0/R2vhlsCjlBkPIOrnvSilLZhoHiORu2lBv +bkTxzx0tugppLvvaAAhz47/qAFxKZzHhFLDPQ++oeFC4SQ9dH4t6bfub3ynFPelDG8i wruNmeceJ2/0gyRwR23YyESpu0xRoO0dobm2WD2ACNhsNQgp+CB977wyGuS5zzVt8NTx XDmqFGd9X7/lrkkVw8GWp+07bKpvH5CkBS81rj+n+HRBz0ItYfn5zbTQn07BjnrYGrFi gdzEwhOF3cXkkq8soiZBQZcoDfMs0iUmcXBwDTOGI4fPE+apiwo59jJfBuDdimVmxlez 8xuQ== X-Gm-Message-State: AA6/9RlOT6uGmZopH+oj7COltckVILYCZtJV6KfOW+9Lf28qkJh46DaqlgqLiAO0zZoOVjvk X-Received: by 10.55.25.96 with SMTP id k93mr10116902qkh.86.1476719045374; Mon, 17 Oct 2016 08:44:05 -0700 (PDT) Subject: Re: [PATCH] Bug 201689: Belt-and-suspenders detection of FMA. To: "H.J. Lu" References: <90af3efa-aaca-2dce-e433-1df7e5dbcfbd@redhat.com> Cc: GNU C Library From: Carlos O'Donell Message-ID: <85c5a00e-583b-da48-0bac-66ec6231be5a@redhat.com> Date: Mon, 17 Oct 2016 11:44:03 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: On 10/14/2016 03:12 PM, H.J. Lu wrote: > On Fri, Oct 14, 2016 at 11:09 AM, Carlos O'Donell wrote: >> In the Intel Architecture Instruction Set Extensions Programming reference >> the recommended way to test for FMA in section '2.2.1 Detection of FMA' >> is: >> >> "Application Software must identify that hardware supports AVX as explained >> in ... after that it must also detect support for FMA..." >> >> We don't do that in glibc. We use osxsave to detect the use of xgetbv, and >> after that we check for AVX and FMA orthogonally. It is conceivable that >> you could have the AVX bit clear and the FMA bit in an undefined state. >> >> I have never seen a machine with the AVX bit clear and the FMA bit set, but >> we should follow the intel specifications and adjust our check as the following >> patch works. > > One can't write a program with FMA nor AVX2 without using AVX > instructions. AVX2/FMA aren't usable if AVX isn't. We should do > > if (CPU_FEATURES_CPU_P (cpu_features, AVX)) > { > Set AVX available > if (CPU_FEATURES_CPU_P (cpu_features, AVX2)) > Set AVX2 available > if (CPU_FEATURES_CPU_P (cpu_features, FMA)) > Set FMA available > } > Agreed. I double checked the manual, here is a new patch. Testing on x86_64 and i686. OK to commit if the results are clean? v2 - Move FMA and AVX2 check up into AVX check since they depend upon it. 2016-10-14 Carlos O'Donell [BZ #20689] * sysdeps/x86/cpu-features.c: Only enable FMA and AVX2 if AVX is usable. diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index 11b9af2..e228a76 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -60,12 +60,20 @@ get_common_indeces (struct cpu_features *cpu_features, { /* Determine if AVX is usable. */ if (CPU_FEATURES_CPU_P (cpu_features, AVX)) - cpu_features->feature[index_arch_AVX_Usable] - |= bit_arch_AVX_Usable; - /* Determine if AVX2 is usable. */ - if (CPU_FEATURES_CPU_P (cpu_features, AVX2)) - cpu_features->feature[index_arch_AVX2_Usable] - |= bit_arch_AVX2_Usable; + { + cpu_features->feature[index_arch_AVX_Usable] + |= bit_arch_AVX_Usable; + /* The following features depend on AVX being usable. */ + /* Determine if AVX2 is usable. */ + if (CPU_FEATURES_CPU_P (cpu_features, AVX2)) + cpu_features->feature[index_arch_AVX2_Usable] + |= bit_arch_AVX2_Usable; + /* Determine if FMA is usable. */ + if (CPU_FEATURES_CPU_P (cpu_features, FMA)) + cpu_features->feature[index_arch_FMA_Usable] + |= bit_arch_FMA_Usable; + } + /* Check if OPMASK state, upper 256-bit of ZMM0-ZMM15 and ZMM16-ZMM31 state are enabled. */ if ((xcrlow & (bit_Opmask_state | bit_ZMM0_15_state @@ -83,10 +91,6 @@ get_common_indeces (struct cpu_features *cpu_features, |= bit_arch_AVX512DQ_Usable; } } - /* Determine if FMA is usable. */ - if (CPU_FEATURES_CPU_P (cpu_features, FMA)) - cpu_features->feature[index_arch_FMA_Usable] - |= bit_arch_FMA_Usable; } } }