[v3] explicit_bzero yet again

Message ID 564DE0BE.5070607@panix.com
State Superseded
Headers

Commit Message

Zack Weinberg Nov. 19, 2015, 2:46 p.m. UTC
  I've updated the explicit_bero patches for the change to how .abilist
files work and the reorganization of x86.  I've also split the
controversial string2.h optimization into its own patch so it can be
discussed separately from the merits of the interface.

zw
  

Comments

Rich Felker Nov. 19, 2015, 3:55 p.m. UTC | #1
On Thu, Nov 19, 2015 at 09:46:22AM -0500, Zack Weinberg wrote:
> [PATCH 1/3] New library function, explicit_bzero
> 
> The new function explicit_bzero is functionally identical to bzero,
> but the compiler will not delete a call to it, even if the memory
> region it's applied to is dead after the call.  You should use this
> function, for instance, to erase cryptographic keys or similar when
> you are done with them.  Taken from OpenBSD.

Perhaps glibc explicitly does not support LTO, but with LTO this is
trivially removable by the compiler. IMO from the beginning there
should be asm constraints to make it impossible to remove the code
even if it's inlined. Then it will be safe/future-proof (to the extent
that is possible; of course all the other issues with the completeness
of clearing still apply).

Rich
  
Zack Weinberg Dec. 7, 2015, 2:05 p.m. UTC | #2
On 11/19/2015 09:46 AM, Zack Weinberg wrote:
> I've updated the explicit_bero patches for the change to how .abilist
> files work and the reorganization of x86.  I've also split the
> controversial string2.h optimization into its own patch so it can be
> discussed separately from the merits of the interface.

Ping.  The patches at
<https://sourceware.org/ml/libc-alpha/2015-11/msg00467.html> are
awaiting review.

zw
  
Zack Weinberg Dec. 7, 2015, 2:13 p.m. UTC | #3
On 11/19/2015 10:55 AM, Rich Felker wrote:
> On Thu, Nov 19, 2015 at 09:46:22AM -0500, Zack Weinberg wrote:
>> [PATCH 1/3] New library function, explicit_bzero
>>
>> The new function explicit_bzero is functionally identical to bzero,
>> but the compiler will not delete a call to it, even if the memory
>> region it's applied to is dead after the call.  You should use this
>> function, for instance, to erase cryptographic keys or similar when
>> you are done with them.  Taken from OpenBSD.
> 
> Perhaps glibc explicitly does not support LTO, but with LTO this is
> trivially removable by the compiler. IMO from the beginning there
> should be asm constraints to make it impossible to remove the code
> even if it's inlined.

Presently there *is* no such asm construct (all possibilities either
don't work sufficiently universally, someone on this list has already
objected to them, or both).  Therefore I do not think this is feasible.

However, presently no compiler will LTO-optimize across shared library
boundaries, and even if such a compiler existed I would argue that the
optimization you contemplate is invalid for functions whose semantics
are unknown to the compiler, because the library could be swapped out at
runtime.

zw
  
Szabolcs Nagy Dec. 7, 2015, 3:18 p.m. UTC | #4
On 07/12/15 14:13, Zack Weinberg wrote:
> On 11/19/2015 10:55 AM, Rich Felker wrote:
>> On Thu, Nov 19, 2015 at 09:46:22AM -0500, Zack Weinberg wrote:
>>> [PATCH 1/3] New library function, explicit_bzero
>>>
>>> The new function explicit_bzero is functionally identical to bzero,
>>> but the compiler will not delete a call to it, even if the memory
>>> region it's applied to is dead after the call.  You should use this
>>> function, for instance, to erase cryptographic keys or similar when
>>> you are done with them.  Taken from OpenBSD.
>>
>> Perhaps glibc explicitly does not support LTO, but with LTO this is
>> trivially removable by the compiler. IMO from the beginning there
>> should be asm constraints to make it impossible to remove the code
>> even if it's inlined.
>
> Presently there *is* no such asm construct (all possibilities either
> don't work sufficiently universally, someone on this list has already
> objected to them, or both).  Therefore I do not think this is feasible.
>
> However, presently no compiler will LTO-optimize across shared library
> boundaries, and even if such a compiler existed I would argue that the
> optimization you contemplate is invalid for functions whose semantics
> are unknown to the compiler, because the library could be swapped out at
> runtime.
>

obviously the concern is about static linking.
  
Zack Weinberg Dec. 7, 2015, 3:58 p.m. UTC | #5
On Mon, Dec 7, 2015 at 10:18 AM, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> On 07/12/15 14:13, Zack Weinberg wrote:
>> On 11/19/2015 10:55 AM, Rich Felker wrote:
>>>
>>> Perhaps glibc explicitly does not support LTO, but with LTO this is
>>> trivially removable by the compiler. IMO from the beginning there
>>> should be asm constraints to make it impossible to remove the code
>>> even if it's inlined.
>>
>> Presently there *is* no such asm construct [...] However, presently no
>> compiler will LTO-optimize across shared library boundaries [...]
>
> obviously the concern is about static linking.

That didn't occur to me at all.

glibc explicitly doesn't support being statically linked, but I prefer
to keep it working where possible -- unfortunately, as I said, there
is no existing asm construct or compiler intrinsic that fits the bill.
I'd _like_ there to be something like
__builtin_use_memory(untyped-pointer, size) but I don't have time to
implement it myself -- this project is already way, way over the time
budget I originally allocated it -- and in any case it would only
become available in bleeding-edge compilers.

zw
  
Zack Weinberg Dec. 21, 2015, 8:32 p.m. UTC | #6
On Mon, Dec 7, 2015 at 9:05 AM, Zack Weinberg <zackw@panix.com> wrote:
> On 11/19/2015 09:46 AM, Zack Weinberg wrote:
>> I've updated the explicit_bero patches for the change to how .abilist
>> files work and the reorganization of x86.  I've also split the
>> controversial string2.h optimization into its own patch so it can be
>> discussed separately from the merits of the interface.
>
> Ping.  The patches at
> <https://sourceware.org/ml/libc-alpha/2015-11/msg00467.html> are
> awaiting review.

Ping.

zw
  
Mike Frysinger Dec. 29, 2015, 5:08 p.m. UTC | #7
On 07 Dec 2015 10:58, Zack Weinberg wrote:
> glibc explicitly doesn't support being statically linked

we keep it working, and we shouldn't be landing changes that explicitly break it
-mike
  
Zack Weinberg Dec. 29, 2015, 5:22 p.m. UTC | #8
On Tue, Dec 29, 2015 at 9:08 AM, Mike Frysinger <vapier@gentoo.org> wrote:
> On 07 Dec 2015 10:58, Zack Weinberg wrote:
>> glibc explicitly doesn't support being statically linked
>
> we keep it working, and we shouldn't be landing changes that explicitly break it

Are you rejecting the patch in the absence of compiler support?

zw
  
Rich Felker Dec. 29, 2015, 5:45 p.m. UTC | #9
On Tue, Dec 29, 2015 at 09:22:49AM -0800, Zack Weinberg wrote:
> On Tue, Dec 29, 2015 at 9:08 AM, Mike Frysinger <vapier@gentoo.org> wrote:
> > On 07 Dec 2015 10:58, Zack Weinberg wrote:
> >> glibc explicitly doesn't support being statically linked
> >
> > we keep it working, and we shouldn't be landing changes that explicitly break it
> 
> Are you rejecting the patch in the absence of compiler support?

I'm not calling for a rejection of the patch, but for an end to the
claim that static linking is "not supported by glibc". It is, and it
should work. It's probably unlikely that the code will be optimized
out in the near future (though I haven't checked this), but I think it
would make sense to add an extra line or two to make it semantically
correct and safe against LTO inlining. This does not require any heavy
"compiler support", just a compiler barrier in the form of an empty
asm statement (with proper constraints) or similar.

Rich
  
Zack Weinberg Jan. 11, 2016, 3:11 p.m. UTC | #10
On 12/29/2015 12:45 PM, Rich Felker wrote:
> On Tue, Dec 29, 2015 at 09:22:49AM -0800, Zack Weinberg wrote:
>> On Tue, Dec 29, 2015 at 9:08 AM, Mike Frysinger <vapier@gentoo.org> wrote:
>>> On 07 Dec 2015 10:58, Zack Weinberg wrote:
>>>> glibc explicitly doesn't support being statically linked
>>>
>>> we keep it working, and we shouldn't be landing changes that explicitly break it
>>
>> Are you rejecting the patch in the absence of compiler support?
> 
> I'm not calling for a rejection of the patch, but for an end to the
> claim that static linking is "not supported by glibc". It is, and it
> should work. It's probably unlikely that the code will be optimized
> out in the near future (though I haven't checked this), but I think it
> would make sense to add an extra line or two to make it semantically
> correct and safe against LTO inlining. This does not require any heavy
> "compiler support", just a compiler barrier in the form of an empty
> asm statement (with proper constraints) or similar.

I assume you mean you want the definition of explicit_bzero to be
something like this:

    void explicit_bzero (void *block, size_t len)
    {
        memset (block, 0, len);
        asm volatile ("" : : : "memory");
    }

(this is a bigger barrier than strictly necessary, but, as beaten to
death elsewhere, there is no more precise construct that works in all
contexts).

What I am not clear on is whether you think it suffices to put this in
the *out-of-line* definition of explicit_bzero in libc.{so,a} (which
would mean that it only affects LTO reaching into libc.a, at least with
the current generation of compilers) or whether the headers need an
always_inline definition with the barrier (which would affect all uses).

zw
  
Zack Weinberg Jan. 11, 2016, 3:15 p.m. UTC | #11
On 12/21/2015 03:32 PM, Zack Weinberg wrote:
> On Mon, Dec 7, 2015 at 9:05 AM, Zack Weinberg <zackw@panix.com> wrote:
>> On 11/19/2015 09:46 AM, Zack Weinberg wrote:
>>> I've updated the explicit_bero patches for the change to how .abilist
>>> files work and the reorganization of x86.  I've also split the
>>> controversial string2.h optimization into its own patch so it can be
>>> discussed separately from the merits of the interface.
>>
>> Ping.  The patches at
>> <https://sourceware.org/ml/libc-alpha/2015-11/msg00467.html> are
>> awaiting review.
> 
> Ping.

Ping.

My hacking time is very limited right now.  With the freeze looming, I
would appreciate a definitive yes-or-no answer to whether this will be
accepted for the next release (possibly with some concrete list of
changes) so I know whether I need to find time in the near future.

zw
  
Carlos O'Donell Jan. 11, 2016, 3:34 p.m. UTC | #12
On 01/11/2016 10:15 AM, Zack Weinberg wrote:
> On 12/21/2015 03:32 PM, Zack Weinberg wrote:
>> On Mon, Dec 7, 2015 at 9:05 AM, Zack Weinberg <zackw@panix.com> wrote:
>>> On 11/19/2015 09:46 AM, Zack Weinberg wrote:
>>>> I've updated the explicit_bero patches for the change to how .abilist
>>>> files work and the reorganization of x86.  I've also split the
>>>> controversial string2.h optimization into its own patch so it can be
>>>> discussed separately from the merits of the interface.
>>>
>>> Ping.  The patches at
>>> <https://sourceware.org/ml/libc-alpha/2015-11/msg00467.html> are
>>> awaiting review.
>>
>> Ping.
> 
> Ping.
> 
> My hacking time is very limited right now.  With the freeze looming, I
> would appreciate a definitive yes-or-no answer to whether this will be
> accepted for the next release (possibly with some concrete list of
> changes) so I know whether I need to find time in the near future.

I'd nack this for 2.23, and review again in 2.24.

The clear points from this discussion are:

(a) glibc does support static linking.

    If you see anywhere that we are saying we don't support static
    linking, please point it out and I will correct this.

    We will support static linking fully, but with caveats that
    allow the user to accept the problems that come with having
    two namespaces in their application (the static one and the
    dynamic one).

(b) LTO problems.

    You leave an open question to Rich about what kind of asm
    barrier would be best to use today (given that no better
    barrier exists). The fact that neither you nor Rich have
    decided yet what should be used means we need more discussion.

    If there isn't already a gcc bug filed for creating this barrier
    we should file one now.

All of this means to me that this should be moved to 2.24.

Cheers,
Carlos.
  
Alexander Cherepanov Jan. 11, 2016, 3:34 p.m. UTC | #13
On 2015-12-07 17:13, Zack Weinberg wrote:
> On 11/19/2015 10:55 AM, Rich Felker wrote:
>> On Thu, Nov 19, 2015 at 09:46:22AM -0500, Zack Weinberg wrote:
>>> [PATCH 1/3] New library function, explicit_bzero
>>>
>>> The new function explicit_bzero is functionally identical to bzero,
>>> but the compiler will not delete a call to it, even if the memory
>>> region it's applied to is dead after the call.  You should use this
>>> function, for instance, to erase cryptographic keys or similar when
>>> you are done with them.  Taken from OpenBSD.
>>
>> Perhaps glibc explicitly does not support LTO, but with LTO this is
>> trivially removable by the compiler. IMO from the beginning there
>> should be asm constraints to make it impossible to remove the code
>> even if it's inlined.
>
> Presently there *is* no such asm construct (all possibilities either
> don't work sufficiently universally, someone on this list has already
> objected to them, or both).  Therefore I do not think this is feasible.
>
> However, presently no compiler will LTO-optimize across shared library
> boundaries, and even if such a compiler existed I would argue that the
> optimization you contemplate is invalid for functions whose semantics
> are unknown to the compiler, because the library could be swapped out at
> runtime.

I see that you added some uses of explicit_bzero into glibc. Hence 
global optimization of glibc itself is a concern?
  
Zack Weinberg Jan. 11, 2016, 6:05 p.m. UTC | #14
On Mon, Jan 11, 2016 at 10:34 AM, Alexander Cherepanov
<ch3root@openwall.com> wrote:
> On 2015-12-07 17:13, Zack Weinberg wrote:
>> However, presently no compiler will LTO-optimize across shared library
>> boundaries, and even if such a compiler existed I would argue that the
>> optimization you contemplate is invalid for functions whose semantics
>> are unknown to the compiler, because the library could be swapped out at
>> runtime.
>
> I see that you added some uses of explicit_bzero into glibc. Hence global
> optimization of glibc itself is a concern?

IIRC all those uses are in libcrypt rather than libc, but this is
still an excellent point which tips my personal opinion from "well,
ok, if it makes you happy" to "yes, we definitely need a barrier in
the implementation".

zw
  
Zack Weinberg Jan. 11, 2016, 6:08 p.m. UTC | #15
On Mon, Jan 11, 2016 at 10:34 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 01/11/2016 10:15 AM, Zack Weinberg wrote:
>> My hacking time is very limited right now.  With the freeze looming, I
>> would appreciate a definitive yes-or-no answer to whether this will be
>> accepted for the next release (possibly with some concrete list of
>> changes) so I know whether I need to find time in the near future.
>
> I'd nack this for 2.23, and review again in 2.24.

In that case, I'm going to drop this for now and pick up again when I
have more time.

>     If there isn't already a gcc bug filed for creating this barrier
>     we should file one now.

I tried to start a conversation on gcc@ (and clang-dev) about that a
couple months ago but no one seemed interested.  I don't have time to
push such a conversation (in any venue) right now.

zw
  
Carlos O'Donell Jan. 11, 2016, 9:07 p.m. UTC | #16
On 01/11/2016 01:08 PM, Zack Weinberg wrote:
> On Mon, Jan 11, 2016 at 10:34 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 01/11/2016 10:15 AM, Zack Weinberg wrote:
>>> My hacking time is very limited right now.  With the freeze looming, I
>>> would appreciate a definitive yes-or-no answer to whether this will be
>>> accepted for the next release (possibly with some concrete list of
>>> changes) so I know whether I need to find time in the near future.
>>
>> I'd nack this for 2.23, and review again in 2.24.
> 
> In that case, I'm going to drop this for now and pick up again when I
> have more time.
> 
>>     If there isn't already a gcc bug filed for creating this barrier
>>     we should file one now.
> 
> I tried to start a conversation on gcc@ (and clang-dev) about that a
> couple months ago but no one seemed interested.  I don't have time to
> push such a conversation (in any venue) right now.

File the bug anyway? At least we can use the bug to centralize
links to ml discussions? For slow moving projects like this it
is useful to have this kind of open bug to aggregate discussions.

Cheers,
Carlos.
  
Florian Weimer Aug. 3, 2016, 12:36 p.m. UTC | #17
On 11/19/2015 03:46 PM, Zack Weinberg wrote:
> I've updated the explicit_bero patches for the change to how .abilist
> files work and the reorganization of x86.  I've also split the
> controversial string2.h optimization into its own patch so it can be
> discussed separately from the merits of the interface.

Zack, can you propose a stripped-down version of the patch with no 
unrelated test changes?

I need the explicit_bzero compiler barrier for writing some tests.

I think the API is as good as it gets.  The implementation may need some 
changes in the future (e.g., clearing all callee-saved registers), but 
that depends on how things work out on the GCC side and does not seem 
required for a first implementation.

Florian
  
Zack Weinberg Aug. 3, 2016, 2:06 p.m. UTC | #18
On Wed, Aug 3, 2016 at 8:36 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 11/19/2015 03:46 PM, Zack Weinberg wrote:
>> I've updated the explicit_bero patches for the change to how .abilist
>> files work and the reorganization of x86.  I've also split the
>> controversial string2.h optimization into its own patch so it can be
>> discussed separately from the merits of the interface.
>
> Zack, can you propose a stripped-down version of the patch with no unrelated
> test changes?

I may not have time to do this before the 16th.

zw
  

Patch

[PATCH 3/3] Use explicit_bzero where appropriate

I *believe* these are the only places where memset was being used
to clear buffers containing sensitive data.  The compiler probably
couldn't optimize *all* of them out but it seems best to change them all.

The legacy DES implementation wasn't bothering to clear its buffers,
so I added that, mostly for consistency's sake.

	* crypt/crypt-entry.c (__crypt_r): Clear key-dependent intermediate
	data before returning, using explicit_bzero.
	* crypt/md5-crypt.c (__md5_crypt_r): Likewise.
	* crypt/sha256-crypt.c (__sha256_crypt_r): Likewise.
	* crypt/sha512-crypt.c (__sha512_crypt_r): Likewise.
---
 crypt/crypt-entry.c  | 11 +++++++++++
 crypt/md5-crypt.c    |  8 ++++----
 crypt/sha256-crypt.c | 14 +++++++-------
 crypt/sha512-crypt.c | 14 +++++++-------
 4 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/crypt/crypt-entry.c b/crypt/crypt-entry.c
index f7b3949..1ea127e 100644
--- a/crypt/crypt-entry.c
+++ b/crypt/crypt-entry.c
@@ -141,6 +141,17 @@  __crypt_r (const char *key, const char *salt,
    * And convert back to 6 bit ASCII
    */
   _ufc_output_conversion_r (res[0], res[1], salt, data);
+
+#ifdef _LIBC
+  /*
+   * Erase key-dependent intermediate data.  Data dependent only on
+   * the salt is not considered sensitive.
+   */
+  explicit_bzero (ktab, sizeof (ktab));
+  explicit_bzero (data->keysched, sizeof (data->keysched));
+  explicit_bzero (res, sizeof (res));
+#endif
+
   return data->crypt_3_buf;
 }
 weak_alias (__crypt_r, crypt_r)
diff --git a/crypt/md5-crypt.c b/crypt/md5-crypt.c
index dad5942..8063e87 100644
--- a/crypt/md5-crypt.c
+++ b/crypt/md5-crypt.c
@@ -288,13 +288,13 @@  __md5_crypt_r (const char *key, const char *salt, char *buffer, int buflen)
 #ifndef USE_NSS
   __md5_init_ctx (&ctx);
   __md5_finish_ctx (&ctx, alt_result);
-  memset (&ctx, '\0', sizeof (ctx));
-  memset (&alt_ctx, '\0', sizeof (alt_ctx));
+  explicit_bzero (&ctx, sizeof (ctx));
+  explicit_bzero (&alt_ctx, sizeof (alt_ctx));
 #endif
   if (copied_key != NULL)
-    memset (copied_key, '\0', key_len);
+    explicit_bzero (copied_key, key_len);
   if (copied_salt != NULL)
-    memset (copied_salt, '\0', salt_len);
+    explicit_bzero (copied_salt, salt_len);
 
   free (free_key);
   return buffer;
diff --git a/crypt/sha256-crypt.c b/crypt/sha256-crypt.c
index 96102e9..55c40dc 100644
--- a/crypt/sha256-crypt.c
+++ b/crypt/sha256-crypt.c
@@ -371,16 +371,16 @@  __sha256_crypt_r (const char *key, const char *salt, char *buffer, int buflen)
 #ifndef USE_NSS
   __sha256_init_ctx (&ctx);
   __sha256_finish_ctx (&ctx, alt_result);
-  memset (&ctx, '\0', sizeof (ctx));
-  memset (&alt_ctx, '\0', sizeof (alt_ctx));
+  explicit_bzero (&ctx, sizeof (ctx));
+  explicit_bzero (&alt_ctx, sizeof (alt_ctx));
 #endif
-  memset (temp_result, '\0', sizeof (temp_result));
-  memset (p_bytes, '\0', key_len);
-  memset (s_bytes, '\0', salt_len);
+  explicit_bzero (temp_result, sizeof (temp_result));
+  explicit_bzero (p_bytes, key_len);
+  explicit_bzero (s_bytes, salt_len);
   if (copied_key != NULL)
-    memset (copied_key, '\0', key_len);
+    explicit_bzero (copied_key, key_len);
   if (copied_salt != NULL)
-    memset (copied_salt, '\0', salt_len);
+    explicit_bzero (copied_salt, salt_len);
 
   free (free_key);
   free (free_pbytes);
diff --git a/crypt/sha512-crypt.c b/crypt/sha512-crypt.c
index 9257492..d3baafb 100644
--- a/crypt/sha512-crypt.c
+++ b/crypt/sha512-crypt.c
@@ -393,16 +393,16 @@  __sha512_crypt_r (const char *key, const char *salt, char *buffer, int buflen)
 #ifndef USE_NSS
   __sha512_init_ctx (&ctx);
   __sha512_finish_ctx (&ctx, alt_result);
-  memset (&ctx, '\0', sizeof (ctx));
-  memset (&alt_ctx, '\0', sizeof (alt_ctx));
+  explicit_bzero (&ctx, sizeof (ctx));
+  explicit_bzero (&alt_ctx, sizeof (alt_ctx));
 #endif
-  memset (temp_result, '\0', sizeof (temp_result));
-  memset (p_bytes, '\0', key_len);
-  memset (s_bytes, '\0', salt_len);
+  explicit_bzero (temp_result, sizeof (temp_result));
+  explicit_bzero (p_bytes, key_len);
+  explicit_bzero (s_bytes, salt_len);
   if (copied_key != NULL)
-    memset (copied_key, '\0', key_len);
+    explicit_bzero (copied_key, key_len);
   if (copied_salt != NULL)
-    memset (copied_salt, '\0', salt_len);
+    explicit_bzero (copied_salt, salt_len);
 
   free (free_key);
   free (free_pbytes);
-- 
2.6.2