Message ID | 20201004130938.64575-1-toiwoton@gmail.com |
---|---|
Headers |
Return-Path: <libc-alpha-bounces@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 256BA385780E; Sun, 4 Oct 2020 13:09:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 256BA385780E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1601816989; bh=7USDt3tvTYmxhfu+BgQKb/20G7+qPdp+qhldtRHy1I0=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=iV27yiWN0sAk2z9L7l4SDFA6NhAEfuiVAoVa75HJD3yNwQqM3w0KR23phymWFpe/K /6rcXenhyikYOK5OuxIFSHBvsJEx5D8nDgQRZq7dYUYIwwCCnRfXUCYEgVMAQ19gOK umPxhgp5TGbAKF89D6xPqI6oUvCrZHcneVmvZsEA= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-lf1-x141.google.com (mail-lf1-x141.google.com [IPv6:2a00:1450:4864:20::141]) by sourceware.org (Postfix) with ESMTPS id 9B18B385780E for <libc-alpha@sourceware.org>; Sun, 4 Oct 2020 13:09:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9B18B385780E Received: by mail-lf1-x141.google.com with SMTP id d24so3743313lfa.8 for <libc-alpha@sourceware.org>; Sun, 04 Oct 2020 06:09:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=7USDt3tvTYmxhfu+BgQKb/20G7+qPdp+qhldtRHy1I0=; b=LoGiaS2ncUd/4GJ0BFtEZYYGrHY9pjDgUhVow5i7YmPXcuX7bnCfjPUOeR7sc5VYK8 3ZKGcLzJ1M4Rkn5s6CcVaCyrVUoT/VyVcE6/1sVo2DnpyeQk661V4puQpTXQ7iuEj9xe 9PCRt9p3tgG8DANEzurmmvMR1DQS6d/zQdh2662/uCmqn3JWk1d6UdS+mnXlTdf0iqIJ Ssi+nM7GOCA4q8tqjGp5/k6LyI0Irp9WuFJ3f0P1/iDGBvlQKZm3o9zt3IlxP7N7uKR1 PuRhKqAlv/Q4AViLqJm8nc3zqkHvojmmO224V8uUOijYEhULzPlsWXL8SJ45edVr5AY0 xGxQ== X-Gm-Message-State: AOAM5327jaPswSdYp5UXZ4z35rimS+SKM8NfQbx2cmGU/ZYWu0+ZPcJH ewZaEt8UiWYXnGPA9LxiFcfafL73CNU= X-Google-Smtp-Source: ABdhPJyFjAfnJ3pwMkYEUBieXrm8M7NfoXierDxhBZKNrcWqgK+KzVu0c7AY2jiIrBADUedGccd4kw== X-Received: by 2002:a19:f71a:: with SMTP id z26mr251235lfe.90.1601816984139; Sun, 04 Oct 2020 06:09:44 -0700 (PDT) Received: from localhost.localdomain (88-114-211-119.elisa-laajakaista.fi. [88.114.211.119]) by smtp.gmail.com with ESMTPSA id i7sm2264454ljb.44.2020.10.04.06.09.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Oct 2020 06:09:43 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [RFC PATCH 0/3] Improved ALSR Date: Sun, 4 Oct 2020 16:09:35 +0300 Message-Id: <20201004130938.64575-1-toiwoton@gmail.com> X-Mailer: git-send-email 2.28.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> From: Topi Miettinen via Libc-alpha <libc-alpha@sourceware.org> Reply-To: Topi Miettinen <toiwoton@gmail.com> Cc: Topi Miettinen <toiwoton@gmail.com> Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org> |
Series | Improved ALSR | |
Message
Topi Miettinen
Oct. 4, 2020, 1:09 p.m. UTC
Problem with using sbrk() for allocations is that the location of the memory is relatively predicatable since it's always located next to data segment. This series makes malloc() and TCB use mmap() instead. Topi Miettinen (3): csu: randomize location of TCB malloc: always use mmap() to improve ASLR dl-sysdep: disable remaining calls to sbrk() csu/libc-tls.c | 20 ++++++++++++++------ elf/dl-sysdep.c | 2 ++ malloc/arena.c | 5 ++++- malloc/malloc.c | 16 +++++++++++++--- malloc/morecore.c | 2 ++ sysdeps/unix/sysv/linux/dl-sysdep.c | 2 ++ sysdeps/unix/sysv/linux/mmap64.c | 19 +++++++++++++++++++ sysdeps/unix/sysv/linux/mmap_internal.h | 3 +++ 8 files changed, 59 insertions(+), 10 deletions(-)
Comments
On 4.10.2020 16.09, Topi Miettinen wrote: > Problem with using sbrk() for allocations is that the location of the > memory is relatively predicatable since it's always located next to > data segment. This series makes malloc() and TCB use mmap() instead. No comments at all? I see several implementation options here: 1. Always use mmap() instead of sbrk(), delete any uses of sbrk() I have hard time thinking why sbrk() would ever be the preferred choice over mmap(), especially considering security. There may be some bytes wasted, so embedded systems could want to save those and also MMU-less systems can't map pages anywhere, but then those probably won't use glibc. 2. Conditionally use mmap() instead of sbrk() Something like `#define USE_SBRK`, enabled by `configure` or a header file. Sub-options: 2.1. Default to sbrk(), use mmap() only for Linux This is of course safer if some obscure system needs sbrk(). It would be even safer to limit mmap() only to Linux/x86_64 (which is all I care). 2.2. Default to mmap() but don't enable sbrk() anywhere This is pretty much like #1 but after breakage is noticed for some obscure systems, it's easy to `#define USE_SBRK` somewhere. I've been using a patched glibc for a month without seeing problems. I enabled audit logging for the brk() system call and installed a global seccomp filter (in initrd) which returns ENOSYS to catch any uses. So far I've only noticed that cpp (used by X11 startup in addition to compiling) calls sbrk() to check memory usage. Perhaps it should use official malloc statistics interface instead, since malloc() may use mmap() for other reasons and then sbrk() won't return true data. It's easy to check that sbrk() has not been used with the command `grep '\[heap\]' /proc/*/maps` (as root), which should print nothing since there are no heaps (as in "extended data segment") anymore. -Topi > Topi Miettinen (3): > csu: randomize location of TCB > malloc: always use mmap() to improve ASLR > dl-sysdep: disable remaining calls to sbrk() > > csu/libc-tls.c | 20 ++++++++++++++------ > elf/dl-sysdep.c | 2 ++ > malloc/arena.c | 5 ++++- > malloc/malloc.c | 16 +++++++++++++--- > malloc/morecore.c | 2 ++ > sysdeps/unix/sysv/linux/dl-sysdep.c | 2 ++ > sysdeps/unix/sysv/linux/mmap64.c | 19 +++++++++++++++++++ > sysdeps/unix/sysv/linux/mmap_internal.h | 3 +++ > 8 files changed, 59 insertions(+), 10 deletions(-) >
The 11/23/2020 18:06, Topi Miettinen via Libc-alpha wrote: > No comments at all? I see several implementation options here: > > 1. Always use mmap() instead of sbrk(), delete any uses of sbrk() > > I have hard time thinking why sbrk() would ever be the preferred choice over > mmap(), especially considering security. There may be some bytes wasted, so i'm not against using mmap instead brk in malloc but the latter has more overhead so such change should be measured. > 2. Conditionally use mmap() instead of sbrk() > > Something like `#define USE_SBRK`, enabled by `configure` or a header file. i think configure time option is not a good idea, but e.g. it can be a runtime tunable. > I've been using a patched glibc for a month without seeing problems. I > enabled audit logging for the brk() system call and installed a global > seccomp filter (in initrd) which returns ENOSYS to catch any uses. So far > I've only noticed that cpp (used by X11 startup in addition to compiling) > calls sbrk() to check memory usage. Perhaps it should use official malloc > statistics interface instead, since malloc() may use mmap() for other > reasons and then sbrk() won't return true data. sbrk should continue to work even if glibc itself does not use it internally, that's public api/abi.
* Topi Miettinen via Libc-alpha: > Problem with using sbrk() for allocations is that the location of the > memory is relatively predicatable since it's always located next to > data segment. This series makes malloc() and TCB use mmap() instead. Doesn't switching to mmap trade one (relatively) fixed offset for another? I think anonymous mmap is not randomized independently from the file mappings used for loading DSOs. And the series only changes the TCB allocation for the main thread. Fixing thread TCB/stack collocation is a completely different matter (and probably the more significant issue than lack of ASLR). Thanks, Florian
On 23.11.2020 18.41, Szabolcs Nagy wrote: > The 11/23/2020 18:06, Topi Miettinen via Libc-alpha wrote: >> No comments at all? I see several implementation options here: >> >> 1. Always use mmap() instead of sbrk(), delete any uses of sbrk() >> >> I have hard time thinking why sbrk() would ever be the preferred choice over >> mmap(), especially considering security. There may be some bytes wasted, so > > i'm not against using mmap instead brk in malloc > but the latter has more overhead so such change > should be measured. This test shows 48% increase when using mmap() vs. sbrk(): $ cat malloc-vs-sbrk.c #include <sys/mman.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #define ROUNDS 1000000 #define SIZES 4 #define SIZE_FACTOR 4 int main(int argc, char **argv) { if (argc == 2) { for (int i = 0; i < ROUNDS; i++) { for (int j = 0; j < SIZES; j++) { size_t s = 4096 * (1 << (j * SIZE_FACTOR)); void *ptr = mmap(NULL, s, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (ptr == MAP_FAILED) { fprintf(stderr, "mmap() failed, size %zu iter %d\n", s, i); return 1; } munmap(ptr, s); } } } else { for (int i = 0; i < ROUNDS; i++) { for (int j = 0; j < SIZES; j++) { size_t s = 4096 * (1 << (j * SIZE_FACTOR)); void *ptr = sbrk(s); if (ptr == (void *) -1) { fprintf(stderr, "sbrk() failed, size %zu iter %d\n", s, i); return 1; } sbrk(-s); } } } return 0; } $ time ./malloc-vs-sbrk real 0m1.923s user 0m0.160s sys 0m1.762s $ time ./malloc-vs-sbrk 1 real 0m2.847s user 0m0.176s sys 0m2.669s >> 2. Conditionally use mmap() instead of sbrk() >> >> Something like `#define USE_SBRK`, enabled by `configure` or a header file. > > i think configure time option is not a good idea, > but e.g. it can be a runtime tunable. The runtime option needs to be available very early in the dynamic loader, before errno and malloc() are available. Would getenv() work? >> I've been using a patched glibc for a month without seeing problems. I >> enabled audit logging for the brk() system call and installed a global >> seccomp filter (in initrd) which returns ENOSYS to catch any uses. So far >> I've only noticed that cpp (used by X11 startup in addition to compiling) >> calls sbrk() to check memory usage. Perhaps it should use official malloc >> statistics interface instead, since malloc() may use mmap() for other >> reasons and then sbrk() won't return true data. > > sbrk should continue to work even if glibc itself > does not use it internally, that's public api/abi. Yes, the patches don't remove the API/ABI. -Topi
On 23.11.2020 18.44, Florian Weimer wrote: > * Topi Miettinen via Libc-alpha: > >> Problem with using sbrk() for allocations is that the location of the >> memory is relatively predicatable since it's always located next to >> data segment. This series makes malloc() and TCB use mmap() instead. > > Doesn't switching to mmap trade one (relatively) fixed offset for > another? I think anonymous mmap is not randomized independently from > the file mappings used for loading DSOs. The mappings are indeed rather predictable relative to each other, even with /proc/sys/kernel/randomize_va_space=2. The base address is somewhat randomized. I've sent a patch to fully randomize the mappings: https://patchwork.kernel.org/project/linux-mm/patch/20201026160518.9212-1-toiwoton@gmail.com/ Glibc could do similar randomization by itself, by calling mmap() with an address hint generated with a random numbers from getrandom(), but I think hardening the kernel is better choice. > And the series only changes the TCB allocation for the main thread. > Fixing thread TCB/stack collocation is a completely different matter > (and probably the more significant issue than lack of ASLR). I thought I covered all uses of sbrk(), perhaps I missed that one. Does the thread TCB/stack allocation use sbrk()? -Topi
* Topi Miettinen via Libc-alpha: > $ time ./malloc-vs-sbrk > > real 0m1.923s > user 0m0.160s > sys 0m1.762s > $ time ./malloc-vs-sbrk 1 > > real 0m2.847s > user 0m0.176s > sys 0m2.669s Does the difference go away if you change the mmap granularity to 128 KiB? I think this happens under the covers (on the kernel side) with sbrk. >>> 2. Conditionally use mmap() instead of sbrk() >>> >>> Something like `#define USE_SBRK`, enabled by `configure` or a header file. >> i think configure time option is not a good idea, >> but e.g. it can be a runtime tunable. > > The runtime option needs to be available very early in the dynamic > loader, before errno and malloc() are available. Would getenv() work? getenv wouldn't work, but there is a parser for the GLIBC_TUNABLES environment variable. Thanks, Florian
On 23.11.2020 23.45, Florian Weimer wrote: > * Topi Miettinen via Libc-alpha: > >> $ time ./malloc-vs-sbrk >> >> real 0m1.923s >> user 0m0.160s >> sys 0m1.762s >> $ time ./malloc-vs-sbrk 1 >> >> real 0m2.847s >> user 0m0.176s >> sys 0m2.669s > > Does the difference go away if you change the mmap granularity to > 128 KiB? I think this happens under the covers (on the kernel side) > with sbrk. Does not seem so, 56% increase: $ time ./malloc-vs-sbrk real 0m2.911s user 0m0.232s sys 0m2.677s $ time ./malloc-vs-sbrk 1 real 0m4.545s user 0m0.196s sys 0m4.347s $ cat malloc-vs-sbrk.c #include <sys/mman.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #define ROUNDS 1000000 #define SIZES 4 #define SIZE_FACTOR 4 #define SIZE_BASE (128 * 1024) int main(int argc, char **argv) { if (argc == 2) { for (int i = 0; i < ROUNDS; i++) { for (int j = 0; j < SIZES; j++) { size_t s = SIZE_BASE * (1 << (j * SIZE_FACTOR)); void *ptr = mmap(NULL, s, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (ptr == MAP_FAILED) { fprintf(stderr, "mmap() failed, size %zu iter %d\n", s, i); return 1; } munmap(ptr, s); } } } else { for (int i = 0; i < ROUNDS; i++) { for (int j = 0; j < SIZES; j++) { size_t s = SIZE_BASE * (1 << (j * SIZE_FACTOR)); void *ptr = sbrk(s); if (ptr == (void *) -1) { fprintf(stderr, "sbrk() failed, size %zu iter %d\n", s, i); return 1; } sbrk(-s); } } } return 0; } >>>> 2. Conditionally use mmap() instead of sbrk() >>>> >>>> Something like `#define USE_SBRK`, enabled by `configure` or a header file. >>> i think configure time option is not a good idea, >>> but e.g. it can be a runtime tunable. >> >> The runtime option needs to be available very early in the dynamic >> loader, before errno and malloc() are available. Would getenv() work? > > getenv wouldn't work, but there is a parser for the GLIBC_TUNABLES > environment variable. OK, I'll try to use that in the next version. -Topi
The 11/23/2020 20:16, Topi Miettinen via Libc-alpha wrote: > On 23.11.2020 18.44, Florian Weimer wrote: > > And the series only changes the TCB allocation for the main thread. > > Fixing thread TCB/stack collocation is a completely different matter > > (and probably the more significant issue than lack of ASLR). > > I thought I covered all uses of sbrk(), perhaps I missed that one. Does the > thread TCB/stack allocation use sbrk()? the point is that thread stack and tls is allocated as one block (with mmap, but that does not matter). you want at least a guard page in between, so stack corruption does not clobber tcb/tls data. but this has costs (kernel side vma) and significant work in glibc (separate tls and stack size accounting).
* Topi Miettinen: > On 23.11.2020 23.45, Florian Weimer wrote: >> * Topi Miettinen via Libc-alpha: >> >>> $ time ./malloc-vs-sbrk >>> >>> real 0m1.923s >>> user 0m0.160s >>> sys 0m1.762s >>> $ time ./malloc-vs-sbrk 1 >>> >>> real 0m2.847s >>> user 0m0.176s >>> sys 0m2.669s >> Does the difference go away if you change the mmap granularity to >> 128 KiB? I think this happens under the covers (on the kernel side) >> with sbrk. > > Does not seem so, 56% increase: But the test does not seem very realistic because the pages are never faulted in. Sorry, I didn't check that before. Thanks, Florian
On 24.11.2020 13.24, Florian Weimer wrote: > * Topi Miettinen: > >> On 23.11.2020 23.45, Florian Weimer wrote: >>> * Topi Miettinen via Libc-alpha: >>> >>>> $ time ./malloc-vs-sbrk >>>> >>>> real 0m1.923s >>>> user 0m0.160s >>>> sys 0m1.762s >>>> $ time ./malloc-vs-sbrk 1 >>>> >>>> real 0m2.847s >>>> user 0m0.176s >>>> sys 0m2.669s >>> Does the difference go away if you change the mmap granularity to >>> 128 KiB? I think this happens under the covers (on the kernel side) >>> with sbrk. >> >> Does not seem so, 56% increase: > > But the test does not seem very realistic because the pages are never > faulted in. Sorry, I didn't check that before. Right, this changes the equation dramatically: # time ./malloc-vs-sbrk real 0m19.498s user 0m1.192s sys 0m18.302s # time ./malloc-vs-sbrk 1 real 0m19.428s user 0m1.276s sys 0m18.146s FYI, the effect of full ASLR of mmap() by kernel also seems small: # echo 3 >/proc/sys/kernel/randomize_va_space # time ./malloc-vs-sbrk real 0m19.489s user 0m1.263s sys 0m18.211s # time ./malloc-vs-sbrk 1 real 0m19.532s user 0m1.148s sys 0m18.366s # cat malloc-vs-sbrk.c #include <sys/mman.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #define ROUNDS 1000 #define SIZES 4 #define SIZE_FACTOR 3 #define SIZE_BASE (128 * 1024) int main(int argc, char **argv) { if (argc == 2) { for (int i = 0; i < ROUNDS; i++) { for (int j = 0; j < SIZES; j++) { size_t s = SIZE_BASE * (1 << (j * SIZE_FACTOR)); void *ptr = mmap(NULL, s, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (ptr == MAP_FAILED) { fprintf(stderr, "mmap() failed, size %zu iter %d\n", s, i); return 1; } memset(ptr, 0, s); munmap(ptr, s); } } } else { for (int i = 0; i < ROUNDS; i++) { for (int j = 0; j < SIZES; j++) { size_t s = SIZE_BASE * (1 << (j * SIZE_FACTOR)); void *ptr = sbrk(s); if (ptr == (void *) -1) { fprintf(stderr, "sbrk() failed, size %zu iter %d\n", s, i); return 1; } memset(ptr, 0, s); sbrk(-s); } } } return 0; } -Topi
On 24/11/2020 08:59, Topi Miettinen via Libc-alpha wrote: > On 24.11.2020 13.24, Florian Weimer wrote: >> * Topi Miettinen: >> >>> On 23.11.2020 23.45, Florian Weimer wrote: >>>> * Topi Miettinen via Libc-alpha: >>>> >>>>> $ time ./malloc-vs-sbrk >>>>> >>>>> real 0m1.923s >>>>> user 0m0.160s >>>>> sys 0m1.762s >>>>> $ time ./malloc-vs-sbrk 1 >>>>> >>>>> real 0m2.847s >>>>> user 0m0.176s >>>>> sys 0m2.669s >>>> Does the difference go away if you change the mmap granularity to >>>> 128 KiB? I think this happens under the covers (on the kernel side) >>>> with sbrk. >>> >>> Does not seem so, 56% increase: >> >> But the test does not seem very realistic because the pages are never >> faulted in. Sorry, I didn't check that before. > > Right, this changes the equation dramatically: > > # time ./malloc-vs-sbrk > > real 0m19.498s > user 0m1.192s > sys 0m18.302s > # time ./malloc-vs-sbrk 1 > > real 0m19.428s > user 0m1.276s > sys 0m18.146s > > FYI, the effect of full ASLR of mmap() by kernel also seems small: > > # echo 3 >/proc/sys/kernel/randomize_va_space > # time ./malloc-vs-sbrk > > real 0m19.489s > user 0m1.263s > sys 0m18.211s > # time ./malloc-vs-sbrk 1 > > real 0m19.532s > user 0m1.148s > sys 0m18.366s I saw similar results showing little performance difference on different architectures as well (aarch64, s390x, sparc64, and armhf), so I don't think performance should an impending reason for this change.
* Topi Miettinen: > FYI, the effect of full ASLR of mmap() by kernel also seems small: > > # echo 3 >/proc/sys/kernel/randomize_va_space > # time ./malloc-vs-sbrk > > real 0m19.489s > user 0m1.263s > sys 0m18.211s > # time ./malloc-vs-sbrk 1 > > real 0m19.532s > user 0m1.148s > sys 0m18.366s The value 3 doesn't do anything in a mainline kernel. I think you will see rather catastrophic effects if you have a workload that triggers TLB misses. I expect that the cost of walking page tables increases dramatically with full ASLR. Thanks, Florian
On 30.11.2020 12.28, Florian Weimer wrote: > * Topi Miettinen: > >> FYI, the effect of full ASLR of mmap() by kernel also seems small: >> >> # echo 3 >/proc/sys/kernel/randomize_va_space >> # time ./malloc-vs-sbrk >> >> real 0m19.489s >> user 0m1.263s >> sys 0m18.211s >> # time ./malloc-vs-sbrk 1 >> >> real 0m19.532s >> user 0m1.148s >> sys 0m18.366s > > The value 3 doesn't do anything in a mainline kernel. No, you need this patch to randomize mmap(), mremap(..., MREMAP_MAYMOVE), stack and vdso: https://lkml.org/lkml/2020/11/29/214 > I think you will see rather catastrophic effects if you have a workload > that triggers TLB misses. I expect that the cost of walking page tables > increases dramatically with full ASLR. Each random mapping may likely need new page tables, up to 4 pages. Thus for a workload with lots of small (page-size) mappings, the memory consumption will indeed increase a lot with sysctl.kernel.randomize_va_space=3. But as a real world example, main process of Firefox has roughly 1500 lines in /proc/$PID/maps in my system. Some are contiguous to other mappings, but for simplicity, pretending that they all were independent and each required a full set of page tables, it would mean 1500*4*4kB = 24MB of page tables. RSS of the process is 340MB, so the effect is not that dramatic by itself, but page table entries may also compete for cache slots. So randomize_va_space=3 clearly isn't interesting for performance but hardening reasons. I'd also not use it on a 32-bit system because the virtual memory space could be fragmented too much. I don't think that could happen on a 64 bit system, the physical memory should run out first. The effect of using mmap() instead of sbrk() in libc is independent of the kernel patch: one malloc arena and few other areas will use mmap() instead of sbrk(), so only a few mappings are affected. With randomize_va_space=2, the mappings made with mmap() will be allocated close to the other mappings and then a lot of page tables will be shared, so probably no additional memory is needed. With a patched kernel and randomize_va_space=3, each mapping will be placed randomly. This could take maybe tens of kBs more memory in page tables. -Topi