Message ID | 20210225194702.6113-1-pvorel@suse.cz |
---|---|
State | Rejected |
Headers |
Return-Path: <libc-alpha-bounces@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4B7AA3896805; Thu, 25 Feb 2021 19:47:23 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id BBD293851C35 for <libc-alpha@sourceware.org>; Thu, 25 Feb 2021 19:47:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org BBD293851C35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=pvorel@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D1AD6AF72; Thu, 25 Feb 2021 19:47:09 +0000 (UTC) From: Petr Vorel <pvorel@suse.cz> To: libc-alpha@sourceware.org Subject: [RFC PATCH] Linux: Workaround seccomp() issue with faccessat2() Date: Thu, 25 Feb 2021 20:47:02 +0100 Message-Id: <20210225194702.6113-1-pvorel@suse.cz> X-Mailer: git-send-email 2.30.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> Cc: Florian Weimer <fweimer@redhat.com>, Fabian Vogt <fvogt@suse.com>, Andreas Schwab <schwab@suse.de>, Kir Kolyshkin <kolyshkin@gmail.com>, Aleksa Sarai <cyphar@cyphar.com>, Ladislav Slezak <lslezak@suse.com> Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org> |
Series |
[RFC] Linux: Workaround seccomp() issue with faccessat2()
|
|
Commit Message
Petr Vorel
Feb. 25, 2021, 7:47 p.m. UTC
3d3ab573a5 ("Linux: Use faccessat2 to implement faccessat (bug 18683)")
started to use faccessat2() which breaks docker/podman/... containers
with guest running glibc 2.33 running on host with older kernel and are
built with older libseccomp.
See also: https://bugzilla.opensuse.org/show_bug.cgi?id=1182451#c17
Signed-off-by: Petr Vorel <pvorel@suse.cz>
---
Hi,
I admit that this is a very ugly workaround and wouldn't be surprised if
you just don't care about seccomp() incompatibilities. But it'd be nice
to have unified approach for this incompatibility, as it hits any distro
with glibc 2.33 (currently openSUSE Tumbleweed, Arch Linux, Fedora
rawhide). And after some time (when old LTS distros EOL) this crap could be removed.
More info:
https://github.com/opencontainers/runc/pull/2750
https://github.com/seccomp/libseccomp/issues/314
Kind regards,
Petr
sysdeps/unix/sysv/linux/faccessat.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
Comments
Hi, On Thu, Feb 25, 2021 at 08:47:02PM +0100, Petr Vorel wrote: > 3d3ab573a5 ("Linux: Use faccessat2 to implement faccessat (bug 18683)") > started to use faccessat2() which breaks docker/podman/... containers > with guest running glibc 2.33 running on host with older kernel and are > built with older libseccomp. > > See also: https://bugzilla.opensuse.org/show_bug.cgi?id=1182451#c17 > > Signed-off-by: Petr Vorel <pvorel@suse.cz> > --- > Hi, > > I admit that this is a very ugly workaround and wouldn't be surprised if > you just don't care about seccomp() incompatibilities. But it'd be nice > to have unified approach for this incompatibility, as it hits any distro > with glibc 2.33 (currently openSUSE Tumbleweed, Arch Linux, Fedora > rawhide). And after some time (when old LTS distros EOL) this crap could be removed. > > More info: > https://github.com/opencontainers/runc/pull/2750 > https://github.com/seccomp/libseccomp/issues/314 > > Kind regards, > Petr Petr, you must have missed the whole discussion on this subject [1][2], the consensus was that problematic container runtimes need to be fixed to make their seccomp filters return ENOSYS for unknown syscalls. [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119955.html [2] https://lore.kernel.org/linux-api/87lfer2c0b.fsf@oldenburg2.str.redhat.com/T/#u
Hi Dmitry, > Petr, you must have missed the whole discussion on this subject [1][2], > the consensus was that problematic container runtimes need to be fixed > to make their seccomp filters return ENOSYS for unknown syscalls. Thanks for info and sorry for spam then. Kind regards, Petr > [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119955.html > [2] https://lore.kernel.org/linux-api/87lfer2c0b.fsf@oldenburg2.str.redhat.com/T/#u
On 26 Feb 2021 05:11, Petr Vorel wrote: > Hi Dmitry, > > Petr, you must have missed the whole discussion on this subject [1][2], > > the consensus was that problematic container runtimes need to be fixed > > to make their seccomp filters return ENOSYS for unknown syscalls. > > Thanks for info and sorry for spam then. no need to apologize. can't expect everyone to read everything all the time. -mike
On 2021-02-26, Dmitry V. Levin <ldv@altlinux.org> wrote: > On Thu, Feb 25, 2021 at 08:47:02PM +0100, Petr Vorel wrote: > > 3d3ab573a5 ("Linux: Use faccessat2 to implement faccessat (bug 18683)") > > started to use faccessat2() which breaks docker/podman/... containers > > with guest running glibc 2.33 running on host with older kernel and are > > built with older libseccomp. > > > > See also: https://bugzilla.opensuse.org/show_bug.cgi?id=1182451#c17 > > > > Signed-off-by: Petr Vorel <pvorel@suse.cz> > > --- > > Hi, > > > > I admit that this is a very ugly workaround and wouldn't be surprised if > > you just don't care about seccomp() incompatibilities. But it'd be nice > > to have unified approach for this incompatibility, as it hits any distro > > with glibc 2.33 (currently openSUSE Tumbleweed, Arch Linux, Fedora > > rawhide). And after some time (when old LTS distros EOL) this crap could be removed. > > > > More info: > > https://github.com/opencontainers/runc/pull/2750 > > https://github.com/seccomp/libseccomp/issues/314 > > > > Kind regards, > > Petr > > Petr, you must have missed the whole discussion on this subject [1][2], > the consensus was that problematic container runtimes need to be fixed > to make their seccomp filters return ENOSYS for unknown syscalls. It should also be noted that we fixed this in runc a month ago[1], which means that it's up to distributions and cloud vendors to update their runc packages to the latest version or backport the patch. Docker's packaging hasn't been updated to use the latest runc yet (that'll happen in the next patch release), but distributions can ship newer runc versions -- that's what we do in openSUSE. [1]: https://github.com/opencontainers/runc/pull/2750
* Aleksa Sarai: > It should also be noted that we fixed this in runc a month ago[1], which > means that it's up to distributions and cloud vendors to update their > runc packages to the latest version or backport the patch. > > Docker's packaging hasn't been updated to use the latest runc yet > (that'll happen in the next patch release), but distributions can ship > newer runc versions -- that's what we do in openSUSE. > > [1]: https://github.com/opencontainers/runc/pull/2750 There are some indications that not all container runtimes will pick up the runc kludge (thanks for developing that by the way). So it's likely that the general issue will be with us for a while longer. Maybe the competitive pressure from other working container runtimes will encourage other re-evaluate their approach, I don't know. We still don't plan to throw in downstream-only glibc patches to paper over this (given that it's been rejected by kernel and glibc developers alike, I really think it's the wrong way to go). So far management isn't breathing down our necks. Thanks, Florian
Hi all, > There are some indications that not all container runtimes will pick up > the runc kludge (thanks for developing that by the way). So it's likely > that the general issue will be with us for a while longer. Maybe the > competitive pressure from other working container runtimes will > encourage other re-evaluate their approach, I don't know. Hopefully. > We still don't plan to throw in downstream-only glibc patches to paper > over this (given that it's been rejected by kernel and glibc developers > alike, I really think it's the wrong way to go). So far management > isn't breathing down our necks. As workaround exists (for openSUSE using podman with newest runc v1.0.0-rc93) I understand the reluctance to accept a workaround. It just reminds me occasional musl approach to be correct no matter what problems it brings to users. Kind regards, Petr > Thanks, > Florian
diff --git a/sysdeps/unix/sysv/linux/faccessat.c b/sysdeps/unix/sysv/linux/faccessat.c index 13160d3249..f01c59b6e7 100644 --- a/sysdeps/unix/sysv/linux/faccessat.c +++ b/sysdeps/unix/sysv/linux/faccessat.c @@ -30,9 +30,22 @@ __faccessat (int fd, const char *file, int mode, int flag) #if __ASSUME_FACCESSAT2 return ret; #else - if (ret == 0 || errno != ENOSYS) + if (ret == 0 || (errno != ENOSYS && errno != EPERM)) return ret; + /* + * Check seccomp() issue with faccessat2(). Additional EPERM means seccomp() + * in use, ENOSYS or EBADF real EPERM. + */ + if (errno == EPERM) { + int backup = errno; + INLINE_SYSCALL_CALL (faccessat2, -2, ".", 0, 0); + int err = errno; + errno = backup; + if (err != EPERM) + return ret; + } + if (flag & ~(AT_SYMLINK_NOFOLLOW | AT_EACCESS)) return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);