From patchwork Sat Jan  8 00:42:37 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tom Honermann <tom@honermann.net>
X-Patchwork-Id: 49747
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 1505A3858036
	for <patchwork@sourceware.org>; Sat,  8 Jan 2022 00:45:49 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1505A3858036
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1641602749;
	bh=eN50TzGIXHlypUlLKF8DBPksWXP6ITbmbzqWVPjIruw=;
	h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post:
	 List-Help:List-Subscribe:From:Reply-To:From;
	b=G1X1RnRqcllhYPdnqKW0BRmTDHA3BcBhkAkPAHoGO0Lk95z/UQHJ9FiTkch370Hm/
	 wIoqya/H8kJ5lr4FtEF7z2aqfzrSRLSfQWyxwmZvSxYYs6V5H3Zv7fEgbxjEE6mVUC
	 T+Xn30QndbKev/ThE6/5bwfI2FPrBa8NIjW+VW8I=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from smtp100.ord1c.emailsrvr.com (smtp100.ord1c.emailsrvr.com
 [108.166.43.100])
 by sourceware.org (Postfix) with ESMTPS id 00362385780D
 for <gcc-patches@gcc.gnu.org>; Sat,  8 Jan 2022 00:42:38 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 00362385780D
X-Auth-ID: tom@honermann.net
Received: by smtp5.relay.ord1c.emailsrvr.com (Authenticated sender:
 tom-AT-honermann.net) with ESMTPSA id 59524401A4
 for <gcc-patches@gcc.gnu.org>; Fri,  7 Jan 2022 19:42:38 -0500 (EST)
To: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: [PATCH 1/2]: C N2653 char8_t: Language support
Message-ID: <5799be74-78c5-7b75-a5c8-5b27c33ea7fd@honermann.net>
Date: Fri, 7 Jan 2022 19:42:37 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.13.0
MIME-Version: 1.0
Content-Language: en-US
X-Classification-ID: cc6b0f87-2f0a-4e37-ab72-ecbb0bc9e604-1-1
X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE,
 RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: Tom Honermann via Gcc-patches
 <gcc-patches@gcc.gnu.org>
From: Tom Honermann <tom@honermann.net>
Reply-To: Tom Honermann <tom@honermann.net>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patch implements the core language and compiler dependent library 
changes proposed in WG14 N2653 [1] for C2x. The changes include:
- Change of type for UTF-8 string literals from array of char to array
   of char8_t (unsigned char) when targeting C2x.
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
   __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

Tested on Linux x86_64.

gcc/ChangeLog:

2022-01-07  Tom Honermann  <tom@honermann.net>

	* ginclude/stdatomic.h (atomic_char8_t,
	ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

2022-01-07  Tom Honermann  <tom@honermann.net>

	* c-parser.c (c_parser_string_literal): Use char8_t as the type
	of CPP_UTF8STRING when char8_t support is enabled.
	* c-typeck.c (digest_init): Allow initialization of an array
	of character type by a string literal with type array of
	char8_t.

gcc/c-family/ChangeLog:

2022-01-07  Tom Honermann  <tom@honermann.net>

	* c-lex.c (lex_string, lex_charconst): Use char8_t as the type
	of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
	enabled.
	* c-opts.c (c_common_post_options): Set flag_char8_t if
	targeting C2x.

Tom.

[1]: WG14 N2653
      "char8_t: A type for UTF-8 characters and strings (Revision 1)"
      http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

commit c041cce5d262908349be3f1f2e361c824db15845
Author: Tom Honermann <tom@honermann.net>
Date:   Sat Jan 1 18:10:41 2022 -0500

    N2653 char8_t for C: Language support
    
    This patch implements the core language and compiler dependent library
    changes proposed in WG14 N2653 for C2X.  The changes include:
    - Change of type for UTF-8 string literals from array of const char to
      array of const char8_t (unsigned char).
    - A new atomic_char8_t typedef.
    - A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
      __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 2651331e683..0b3debbb9bd 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate)
 	default:
 	case CPP_STRING:
 	case CPP_UTF8STRING:
-	  value = build_string (1, "");
+	  if (type == CPP_UTF8STRING && flag_char8_t)
+	    {
+	      value = build_string (TYPE_PRECISION (char8_type_node)
+				    / TYPE_PRECISION (char_type_node),
+				    "");  /* char8_t is 8 bits */
+	    }
+	  else
+	    value = build_string (1, "");
 	  break;
 	case CPP_STRING16:
 	  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -1425,10 +1432,10 @@ lex_charconst (const cpp_token *token)
     type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
     {
-      if (!c_dialect_cxx ())
-	type = unsigned_char_type_node;
-      else if (flag_char8_t)
+      if (flag_char8_t)
         type = char8_type_node;
+      else if (!c_dialect_cxx ())
+	type = unsigned_char_type_node;
       else
         type = char_type_node;
     }
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 4c20e44f5b5..bd96e1319ad 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -1060,9 +1060,9 @@ c_common_post_options (const char **pfilename)
   if (flag_sized_deallocation == -1)
     flag_sized_deallocation = (cxx_dialect >= cxx14);
 
-  /* char8_t support is new in C++20.  */
+  /* char8_t support is implicitly enabled in C++20 and C2x.  */
   if (flag_char8_t == -1)
-    flag_char8_t = (cxx_dialect >= cxx20);
+    flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
 
   if (flag_extern_tls_init)
     {
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index b09ad307acd..4239633e295 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7439,7 +7439,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok)
 	default:
 	case CPP_STRING:
 	case CPP_UTF8STRING:
-	  value = build_string (1, "");
+	  if (type == CPP_UTF8STRING && flag_char8_t)
+	    {
+	      value = build_string (TYPE_PRECISION (char8_type_node)
+				    / TYPE_PRECISION (char_type_node),
+				    "");  /* char8_t is 8 bits */
+	    }
+	  else
+	    value = build_string (1, "");
 	  break;
 	case CPP_STRING16:
 	  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -7464,9 +7471,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok)
     {
     default:
     case CPP_STRING:
-    case CPP_UTF8STRING:
       TREE_TYPE (value) = char_array_type_node;
       break;
+    case CPP_UTF8STRING:
+      if (flag_char8_t)
+	TREE_TYPE (value) = char8_array_type_node;
+      else
+	TREE_TYPE (value) = char_array_type_node;
+      break;
     case CPP_STRING16:
       TREE_TYPE (value) = char16_array_type_node;
       break;
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 78a6c68aaa6..b4eeea545a9 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -8028,7 +8028,7 @@ digest_init (location_t init_loc, tree type, tree init, tree origtype,
 
 	  if (char_array)
 	    {
-	      if (typ2 != char_type_node)
+	      if (typ2 != char_type_node && typ2 != char8_type_node)
 		incompat_string_cst = true;
 	    }
 	  else if (!comptypes (typ1, typ2))
diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h
index 23c07be2a48..b36703b2dc2 100644
--- a/gcc/ginclude/stdatomic.h
+++ b/gcc/ginclude/stdatomic.h
@@ -49,6 +49,10 @@ typedef _Atomic long atomic_long;
 typedef _Atomic unsigned long atomic_ulong;
 typedef _Atomic long long atomic_llong;
 typedef _Atomic unsigned long long atomic_ullong;
+#if (defined(__CHAR8_TYPE__) \
+     && (defined(_GNU_SOURCE) || defined(_ISOC2X_SOURCE)))
+typedef _Atomic __CHAR8_TYPE__ atomic_char8_t;
+#endif
 typedef _Atomic __CHAR16_TYPE__ atomic_char16_t;
 typedef _Atomic __CHAR32_TYPE__ atomic_char32_t;
 typedef _Atomic __WCHAR_TYPE__ atomic_wchar_t;
@@ -97,6 +101,10 @@ extern void atomic_signal_fence (memory_order);
 
 #define ATOMIC_BOOL_LOCK_FREE		__GCC_ATOMIC_BOOL_LOCK_FREE
 #define ATOMIC_CHAR_LOCK_FREE		__GCC_ATOMIC_CHAR_LOCK_FREE
+#if (defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE) \
+     && (defined(_GNU_SOURCE) || defined(_ISOC2X_SOURCE)))
+#define ATOMIC_CHAR8_T_LOCK_FREE	__GCC_ATOMIC_CHAR8_T_LOCK_FREE
+#endif
 #define ATOMIC_CHAR16_T_LOCK_FREE	__GCC_ATOMIC_CHAR16_T_LOCK_FREE
 #define ATOMIC_CHAR32_T_LOCK_FREE	__GCC_ATOMIC_CHAR32_T_LOCK_FREE
 #define ATOMIC_WCHAR_T_LOCK_FREE	__GCC_ATOMIC_WCHAR_T_LOCK_FREE

From patchwork Sat Jan  8 00:42:42 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tom Honermann <tom@honermann.net>
X-Patchwork-Id: 49748
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id EB6773857C78
	for <patchwork@sourceware.org>; Sat,  8 Jan 2022 00:46:45 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EB6773857C78
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1641602806;
	bh=kUG0BXJvF9t8Kp3slcCMCV5YPts1QkGTkd8q7czp/7s=;
	h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post:
	 List-Help:List-Subscribe:From:Reply-To:From;
	b=XnDGY4lWPqc+rgXuEW8MVAgWpPc29Ga89RqmvuSaXFJqFHSU2wFZXrIeUCTi8tS0w
	 3aJvv89eg4KVz4QDvPmnTgmyHmMx4sydbxTNgy2d5L3FhIERo0jTJO0Y+HEjOMimiL
	 UwJlCh2Tv31DiFiIuuJzVLtjdTLw924mHsszkH3Q=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from smtp122.ord1c.emailsrvr.com (smtp122.ord1c.emailsrvr.com
 [108.166.43.122])
 by sourceware.org (Postfix) with ESMTPS id 455A93857811
 for <gcc-patches@gcc.gnu.org>; Sat,  8 Jan 2022 00:42:43 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 455A93857811
X-Auth-ID: tom@honermann.net
Received: by smtp16.relay.ord1c.emailsrvr.com (Authenticated sender:
 tom-AT-honermann.net) with ESMTPSA id AB184C0145
 for <gcc-patches@gcc.gnu.org>; Fri,  7 Jan 2022 19:42:42 -0500 (EST)
To: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: =?utf-8?b?W1BBVENIIDIvMl06IEMgTjI2NTMgY2hhcjhfdDogTmV3IHRlc3Rz4oCL?=
Message-ID: <58306add-5803-5708-cb48-20d6a8a6d1d6@honermann.net>
Date: Fri, 7 Jan 2022 19:42:42 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.13.0
MIME-Version: 1.0
Content-Language: en-US
X-Classification-ID: 6be60d84-476a-44a3-b744-5305295d9803-1-1
X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: Tom Honermann via Gcc-patches
 <gcc-patches@gcc.gnu.org>
From: Tom Honermann <tom@honermann.net>
Reply-To: Tom Honermann <tom@honermann.net>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patch provides new tests for the core language and compiler 
dependent library changes proposed in WG14 N2653 [1] for C2x.

Tested on Linux x86_64.

gcc/testsuite/ChangeLog:

2021-05-31  Tom Honermann  <tom@honermann.net>

	* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
	* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
	* gcc.dg/c2x-predefined-macros.c: New test.
	* gcc.dg/c2x-utf8str-type.c: New test.
	* gcc.dg/c2x-utf8str.c: New test.
	* gcc.dg/gnu2x-predefined-macros.c: New test.
	* gcc.dg/gnu2x-utf8str-type.c: New test.
	* gcc.dg/gnu2x-utf8str.c: New test.

Tom.

[1]: WG14 N2653
      "char8_t: A type for UTF-8 characters and strings (Revision 1)"
      http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

commit f4eee2bf403b62714d1ccb4542b8c85dc552a411
Author: Tom Honermann <tom@honermann.net>
Date:   Sun Jan 2 00:26:17 2022 -0500

    N2653 char8_t for C: New tests
    
    This change provides new tests for the core language and compiler
    dependent library changes proposed in WG14 N2653 for C.

diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 00000000000..37ea4c8926c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */
+
+#include <stdatomic.h>
+#include <stdint.h>
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)		\
+  do						\
+    {						\
+      int r1 = MACRO;				\
+      int r2 = atomic_is_lock_free (&V1);	\
+      int r3 = atomic_is_lock_free (&V2);	\
+      if (r1 != 0 && r1 != 1 && r1 != 2)	\
+	abort ();				\
+      if (r2 != 0 && r2 != 1)			\
+	abort ();				\
+      if (r3 != 0 && r3 != 1)			\
+	abort ();				\
+      if (r1 == 2 && r2 != 1)			\
+	abort ();				\
+      if (r1 == 2 && r3 != 1)			\
+	abort ();				\
+      if (r1 == 0 && r2 != 0)			\
+	abort ();				\
+      if (r1 == 0 && r3 != 0)			\
+	abort ();				\
+    }						\
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 00000000000..a017b134817
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */
+
+#include "c2x-stdatomic-lockfree-char8_t.c"
diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
new file mode 100644
index 00000000000..c88e51b54c5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
@@ -0,0 +1,11 @@
+/* Test C2x predefined macros.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+#if !defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is not defined!
+#endif
+
+#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str-type.c b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
new file mode 100644
index 00000000000..76559c0b19b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C2x UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, unsigned char*: 2) == 2, "UTF-8 string literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, unsigned char:  2) == 2, "UTF-8 string literal elements have an unexpected type");
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str.c b/gcc/testsuite/gcc.dg/c2x-utf8str.c
new file mode 100644
index 00000000000..712482c6569
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-utf8str.c
@@ -0,0 +1,34 @@
+/* Test initialization by UTF-8 string literal in C2x.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=c2x" } */
+
+typedef __CHAR8_TYPE__	char8_t;
+typedef __CHAR16_TYPE__	char16_t;
+typedef __CHAR32_TYPE__ char32_t;
+typedef __WCHAR_TYPE__	wchar_t;
+
+/* Test that char, signed char, unsigned char, and char8_t arrays can be
+   initialized by a UTF-8 string literal.  */
+const char cbuf1[] = u8"text";
+const char cbuf2[] = { u8"text" };
+const signed char scbuf1[] = u8"text";
+const signed char scbuf2[] = { u8"text" };
+const unsigned char ucbuf1[] = u8"text";
+const unsigned char ucbuf2[] = { u8"text" };
+const char8_t c8buf1[] = u8"text";
+const char8_t c8buf2[] = { u8"text" };
+
+/* Test that a diagnostic is issued for attempted initialization of
+   other character types by a UTF-8 string literal.  */
+const char16_t c16buf1[] = u8"text";		/* { dg-error "from a string literal with type array of .unsigned char." } */
+const char16_t c16buf2[] = { u8"text" };	/* { dg-error "from a string literal with type array of .unsigned char." } */
+const char32_t c32buf1[] = u8"text";		/* { dg-error "from a string literal with type array of .unsigned char." } */
+const char32_t c32buf2[] = { u8"text" };	/* { dg-error "from a string literal with type array of .unsigned char." } */
+const wchar_t wbuf1[] = u8"text";		/* { dg-error "from a string literal with type array of .unsigned char." } */
+const wchar_t wbuf2[] = { u8"text" };		/* { dg-error "from a string literal with type array of .unsigned char." } */
+
+/* Test that char8_t arrays can be initialized by an ordinary string
+   literal.  */
+const char8_t c8buf3[] = "text";
+const char8_t c8buf4[] = { "text" };
diff --git a/gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c b/gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
new file mode 100644
index 00000000000..1c52ac9193c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
@@ -0,0 +1,5 @@
+/* Test C2x predefined macros with -std=gnu2x.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+#include "c2x-predefined-macros.c"
diff --git a/gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c b/gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
new file mode 100644
index 00000000000..53b4f411658
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
@@ -0,0 +1,5 @@
+/* Test C2x UTF-8 string literal type with -std=gnu2x.  */
+/* { dg-do compile } */
+/* { dg-options "-std=gnu2x" } */
+
+#include "c2x-utf8str-type.c"
diff --git a/gcc/testsuite/gcc.dg/gnu2x-utf8str.c b/gcc/testsuite/gcc.dg/gnu2x-utf8str.c
new file mode 100644
index 00000000000..a679d988fe4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gnu2x-utf8str.c
@@ -0,0 +1,39 @@
+/* Test initialization by UTF-8 string literal in C2x with -std=gnu2x.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=gnu2x" } */
+
+/* Test initialization by UTF-8 string literal in C2x.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=c2x" } */
+
+typedef __CHAR8_TYPE__	char8_t;
+typedef __CHAR16_TYPE__	char16_t;
+typedef __CHAR32_TYPE__ char32_t;
+typedef __WCHAR_TYPE__	wchar_t;
+
+/* Test that char, signed char, unsigned char, and char8_t arrays can be
+   initialized by a UTF-8 string literal.  */
+const char cbuf1[] = u8"text";
+const char cbuf2[] = { u8"text" };
+const signed char scbuf1[] = u8"text";
+const signed char scbuf2[] = { u8"text" };
+const unsigned char ucbuf1[] = u8"text";
+const unsigned char ucbuf2[] = { u8"text" };
+const char8_t c8buf1[] = u8"text";
+const char8_t c8buf2[] = { u8"text" };
+
+/* Test that a diagnostic is issued for attempted initialization of
+   other character types by a UTF-8 string literal.  */
+const char16_t c16buf1[] = u8"text";		/* { dg-error "from a string literal with type array of .unsigned char." } */
+const char16_t c16buf2[] = { u8"text" };	/* { dg-error "from a string literal with type array of .unsigned char." } */
+const char32_t c32buf1[] = u8"text";		/* { dg-error "from a string literal with type array of .unsigned char." } */
+const char32_t c32buf2[] = { u8"text" };	/* { dg-error "from a string literal with type array of .unsigned char." } */
+const wchar_t wbuf1[] = u8"text";		/* { dg-error "from a string literal with type array of .unsigned char." } */
+const wchar_t wbuf2[] = { u8"text" };		/* { dg-error "from a string literal with type array of .unsigned char." } */
+
+/* Test that char8_t arrays can be initialized by an ordinary string
+   literal.  */
+const char8_t c8buf3[] = "text";
+const char8_t c8buf4[] = { "text" };