Objective-C: fix protocol list count type (pertinent to non-LP64)

Message ID 3FA161CB-9D8F-4576-8607-292CE21B2BF3@me.com
State New
Headers
Series Objective-C: fix protocol list count type (pertinent to non-LP64) |

Commit Message

Matt Jacobson Sept. 27, 2021, 3:45 a.m. UTC
  Fix protocol list layout for non-LP64.  clang and objc4 both give the `count` 
field as `long`, not `intptr_t`.  Those are the same on LP64, but not 
everywhere.  For non-LP64, this fixes binary compatibility with clang-built 
classes.

This was more complicated than I anticipated, because the relevant frontend 
code in fact had no AST type for `protocol_list_t`, instead emitting protocol 
lists as `protocol_t[]`, with the zeroth element actually being the integer 
count.  That made it nontrivial to change the count to `long`.  With this 
change, there is now a true `protocol_list_t` type in the AST.

Tested multiple ways.  On x86_64/Darwin, I confirmed with a test program that 
protocol conformances by classes, categories, and protocols works.  On AVR, I 
manually inspected the generated assembly to confirm that protocol lists gain 
an extra two bytes of `count`, matching clang.

Thank you for your time.

<https://github.com/mhjacobson/gcc/commit/5ebc95dc726f0745ebdf003093f1b8d7720ce32f>


gcc/objc/ChangeLog:

2021-09-26  Matt Jacobson  <mhjacobson@me.com>

	* objc-next-runtime-abi-02.c (enum objc_v2_tree_index): Add new global tree.
	(static void next_runtime_02_initialize): Initialize protocol list type tree.
	(struct class_ro_t): Fix type misspelling.
	(build_v2_class_templates): Correct type in field declaration.
	(build_v2_protocol_templates): Create actual protocol list type tree.
	(build_v2_category_template): Correct type in field declaration.
	(generate_v2_protocol_list): Emit protocol list count as `long`.
	(generate_v2_protocols): Use correct type.
	(build_v2_category_initializer): Use correct type.
	(build_v2_class_ro_t_initializer): Use correct type.
  

Comments

Matt Jacobson Oct. 20, 2021, 3:51 a.m. UTC | #1
> On Sep 26, 2021, at 11:45 PM, Matt Jacobson <mhjacobson@me.com> wrote:
> 
> Fix protocol list layout for non-LP64.  clang and objc4 both give the `count` 
> field as `long`, not `intptr_t`.  Those are the same on LP64, but not 
> everywhere.  For non-LP64, this fixes binary compatibility with clang-built 
> classes.
> 
> This was more complicated than I anticipated, because the relevant frontend 
> code in fact had no AST type for `protocol_list_t`, instead emitting protocol 
> lists as `protocol_t[]`, with the zeroth element actually being the integer 
> count.  That made it nontrivial to change the count to `long`.  With this 
> change, there is now a true `protocol_list_t` type in the AST.
> 
> Tested multiple ways.  On x86_64/Darwin, I confirmed with a test program that 
> protocol conformances by classes, categories, and protocols works.  On AVR, I 
> manually inspected the generated assembly to confirm that protocol lists gain 
> an extra two bytes of `count`, matching clang.
> 
> Thank you for your time.
> 
> <https://github.com/mhjacobson/gcc/commit/5ebc95dc726f0745ebdf003093f1b8d7720ce32f>

Friendly ping.  Please let me know if there’s anything I can clarify.

Original mail:
<https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580280.html>

Thanks.
  
Iain Sandoe Oct. 23, 2021, 8:46 a.m. UTC | #2
Hi Matt,

sorry for slow response,
unavoidable external factors have been keeping me away from the computer (for both $dayjob and and volunteer stuff).

> On 20 Oct 2021, at 04:51, Matt Jacobson via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> 
>> On Sep 26, 2021, at 11:45 PM, Matt Jacobson <mhjacobson@me.com> wrote:
>> 
>> Fix protocol list layout for non-LP64.  clang and objc4 both give the `count` 
>> field as `long`, not `intptr_t`.  Those are the same on LP64, but not 
>> everywhere.  For non-LP64, this fixes binary compatibility with clang-built 
>> classes.
>> 
>> This was more complicated than I anticipated, because the relevant frontend 
>> code in fact had no AST type for `protocol_list_t`, instead emitting protocol 
>> lists as `protocol_t[]`, with the zeroth element actually being the integer 
>> count.  That made it nontrivial to change the count to `long`.  With this 
>> change, there is now a true `protocol_list_t` type in the AST.
>> 
>> Tested multiple ways.  On x86_64/Darwin, I confirmed with a test program that 
>> protocol conformances by classes, categories, and protocols works.  On AVR, I 
>> manually inspected the generated assembly to confirm that protocol lists gain 
>> an extra two bytes of `count`, matching clang.
>> 
>> Thank you for your time.
>> 
>> <https://github.com/mhjacobson/gcc/commit/5ebc95dc726f0745ebdf003093f1b8d7720ce32f>
> 
> Friendly ping.  Please let me know if there’s anything I can clarify.

The patch is in my queue (it will not get lost), the rationale seems reasonable,
Iain
  
Iain Sandoe Oct. 25, 2021, 9:43 a.m. UTC | #3
Hi Matt,

> On 23 Oct 2021, at 09:46, Iain Sandoe via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
>> On 20 Oct 2021, at 04:51, Matt Jacobson via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>> 
>> 
>>> On Sep 26, 2021, at 11:45 PM, Matt Jacobson <mhjacobson@me.com> wrote:
>>> 
>>> Fix protocol list layout for non-LP64.  clang and objc4 both give the `count` 
>>> field as `long`, not `intptr_t`.  Those are the same on LP64, but not 
>>> everywhere.  For non-LP64, this fixes binary compatibility with clang-built 
>>> classes.
>>> 
>>> This was more complicated than I anticipated, because the relevant frontend 
>>> code in fact had no AST type for `protocol_list_t`, instead emitting protocol 
>>> lists as `protocol_t[]`, with the zeroth element actually being the integer 
>>> count.  That made it nontrivial to change the count to `long`.  With this 
>>> change, there is now a true `protocol_list_t` type in the AST.
>>> 
>>> Tested multiple ways.  On x86_64/Darwin, I confirmed with a test program that 
>>> protocol conformances by classes, categories, and protocols works.

see ** below …

>>>  On AVR, I 
>>> manually inspected the generated assembly to confirm that protocol lists gain 
>>> an extra two bytes of `count`, matching clang.

Did you test objective-c++ on Darwin?

I see a lot of fails of the form:
Excess errors:
<built-in>: error: initialization of a flexible array member [-Wpedantic]


>>> Thank you for your time.
>>> 
>>> <https://github.com/mhjacobson/gcc/commit/5ebc95dc726f0745ebdf003093f1b8d7720ce32f>
>> 
>> Friendly ping.  Please let me know if there’s anything I can clarify.
> 
> The patch is in my queue (it will not get lost), the rationale seems reasonable,

For a patch that changes code-gen we should have a test that it produces what’s
expected (in general, a ‘torture' test would be preferrable so that we can be sure the
output is as expected for different optimisation levels). 

** If you have a test program that could be the basis for this - but it needs to be
integrated into the testsuite with scans for the required output.

Have to give some thought to how to solve the obj-c++ issue.
Iain
  
Matt Jacobson Nov. 7, 2021, 10:50 p.m. UTC | #4
> On Oct 25, 2021, at 5:43 AM, Iain Sandoe <iain@sandoe.co.uk> wrote:
> 
> Did you test objective-c++ on Darwin?
> 
> I see a lot of fails of the form:
> Excess errors:
> <built-in>: error: initialization of a flexible array member [-Wpedantic]

Looked into this.  It’s happening because obj-c++.dg/dg.exp has:

    set DEFAULT_OBJCXXFLAGS " -ansi -pedantic-errors -Wno-long-long"

Specifically, the `-pedantic-errors` argument prohibits initialization of a 
flexible array member.  Notably, this flag does *not* appear in objc/dg.exp.

Admittedly I didn’t know that initialization of a FAM was prohibited by the 
standard.  It’s allowed by GCC, though, as documented here:

<https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html>

Is it OK to use a GCC extension this way in the Objective-C frontend?

> For a patch that changes code-gen we should have a test that it produces what’s
> expected (in general, a ‘torture' test would be preferrable so that we can be sure the
> output is as expected for different optimisation levels). 

The output is different only for targets where 
sizeof (long) != sizeof (void *).  Do we have the ability to run “cross” 
torture tests?  Could such a test verify the emitted assembly (like LLVM’s 
FileCheck tests do)?  Or would it need to execute something?

Thanks for your help!

Matt
  
Iain Sandoe Nov. 8, 2021, 1:41 p.m. UTC | #5
> On 7 Nov 2021, at 22:50, Matt Jacobson <mhjacobson@me.com> wrote:
> 
> 
>> On Oct 25, 2021, at 5:43 AM, Iain Sandoe <iain@sandoe.co.uk> wrote:
>> 
>> Did you test objective-c++ on Darwin?
>> 
>> I see a lot of fails of the form:
>> Excess errors:
>> <built-in>: error: initialization of a flexible array member [-Wpedantic]
> 
> Looked into this.  It’s happening because obj-c++.dg/dg.exp has:
> 
>    set DEFAULT_OBJCXXFLAGS " -ansi -pedantic-errors -Wno-long-long"
> 
> Specifically, the `-pedantic-errors` argument prohibits initialization of a 
> flexible array member.  Notably, this flag does *not* appear in objc/dg.exp.

I think -pedantic-errors might be applied at a higher level - if it is not, then perhaps that is an omission to rectify.
> 
> Admittedly I didn’t know that initialization of a FAM was prohibited by the 
> standard.  It’s allowed by GCC, though, as documented here:
> 
> <https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html>
> 
> Is it OK to use a GCC extension this way in the Objective-C frontend?

Well, I had two thoughts so far;

1/ allow the extension and suppress the warning on the relevant trees.

2/ maybe create a new type “ on the fly “ for each protocol list with a correct length [initialising this would not be an error] (since the type is only used once it can be local to the use).

— but I haven’t had time to implement either ….

>> For a patch that changes code-gen we should have a test that it produces what’s
>> expected (in general, a ‘torture' test would be preferrable so that we can be sure the
>> output is as expected for different optimisation levels). 
> 
> The output is different only for targets where 
> sizeof (long) != sizeof (void *).  
> Do we have the ability to run “cross” 
> torture tests?

For most platforms, the answer is pretty much “no” - GCC is built for a single target (unless you have a mutlilib with the necessary difference - which seems unlikely in this case).

For Darwin platforms, we can (in most cases) choose a different -mmacosx-version-min= from the current host) - but that doesn’t allow us really any more and the 32b multilibs have sizeof (long) == sizeof (ptr) so no help there.

>  Could such a test verify the emitted assembly (like LLVM’s 
> FileCheck tests do)?  

We have a similar set of possibilities to LLVM/clang (the tree check could be the right one here)

* we can verify at some appproriate IR stage [-fdump-tree-xxxxx] (check that the right trees are emitted)

* we can verify the expected assembler is emitted (scan-asm in dg-tests)

> Or would it need to execute something?

An execute test can be a good way for checking that code-gen is working properly (providing, of course, that a wrong codegen would in some way make the execute fail - the question is can one construct such a test for this)

sorry this is not an answer - just a list of possible ways forward.
Iain
  

Patch

diff --git a/gcc/objc/objc-next-runtime-abi-02.c b/gcc/objc/objc-next-runtime-abi-02.c
index c3af369ff0d..aadf1741676 100644
--- a/gcc/objc/objc-next-runtime-abi-02.c
+++ b/gcc/objc/objc-next-runtime-abi-02.c
@@ -92,6 +92,7 @@  enum objc_v2_tree_index
   OCTI_V2_CAT_TEMPL,
   OCTI_V2_CLS_RO_TEMPL,
   OCTI_V2_PROTO_TEMPL,
+  OCTI_V2_PROTO_LIST_TEMPL,
   OCTI_V2_IVAR_TEMPL,
   OCTI_V2_IVAR_LIST_TEMPL,
   OCTI_V2_MESSAGE_REF_TEMPL,
@@ -130,6 +131,8 @@  enum objc_v2_tree_index
 				objc_v2_global_trees[OCTI_V2_CAT_TEMPL]
 #define objc_v2_protocol_template \
 				objc_v2_global_trees[OCTI_V2_PROTO_TEMPL]
+#define objc_v2_protocol_list_template \
+				objc_v2_global_trees[OCTI_V2_PROTO_LIST_TEMPL]
 
 /* struct message_ref_t */
 #define objc_v2_message_ref_template \
@@ -196,7 +199,7 @@  static void build_v2_message_ref_templates (void);
 static void build_v2_class_templates (void);
 static void build_v2_super_template (void);
 static void build_v2_category_template (void);
-static void build_v2_protocol_template (void);
+static void build_v2_protocol_templates (void);
 
 static tree next_runtime_abi_02_super_superclassfield_id (void);
 
@@ -394,9 +397,9 @@  static void next_runtime_02_initialize (void)
 		build_pointer_type (xref_tag (RECORD_TYPE,
 				    get_identifier ("_prop_list_t")));
 
+  build_v2_protocol_templates ();
   build_v2_class_templates ();
   build_v2_super_template ();
-  build_v2_protocol_template ();
   build_v2_category_template ();
 
   bool fixup_p = flag_next_runtime < USE_FIXUP_BEFORE;
@@ -636,7 +639,7 @@  struct class_ro_t
     const uint8_t * const ivarLayout;
     const char *const name;
     const struct method_list_t * const baseMethods;
-    const struct objc_protocol_list *const baseProtocols;
+    const struct protocol_list_t *const baseProtocols;
     const struct ivar_list_t *const ivars;
     const uint8_t * const weakIvarLayout;
     const struct _prop_list_t * const properties;
@@ -685,11 +688,9 @@  build_v2_class_templates (void)
   /* const struct method_list_t * const baseMethods; */
   add_field_decl (objc_method_list_ptr, "baseMethods", &chain);
 
-  /* const struct objc_protocol_list *const baseProtocols; */
-  add_field_decl (build_pointer_type
-			(xref_tag (RECORD_TYPE,
-				   get_identifier (UTAG_V2_PROTOCOL_LIST))),
-				  "baseProtocols", &chain);
+  /* const struct protocol_list_t *const baseProtocols; */
+  add_field_decl (build_pointer_type (objc_v2_protocol_list_template),
+		  "baseProtocols", &chain);
 
   /* const struct ivar_list_t *const ivars; */
   add_field_decl (objc_v2_ivar_list_ptr, "ivars", &chain);
@@ -763,25 +764,33 @@  build_v2_super_template (void)
      const char ** extended_method_types;
      const char * demangled_name;
      const struct _prop_list_t * class_properties;
-   }
+  };
+
+  struct protocol_list_t
+  {
+    long count;
+    struct protocol_t protocols[];
+  };
 */
 static void
-build_v2_protocol_template (void)
+build_v2_protocol_templates (void)
 {
-  tree decls, *chain = NULL;
+  tree decls, protolist_pointer_type, protocol_pointer_type, *chain;
 
   objc_v2_protocol_template =
 	objc_start_struct (get_identifier (UTAG_V2_PROTOCOL));
 
   /* Class isa; */
+  chain = NULL;
   decls = add_field_decl (objc_object_type, "isa", &chain);
 
   /* char *protocol_name; */
   add_field_decl (string_type_node, "protocol_name", &chain);
 
   /* const struct protocol_list_t * const protocol_list; */
-  add_field_decl (build_pointer_type (objc_v2_protocol_template),
-		  "protocol_list", &chain);
+  protolist_pointer_type = build_pointer_type (xref_tag (RECORD_TYPE,
+				get_identifier (UTAG_V2_PROTOCOL_LIST)));
+  add_field_decl (protolist_pointer_type, "protocol_list", &chain);
 
   /* const struct method_list_t * const instance_methods; */
   add_field_decl (objc_method_proto_list_ptr,  "instance_methods", &chain);
@@ -815,6 +824,24 @@  build_v2_protocol_template (void)
   add_field_decl (objc_prop_list_ptr, "class_properties", &chain);
 
   objc_finish_struct (objc_v2_protocol_template, decls);
+
+  /* --- */
+
+  objc_v2_protocol_list_template =
+		objc_start_struct (get_identifier (UTAG_V2_PROTOCOL_LIST));
+  gcc_assert (TREE_TYPE (protolist_pointer_type)
+		== objc_v2_protocol_list_template);
+
+  /* long count; */
+  chain = NULL;
+  decls = add_field_decl (long_integer_type_node, "count", &chain);
+
+  /* struct protocol_t *protocols[]; */
+  protocol_pointer_type = build_pointer_type (objc_v2_protocol_template);
+  add_field_decl (build_array_type (protocol_pointer_type, 0), "protocols",
+		  &chain);
+
+  objc_finish_struct (objc_v2_protocol_list_template, decls);
 }
 
 /* Build type for a category:
@@ -850,7 +877,7 @@  build_v2_category_template (void)
   add_field_decl (objc_method_list_ptr, "class_methods", &chain);
 
   /* struct protocol_list_t *protocol_list; */
-  add_field_decl (build_pointer_type (objc_v2_protocol_template),
+  add_field_decl (build_pointer_type (objc_v2_protocol_list_template),
                   "protocol_list", &chain );
 
   /* struct _prop_list_t * properties; */
@@ -2323,9 +2350,9 @@  build_v2_protocol_list_address_table (void)
 static tree
 generate_v2_protocol_list (tree i_or_p, tree klass_ctxt)
 {
-  tree refs_decl, lproto, e, plist, ptempl_p_t;
+  tree refs_decl, lproto, plist, protocol_pointer_type, array_ctor, ctor;
   int size = 0;
-  vec<constructor_elt, va_gc> *initlist = NULL;
+  vec<constructor_elt, va_gc> *initlist = NULL, *subinitlist = NULL;
   char buf[BUFSIZE];
 
   if (TREE_CODE (i_or_p) == CLASS_INTERFACE_TYPE
@@ -2336,18 +2363,7 @@  generate_v2_protocol_list (tree i_or_p, tree klass_ctxt)
   else
     gcc_unreachable ();
 
-  /* Compute size.  */
-  for (lproto = plist; lproto; lproto = TREE_CHAIN (lproto))
-    if (TREE_CODE (TREE_VALUE (lproto)) == PROTOCOL_INTERFACE_TYPE
-	&& PROTOCOL_FORWARD_DECL (TREE_VALUE (lproto)))
-      size++;
-
-  /* Build initializer.  */
-
-  ptempl_p_t = build_pointer_type (objc_v2_protocol_template);
-  e = build_int_cst (ptempl_p_t, size);
-  CONSTRUCTOR_APPEND_ELT (initlist, NULL_TREE, e);
-
+  /* Compute size, and build array initializer.  */
   for (lproto = plist; lproto; lproto = TREE_CHAIN (lproto))
     {
       tree pval = TREE_VALUE (lproto);
@@ -2356,14 +2372,24 @@  generate_v2_protocol_list (tree i_or_p, tree klass_ctxt)
 	  && PROTOCOL_FORWARD_DECL (pval))
 	{
 	  tree fwref = PROTOCOL_FORWARD_DECL (pval);
-	  location_t loc = DECL_SOURCE_LOCATION (fwref) ;
-	  e = build_unary_op (loc, ADDR_EXPR, fwref, 0);
-	  CONSTRUCTOR_APPEND_ELT (initlist, NULL_TREE, e);
+	  location_t loc = DECL_SOURCE_LOCATION (fwref);
+	  tree e = build_unary_op (loc, ADDR_EXPR, fwref, 0);
+	  CONSTRUCTOR_APPEND_ELT (subinitlist,
+				  build_int_cst (NULL_TREE, size), e);
+
+	  size++;
 	}
     }
 
-  /* static struct protocol_list_t *list[size]; */
+  /* Build struct initializer.  */
+  CONSTRUCTOR_APPEND_ELT (initlist, NULL_TREE,
+			  build_int_cst (long_integer_type_node, size));
+  protocol_pointer_type = build_pointer_type (objc_v2_protocol_template);
+  array_ctor = objc_build_constructor (
+		build_array_type (protocol_pointer_type, 0), subinitlist);
+  CONSTRUCTOR_APPEND_ELT (initlist, NULL_TREE, array_ctor);
 
+  /* Build declaration name.  */
   switch (TREE_CODE (i_or_p))
     {
     case PROTOCOL_INTERFACE_TYPE:
@@ -2383,13 +2409,13 @@  generate_v2_protocol_list (tree i_or_p, tree klass_ctxt)
 	gcc_unreachable ();
     }
 
-  refs_decl = start_var_decl (build_sized_array_type (ptempl_p_t, size+1),
-			      buf);
+  /* Build declaration.  */
+  refs_decl = start_var_decl (objc_v2_protocol_list_template, buf);
   /* ObjC2 puts all these in the base section.  */
   OBJCMETA (refs_decl, objc_meta, meta_base);
   DECL_PRESERVE_P (refs_decl) = 1;
-  finish_var_decl (refs_decl,
-		   objc_build_constructor (TREE_TYPE (refs_decl),initlist));
+  ctor = objc_build_constructor (objc_v2_protocol_list_template, initlist);
+  finish_var_decl (refs_decl, ctor);
   return refs_decl;
 }
 
@@ -2794,8 +2820,9 @@  generate_v2_protocols (void)
       protocol_name_expr = add_objc_string (PROTOCOL_NAME (p), class_names);
 
       if (refs_decl)
-	refs_expr = convert (build_pointer_type (objc_v2_protocol_template),
-			     build_unary_op (loc, ADDR_EXPR, refs_decl, 0));
+	refs_expr = convert (
+		build_pointer_type (objc_v2_protocol_list_template),
+		build_unary_op (loc, ADDR_EXPR, refs_decl, 0));
       else
 	refs_expr = build_int_cst (NULL_TREE, 0);
 
@@ -2885,8 +2912,7 @@  build_v2_category_initializer (tree type, tree cat_name, tree class_name,
     expr = convert (ltyp, null_pointer_node);
   CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, expr);
 
-  /* protocol_list = */
-  ltyp = build_pointer_type (objc_v2_protocol_template);
+  ltyp = build_pointer_type (objc_v2_protocol_list_template);
   if (protocol_list)
     expr = convert (ltyp, build_unary_op (loc, ADDR_EXPR, protocol_list, 0));
   else
@@ -3238,8 +3264,7 @@  build_v2_class_ro_t_initializer (tree type, tree name,
   CONSTRUCTOR_APPEND_ELT (initlist, NULL_TREE, expr);
 
   /* baseProtocols */
-  ltyp = build_pointer_type (xref_tag (RECORD_TYPE,
-			               get_identifier (UTAG_V2_PROTOCOL_LIST)));
+  ltyp = build_pointer_type (objc_v2_protocol_list_template);
   if (baseProtocols)
     expr = convert (ltyp, build_unary_op (loc, ADDR_EXPR, baseProtocols, 0));
   else