Patchwork elf: Fix LD_AUDIT for modules with invalid version (BZ#24122)

login
register
mail settings
Submitter Adhemerval Zanella Netto
Date Jan. 24, 2019, 3:16 p.m.
Message ID <47113e5a-587c-e490-7be7-3c846d2a1a0b@linaro.org>
Download mbox | patch
Permalink /patch/31204/
State New
Headers show

Comments

Adhemerval Zanella Netto - Jan. 24, 2019, 3:16 p.m.
On 24/01/2019 12:32, Carlos O'Donell wrote:
> On 1/24/19 9:16 AM, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> On 24/01/2019 11:20, Florian Weimer wrote:
>>>> * Adhemerval Zanella:
>>>>
>>>>> On 24/01/2019 10:33, Florian Weimer wrote:
>>>>>> * Siddhesh Poyarekar:
>>>>>>
>>>>>>> On 24/01/19 2:14 AM, Carlos O'Donell wrote:
>>>>>>>>> Siddhesh,
>>>>>>>>>
>>>>>>>>> It is acceptable for 2.29?
>>>>>>>>  
>>>>>>>> It's OK with me. Siddhesh gets to make the call.
>>>>>>>> This is only a bug fix.
>>>>>>>>
>>>>>>>> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
>>>>>>>
>>>>>>> OK
>>>>>>
>>>>>> This broke the build with at least GCC 8 and the GCC 9 pre-release on
>>>>>> x86-64 and ppc64le:
>>>>>>
>>>>>> rtld.c: In function ‘dl_main’:
>>>>>> rtld.c:1556:8: error: ‘lav’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
>>>>>>         _dl_debug_printf (
>>>>>>         ^~~~~~~~~~~~~~~~~~
>>>>>>  "  auditor disabled since expected version %d is greater than "
>>>>>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>  "supported version %d.\n",
>>>>>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>             lav, LAV_CURRENT);
>>>>>>             ~~~~~~~~~~~~~~~~~
>>>>>>
>>>>>> Looks like the control flow is too complicated.  I think it's a false
>>>>>> positive.
>>>>>>
>>>>>> How should we fix this?  Inhibit the warning (making the sources even
>>>>>> more convoluted than they are)?  Revert the patch?  Factor out the audit
>>>>>> module loading into a separate function and simplify the control flow?
>>>>>
>>>>> It does seem a false positive, since lav will always be set if laversion
>>>>> is not NULL and the code explicitly checks for it. Let me try if
>>>>> refactoring this can help gcc handle it correctly.
>>>>
>>>> I'm testing this patch.  It changes the error messages slightly.
>>>>
>>>> It also calls _dl_close if we cannot find the laversion symbol,
>>>> something the previous code did not do.
>>>
>>> The current code does calls _dl_close in this case, it does not if
>>> _dl_catch_error returns an err_str (in this case the object is not
>>> loaded).
>>
>> I meant this part:
>>
>>               /* Check whether the interface version matches.  */
>>               (void) _dl_catch_error (&objname, &err_str, &malloced,
>>                                       lookup_doit, &largs);
>>
>>               unsigned int (*laversion) (unsigned int);
>>               unsigned int lav;
>>               if (err_str != NULL)
>>                 goto not_loaded;
>>
>> The not_loaded label is in the true branch of the surrounding if, and
>> the unloading on error only happens at the end of the else branch.
>>
>>>> +/* Called the audit DSO cannot be used, if it does not have the
>>>> +   appropriate interfaces or it expects something more recent.  */
>>>> +static void
>>>> +unload_audit_modile (struct link_map *map, int original_tls_idx)
>>>
>>> s/modile/module
>>
>> Fixed.
>>
>>>> +  (void) _dl_catch_error (&objname, &err_str, &malloced,
>>>> +			  lookup_doit, &largs);
>>>
>>> Do we need this (void) cast?
>>
>> No, I can remove it, it came from the original code.  I removed it.
>>
>>>
>>>> +  if (__glibc_likely (err_str != NULL))
>>>> +    {
>>>> +      unload_audit_modile (dlmargs.map, original_tls_idx);
>>>> +      report_audit_module_load_error (name, err_str, malloced);
>>>> +      return;
>>>> +    }
>>>> +
>>>> +  unsigned int (*laversion) (unsigned int) = largs.result;
>>>> +  if (laversion == NULL)
>>>> +    {
>>>> +      _dl_error_printf ("\
>>>> +ERROR: ld.so: audit interface '%s' has NULL laversion symbol; ignored.\n",
>>>> +			name);
>>>
>>> Maybe "ERROR: ld.so: audit interface does not contain a laversion symbol; ignored.\n" ?
>>
>> No, the missing symbol case is covered earlier, where lookup_doit will
>> fail.  This is really about a NULL symbol, a different error case.
>> Maybe this is so bizarre that we should drop this check?  (You need an
>> SHN_ABS symbol or an IFUNC resolver that returns NULL to get this.)
>>
>>>> +      unload_audit_modile (dlmargs.map, original_tls_idx);
>>>> +      return;
>>>> +    }
>>>> +
>>>> +  unsigned lav = laversion (LAV_CURRENT);
>>>> +  if (lav == 0)
>>>> +    {
>>>> +      /* Only print an error message if debugging because this can
>>>> +	 happen deliberately.  */
>>>> +      if (GLRO(dl_debug_mask) & DL_DEBUG_FILES)
>>>> +	_dl_debug_printf ("\
>>>> +audit interface '%s' requested to be ignored (returned version of 0).\n",
>>>> +			  name);
>>>
>>> I used the '\nfile=%s...' to follow the messages already printed for DL_DEBUG_FILES.
>>> Maybe 'file=%s requested to be ignored (returned version of 0).\n".
>>
>> This doesn't say that it's ignored as an audit module.
>>
>>>> +      unload_audit_modile (dlmargs.map, original_tls_idx);
>>>> +      return;
>>>> +    }
>>>> +
>>>> +  if (lav > LAV_CURRENT)
>>>> +    {
>>>> +      _dl_debug_printf ("\
>>>> +ERROR: audit interface '%s' requires version %d (maximum supported version %d); ignored.\n",
>>>> +			name, lav, LAV_CURRENT);
>>>
>>> My understanding is Carlos asked to this scenario not be reported as
>>> an error.
>>
>> I think Carlos meant the lav == 0 case, where I agree that warning
>> should be suppressed by default.  The lav > LAV_CURRENT suggests a
>> broken audit module because it did not properly check the required
>> version against LAV_CURRENT.  (If it does not want ld.so to log an
>> error, the audit module can always return 0 instead.)
>>
>> New patch below.
>>
>> Thanks,
>> Florian
>>
>> elf: Refactor audit module loading out of dl_main
>>
>> This avoids a compiler warning about an uninitialized lav variable
>> (seen with GCC 8 and 9 on x86-64).
>>
>> 2019-01-24  Florian Weimer  <fweimer@redhat.com>
>>
>> 	* elf/rtld.c (unload_audit_module): New function.
>> 	(report_audit_module_load_error): Likewise.
>> 	(load_audit_module): Likewise.  Extracted from dl_main.  Call
>> 	_dl_close if the laversion symbol cannot be found.  Use early
>> 	returns for error handling.  Remove spurious comment about static
>> 	TLS initialization.  Remove (void) casts of function results.
>> 	(load_audit_modules): New function.  Extracted from dl_main.
>> 	(dl_main): Call load_audit_modules.
> 
> This work looks great.
> 
> However, this has gotten too big.
> 
> Please revert the fix, and put this back in when 2.30 opens.

I would like to avoid the revert, I think based on new requirements
regarding laversion handling that we can use simpler patch and work
towards the refactor on 2.30:

---
---

what do you think? 


> 
> I have been convinced of the following:
> 
> * No error or warning when lav == 0.
> 
> * A warning when lav > LAV_CURRENT, because such modules can see
>   LAV_CURRENT as input and should disable themselves by returning
>   0, rather than returning a really high lav. This allows users
>   to catch this mistake (Florian convinced me of this on IRC).
> 
> Lastly, we should not look for a la_version == NULL, which really
> means you had a (null) symbol (hard to create), but should instead
> just assert(la_version != NULL) in this case I think.
Carlos O'Donell - Jan. 25, 2019, 1:07 a.m.
On 1/24/19 10:16 AM, Adhemerval Zanella wrote:
> what do you think? 

We're not a rush for this.

I suggest:

* Revert the existing changes which break the build with gcc9.

* Post a clean v3 for Florian and I to review.

Siddhesh needs to make the call for freezing everything.

The Red Hat team is basically waiting for the full freeze to
kick off final testing and results.
Siddhesh Poyarekar - Jan. 25, 2019, 1:38 a.m.
On 25/01/19 6:37 AM, Carlos O'Donell wrote:
> On 1/24/19 10:16 AM, Adhemerval Zanella wrote:
>> what do you think? 
> 
> We're not a rush for this.
> 
> I suggest:
> 
> * Revert the existing changes which break the build with gcc9.
> 
> * Post a clean v3 for Florian and I to review.
> 
> Siddhesh needs to make the call for freezing everything.
> 
> The Red Hat team is basically waiting for the full freeze to
> kick off final testing and results.

Agreed, I would like to do the freeze at least by Sunday night so that I
have the last four days to run tests and cut the release branch.  You
can always backport the fix to the 2.29 branch when it's ready.

Siddhesh
Adhemerval Zanella Netto - Jan. 25, 2019, 10:28 a.m.
On 24/01/2019 23:38, Siddhesh Poyarekar wrote:
> On 25/01/19 6:37 AM, Carlos O'Donell wrote:
>> On 1/24/19 10:16 AM, Adhemerval Zanella wrote:
>>> what do you think? 
>>
>> We're not a rush for this.
>>
>> I suggest:
>>
>> * Revert the existing changes which break the build with gcc9.
>>
>> * Post a clean v3 for Florian and I to review.
>>
>> Siddhesh needs to make the call for freezing everything.
>>
>> The Red Hat team is basically waiting for the full freeze to
>> kick off final testing and results.
> 
> Agreed, I would like to do the freeze at least by Sunday night so that I
> have the last four days to run tests and cut the release branch.  You
> can always backport the fix to the 2.29 branch when it's ready.
> 
> Siddhesh
> 

Patch reverted.

Patch

diff --git a/elf/rtld.c b/elf/rtld.c
index 9e0f752..305b443 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -1434,7 +1434,6 @@  ERROR: '%s': cannot process note segment.\n", _dl_argv[0]);
 				  &dlmargs);
 	  if (__glibc_unlikely (err_str != NULL))
 	    {
-	    not_loaded:
 	      _dl_error_printf ("\
 ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n",
 				name, err_str);
@@ -1446,18 +1445,20 @@  ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n",
 	      struct lookup_args largs;
 	      largs.name = "la_version";
 	      largs.map = dlmargs.map;
+	      Lmid_t ns;
 
 	      /* Check whether the interface version matches.  */
 	      (void) _dl_catch_error (&objname, &err_str, &malloced,
 				      lookup_doit, &largs);
+	      if (err_str != NULL)
+		goto fail_lookup;
 
 	      unsigned int (*laversion) (unsigned int);
-	      unsigned int lav;
-	      if (err_str != NULL)
-		goto not_loaded;
+	      laversion = largs.result;
+	      assert (laversion != NULL);
 
-	      if ((laversion = largs.result) != NULL
-		  && (lav = laversion (LAV_CURRENT)) > 0
+	      unsigned int lav = laversion (LAV_CURRENT);
+	      if ((lav = laversion (LAV_CURRENT)) > 0
 		  && lav <= LAV_CURRENT)
 		{
 		  /* Allocate structure for the callback function pointers.
@@ -1530,8 +1531,17 @@  ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n",
 		  /* We cannot use the DSO, it does not have the
 		     appropriate interfaces or it expects something
 		     more recent.  */
+		  if (lav > LAV_CURRENT)
+		    _dl_error_printf ("\
+ERROR: audit interface '%s' requires version %d (maximum supported version %d); ignored.\n",
+				      name, lav, LAV_CURRENT);
+		  else if (GLRO(dl_debug_mask) & DL_DEBUG_FILES)
+		    _dl_debug_printf ("\
+\nfile=%s cannot be loaded as audit interface; ignored.\n", name);
+
+		fail_lookup:
 #ifndef NDEBUG
-		  Lmid_t ns = dlmargs.map->l_ns;
+		  ns = dlmargs.map->l_ns;
 #endif
 		  _dl_close (dlmargs.map);
 
@@ -1540,25 +1550,6 @@  ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n",
 		  assert (GL(dl_ns)[ns]._ns_nloaded == 0);
 
 		  GL(dl_tls_max_dtv_idx) = tls_idx;
-		  if (GLRO(dl_debug_mask) & DL_DEBUG_FILES)
-		    {
-		      _dl_debug_printf ("\
-\nfile=%s cannot be loaded as audit interface; ignored.\n", name);
-		      if (laversion == NULL)
-			_dl_debug_printf (
-"  la_version function not found.\n");
-		      else
-			{
-			  if (lav == 0)
-			    _dl_debug_printf (
-"  auditor requested to be ignored (returned version of 0).\n");
-			  else
-			    _dl_debug_printf (
-"  auditor disabled since expected version %d is greater than "
-"supported version %d.\n",
-					      lav, LAV_CURRENT);
-			}
-		    }
 		}
 	    }
 	}