elf: Fix LD_AUDIT for modules with invalid version (BZ#24122)

Message ID 47113e5a-587c-e490-7be7-3c846d2a1a0b@linaro.org
State Dropped
Headers

Commit Message

Adhemerval Zanella Jan. 24, 2019, 3:16 p.m. UTC
  On 24/01/2019 12:32, Carlos O'Donell wrote:
> On 1/24/19 9:16 AM, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> On 24/01/2019 11:20, Florian Weimer wrote:
>>>> * Adhemerval Zanella:
>>>>
>>>>> On 24/01/2019 10:33, Florian Weimer wrote:
>>>>>> * Siddhesh Poyarekar:
>>>>>>
>>>>>>> On 24/01/19 2:14 AM, Carlos O'Donell wrote:
>>>>>>>>> Siddhesh,
>>>>>>>>>
>>>>>>>>> It is acceptable for 2.29?
>>>>>>>>  
>>>>>>>> It's OK with me. Siddhesh gets to make the call.
>>>>>>>> This is only a bug fix.
>>>>>>>>
>>>>>>>> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
>>>>>>>
>>>>>>> OK
>>>>>>
>>>>>> This broke the build with at least GCC 8 and the GCC 9 pre-release on
>>>>>> x86-64 and ppc64le:
>>>>>>
>>>>>> rtld.c: In function ‘dl_main’:
>>>>>> rtld.c:1556:8: error: ‘lav’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
>>>>>>         _dl_debug_printf (
>>>>>>         ^~~~~~~~~~~~~~~~~~
>>>>>>  "  auditor disabled since expected version %d is greater than "
>>>>>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>  "supported version %d.\n",
>>>>>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>             lav, LAV_CURRENT);
>>>>>>             ~~~~~~~~~~~~~~~~~
>>>>>>
>>>>>> Looks like the control flow is too complicated.  I think it's a false
>>>>>> positive.
>>>>>>
>>>>>> How should we fix this?  Inhibit the warning (making the sources even
>>>>>> more convoluted than they are)?  Revert the patch?  Factor out the audit
>>>>>> module loading into a separate function and simplify the control flow?
>>>>>
>>>>> It does seem a false positive, since lav will always be set if laversion
>>>>> is not NULL and the code explicitly checks for it. Let me try if
>>>>> refactoring this can help gcc handle it correctly.
>>>>
>>>> I'm testing this patch.  It changes the error messages slightly.
>>>>
>>>> It also calls _dl_close if we cannot find the laversion symbol,
>>>> something the previous code did not do.
>>>
>>> The current code does calls _dl_close in this case, it does not if
>>> _dl_catch_error returns an err_str (in this case the object is not
>>> loaded).
>>
>> I meant this part:
>>
>>               /* Check whether the interface version matches.  */
>>               (void) _dl_catch_error (&objname, &err_str, &malloced,
>>                                       lookup_doit, &largs);
>>
>>               unsigned int (*laversion) (unsigned int);
>>               unsigned int lav;
>>               if (err_str != NULL)
>>                 goto not_loaded;
>>
>> The not_loaded label is in the true branch of the surrounding if, and
>> the unloading on error only happens at the end of the else branch.
>>
>>>> +/* Called the audit DSO cannot be used, if it does not have the
>>>> +   appropriate interfaces or it expects something more recent.  */
>>>> +static void
>>>> +unload_audit_modile (struct link_map *map, int original_tls_idx)
>>>
>>> s/modile/module
>>
>> Fixed.
>>
>>>> +  (void) _dl_catch_error (&objname, &err_str, &malloced,
>>>> +			  lookup_doit, &largs);
>>>
>>> Do we need this (void) cast?
>>
>> No, I can remove it, it came from the original code.  I removed it.
>>
>>>
>>>> +  if (__glibc_likely (err_str != NULL))
>>>> +    {
>>>> +      unload_audit_modile (dlmargs.map, original_tls_idx);
>>>> +      report_audit_module_load_error (name, err_str, malloced);
>>>> +      return;
>>>> +    }
>>>> +
>>>> +  unsigned int (*laversion) (unsigned int) = largs.result;
>>>> +  if (laversion == NULL)
>>>> +    {
>>>> +      _dl_error_printf ("\
>>>> +ERROR: ld.so: audit interface '%s' has NULL laversion symbol; ignored.\n",
>>>> +			name);
>>>
>>> Maybe "ERROR: ld.so: audit interface does not contain a laversion symbol; ignored.\n" ?
>>
>> No, the missing symbol case is covered earlier, where lookup_doit will
>> fail.  This is really about a NULL symbol, a different error case.
>> Maybe this is so bizarre that we should drop this check?  (You need an
>> SHN_ABS symbol or an IFUNC resolver that returns NULL to get this.)
>>
>>>> +      unload_audit_modile (dlmargs.map, original_tls_idx);
>>>> +      return;
>>>> +    }
>>>> +
>>>> +  unsigned lav = laversion (LAV_CURRENT);
>>>> +  if (lav == 0)
>>>> +    {
>>>> +      /* Only print an error message if debugging because this can
>>>> +	 happen deliberately.  */
>>>> +      if (GLRO(dl_debug_mask) & DL_DEBUG_FILES)
>>>> +	_dl_debug_printf ("\
>>>> +audit interface '%s' requested to be ignored (returned version of 0).\n",
>>>> +			  name);
>>>
>>> I used the '\nfile=%s...' to follow the messages already printed for DL_DEBUG_FILES.
>>> Maybe 'file=%s requested to be ignored (returned version of 0).\n".
>>
>> This doesn't say that it's ignored as an audit module.
>>
>>>> +      unload_audit_modile (dlmargs.map, original_tls_idx);
>>>> +      return;
>>>> +    }
>>>> +
>>>> +  if (lav > LAV_CURRENT)
>>>> +    {
>>>> +      _dl_debug_printf ("\
>>>> +ERROR: audit interface '%s' requires version %d (maximum supported version %d); ignored.\n",
>>>> +			name, lav, LAV_CURRENT);
>>>
>>> My understanding is Carlos asked to this scenario not be reported as
>>> an error.
>>
>> I think Carlos meant the lav == 0 case, where I agree that warning
>> should be suppressed by default.  The lav > LAV_CURRENT suggests a
>> broken audit module because it did not properly check the required
>> version against LAV_CURRENT.  (If it does not want ld.so to log an
>> error, the audit module can always return 0 instead.)
>>
>> New patch below.
>>
>> Thanks,
>> Florian
>>
>> elf: Refactor audit module loading out of dl_main
>>
>> This avoids a compiler warning about an uninitialized lav variable
>> (seen with GCC 8 and 9 on x86-64).
>>
>> 2019-01-24  Florian Weimer  <fweimer@redhat.com>
>>
>> 	* elf/rtld.c (unload_audit_module): New function.
>> 	(report_audit_module_load_error): Likewise.
>> 	(load_audit_module): Likewise.  Extracted from dl_main.  Call
>> 	_dl_close if the laversion symbol cannot be found.  Use early
>> 	returns for error handling.  Remove spurious comment about static
>> 	TLS initialization.  Remove (void) casts of function results.
>> 	(load_audit_modules): New function.  Extracted from dl_main.
>> 	(dl_main): Call load_audit_modules.
> 
> This work looks great.
> 
> However, this has gotten too big.
> 
> Please revert the fix, and put this back in when 2.30 opens.

I would like to avoid the revert, I think based on new requirements
regarding laversion handling that we can use simpler patch and work
towards the refactor on 2.30:

---
---

what do you think? 


> 
> I have been convinced of the following:
> 
> * No error or warning when lav == 0.
> 
> * A warning when lav > LAV_CURRENT, because such modules can see
>   LAV_CURRENT as input and should disable themselves by returning
>   0, rather than returning a really high lav. This allows users
>   to catch this mistake (Florian convinced me of this on IRC).
> 
> Lastly, we should not look for a la_version == NULL, which really
> means you had a (null) symbol (hard to create), but should instead
> just assert(la_version != NULL) in this case I think.
  

Comments

Carlos O'Donell Jan. 25, 2019, 1:07 a.m. UTC | #1
On 1/24/19 10:16 AM, Adhemerval Zanella wrote:
> what do you think? 

We're not a rush for this.

I suggest:

* Revert the existing changes which break the build with gcc9.

* Post a clean v3 for Florian and I to review.

Siddhesh needs to make the call for freezing everything.

The Red Hat team is basically waiting for the full freeze to
kick off final testing and results.
  
Siddhesh Poyarekar Jan. 25, 2019, 1:38 a.m. UTC | #2
On 25/01/19 6:37 AM, Carlos O'Donell wrote:
> On 1/24/19 10:16 AM, Adhemerval Zanella wrote:
>> what do you think? 
> 
> We're not a rush for this.
> 
> I suggest:
> 
> * Revert the existing changes which break the build with gcc9.
> 
> * Post a clean v3 for Florian and I to review.
> 
> Siddhesh needs to make the call for freezing everything.
> 
> The Red Hat team is basically waiting for the full freeze to
> kick off final testing and results.

Agreed, I would like to do the freeze at least by Sunday night so that I
have the last four days to run tests and cut the release branch.  You
can always backport the fix to the 2.29 branch when it's ready.

Siddhesh
  
Adhemerval Zanella Jan. 25, 2019, 10:28 a.m. UTC | #3
On 24/01/2019 23:38, Siddhesh Poyarekar wrote:
> On 25/01/19 6:37 AM, Carlos O'Donell wrote:
>> On 1/24/19 10:16 AM, Adhemerval Zanella wrote:
>>> what do you think? 
>>
>> We're not a rush for this.
>>
>> I suggest:
>>
>> * Revert the existing changes which break the build with gcc9.
>>
>> * Post a clean v3 for Florian and I to review.
>>
>> Siddhesh needs to make the call for freezing everything.
>>
>> The Red Hat team is basically waiting for the full freeze to
>> kick off final testing and results.
> 
> Agreed, I would like to do the freeze at least by Sunday night so that I
> have the last four days to run tests and cut the release branch.  You
> can always backport the fix to the 2.29 branch when it's ready.
> 
> Siddhesh
> 

Patch reverted.
  

Patch

diff --git a/elf/rtld.c b/elf/rtld.c
index 9e0f752..305b443 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -1434,7 +1434,6 @@  ERROR: '%s': cannot process note segment.\n", _dl_argv[0]);
 				  &dlmargs);
 	  if (__glibc_unlikely (err_str != NULL))
 	    {
-	    not_loaded:
 	      _dl_error_printf ("\
 ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n",
 				name, err_str);
@@ -1446,18 +1445,20 @@  ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n",
 	      struct lookup_args largs;
 	      largs.name = "la_version";
 	      largs.map = dlmargs.map;
+	      Lmid_t ns;
 
 	      /* Check whether the interface version matches.  */
 	      (void) _dl_catch_error (&objname, &err_str, &malloced,
 				      lookup_doit, &largs);
+	      if (err_str != NULL)
+		goto fail_lookup;
 
 	      unsigned int (*laversion) (unsigned int);
-	      unsigned int lav;
-	      if (err_str != NULL)
-		goto not_loaded;
+	      laversion = largs.result;
+	      assert (laversion != NULL);
 
-	      if ((laversion = largs.result) != NULL
-		  && (lav = laversion (LAV_CURRENT)) > 0
+	      unsigned int lav = laversion (LAV_CURRENT);
+	      if ((lav = laversion (LAV_CURRENT)) > 0
 		  && lav <= LAV_CURRENT)
 		{
 		  /* Allocate structure for the callback function pointers.
@@ -1530,8 +1531,17 @@  ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n",
 		  /* We cannot use the DSO, it does not have the
 		     appropriate interfaces or it expects something
 		     more recent.  */
+		  if (lav > LAV_CURRENT)
+		    _dl_error_printf ("\
+ERROR: audit interface '%s' requires version %d (maximum supported version %d); ignored.\n",
+				      name, lav, LAV_CURRENT);
+		  else if (GLRO(dl_debug_mask) & DL_DEBUG_FILES)
+		    _dl_debug_printf ("\
+\nfile=%s cannot be loaded as audit interface; ignored.\n", name);
+
+		fail_lookup:
 #ifndef NDEBUG
-		  Lmid_t ns = dlmargs.map->l_ns;
+		  ns = dlmargs.map->l_ns;
 #endif
 		  _dl_close (dlmargs.map);
 
@@ -1540,25 +1550,6 @@  ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n",
 		  assert (GL(dl_ns)[ns]._ns_nloaded == 0);
 
 		  GL(dl_tls_max_dtv_idx) = tls_idx;
-		  if (GLRO(dl_debug_mask) & DL_DEBUG_FILES)
-		    {
-		      _dl_debug_printf ("\
-\nfile=%s cannot be loaded as audit interface; ignored.\n", name);
-		      if (laversion == NULL)
-			_dl_debug_printf (
-"  la_version function not found.\n");
-		      else
-			{
-			  if (lav == 0)
-			    _dl_debug_printf (
-"  auditor requested to be ignored (returned version of 0).\n");
-			  else
-			    _dl_debug_printf (
-"  auditor disabled since expected version %d is greater than "
-"supported version %d.\n",
-					      lav, LAV_CURRENT);
-			}
-		    }
 		}
 	    }
 	}