From patchwork Wed Apr 22 19:11:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 39120 From: gprocida@google.com (Giuliano Procida) Date: Wed, 22 Apr 2020 20:11:16 +0100 Subject: [PATCH 1/3] test-diff-suppr: Improve regexes in tests. In-Reply-To: <87a733d42i.fsf@seketeli.org> References: <20200420164153.5213-1-gprocida@google.com> <87a733d42i.fsf@seketeli.org> Message-ID: Hi there. Thanks for the review. On Wed, 22 Apr 2020 at 14:48, Dodji Seketeli wrote: > > Hello Giuliano, > > Giuliano Procida a ?crit: > > > The are some (mostly inconsequential) issues with regexes: > > > > - Leading and trailing .* are redundant and can be removed. > > They are redundant I agree, but that redundancy can be useful, I think. > The reason why I have put them in the test is that in the past, I have > tried different regular expression engines. Some of them would > implement the quantifier in a greedy manner, rather than in a lazy > manner. Whereas, what I find more useful is the lazy variant. So that > writing ".*blah" would do what you would expect (rather than having to > write .*?blah) If we only care about match existence (which is the case for libabigail), greediness shouldn't matter, the engine should backtrack. Do you mean possessive matching (I've just found this on Wikipedia, it's not something I thought regex engines had)? That would be painful to work with. If you want more features (like being able to match foofoo for any foo), you'll want PCRE. If you are thinking of embedding libabigail in a server, you'll want RE2. > So putting that in those tests explicitly will hopefully help us catch > potential issues if we are to switch to a different engine tomorrow. So > I'd prefer keeping that redundancy in those tests. Fair enough. > [...] > > > - File name matches are full path > > Are they? Correction, some file name matches seem to be full path. This may be a bug. I've found: Path in XML corpus_is_suppressed_by_soname_or_filename suppression_matches_soname_or_filename matches_binary_name (@param binary_name the full path to the binary) ELF path suppression_can_match matches_binary_name (@param binary_name the full path to the binary) I've not traced through the code, only looked at it. This change breaks make check: > It seems to me file_suppression::suppress_file() acts on the base name > of the file, rather than on its full path. > > I might be missing something here, please tell me. > > > so should start with (^|/) if trying to match a base name, assuming a > > Unix-like filesystem. > > I'd need the above to be clarified to say for sure. > > > Given these are just tests, it's not that important, but they > > still serve as examples. > > - In cases where the ^ anchor was used, full paths would usually > > fail to match. In such cases, the regex was being ignored for > > other reasons (see later patch) or is expected not to match > > anyway. > > - In many cases, the $ anchor could be considered to be missing. > > - The .ini parser unescapes string values, so escaping regex > > metacharacters requires a double backslash. Single backslashes > > are pointless. > > - The dot metacharacter is used unescaped in a few places where a > > literal was likely intended, so should be escaped. > > - The characters [ and ] don't need to be (.ini) escaped. > > They don't *need* to (in a string) I agree. But it they are escaped, > that should work. It's a way to test the init parser as well. So I'd > keep the redundancy there as well. I'd go as far to say "\." should be "\\.". In fact, changing all the tests to this exposes a bug in the init printer (it doesn't re-escape strings). For the rest, I'm much less opinionated. :-) > [...] > > Cheers, Regards, Giuliano. > -- > Dodji > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. > diff --git a/tests/data/test-diff-suppr/test24-soname-suppr-2.txt b/tests/data/test-diff-suppr/test24-soname-suppr-2.txt index d11e632c..8ac78817 100644 --- a/tests/data/test-diff-suppr/test24-soname-suppr-2.txt +++ b/tests/data/test-diff-suppr/test24-soname-suppr-2.txt @@ -1,4 +1,4 @@ [suppress_type] - file_name_regexp = .*/libtest24-soname-v0.so$ + file_name_regexp = ^libtest24-soname-v0.so$ name = S accessed_through = reference-or-pointer diff --git a/tests/data/test-diff-suppr/test24-soname-suppr-3.txt b/tests/data/test-diff-suppr/test24-soname-suppr-3.txt index 00d3428a..e94a6477 100644 --- a/tests/data/test-diff-suppr/test24-soname-suppr-3.txt +++ b/tests/data/test-diff-suppr/test24-soname-suppr-3.txt @@ -1,4 +1,4 @@ [suppress_type] - file_name_regexp = .*/libtest24-soname-v1.so$ + file_name_regexp = ^libtest24-soname-v1.so$ name = S accessed_through = reference-or-pointer