From patchwork Mon Apr 25 07:09:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Huber X-Patchwork-Id: 53162 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4D1173857349 for ; Mon, 25 Apr 2022 07:14:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148]) by sourceware.org (Postfix) with ESMTPS id 2B61F385840F for ; Mon, 25 Apr 2022 07:09:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2B61F385840F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embedded-brains.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embedded-brains.de Received: from sslproxy05.your-server.de ([78.46.172.2]) by dedi548.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nisql-0004ms-0T; Mon, 25 Apr 2022 09:09:35 +0200 Received: from [82.100.198.138] (helo=mail.embedded-brains.de) by sslproxy05.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nisqk-000XDp-Tm; Mon, 25 Apr 2022 09:09:34 +0200 Received: from localhost (localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id A0ABF4800CA; Mon, 25 Apr 2022 09:09:34 +0200 (CEST) Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id rUF0FSZa5dDi; Mon, 25 Apr 2022 09:09:34 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id F1EC7480147; Mon, 25 Apr 2022 09:09:33 +0200 (CEST) X-Virus-Scanned: amavisd-new at zimbra.eb.localhost Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 2nLOXAwrc7iQ; Mon, 25 Apr 2022 09:09:33 +0200 (CEST) Received: from zimbra.eb.localhost (unknown [192.168.96.242]) by mail.embedded-brains.de (Postfix) with ESMTPSA id 8B9134801ED; Mon, 25 Apr 2022 09:09:33 +0200 (CEST) From: Sebastian Huber To: gcc-patches@gcc.gnu.org Subject: [gcov v2 14/14] gcov: Add section for freestanding environments Date: Mon, 25 Apr 2022 09:09:29 +0200 Message-Id: <20220425070929.7466-15-sebastian.huber@embedded-brains.de> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220425070929.7466-1-sebastian.huber@embedded-brains.de> References: <20220425070929.7466-1-sebastian.huber@embedded-brains.de> MIME-Version: 1.0 X-Authenticated-Sender: smtp-embedded@poldinet.de X-Virus-Scanned: Clear (ClamAV 0.103.5/26522/Sun Apr 24 10:22:35 2022) X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" gcc/ * doc/gcov.texi (Profiling and Test Coverage in Freestanding Environments): New section. --- gcc/doc/gcov.texi | 375 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 375 insertions(+) diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi index fc39da0f02d..751a11314f3 100644 --- a/gcc/doc/gcov.texi +++ b/gcc/doc/gcov.texi @@ -41,6 +41,8 @@ test code coverage in your programs. * Gcov and Optimization:: Using gcov with GCC optimization. * Gcov Data Files:: The files used by gcov. * Cross-profiling:: Data file relocation. +* Freestanding Environments:: How to use profiling and test + coverage in freestanding environments. @end menu @node Gcov Intro @@ -971,3 +973,376 @@ setting will name the data file @file{/target/run/build/foo.gcda}. You must move the data files to the expected directory tree in order to use them for profile directed optimizations (@option{-fprofile-use}), or to use the @command{gcov} tool. + +@node Freestanding Environments +@section Profiling and Test Coverage in Freestanding Environments + +In case your application runs in a hosted environment such as GNU/Linux, then +this section is likely not relevant to you. This section is intended for +application developers targeting freestanding environments (for example +embedded systems) with limited resources. In particular, systems or test cases +which do not support constructors/destructors or the C library file I/O. In +this section, the @dfn{target system} runs your application instrumented for +profiling or test coverage. You develop and analyze your application on the +@dfn{host system}. We give now an overview how profiling and test coverage can +be obtained in this scenario followed by a tutorial which can be exercised on +the host system. Finally, some system initialization caveats are listed. + +@subsection Overview + +For an application instrumented for profiling or test coverage, the compiler +generates some global data structures which are updated by instrumentation code +while the application runs. These data structures are called the @dfn{gcov +information}. Normally, when the application exits, the gcov information is +stored to @file{.gcda} files. There is one file per translation unit +instrumented for profiling or test coverage. The function +@code{__gcov_exit()}, which stores the gcov information to a file, is called by +a global destructor function for each translation unit instrumented for +profiling or test coverage. It runs at process exit. In a global constructor +function, the @code{__gcov_init()} function is called to register the gcov +information of a translation unit in a global list. In some situations, this +procedure does not work. Firstly, if you want to profile the global +constructor or exit processing of an operating system, the compiler generated +functions may conflict with the test objectives. Secondly, you may want to +test early parts of the system initialization or abnormal program behaviour +which do not allow a global constructor or exit processing. Thirdly, you need +a filesystem to store the files. + +The @option{-fprofile-info-section} GCC option enables you to use profiling and +test coverage in freestanding environments. This option disables the use of +global constructors and destructors for the gcov information. Instead, a +pointer to the gcov information is stored in a special linker input section for +each translation unit which is compiled with this option. By default, the +section name is @code{.gcov_info}. The gcov information is statically +initialized. The pointers to the gcov information from all translation units +of an executable can be collected by the linker in a continuous memory block. +For the GNU linker, the below linker script output section definition can be +used to achieve this: + +@smallexample + .gcov_info : + @{ + PROVIDE (__gcov_info_start = .); + KEEP (*(.gcov_info)) + PROVIDE (__gcov_info_end = .); + @} +@end smallexample + +The linker will provide two global symbols, @code{__gcov_info_start} and +@code{__gcov_info_end}, which define the start and end of the array of pointers +to gcov information blocks, respectively. The @code{KEEP ()} directive is +required to prevent a garbage collection of the pointers. They are not +directly referenced by anything in the executable. The section may be placed +in a read-only memory area. + +In order to transfer the profiling and test coverage data from the target to +the host system, the application has to provide a function to produce a +reliable in order byte stream from the target to the host. The byte stream may +be compressed and encoded using error detection and correction codes to meet +application-specific requirements. The GCC provided @file{libgcov} target +library provides two functions, @code{__gcov_info_to_gcda()} and +@code{__gcov_filename_to_gcfn()}, to generate a byte stream from a gcov +information bock. The functions are declared in @code{#include }. The +byte stream can be deserialized by the @command{merge-stream} subcommand of the +@command{gcov-tool} to create or update @file{.gcda} files in the host +filesystem for the instrumented application. + +@subsection Tutorial + +This tutorial should be exercised on the host system. We will build a program +instrumented for test coverage. The program runs an application and dumps the +gcov information to @file{stderr} encoded as a printable character stream. The +application simply decodes such character streams from @file{stdin} and writes +the decoded character stream to @file{stdout} (warning: this is binary data). +The decoded character stream is consumed by the @command{merge-stream} +subcommand of the @command{gcov-tool} to create or update the @file{.gcda} +files. + +To get started, create an empty directory. Change into the new directory. +Create a header file @file{app.h} with the following content: + +@smallexample +static const unsigned char a = 'a'; + +static inline unsigned char * +encode (unsigned char c, unsigned char buf[2]) +@{ + buf[0] = c % 16 + a; + buf[1] = (c / 16) % 16 + a; + return buf; +@} + +extern void application (void); +@end smallexample + +Create a source file @file{app.c} with the following content: + +@smallexample +#include "app.h" + +#include + +/* The application reads a character stream encoded by encode() from stdin, + decodes it, and writes the decoded characters to stdout. Characters other + than the 16 characters 'a' to 'p' are ignored. */ + +static int can_decode (unsigned char c) +@{ + return (unsigned char)(c - a) < 16; +@} + +void +application (void) +@{ + int first = 1; + int i; + unsigned char c; + + while ((i = fgetc (stdin)) != EOF) + @{ + unsigned char x = (unsigned char)i; + + if (can_decode (x)) + @{ + if (first) + c = x - a; + else + fputc (c + 16 * (x - a), stdout); + first = !first; + @} + else + first = 1; + @} +@} +@end smallexample + +Create a source file @file{main.c} with the following content: + +@smallexample +#include "app.h" + +#include +#include +#include + +/* The start and end symbols are provided by the linker script. We use the + array notation to avoid issues with a potential small-data area. */ + +extern const struct gcov_info *const __gcov_info_start[]; +extern const struct gcov_info *const __gcov_info_end[]; + +/* This function shall produce a reliable in order byte stream to transfer the + gcov information from the target to the host system. */ + +static void +dump (const void *d, unsigned n, void *arg) +@{ + (void)arg; + const unsigned char *c = d; + unsigned char buf[2]; + + for (unsigned i = 0; i < n; ++i) + fwrite (encode (c[i], buf), sizeof (buf), 1, stderr); +@} + +/* The filename is serialized to a gcfn data stream by the + __gcov_filename_to_gcfn() function. The gcfn data is used by the + "merge-stream" subcommand of the "gcov-tool" to figure out the filename + associated with the gcov information. */ + +static void +filename (const char *f, void *arg) +@{ + __gcov_filename_to_gcfn (f, dump, arg); +@} + +/* The __gcov_info_to_gcda() function may have to allocate memory under + certain conditions. Simply try it out if it is needed for your application + or not. */ + +static void * +allocate (unsigned length, void *arg) +@{ + (void)arg; + return malloc (length); +@} + +/* Dump the gcov information of all translation units. */ + +static void +dump_gcov_info (void) +@{ + const struct gcov_info *const *info = __gcov_info_start; + const struct gcov_info *const *end = __gcov_info_end; + + /* Obfuscate variable to prevent compiler optimizations. */ + __asm__ ("" : "+r" (info)); + + while (info != end) + @{ + void *arg = NULL; + __gcov_info_to_gcda (*info, filename, dump, allocate, arg); + fputc ('\n', stderr); + ++info; + @} +@} + +/* The main() function just runs the application and then dumps the gcov + information to stderr. */ + +int +main (void) +@{ + application (); + dump_gcov_info (); + return 0; +@} +@end smallexample + +If we compile @file{app.c} with test coverage and no extra profiling options, +then a global constructor (@code{_sub_I_00100_0} here, it may have a different +name in your environment) and destructor (@code{_sub_D_00100_1}) is used to +dump the gcov information. We also see undefined references to +@code{__gcov_init} and @code{__gcov_exit}: + +@smallexample +$ gcc -ftest-coverage -fprofile-arcs -c app.c +$ nm app.o +0000000000000000 r a +0000000000000030 T application +0000000000000000 t can_decode + U fgetc + U fputc +0000000000000000 b __gcov0.application +0000000000000038 b __gcov0.can_decode +0000000000000000 d __gcov_.application +00000000000000c0 d __gcov_.can_decode + U __gcov_exit + U __gcov_init + U __gcov_merge_add + U stdin + U stdout +0000000000000161 t _sub_D_00100_1 +0000000000000151 t _sub_I_00100_0 +@end smallexample + +Compile @file{app.c} and @file{main.c} with test coverage and +@option{-fprofile-info-section}. Now, a read-only pointer size object is +present in the @code{.gcov_info} section and there are no undefined references +to @code{__gcov_init} and @code{__gcov_exit}: + +@smallexample +$ gcc -ftest-coverage -fprofile-arcs -fprofile-info-section -c main.c +$ gcc -ftest-coverage -fprofile-arcs -fprofile-info-section -c app.c +$ objdump -h app.o + +app.o: file format elf64-x86-64 + +Sections: +Idx Name Size VMA LMA File off Algn + 0 .text 00000151 0000000000000000 0000000000000000 00000040 2**0 + CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE + 1 .data 00000100 0000000000000000 0000000000000000 000001a0 2**5 + CONTENTS, ALLOC, LOAD, RELOC, DATA + 2 .bss 00000040 0000000000000000 0000000000000000 000002a0 2**5 + ALLOC + 3 .rodata 0000003c 0000000000000000 0000000000000000 000002a0 2**3 + CONTENTS, ALLOC, LOAD, READONLY, DATA + 4 .gcov_info 00000008 0000000000000000 0000000000000000 000002e0 2**3 + CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA + 5 .comment 0000004e 0000000000000000 0000000000000000 000002e8 2**0 + CONTENTS, READONLY + 6 .note.GNU-stack 00000000 0000000000000000 0000000000000000 00000336 2**0 + CONTENTS, READONLY + 7 .eh_frame 00000058 0000000000000000 0000000000000000 00000338 2**3 + CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA +@end smallexample + +We have to customize the program link process so that all the @code{.gcov_info} +linker input sections are placed in a continuous memory block with a begin and +end symbol. Firstly, get the default linker script using the following +commands (we assume a GNU linker): + +@smallexample +$ ld --verbose | sed '1,/^===/d' | sed '/^===/d' > linkcmds +@end smallexample + +Secondly, open the file @file{linkcmds} with a text editor and place the linker +output section definition from the overview after the @code{.rodata} section +definition. Link the program executable using the customized linker script: + +@smallexample +$ gcc main.o app.o -T linkcmds -lgcov -Wl,-Map,app.map +@end smallexample + +In the linker map file @file{app.map}, we see that the linker placed the +read-only pointer size objects of our objects files @file{main.o} and +@file{app.o} into a continuous memory block and provided the symbols +@code{__gcov_info_start} and @code{__gcov_info_end}: + +@smallexample +$ grep -C 1 "\.gcov_info" app.map + +.gcov_info 0x0000000000403ac0 0x10 + 0x0000000000403ac0 PROVIDE (__gcov_info_start = .) + *(.gcov_info) + .gcov_info 0x0000000000403ac0 0x8 main.o + .gcov_info 0x0000000000403ac8 0x8 app.o + 0x0000000000403ad0 PROVIDE (__gcov_info_end = .) +@end smallexample + +Make sure no @file{.gcda} files are present. Run the program with nothing to +decode and dump @file{stderr} to the file @file{gcda-0.txt} (first run). Run +the program to decode @file{gcda-0.txt} and send it to the @command{gcov-tool} +using the @command{merge-stream} subcommand to create the @file{.gcda} files +(second run). Run @command{gcov} to produce a report for @file{app.c}. We see +that the first run with nothing to decode resulted in a partially covered +application: + +@smallexample +$ rm -f app.gcda main.gcda +$ echo "" | ./a.out 2>gcda-0.txt +$ ./a.out gcda-1.txt | gcov-tool merge-stream +$ gcov -bc app.c +File 'app.c' +Lines executed:69.23% of 13 +Branches executed:66.67% of 6 +Taken at least once:50.00% of 6 +Calls executed:66.67% of 3 +Creating 'app.c.gcov' + +Lines executed:69.23% of 13 +@end smallexample + +Run the program to decode @file{gcda-1.txt} and send it to the +@command{gcov-tool} using the @command{merge-stream} subcommand to update the +@file{.gcda} files. Run @command{gcov} to produce a report for @file{app.c}. +Since the second run decoded the gcov information of the first run, we have now +a fully covered application: + +@smallexample +$ ./a.out gcda-2.txt | gcov-tool merge-stream +$ gcov -bc app.c +File 'app.c' +Lines executed:100.00% of 13 +Branches executed:100.00% of 6 +Taken at least once:100.00% of 6 +Calls executed:100.00% of 3 +Creating 'app.c.gcov' + +Lines executed:100.00% of 13 +@end smallexample + +@subsection System Initialization Caveats + +The gcov information of a translation unit consists of several global data +structures. For example, the instrumented code may update program flow graph +edge counters in a zero-initialized data structure. It is safe to run +instrumented code before the zero-initialized data is cleared to zero. The +coverage information obtained before the zero-initialized data is cleared to +zero is unusable. Dumping the gcov information using +@code{__gcov_info_to_gcda()} before the zero-initialized data is cleared to +zero or the initialized data is loaded, is undefined behaviour. Clearing the +zero-initialized data to zero through a function instrumented for profiling or +test coverage is undefined behaviour, since it may produce inconsistent program +flow graph edge counters for example.