[RFC,2/2,precommit] Add codespell

Message ID 20241129161707.25292-3-tdevries@suse.de
State New
Headers
Series Add codespell hook |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 warning Skipped upon request
linaro-tcwg-bot/tcwg_gdb_build--master-arm warning Skipped upon request

Commit Message

Tom de Vries Nov. 29, 2024, 4:17 p.m. UTC
  Add a pre-commit codespell hook.

We use a custom one (gdb/contrib/codespell.sh) rather than the regular one
because:
- it allows us downgrade detected spelling mistakes from errors aborting a
  commit, to warnings that can be inspected after the commit has finished, and
- it allows us to check only the staged part of a file rather than the
  entire file, greatly reducing noise.

Both of these items are intended to avoid disrupting developer workflow as
much as possible, while giving useful information.

To demonstrate usage, consider gdb/ada-lang.c, which is not codespell clean:
...
$ ./gdb/contrib/codespell.sh gdb/ada-lang.c
gdb/ada-lang.c:847: Olt ==> Old
gdb/ada-lang.c:858: Onot ==> Note, Not
gdb/ada-lang.c:1533: alpha-numeric ==> alphanumeric
gdb/ada-lang.c:3047: arithmetics ==> arithmetic
gdb/ada-lang.c:5440: re-use ==> reuse
gdb/ada-lang.c:9353: swith ==> switch
gdb/ada-lang.c:9771: separatly ==> separately
gdb/ada-lang.c:11749: suport ==> support
gdb/ada-lang.c:11756: suport ==> support
gdb/ada-lang.c:13655: statics ==> statistics
...

Now let's introduce a typo:
...
$ echo "This is the wrong adres" >> gdb/ada-lang.c
...
and commit it:
...
$ git commit -a -m typo
black................................................(no files to check)Skipped
flake8...............................................(no files to check)Skipped
isort................................................(no files to check)Skipped
gdb/scripts/codespell.sh --always-pass --staged..........................Passed
- hook id: codespell
- duration: 0.29s

> This is the wrong adres
gdb/ada-lang.c: adres ==> address

[precommit/codespell-2 47baee9867b] typo
 1 file changed, 1 insertion(+)
...

As we can see:
- the commit succeeded,
- the introduced typo was noticed, and
- only the introduced typo was noticed.

The current implementation of gdb/contrib/codespell.sh --staged fails to print
the line number, so I've added -C0 to make the context clear, which get us the
first line here:
...
> This is the wrong adres
gdb/ada-lang.c: adres ==> address
...
---
 .pre-commit-config.yaml | 9 +++++++++
 1 file changed, 9 insertions(+)
  

Patch

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 070631c0f16..b1b4c2677b3 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -59,3 +59,12 @@  repos:
     - id: isort
       types_or: [file]
       files: 'gdb/.*\.py(\.in)?$'
+  - repo: local
+    hooks:
+    - id: codespell
+      name: gdb/scripts/codespell.sh --always-pass --staged
+      language: script
+      entry: ./gdb/contrib/codespell.sh
+      args: [ "--always-pass", "--staged", "--", "-C0", "--" ]
+      files: '^(gdb|gdbsupport|gdbserver)/'
+      verbose: true