[3/3] Add spell check pre-commit hook
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gdb_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-arm |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 |
success
|
Test passed
|
Commit Message
Add a call to gdb/contrib/spellcheck.sh --check in .pre-commit-config.yaml:
...
$ time git commit -a -m typo
black................................................(no files to check)Skipped
flake8...............................................(no files to check)Skipped
isort................................................(no files to check)Skipped
spell check..............................................................Failed
- hook id: spell-check
- exit code: 1
real 0m0,493s
user 0m0,258s
sys 0m0,318s
...
Tested on aarch64-linux.
---
.pre-commit-config.yaml | 7 +++++++
1 file changed, 7 insertions(+)
Comments
On 10/8/24 09:44, Tom de Vries wrote:
> Add a call to gdb/contrib/spellcheck.sh --check in .pre-commit-config.yaml:
> ...
> $ time git commit -a -m typo
> black................................................(no files to check)Skipped
> flake8...............................................(no files to check)Skipped
> isort................................................(no files to check)Skipped
> spell check..............................................................Failed
> - hook id: spell-check
> - exit code: 1
>
> real 0m0,493s
> user 0m0,258s
> sys 0m0,318s
> ...
>
I've committed the other patches from this series, but this one remains
in review, so ... ping.
Thanks,
- Tom
> Tested on aarch64-linux.
> ---
> .pre-commit-config.yaml | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
> index 87726aeb758..4a3fdd79f3b 100644
> --- a/.pre-commit-config.yaml
> +++ b/.pre-commit-config.yaml
> @@ -22,3 +22,10 @@ repos:
> - id: isort
> types_or: [file]
> files: 'gdb/.*\.py(\.in)?$'
> + - repo: local
> + hooks:
> + - id: spell-check
> + name: spell check
> + language: script
> + entry: ./gdb/contrib/spellcheck.sh --check
> + files: ^(gdb|gdbsupport|gdbserver)/
>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
Tom> I've committed the other patches from this series, but this one
Tom> remains in review, so ... ping.
Ah, now I see the others already went in.
Totally fine of course, I just should have read the whole thread first.
I am in favor of this change. I think it's a good improvement.
However I wonder -- how long does it take to run the script?
I ask because it seems like it will be run on essentially every commit.
Also, I can't recall how pre-commit works -- is the script run on the
entire tree or just the files being modified?
Assuming everything is ok though I think this should go in.
Approved-By: Tom Tromey <tom@tromey.com>
Tom
On 11/7/24 15:57, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>
> Tom> I've committed the other patches from this series, but this one
> Tom> remains in review, so ... ping.
>
> Ah, now I see the others already went in.
> Totally fine of course, I just should have read the whole thread first.
>
> I am in favor of this change. I think it's a good improvement.
> However I wonder -- how long does it take to run the script?
It depends.
Let's take the largest file in gdb* in terms of lines, that's
gdb/doc/gdb.texinfo with ~51k lines:
...
$ time ./gdb/contrib/spellcheck.sh --check gdb/doc/gdb.texinfo
real 0m4.613s
user 0m4.405s
sys 0m0.215s
...
Now, let's take a small file with just 39 lines:
...
$ time ./gdb/contrib/spellcheck.sh --check gdb/gdb.c
real 0m0.445s
user 0m0.314s
sys 0m0.137s
...
This is with .git/wikipedia-common-misspellings.txt already downloaded,
and .git/spell-check.pat1.$md5sum already generated.
> I ask because it seems like it will be run on essentially every commit.
Agreed, speed matters for this.
> Also, I can't recall how pre-commit works -- is the script run on the
> entire tree or just the files being modified?
Just on the files being modified.
[ Running it on the entire tree is slow:
$ time ./gdb/contrib/spellcheck.sh --check gdb*
real 1m9.906s
user 1m7.710s
sys 0m1.899s ]
I think it may be possible to speed up the check to only check the
modified lines (see the line counter example below).
This works for checks that don't require context, or require a fixed
amount of lines of context.
Currently, spellcheck.sh works on line-at-a-time basis, so no context used.
The question is, should it be sensitive to context?
The current implementation does not differentiate between comments and
code, so when running it on directory sim we get:
...
diff --git a/sim/arm/armsupp.c b/sim/arm/armsupp.c
index 1a5eeaff1d6..1b92a3abc3e 100644
--- a/sim/arm/armsupp.c
+++ b/sim/arm/armsupp.c
@@ -390,9 +390,9 @@ ARMul_NthReg (ARMword instr, unsigned number)
{
unsigned bit, upto;
- for (bit = 0, upto = 0; upto <= number; bit ++)
+ for (bit = 0, up to = 0; up to <= number; bit ++)
if (BIT (bit))
- upto ++;
+ up to ++;
return (bit - 1);
}
...
> Assuming everything is ok though I think this should go in.
> Approved-By: Tom Tromey <tom@tromey.com>
Thanks for the review, and I'm glad you're supporting the idea.
I think the potential 4.5 seconds delay is too long, so I'll hold off on
committing this.
I'll work on a --pre-commit switch that only checks modified lines, and
resubmit together with this patch.
Thanks,
- Tom
Patch:
...
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 87726aeb758..0b94dcc4744 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -22,3 +22,10 @@ repos:
- id: isort
types_or: [file]
files: 'gdb/.*\.py(\.in)?$'
+ - repo: local
+ hooks:
+ - id: line-counter
+ name: line counter
+ language: script
+ entry: ./gdb/contrib/line-counter.sh
+ files: ^(gdb|gdbsupport|gdbserver)/
diff --git a/gdb/contrib/line-counter.sh b/gdb/contrib/line-counter.sh
new file mode 100755
index 00000000000..257ec3793bb
--- /dev/null
+++ b/gdb/contrib/line-counter.sh
@@ -0,0 +1,5 @@
+#!/bin/sh
+
+git diff --staged > DIFF
+
+git diff --staged | egrep -v "^\+\+\+" | egrep "^\+" | wc -l >> DIFF
...
Example output:
...
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 53f41e67444..e08ee7fb675 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -51432,6 +51432,8 @@ Richard M. Stallman and Roland H. Pesch, July 1991.
@printindex cp
+BLA
+
@node Command and Variable Index
@unnumbered Command, Variable, and Function Index
2
...
On 11/12/24 11:15, Tom de Vries wrote:
> On 11/7/24 15:57, Tom Tromey wrote:
>>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>>
>> Tom> I've committed the other patches from this series, but this one
>> Tom> remains in review, so ... ping.
>>
>> Ah, now I see the others already went in.
>> Totally fine of course, I just should have read the whole thread first.
>>
>> I am in favor of this change. I think it's a good improvement.
>> However I wonder -- how long does it take to run the script?
>
> It depends.
>
> Let's take the largest file in gdb* in terms of lines, that's gdb/doc/
> gdb.texinfo with ~51k lines:
> ...
> $ time ./gdb/contrib/spellcheck.sh --check gdb/doc/gdb.texinfo
>
> real 0m4.613s
> user 0m4.405s
> sys 0m0.215s
> ...
>
> Now, let's take a small file with just 39 lines:
> ...
> $ time ./gdb/contrib/spellcheck.sh --check gdb/gdb.c
>
> real 0m0.445s
> user 0m0.314s
> sys 0m0.137s
> ...
>
> This is with .git/wikipedia-common-misspellings.txt already downloaded,
> and .git/spell-check.pat1.$md5sum already generated.
>
>> I ask because it seems like it will be run on essentially every commit.
>
> Agreed, speed matters for this.
>
>> Also, I can't recall how pre-commit works -- is the script run on the
>> entire tree or just the files being modified?
>
> Just on the files being modified.
>
> [ Running it on the entire tree is slow:
>
> $ time ./gdb/contrib/spellcheck.sh --check gdb*
>
> real 1m9.906s
> user 1m7.710s
> sys 0m1.899s ]
>
> I think it may be possible to speed up the check to only check the
> modified lines (see the line counter example below).
>
> This works for checks that don't require context, or require a fixed
> amount of lines of context.
>
> Currently, spellcheck.sh works on line-at-a-time basis, so no context used.
>
> The question is, should it be sensitive to context?
>
> The current implementation does not differentiate between comments and
> code, so when running it on directory sim we get:
> ...
> diff --git a/sim/arm/armsupp.c b/sim/arm/armsupp.c
> index 1a5eeaff1d6..1b92a3abc3e 100644
> --- a/sim/arm/armsupp.c
> +++ b/sim/arm/armsupp.c
> @@ -390,9 +390,9 @@ ARMul_NthReg (ARMword instr, unsigned number)
> {
> unsigned bit, upto;
>
> - for (bit = 0, upto = 0; upto <= number; bit ++)
> + for (bit = 0, up to = 0; up to <= number; bit ++)
> if (BIT (bit))
> - upto ++;
> + up to ++;
>
> return (bit - 1);
> }
> ...
>
>> Assuming everything is ok though I think this should go in.
>> Approved-By: Tom Tromey <tom@tromey.com>
>
> Thanks for the review, and I'm glad you're supporting the idea.
>
> I think the potential 4.5 seconds delay is too long, so I'll hold off on
> committing this.
>
> I'll work on a --pre-commit switch that only checks modified lines, and
> resubmit together with this patch.
>
Submitted here (
https://sourceware.org/pipermail/gdb-patches/2024-November/213249.html ).
Thanks,
- Tom
> Patch:
> ...
> diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
> index 87726aeb758..0b94dcc4744 100644
> --- a/.pre-commit-config.yaml
> +++ b/.pre-commit-config.yaml
> @@ -22,3 +22,10 @@ repos:
> - id: isort
> types_or: [file]
> files: 'gdb/.*\.py(\.in)?$'
> + - repo: local
> + hooks:
> + - id: line-counter
> + name: line counter
> + language: script
> + entry: ./gdb/contrib/line-counter.sh
> + files: ^(gdb|gdbsupport|gdbserver)/
> diff --git a/gdb/contrib/line-counter.sh b/gdb/contrib/line-counter.sh
> new file mode 100755
> index 00000000000..257ec3793bb
> --- /dev/null
> +++ b/gdb/contrib/line-counter.sh
> @@ -0,0 +1,5 @@
> +#!/bin/sh
> +
> +git diff --staged > DIFF
> +
> +git diff --staged | egrep -v "^\+\+\+" | egrep "^\+" | wc -l >> DIFF
> ...
>
> Example output:
> ...
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index 53f41e67444..e08ee7fb675 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -51432,6 +51432,8 @@ Richard M. Stallman and Roland H. Pesch, July 1991.
>
> @printindex cp
>
> +BLA
> +
> @node Command and Variable Index
> @unnumbered Command, Variable, and Function Index
>
> 2
> ...
@@ -22,3 +22,10 @@ repos:
- id: isort
types_or: [file]
files: 'gdb/.*\.py(\.in)?$'
+ - repo: local
+ hooks:
+ - id: spell-check
+ name: spell check
+ language: script
+ entry: ./gdb/contrib/spellcheck.sh --check
+ files: ^(gdb|gdbsupport|gdbserver)/