Everyone wonders why I care about spelling. Sometimes the rewards are 🍺:
This typo is epic, you just made my day! 🤣
I never would've imagined that a MR with hundreds of typo fixes would have any real impact, but you made it happen. It turns out the EDNS keepalive module didn't work because of this typo (and no one noticed...) And you fixed it, thank you!
Oh wow! This is a lesser used portion of the Chocolatey application, but I would have still expected for this to have been caught before now. This change makes sense to me.
Everyone makes typos. This includes people writing documentation and comments
but it also includes programmers naming variables, functions, apis, classes
and filenames.
Often, programmers will use InitialCapitalization, camelCaseALL_CAPS, or IDLCase when naming their things. When they do this, it makes
it much harder for naive spelling tools to recognize misspellings, and as such,
with a really high false-positive rate, people don't tend to enable spellchecking
at all.
This repository's tools are capable of tolerating all of those variations.
Specifically, w understands
enough about how programmers name things that it can split the above conventions
into word-like things for checking against a dictionary.
If there's an existing pull request, the action will skip checking the push to save computing resources.
Permissions are reduced during the check phase for security reasons.
It's possible to only check changed files.
The action generates outputs which enables one to consume them in ways I couldn't otherwise imagine.
There's a separate comment phase as commenting requires additional permissions. (This is the default way that the workflow consumes the action's outputs and then sends them back to the action.)
It's possible to ask the workflow to update its own metadata.
When it updates the metadata, it collapses its original comment and your note to update the metadata.
I recently added a note as part of this last phase linking users to a file they can edit in order to trigger a new validation pass.
Journey
I've been working on this tool for a while.
You can use it just for your documentation (as PowerDNS does), or you can use it for your entire project.
In the second half of this year, I've significantly improved its performance (it wasn't bad for small to medium repositories, but it wasn't good enough for giant ones, it's now doing reasonably well there), including allowing concurrency at the process level and via matrix runs.
I've also been adding additional heuristics, such as recognizing when a file really isn't worth checking -- often it will identify translation files and binaries as things to skip. I ran across a couple of projects with source code files that were >10 MB, so there's logic to skip such files by default (you can tune the threshold).
There are now heuristics to identify supplemental dictionaries one could use to reduce the size of the in-repository metadata. (The heuristics run if there are unrecognized terms, but you can turn them off if you don't want them.)
I'm slowly working on adjusting the action so that it could check other languages (hopefully next year).
I regularly feed projects to the template and am growing a list of pattern templates to make it easy for projects to trim out noise.
For medium-sized projects, the tool will regularly find bugs, whether it's a broken public API or a test that isn't testing what it thought it was testing. These are all normal occurrences. Consistent spelling helps projects avoid such pitfalls.
Hooks
Auto-detecting dictionary words
Because the workflow is customizable, I'm playing with instance-specific customizations such as this one for ohmyzsh/ohmyzsh. As ohmyzsh is built around aliases for zsh, this additional commit would automatically recognize aliases before the spell checker runs and add them to the dictionary. This means that if the alias is used in the documentation (and hopefully it is!), it'll be automatically accepted as a word, and when someone misspells an alias, that misspelling will stick out.
-name:find aliasesrun:|for a in $(git ls-files|grep '\.zsh$'); doecho "-- $a"if [ -s "$a" ]; thenperl -ne 'next unless s/^alias ([A-Za-z]{3,})=.*/$1/;print' "$a" | tee -a .github/actions/spelling/allow.txtfidone;
This logic:
Looks for .zsh script files.
Reports the name of the file from which it's going to be getting terms.
Ensures there's a file with content (there's currently a quirk in act where a file tracked by git that matches .gitignore will not be copied into the act environment).
Looks for lines that start with alias and have at least 3 characters.
Reports each item and adds them to the dictionary.
I know at least one project that runs check-spelling using nektos/act in GitLab. Because of the support for outputs, it would be possible for the act workflow to take the outputs and wire them to an equivalent commenting GitLab mechanism.
Recent deployments
dev.to
This blog is hosted by dev.to which is a deployment of:
Welcome to the Forem codebase, the platform that powers
dev.to. We are so excited to have you. With your help, we can
build out Forem’s usability, scalability, and stability to better serve our
communities.
What is Forem?
Forem is open source software for building communities. Communities for your
peers, customers, fanbases, families, friends, and any other time and space
where people need to come together to be part of a collective
See our announcement post
for a high-level overview of what Forem is.
dev.to (or just DEV) is hosted by Forem. It is a community of
software developers who write articles, take part in discussions, and build
their professional profiles. We value supportive and constructive dialogue in
the pursuit of great code and career growth for all members. The ecosystem spans
from beginner to advanced developers, and all are welcome to find their place…
It doesn't take much work to convert the output into a list of words to correct (I use Google Sheets to generate corrections):
Thanks for all the work in this PR @jsoref
! Since this is quite a big PR, could you maybe break it down into smaller ones? I'd really like to get some things that are actual bugs (like this one) out faster.