A couple weeks ago, at the start of Hacktoberfest, I decided to recruit Hacktoberfest contributors to help internationalize the user-statistician GitHub Action. This post is an update on the progress of this effort and lessons learned from the experience. It is organized as follows:
By the way, there are still a couple open issues for additional language translations if you are looking for something to contribute to. And if it doesn't already support your preferred language and there isn't an open issue, feel free to submit an issue.
What is the user-statistician?
The user-statistician GitHub Action generates a detailed GitHub stats SVG entirely within GitHub Actions, GitHub's framework for workflow automation. It generates a detailed visual summary of your activity on GitHub, such as your contributions, distribution of programming languages within your public repositories, etc, suitable to display on your GitHub Profile README. It can also be customized in various ways, such as color theme, among other customizations.
You can find more about it in my earlier Dev.to post recruiting Hacktoberfest contributors, including a few examples of the SVGs of GitHub profile stats that it produces:
And all of the details are available in the GitHub repository:
Generate a GitHub stats SVG for your GitHub Profile README in GitHub Actions
Progress Internationalizing for Hacktoberfest
Before Hacktoberfest: I designed the Action from the start so that it would be easy to add support for languages other than English. I initially opened a few issues for translations to a few random languages, and labeled them as good first issue
and help wanted
, and I got a couple random contributions that way. But prior to Hacktoberfest, user-statistician was limited to three languages: English, German, and Italian.
Two Weeks into Hacktoberfest: A couple days before Hacktoberfest began, I added the topic to the repository, labeled all of the existing language translation issues with Hacktoberfest
, and posted about it here on Dev.to. In the first two weeks of Hacktoberfest, I've merged PRs from 12 different contributors, adding support for 12 more languages. The user-statistician GitHub Action now supports 15 languages: Bahasa Indonesia, Bengali, English, French, German, Hindi, Italian, Japanese, Korean, Lithuanian, Polish, Portuguese, Russian, Spanish, and Turkish.
Lessons Learned from the Experience
99% of the code that I write is for my research on evolutionary computation, and other computational intelligence topics. It includes a handful of libraries including implementations of the relevant algorithms, a few repositories that provide example usage of those libraries, code to reproduce experiments from published articles, etc. None of it has a user interface. So most, if not all, of that code doesn't really motivate an internationalization effort, other than if I wanted to provide translations of API docs.
Therefore, this was my first attempt at providing international support for software I've developed. I actually thought about this at the beginning. The user-statistician action is implemented in Python. Right at the beginning, I isolated into a single source code file all of the strings related to labeling the different stats, section headings used to organize the stats into a couple categories, and template strings for the heading at the top. Specifically, I used a couple Python dictionaries, with the locale code for the language as a key. This worked well, and the two languages (Italian and German) that were contributed prior to Hacktoberfest were easily merged.
However, what I did not anticipate in that design was the potential for there to be multiple PRs (several in one case) submitted over a short time-frame (e.g., within the same day). And this is what I experienced with early Hacktoberfest contributions. The result was many conflicts requiring resolution prior to merge. Most of these were easy to resolve, but somewhat tedious. If I were to design another tool in the future with international support, I would likely isolate each locale into its own individual source file. JSON may have been the way to go for this particular GitHub Action since JSON is simple to parse from Python, perhaps using a file naming convention with the locale code as the filename (e.g., en.json
for the English version). Even the beginning of Hacktoberfest rush of new contributors would have been free of conflicting PRs since each would have been isolated from the others. Perhaps this is obvious to those of you with prior experience supporting multiple languages within an application or tool.
I could potentially take the time to refactor in this very way. But at this point for this application, there would be little if any benefit to do so. Most of the languages with the largest numbers of native speakers are now accounted for, so I'm not likely to see several PRs coming in at about the same time with new language translations leading to the quantity of conflicts I experienced at the start of Hacktoberfest. So although I do believe in refactoring to improve maintainability, in this case I would not likely gain back the time needed to do so.