feat: Add 1,000 Irish word list (@aindriu80)#7535
Conversation
|
Continuous integration check(s) failed. Please review the failing check's logs and make the necessary changes. |
The failing job encountered this error: ERR_PNPM_FETCH_403 GET https://registry.npmjs.org/pnpm: Forbidden - 403 Solution: To fix this, update your workflow (.github/workflows/monkey-ci.yml) where pnpm is set up and dependencies are installed (steps such as Install dependencies, Setup pnpm, etc.) to use an npm registry token. Add a step before any pnpm install command:
Then, add an NPM_TOKEN secret (with your npm registry token) to your repository’s GitHub Actions secrets. This will allow pnpm to authenticate and fetch the necessary packages, resolving the 403 error. |
|
Latest commit changed 27 words. The word list was validated using two independent Irish language spell checkers, hunspell with the ga_IE dictionary and aspell with the --lang=ga flag. An initial pass flagged 26 - 27 words, which were reviewed and changed: the majority were proper nouns (place names, months, and personal names such as Éire, Nollaig, Londain) that required capitalisation, along with a small number of malformed words that arose from stripping hyphens from the source corpus. Additional words were cross-referenced against a secondary source. After 27 replacements, both spell checkers return zero errors against the final list. |
### Description - Adds Irish 1k word list (corpus-based) <br/> This PR adds a frequency-based **Irish 1k word list** to Monkeytype. ### Source The list is primarily based on Kevin Scannell’s @kscanne corpus-derived “Top 1000” Irish wordforms (Fleiscíniú project), which were extracted from a large corpus of Irish texts. The original list included typographic hyphenation for syllable breaking. These hyphens were removed to ensure compatibility with Monkeytype and to reflect standard written forms. A very small number of invalid or incompatible entries were replaced with common alternatives to maintain a clean 1000-word set. ### Rationale * Based on real-world corpus frequency * Wordform-based (not lemma-only), better suited for typing practice * Function-word heavy, which improves natural rhythm and realistic typing flow * Cleaned, deduplicated, and validated locally ### Changes * Added `irish_1k.json` (1000 non-duplicate common Irish words) * Updated `irish.json` (added valid BCP-47 code) * Updated `languages.ts` (registered `irish_1k`) * Updated `constants/languages.ts` (dropdown integration) ### Testing & changes to existing files. Tested locally in a browser after running `pnpm run dev`. It feels natural and well-balanced for B1/B2 level typing practice. Checked for duplicates. Added bcp47 to existing irish.json. ### Checks - [ ] Adding quotes? - [ ] Make sure to include translations for the quotes in the description (or another comment) so we can verify their content. - [X] Adding a language? - Make sure to follow the [languages documentation](https://github.com/monkeytypegame/monkeytype/blob/master/docs/LANGUAGES.md) - [X] Add language to `packages/schemas/src/languages.ts` - [X] Add language to exactly one group in `frontend/src/ts/constants/languages.ts` - [X] Add language json file to `frontend/static/languages` - [ ] Adding a theme? - Make sure to follow the [themes documentation](https://github.com/monkeytypegame/monkeytype/blob/master/docs/THEMES.md) - [ ] Add theme to `packages/schemas/src/themes.ts` - [ ] Add theme to `frontend/src/ts/constants/themes.ts` - [ ] (optional) Add theme css file to `frontend/static/themes` - [ ] Add some screenshots of the theme, especially with different test settings (colorful, flip colors) to your pull request - [ ] Adding a layout? - [ ] Make sure to follow the [layouts documentation](https://github.com/monkeytypegame/monkeytype/blob/master/docs/LAYOUTS.md) - [ ] Add layout to `packages/schemas/src/layouts.ts` - [ ] Add layout json file to `frontend/static/layouts` - [ ] Adding a font? - Make sure to follow the [fonts documentation](https://github.com/monkeytypegame/monkeytype/blob/master/docs/FONTS.md) - [ ] Add font file to `frontend/static/webfonts` - [ ] Add font to `packages/schemas/src/fonts.ts` - [ ] Add font to `frontend/src/ts/constants/fonts.ts` - [X] Check if any open issues are related to this PR; if so, be sure to tag them below. - [X] Make sure the PR title follows the Conventional Commits standard. (https://www.conventionalcommits.org for more info) - [X] Make sure to include your GitHub username prefixed with @ inside parentheses at the end of the PR title. <!-- label(optional scope): pull request title (@your_github_username) --> <!-- I know I know they seem boring but please do them, they help us and you will find out it also helps you.--> Closes # <!-- the issue(s) your PR resolves if any (delete if that is not the case) --> <!-- please also reference any issues and or PRs related to your pull request --> <!-- Also remove it if you are not following any issues. --> <!-- pro tip: you can mention an issue, PR, or discussion on GitHub by referencing its hash number e.g: [monkeytypegame#1234](monkeytypegame#1234) --> <!-- pro tip: you can press . (dot or period) in the code tab of any GitHub repo to get access to GitHub's VS Code web editor Enjoy! :) -->
Description - Adds Irish 1k word list (corpus-based)
This PR adds a frequency-based **Irish 1k word list** to Monkeytype.
Source
The list is primarily based on Kevin Scannell’s @kscanne corpus-derived “Top 1000” Irish wordforms (Fleiscíniú project), which were extracted from a large corpus of Irish texts.
The original list included typographic hyphenation for syllable breaking. These hyphens were removed to ensure compatibility with Monkeytype and to reflect standard written forms.
A very small number of invalid or incompatible entries were replaced with common alternatives to maintain a clean 1000-word set.
Rationale
Changes
irish_1k.json(1000 non-duplicate common Irish words)irish.json(added valid BCP-47 code)languages.ts(registeredirish_1k)constants/languages.ts(dropdown integration)Testing & changes to existing files.
Tested locally in a browser after running
pnpm run dev. It feels natural and well-balanced for B1/B2 level typing practice. Checked for duplicates. Added bcp47 to existing irish.json.Checks
packages/schemas/src/languages.tsfrontend/src/ts/constants/languages.tsfrontend/static/languagespackages/schemas/src/themes.tsfrontend/src/ts/constants/themes.tsfrontend/static/themespackages/schemas/src/layouts.tsfrontend/static/layoutsfrontend/static/webfontspackages/schemas/src/fonts.tsfrontend/src/ts/constants/fonts.tsCloses #