⚙️ Monorepo scripts strategies & naming conventions

⚙️ Monorepo scripts strategies & naming conventions
Mono repository is a popular approach where some libraries or other fairly independent projects are colocated in one repository. One of the benefits you get is simpler control over changes that should be synchronized across all involved packages. As a downside, you get less independence of each package.
The vast majority of everyday monorepo management tasks boil down to running package scripts in a certain order and under certain conditions, both locally and on CI. This article explains one of many possible approaches. This approach worked fairly well in some of RingCentral’s monorepo projects. The size of projects varies: from less than a hundred files and a handful of devs up to thousands of files and 50+ developers.
In this article, I will use Lerna as a tool that helps to manage such mono repositories. It allows you to run scripts and execute code in packages, as well as takes care of some publishing activities. You may read more about Lerna concepts here: https://github.com/lerna/lerna#concepts.
A short remark about package versions: they could be completely independent, or they could be synchronized to a certain degree. For example, if a package did have changes, its version will be updated, whereas non-updated packages will not get a version bump. But for simplicity, we will assume fully synchronized version bumps of all published packages (even unchanged), so we will be using exact versions in lerna.json and --force-publish=* flag in publish commands.
Main ideas
Scripts should eliminate the possibility of error in critical tasks like test coverage collection and publishing
Default scripts (like installtest and publish) should be no-brainers: do all the things automatically with the least astonishment
Full control mode when needed: ability to run scripts granularly or in a quicker fashion
Overall, any interaction with a repository, both locally and on the CI, can be presented as scenarios that can be split into some granular phases (or stages in Gitlab terminology). Scenarios may skip phases or have some phases substituted with other different phases. Scenarios usually have certain goals and expectations — here are the most common ones:
A developer checks out the repo and runs npm install — the goal is to have a fully functional and ready-to-use dev setup at this point, the expectation is that right after this command the developer may run other commands with no additional actions related to getting things ready to work
A developer runs the npm start script — the goal is to run start in all packages, the expectation is that it just starts working, with no need to pre-build anything
CI determines that a tag has been pushed, checks out the repo, runs linter, runs build scripts, runs tests, and collects coverage, and if all is good — publishes to NPM. The goal is the delivery of high-quality code. The expectation is that if any phase breaks it stops the process and there is zero possibility of any unwanted publishes of incomplete/broken code
Let’s dive deeper into those phases.
In this phase, CI or the developer sets up the environment to run other scripts.
CI should run npm install --ignore-scripts so that we can control when and how to do bootstrapping in the CI environment, for example, the lint task does not require packages to be bootstrapped, but the test task does (more about this in the next section)
Developers should use regular npm install, which must run full bootstrap
This is the cheapest check that makes sure the codebase is in good condition.
CI should run lint right after install as the first cheap check
CI must stop all further expensive checks if the cheap lint check failed
Packages bootstrapping
In this phase, dependencies of underlying packages are installed — this is a post install on the dev machine or a post lint phase in the CI environment, but it depends on the setup.
In general the postinstall script should run lerna bootstrap --hoist --no-ci so that devs will get a fully working repo right after running npm install, the -no-ci flag is needed due to a bug https://github.com/lerna/lerna/issues/1324
CI may not require demos to be built/tested and thus neither to be bootstrapped unless you use demos for e2e tests. In this case, we should bootstrap only the truly used packages by scoping the bootstrap script: npm run bootstrap -- --scope=@ringcentral/* (we usually have libraries in this scope). This is a nice optimization that saves CI time.
A phase needed to obtain an artifact, something that later will be published. Also, it may be a prerequisite for tests.
The build script on package level should run the clean script beforehand to make sure no previous artifacts exist on subsequent builds
The build:quick can be used to build all libs that will be used in demos
A phase that does the majority of code quality verification and makes sure things work as expected, not just written as expected (linter phase).
Sometimes code has to be built before running tests. If there are multiple libs in monorepo which depend on each other it is safer to run tests once all dependencies are built instead of using magic to run tests using only sources.
CI runs tests by calling the test script, which is a maximum set of all tests with e2e tests and coverage collection
Locally devs may run test:quick which does not collect coverage or does not run e2e tests
CI should also run test:coverage script to upload coverages after test
Optionally there could also be a test:watch script which will run quick tests in watch mode
This is a phase that occurs on dev machines when developers commit and push the code, some minor quick checks should happen on this phase.
Obviously, we don’t want to make CI run potentially invalid code so we can use test:quick and lint:staged (or lint:quick)scripts in pre-commit hooks
This delivers artifacts to some package management system that distributes them to end users. Code must be built and tested before publishing, which is CI responsibility.
Publish scripts actually do the publishing (usually on CI)
Canary means a pre-release (something you can run nightly)
Release means a regular versioned release
CI may add -yes flag to publish:* scripts: it’s better to add them here in order to prevent unwanted local publishes
Alternatively, on publish, CI can skip regular flow (buildtestcoverage) and only run publish in packages, which should do buildtestcoverage to make sure that if publish is run locally from dev machine it still goes through all phases, but we recommend to always publish from CI only as it is the only way to ensure all proper checks.
Developer-specific phase, day to day activities like website or library/demo/storybook development, on this phase codebase is constantly watched for changes and re-built once changes occur.
Assume the following setup:
packages/ - demo (depends on lib1 and lib2) - lib1 - lib2 (depends on lib1)
Plain Text
Demos that use libraries often need those libs to be pre-built in order to run start correctly without errors about missing files so root’s start script must run build script scoped for libs and then run start:quick script (see examples below), unfortunately, this brings overhead until https://github.com/webpack/webpack/issues/4991 is fixed (previously TS was prone too: bug 12996, see update below)
start:quick simply runs start in all packages which starts Webpack and Babel watchers
UPDATE In TypeScript 3.4 new incremental option has been introduced: it will produce a build and a cache, so no matter how often you restart watchers it will be ready much faster. Unfortunately it’s not yet supported by TS Loader for Webpack It still does not eliminate the necessity to pre-build libraries.
Developer-specific phase before builds or when switching branches.
Root’s clean should first run clean scripts in packages (e.g. clean artifacts), then remove node_modules in packages, then remove node_modules in the root, this will bring repository to ground zero
Cleaning is useful when devs switch branches, then command npm run clean && npm install makes sure they have working setup after branch switch
Main package scripts
{ "postinstall": "npm run bootstrap", "bootstrap": "lerna bootstrap --no-ci --hoist", "bootstrap:quick": "npm run bootstrap -- --scope=@ringcentral/*", "clean": "npm run clean:artifacts && npm run clean:packages && npm run clean:root", "clean:artifacts": "lerna run clean --parallel", "clean:packages": "lerna clean --yes", "clean:root": "rimraf node_modules", "start": "npm run build:quick && npm run start:quick", "start:quick": "dotenv lerna run start -- --parallel", "build": "lerna run build --concurrency=1 --stream", "build:quick": "npm run build -- --scope=@ringcentral/*", "test": "lerna run test --concurrency=1 --stream", "test:quick": "lerna run test:quick --concurrency=1 --stream", "test:coverage": "lerna run test:coverage --parallel", "test:watch": "lerna run test:watch --parallel", "publish:release": "lerna publish --force-publish=* --no-push --no-git-tag-version", "lint": "eslint --cache --cache-location node_modules/.cache/eslint --fix", "lint:all": "npm run lint 'packages/*/src/**/*.ts*'", "lint:staged": "lint-staged" }
We prefer to run tests one by one in topological order (maintained by Lerna) to see the nice structured output. Technically --concurrency=1 --stream can be replaced with --parallel which disregards topology and runs everything in parallel.
You can publish all packages together using -force-publish=* flag for simplicity.
Scripts test:quick and lint:staged should be part of pre-commit hook.
Package scripts
{ "mocha": "mocha --opts mocha.opts", "karma": "karma start --singleRun", "nyc": "nyc mocha --opts mocha.opts", "build": "npm run clean && npm run build:tsc && npm run build:wp", "build:tsc": "tsc", "build:wp": "webpack --progress", "start": "npm-run-all -p watch:tsc watch:webpack", "start:tsc": "npm run build:tsc -- --watch --preserveWatchOutput", "start:webpack": "npm run build:webpack -- --watch" }
GitLab CI config
image: node:lts stages: - validation - bootstrap - test - publish variables: BRANCH: $(echo ${CI_COMMIT_REF_NAME} | sed -e 's/\(.*\)/\L\1/' | sed -r 's/[^a-z0-9-]/-/g') before_script: - npm config set //registry.npmjs.org/:_authToken=${NPM_TOKEN} # or - npm config set repository$PRIVATE_NPM - npm config set _auth$PRIVATE_NPM_AUTH - npm config set email$PRIVATE_NPM_EMAIL - npm install --progress=false --ignore-scripts cache: key:"$CI_COMMIT_REF_NAME" paths: - node_modules/ - packages/*/node_modules job_lint: stage: validation script: DEBUG=eslint:cli-engine npm runlint:all job_bootstrap: stage: bootstrap script: - npm runbootstrap:quick - npm runbuild:quick artifacts: paths: -"*/es" -"*/lib" expire_in: 1 day job_test: stage: test script: - npm test - npm run test:coverage job_canary: stage: publish script: npm runpublish:release-- --canary --preid=$BRANCH.$CI_PIPELINE_IID --dist-tag=$BRANCH --yes only: - master -feature/* -release/* job_publish: stage: publish script: npm runpublish:release --$CI_COMMIT_TAG--yes only: -tags
This setup assumes you have NPM_TOKEN (and others) in GitLab ENV variables. Also note the --yes flags for publish scripts.
Due to https://github.com/lerna/lerna/issues/2171 we have to explicitly set the pipeline ID in the canary preid.
Travis CI config
language: node_js node_js: - stable cache: directories: - $HOME/.npm - node_modules - packages/*/node_modules before_install: - BRANCH=$(echo ${TRAVIS_BRANCH} | sed -e 's/\(.*\)/\L\1/' | sed -r 's/[^a-z0-9-]/-/g') - npm config set //registry.npmjs.org/:_authToken=${NPM_TOKEN}before_script: - DEBUG=eslint:cli-engine npm runlint:all - npm run build:quick deploy: - provider: script script: npm runpublish:release --$TRAVIS_TAG--yes skip_cleanup: true on: branch: master tags: false repo:xxx - provider: script script: npm runpublish:release-- --canary --preid=$BRANCH.$TRAVIS_JOB_NUMBER --dist-tag=$BRANCH --yes skip_cleanup: true on: tags: true repo: xxx - provider: releases api_key:$GITHUB_TOKEN skip_cleanup: true file: - packages/xxx/dist/xxx.js - packages/yyy/dist/yyy.js on: tags: true repo:xxx after_success: - npm runtest:coverage
This setup also assumes you have NPM_TOKEN and GITHUB_TOKEN in Travis ENV variables. Also, note the --yes flags for publish scripts. Travis by default will do npm install which will do npm bootstrap, so there’s some room for optimization like we did in GitLab example.
Due to https://github.com/lerna/lerna/issues/2171 we have to explicitly set the Travis build number in the canary preid.
Packages used
Since we mention a lot of packages it does make sense to tell what they are doing.
Lerna — manages JavaScript projects with multiple packages
Dotenv-cli — loads .env files that stores environment variables
ESLint — a fully pluggable tool for identifying and reporting on patterns in JavaScript
Mocha — simple &flexible javascript test framework for Node.js & the browser
NYC — command line interface for Istanbul, a JS code coverage tool that computes statement, line, function and branch coverage
Karma — test runner for JavaScript, which is able to run Mocha and others
Coveralls — code coverage tracking system
TSC — TypeScript compiler
Webpack — module bundler, bundles JavaScript files for usage in a browser
Husky — runs NPM scripts from git hooks
Lint-staged — runs linters on git staged files
Jest — JavaScript testing framework with a focus on simplicity, can be used as replacement for Karma+Mocha+Istanbul
This approach with minor variations (mostly alternate phases) was successfully implemented and tested in different projects and proven to be consistent and complete. I hope it will help you to build, test and interact with your projects more efficiently.
This whole thing definitely requires some library to make things easier :)