Taming node_modules
It’s no secret that node_modules, while a wonderful solution to local package management, is also regarded as a swarming nightmare of files, which create a lot of pressure on disk IO and space.
This can be very problematic for smaller devices running node as is now more popular in IoT.
There are a couple of neat solutions to bundle all of your dependencies in a single js file, like yarn’s PnP proposal explores. Unfortunately, this doesn’t necessarily work for IoT devices with low computing power as parsing such a large JS file would lock everything for a few seconds, or perhaps longer. This concept has been well explored in this fantastic article: The cost of javascript in 2018
Instead, I tried understanding *what* exactly can be found in your node_modules. The results may shock you…
- Benchmarking suites
- Wikis and documentation (with image/video assets)
- Test suites
- Internal tooling
- Raw sources (alongside transpiled modules)
- Code samples
- Guy Fieri ?
Right now, as you read this, these files are sitting inches away from your code, on your production environments.
NPM allows you to dry-run package deploys to see what files are going to be included. This is a step often skipped by package maintainers. NPM currently does not warn you about the content of your package, and `npm install — prod` doesn’t help with this problem.
So, what’s the solution ?
I created a small set of instructions that I added to a dockerfile that finds and deletes most known offenders. It could be made into an npm package itself (ironic if you ask me).
Here it is: https://gist.github.com/fed135/af42be63989e734acc025090fd120186
Let me know if I missed any.