Sunday, May 3, 2020

Advantages of node package manager yarn over npm



npm and yarn are the most popular javaScript package managers. npm is older than yarn, introduced in Jan 2010 soon after the release of Node.js and was developed by Issaac Z. Schlueter. It still holds the position of most used package manager for JavaScript till date. Whereas, yarn was introduced in late 2016  by facebook engineers with some major security and performance improvements over npm, and was well received in the node community after that.

Through years as it has grown tremendous popularity, npm community faced major security issues by some malicious code attack on some popular packages. The malicious code was meant to copy the npm credentials of the machine running the affected package and upload them to the attacker. Such technique is called Module Highjacking. Also another frustrating issue was coming up the mismatch of sub-dependencies in a package. Say a package used in the current application is not maintained now, and the sub-dependencies have introduced some breaking fixes on new release and the package is not being updated to match those changes. This may cause really frustrating issues during the application run. 

So in order to resolve these issues Yarn was introduced in late 2016 by facebook developres. Later engineers from Exponent, Google and Tilde also helped testing and validating the yarn client outside facebook on different js frameworks for additional use cases and after that it is being released publicly. Here are couple of points of yarn over npm:
  • Lock file:A lock file called yarn.lock is introduced in yarn that keeps the versions of dependencies locked. It also keeps an identifier of every sub-dependencies with their versions locked inside the file, ensuring every installation of the application have the same version of dependencies and sub-dependencies. Although npm since version 5.0 provides the package-lock.json just like yarn.lock.
  • Selective dependency resolution:
    This let user define dependent or sub-dependent package versions or a range inside "resolutions" key of package.json to control the versions of packages used.
  • Caching:
    Yarn stores every package it installs in a global cache in user directory on the file system so that all subsequent installation can be served from the local cache, improving the installation time.
  • Parallel downloads and automatic retries:
    It uses parallel workers to download the packages in order to maximize the resource utilization and helps reducing the build time. Also network requests retried upon failure to avoid build failure caused by a temporary network issue.
  • Multiple registries:
    It installs packages from both npm registry and bower, thus ensuring the availability of packages if one of them goes down.
  • Autoclean:
    The yarn autoclean command frees up space by removing unnecessary files(e.g. *.md, *.yaml) from dependencies. Also inside .yarnclean file it can be configured to clean dependency's test and example files in order to reduce the size of node_modules folder.


Conclusion
Over the years npm resolved many security issues and till date still npm holds the position of most used node package manager. Above all npm comes with node setup as a default package manager to use.

But considering the specs of yarn over npm as discussed above, it is clearly the safest and convenient option specially for production use. And in 2020 yarn released its version 2 with various bug fixes and further performance improvements. In the yarn roadmap there are many new addition including the shift of yarn from only node-specific cli package manager to a platform and API for multiple languages, which makes it more promising for the future use.