Package Managers: Understanding npm, npx and yarn

azhariel

Renan Braga

Posted on May 25, 2023

Package Managers: Understanding npm, npx and yarn

In the development scenario using JavaScript/TypeScript, it is unthinkable to build professional applications without using a package manager. So much so that, when installing the Node.js javascript runtime on your machine, there is already the option (checked by default) to also install its default package manager: npm.

From npm, we can install packages that often form the basis for the development of complex applications, in addition to embedding a management system for these packages and their dependencies in the project.

In this article, I will explain the features and history of the most popular managers and their differences. For a better understanding, we will address the terms in English, which is the most used form in day-to-day development.

Why is it necessary to have a package manager?

As developers, it is part of our responsibility to value the optimization of time and resources in our projects. In a popular analogy, let's imagine that our job was to assemble cars: it would not be efficient for us to carry out from oil extraction to the final production of tires, since we can take them ready and reuse them in all the cars we produce.

However, it is indeed necessary to keep track of which components were used and their versions, to ensure that everyone working on the car uses the same components. With development projects, the situation is similar: it would be inefficient for every project to develop from scratch an application that exposes a server to the web, considering that the same activity has already been performed thousands of times.

That's where the need for a package manager comes in: in addition to pointing us to a resource repository, it also ensures that all people involved in the project's development follow the same dependency tree. Also, when using a package manager, it is much easier to update the versions of these dependencies. Rather than having to update the package and its entire list of dependencies (and dependencies of those dependencies in turn). All of this is solved with a simple command like

npm update packagename

Despite using a name that represents its own function, npm is not the only package manager in node. As we will see later, companies have also developed their own solutions, as is the case with yarn.

The three parts of npm

npm consists of distinct parts: the website (https://www.npmjs.com/), the registry and the Command Line Interface (CLI). The site is used to discover packages, configure profiles, and manage other aspects of the npm experience. The CLI runs from a terminal and is how most developer people interact with npm. Ultimately, the registry is a large public database of JavaScript software and the meta-information that surrounds it.

npm website

The npm website is where you can register for an account for publishing packages. Through it, it is also possible to query the packages in the registry, in addition to other management functions (belonging to an organization and managing private packages).

npm registration

The public npm registry is the database that contains information about published packages, each consisting of code and metadata. Open source developers and enterprise developers use the npm registry to contribute packages to the entire community or members of their organizations and download packages to use in their own projects.

Because it's a public registry, other package managers (like yarn, which we'll explain later) use the same registry as npm.

npm CLI

Using the npm CLI, you can install packages that are in your registry using their name and, optionally, the desired version. Packages can be downloaded globally (to be used at any time, regardless of scope) or locally (to be linked and used specifically in a project). A CLI usage guide would require an article focused on this, but the most common commands are for installing and removing packages, respectively:

npm install packageName
npm uninstall packageName

We can use the -g flag to define the global installation of the package, normally used for packages aimed at development itself, with scripts executable by the npx command.

When running the install command in the scope of a project, the package files and their dependencies are downloaded to a folder called node_modules and organized through two files in the project root: package.json and package-lock.json. As always, the best source to learn more about CLI commands is the official documentation: https://docs.npmjs.com/.

Package files

The organization of packages installed by npm takes place through two files in JSON format: package and package-lock. Next, we will explain the junction of the two.

package.json

When installing packages with the npm install xxx command or starting a project with npm init, a file called package.json is created or modified. This file has the basic information so that everyone can see the basic information about that project, available scripts and the network of dependencies, both production and development.

In case you just install a package in an uninitialized project by npm init command, the only key will be dependencies with the installed property value. Example performing the Express installation:



{
   "dependencies": {
   "express": "^4.18.2"
   }
}


Enter fullscreen mode Exit fullscreen mode

A package.json file naturally grows with the complexity of the project, automatically. To see the list of properties available in the file, I recommend reading the documentation, especially about the scripts property that can define various commands such as running tests or opening a server for development.

package-lock.json

The package-lock.json file is automatically created and updated by any npm operation that modifies the contents of the node_modules folder or the package.json file. It describes the exact file tree that was generated, so that subsequent installations can generate identical trees, regardless of intermediate dependency updates.

Unlike package.json, we shouldn't update this file manually, as it serves exactly so that npm can generate identical operations regardless of the context in which it is being run. Later we will see an alternative to npm called yarn. This alternative produces a similar file called yarn.lock, which has the same function as package-lock.json, but describes the installations performed by yarn.

npx command

Above, we talked about how we can use the npx command to run scripts from packages installed with npm. Usage is simple (in square brackets are optional parameters):

npx packageName [@] [args...]

This command allows you to run an arbitrary command from an npm package (either locally installed or remotely fetched - if any requested package is not present in the local project dependencies or on the machine globally, it will be installed in a folder in the npm cache ). One of the most used package scripts, especially for frontend developers, is create-react-app:

npx create-react-app my-app-name

Running this script creates the entire basic structure of a project using React, considerably speeding up project setup. Given the nature of the command, it is either possible to integrate an executable into a package with more tools or create packages whose only function is its execution. For inspiration, I recommend typing the command

npx azhariel

in the terminal for a simple business card example globally available to npm users.

Yarn

For all npm's merits, some developers saw a need for improvements to the package manager. One of these people was Sebastian McKenzie, who decided to start the implementation of the yarn manager (in Portuguese, literally “soap [of wool]”, but it is an acronym for “Yet Another Resource Negotiator”; in Portuguese, “Mais um negotiador de Recursos” ) while working at Facebook. Despite this initial connection with the tech giant, the manager is open source and is not maintained directly by any company.

To use yarn, we must install it globally using npm itself:

npm install --global yarn

Currently, yarn uses the same registry maintained by npm, just pointing to a different CNAME domain. Therefore, the main differences with the standard manager are in the architecture, strategy and performance of the managers.

Commands

There are some differences in the commands of the two managers. As always, I always suggest consulting the official documentation for both, but in summary we can use the following table:

Command npm yarn
Install dependencies npm install yarn
Install package npm install [package] yarn add [package]
Install dev package npm install --save-dev [package] yarn add --dev [package]
Uninstall package npm uninstall [package] yarn remove [package]
Uninstall dev package npm uninstall --save-dev [package] yarn remove [package]
Update npm update yarn upgrade
Update package npm update [package] yarn upgrade [package]
Global install package npm install --global [package] yarn global add [package]
Global uninstall package npm uninstall --global [package] yarn global remove [package]

Source: https://www.digitalocean.com/community/tutorials/nodejs-npm-yarn-cheatsheet

Installation of Packages

While npm installs the designated packages directly into the node_modules folder, yarn, by default, performs the installation from a cache folder that is used globally on the machine. Although it also maintains a cache folder, npm fetches files from the registry for the project. This caching strategy allows yarn to install dependencies offline as long as they are already available in the local cache.

Still on package installation, yarn opts for a download and installation strategy in parallel, optimizing the total time needed to install several dependencies. In contrast, npm downloads and installs linearly, one package at a time.

Plug N Play Strategy (PNP)

Since 2018, yarn makes it possible to use an optimized strategy to help Node find the right location of installed packages. Normally, Node looks for installed modules recursively: when there is a dependency requirement, the runtime looks for it in the node_modules folder from where the request was made. If it is not found, it goes back to looking in the node_modules folder one level above, successively, until it finds the location where the dependency is installed (or it does not find it and generates a dependency error).

As an alternative to this search sequence, yarn generates a single .pnp.cjs file instead of the node_modules folder. The .pnp.cjs file contains several maps: one linking package names and versions to their location on disk, and another linking package names and versions to their list of dependencies. With these lookup tables, yarn can instantly tell Node where to find any package it needs access to, as long as it's part of the dependency tree.

It is used by default in the latest version of yarn (2, alias berry). More information about can be found in the official documentation.

Security

Both managers have functions that guarantee security in the installation and execution of packages. However, they use different strategies. When requesting the installation of a package, yarn guarantees the integrity of the data by generating a checksum, which is integrated with the data and compared locally, rejecting the data if the calculations present different results. On the other hand, npm uses a SHA-512 algorithm to verify the integrity of packages, comparing them with the hash provided in the package-lock.json file.

In addition, npm also has the npm audit command, which informs about possible vulnerabilities in the installed dependency versions and the npm audit fix command, which aims to resolve reported vulnerabilities with upgrades to more recent versions.

Speed

As there are several factors that can influence the final speed of a package manager operation, it is necessary to analyze different scenarios. With that in mind, another package manager called pnpm performs daily benchmarks between npm, yarn and pnpm itself. The results can be checked on the pnpm website. As an example, these are the results obtained on the date of publication of this article:

pnpm speed benchmark

In most scenarios, yarn appears to have an advantage over npm operations, especially when using the PnP strategy in conjunction with cache and lockfile, generating an almost instantaneous installation of packages.

Conclusion

Despite occasional differences, at the end of the day, more important than the choice of package manager is the choice of the packages themselves.

When we choose to use a package, we also embark on all the dependencies related to that package, which can generate an unnecessary ripple effect for the progress of the project.

That said, building modern, professional applications almost necessarily requires installing packages, and with that, a basic understanding of managing these packages is a necessary skill for the developer person.

💖 💪 🙅 🚩
azhariel
Renan Braga

Posted on May 25, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related