Nicholas Cloud
Posted on January 31, 2021
(This article originally published at nicholascloud.com.)
I've been a paying Evernote user for years; and before that, a "free" Evernote user for even longer. Evernote has some seriously powerful features, among which is the Evernote Web Clipper extension available on all major browsers, the excellent PDF markup features, flawless sync across devices, etc. It is solid software, backed by a solid service.
But I've left Evernote, likely for good.
In the wake of efforts by "Big Tech" companies to censor, deplatform, or control the data that belongs to customers, I've been cutting ties with as many Big Tech companies as I'm able. And Evernote, though I have never experienced any negative service from them, Evernote does not offer end-to-end encryption for it's services (meaning notes stored on Evernote servers are accessible by Evernote employees), and that has become a deal-breaker for me. I value my privacy, and my personal notes are where I can explore ideas and record my thoughts. And I don't want to use any service that doesn't respect and protect that.
But what to do with GIGABYTES of notes, clipped articles, recipes, photos, and annotated PDFs?
When I ditched GMail for ProtonMail, it took me months to dig through all my archived mail. I deleted, exported, or printed each piece, then had to update the sender's settings with my new email address (or unsubscribe, if it was a newsletter). I transferred all of my contacts to ProtonMail, then reviewed them all to put the exported information into ProtonMail's custom contact fields. It was a long, tedious, tiresome process, but I did it.
I expected my transition away from Evernote to be just as challenging. I came up with a list of goals for my new notebook scheme:
- My notes should be plain text (well, technically Markdown) files that link to any relevant external assets, such as images, PDF files, etc.
- Markdown files and assets should be organized in a uniform way.
- Markdown files should have a uniform naming convention (all lower-case words, separated by hyphens).
- I should be able to easily search for note content.
- My notes should be available on multiple devices.
- I should be able to clip content from the web and easily add it as a Markdown file to my notebook.
Step 1: Get my notes out of Evernote
There are two Evernote applications you can use on your device: the slick, newer version, and the old, legacy version. The new version looks nice, but it dropped a significant feature I thought would be available to me, and that is the ability to export all notes at once in HTML format. Even the legacy version seems to lack this feature (maybe just on OSX?), though in the legacy version you can still export individual notebooks as HTML collections. In the newer version you can only export notebooks as Evernote's own ENEX file format (a kind of XML archive of notebook content). This seemed like it was going to be a show-stopper, since I had no clue how I would convert ENEX files into Markdown files, but a friend pointed me to the excellent utility evernote2md which does exactly that. Since I had 132 individual notebooks, it took me a while to export them all, then to convert them to Markdown with this utility, but once done, I had all of my saved notes in Markdown format (along with their attachments).
The total size of my exported Markdown notes and attachments is around 2GB.
Step 2: Fixing file names
I noticed two things pretty quickly after my initial export:
- Most Markdown file names were derived from the Evernote note title, which means they were typically in Title Case with underscores for space separators.
- MANY Markdown files, for one reason or another, had leading or trailing underscores.
- Some Markdown files -- mostly ones clipped from Reddit threads -- had strange naming conventions, e.g.,
_r_<subreddit-name>_<super-long-post-title>
. This is because Evernote automatically uses the webpage title tag as the title of a note imported with it's web clipper. - MANY Markdown files were named "untitled-XX.md" (where XX is some number). Did these notes not have titles?
- MANY Markdown files had the word
undefined
randomly peppered through the file name, e.g.,A_Historyundefined_and_Timeline_of_the_World.md
. (I later realized this only occurred immediately preceding the wordand
, of which I speculate that the ampersand was actually used in the original title and evernote2md has a bug that does not translate it correctly.)
So I had my work cut out for me. The first thing I decided to tackle was normalizing the file name case and space separator concerns. I prefer all lower-case file names, with hyphens for space separators. I hacked together a simple node.js script to traverse all of my exported notebooks and make this change.
// npm install globby
const globby = require("globby");
const path = require("path");
const renameSync = require("fs").renameSync;
const execSync = require("child_process").execSync;
const dirs = [
// top-level notebook names omitted for REASONS
].map(dir => path.join(__dirname, dir, "**/*.md"));
(async () => {
const filePaths = await globby(dirs);
const fixedPaths = filePaths.map(filePath => {
const pathDir = path.dirname(filePath);
const pathName = path.basename(filePath, ".md");
let newPathName = (pathName.replace(/[^\w\d-]/g, "") + ".md").toLowerCase();
newPathName = newPathName.replace(/_/g, "-");
const newPath = path.join(pathDir, newPathName);
return {
oldPath: filePath,
newPath,
};
});
fixedPaths.forEach(fixedPath => {
console.info(`fixing path ${fixedPath.oldPath}...`);
try {
renameSync(fixedPath.oldPath, fixedPath.newPath);
} catch (e) {
console.error(e);
process.exit(1);
}
});
console.info("all done.");
process.exit(0);
})();
This script worked well and addressed the first naming problem -- all files names are lowercase, and instead of underscores, hyphens delimit words -- but there were still problems to address.
My initial gut instinct was to begin modifying my script to handle the remaining naming problems, but that just made me tired, so I turned to the INTERNET to figure out if there was a better way to do this.
Sweet Baby Jesus there is.
There is a wonderful utility called perl-rename
that uses Perl's regular expression engine to bulk rename files in-place. It's very similar to how Vim performs find/replace, and it helped me solve two of my other problems in quick order.
** Getting rid of undefined
**
To get rid of the pesky word undefined
in my note file names, I used the find
command to traverse my entire notebook structure, find all the Markdown files that contained that word, then pass along those file paths to the perl-rename
utility which renamed the file without its troublesome intruder.
cd $NOTEBOOK
find . -iname "*undefined*.md" -exec perl-rename --verbose --dry-run -- 's/undefined//g' '{}' \;
The actual heavy lifting is done in the substitution string: s/undefined//g
, which reads like this: <substitute>/<the word undefined>/<with nothing>/<anywhere in the file name (globally)>
.
(Note that the --dry-run
flag will show you what would happen if the perl-rename
command succeeded; to actually make the changes permanent the flag must be removed from the command.)
So far so good -- no more undefined
in file names. What about leading and trailing spaces? Easy peasy.
cd $NOTEBOOK
find . -iname "*.md" -exec perl-rename --verbose --dry-run -- 's/^-//' '{}' \;
find . -iname "*.md" -exec perl-rename --verbose --dry-run -- 's/-$//' '{}' \;
Again, the magic is in the substitution.
- In the first command, the substitution reads:
<substitute>/<a dash at the beginning of the file name>/<with nothing>
. (The caret^
symbol represents the beginning of a series of characters.) - In the second command, the substitution reads:
<substitute>/<a dash at the end of the file name>/<with nothing>
. (The dollar sign$
symbol represents the end of a series of characters.)
Now for those pesky Reddit notes. Since I'd eliminated leading dashes in file names, clipped notes from Reddit would now have a file name like r-<subreddit>-<note-title>
. I still wanted to know these notes were from Reddit, so I decided the following substitution was best.
cd $NOTEBOOK
find . -iname "r-*.md" -exec perl-rename --verbose --dry-run -- 's/^r-/reddit-/' '{}' \;
The substitution reads (as you probably know by now): <substitute>/<an r- at the beginning of the file name>/<with reddit- >
.
But Nick, what about all those untitled-XX.md notes?
I'm glad you asked. There's nothing to do with those notes but manually examine them and rename them according to their content. Which would absolutely be a pain in the ass if not for the terminal file manager ranger.
Step 3: Renaming untitled notes
Ever since I watched Luke Smith demonstrate the ranger file manager I've had major boner for it, and wanted a real chance to kick its tires. The challenge of renaming all these untitled files gave me the opportunity.
Briefly, ranger is a terminal file manager that emulates some of Vim's modal editor behavior. For example, to move through directories you use the home-row keys h, j, k, and l. To run commands you press the colon key, then enter the command name. It's both sexy and dangerous, and since I'm a kinky guy it was love at first sight.
Since I had never used ranger for any serious file system work before, this was a great way to get used to its navigation controls and command capabilities. I quickly figured out that the home row was my navigation center, but ALSO that I wanted to move through pages of files at a time rather than just hitting j and k repeatedly. Turns out if you hold shift
and hit those same keys ranger will move you half-page at a time. Excellent. I traversed each notebook and used ranger's find
command -- hitting /
followed by a file name string -- to quickly jump to the first instance of a file named untitled...
. Ranger has a great file preview pane that immediately let me inspect the contents of each file, from which I could easily determine what the real file name should be. Renaming each file was easy enough -- I typed the command :rename <new-file-name>
and that did the trick. If I perchance needed to edit the file, I simply hit the l key to enter the file itself, which opened my default text editor (set by the EDITOR
and VISUAL
environment variables) for immediate access. Quitting the editor returned me immediately to ranger. Hitting the n
key repeated my search. And so it went, until I had renamed all untitled-XX.md
files in each notebook directory.
Occasionally I realized that a note I was viewing in ranger really needed to be in another directory (notebook). So I initiated an external shell command by typing !
(alternatively I could have typed :shell
) and then typed my typical shell command: mv <file-name> <other-directory>/
.
All without leaving ranger.
Step 4: Prune unused assets
By far the bulk of the disk space in each notebook is allotted to assets attached to notes -- be they images, or PDFs, or audio files. Markdown files, being plain text, require little space to store -- but assets, being binary, are pigs.
When I exported my notes to markdown, evernote2md
created two directories in each notebook for assets: file
and image
. This was uniform across notebooks, which worked to my advantage. After I exported my notes I started rummaging through each notebook directory, purging notes that were either no longer important, or too badly mangled by the export process to be of any value. But how to remove their assets as well? I hacked together another node.js script to help me find assets that were no longer referenced by any notes in a given notebook.
#!node
const path = require("path");
const execSync = require("child_process").execSync;
const args = process.argv.slice(2);
const assetDir = path.resolve(args[0] || "."); // e.g., file, or image -- assume this script is executed in an asset directory
const mdDir = path.resolve(args[1] || "..");
const lsResults = execSync(`ls ${ assetDir }`).toString().split("\n").filter(n => !!n);
const noResults = [];
lsResults.forEach(a => {
const cmd = `grep -c -H -l "${ a }" ${ mdDir }/*.md`;
let grepResults = [];
try {
grepResults = execSync( cmd ).toString().split("\n").filter(n => !!n);
} catch ( e ) {
// empty
}
console.info(">>", a);
console.info(grepResults);
if ( grepResults.length === 0 ) { // does not appear in any file
noResults.push( a );
}
});
console.info( noResults );
if ( noResults.length > 0 ) {
console.info(`to remove - rm ${ noResults.join(" ") }`);
}
This script uses the grep
command to determine if an asset filename appears in the text content of any note; if it does not, it is included in output at the end that builds up a long rm
command string that can be copied and then run to eliminate unused assets for a given notebook directory.
The grep
command flags are important here:
-
-c
means to generate a count of the matching lines in a file for the given search string (in this case, the asset file name) -
-H
means to print the file name in which the match occurred -
-l
means to restrict output to matching file names only (instead of matching lines within a file)
This combination of flags produces one line per search file that will only be present if the asset name is found within the file, allowing the script to know how many times the asset itself is referenced. If it isn't referenced at all, it's safe to delete. And so it goes.
This process is still a work in progress. As I review each notebook, I'm pruning its assets, and keeping track of those I've completed.
Step 5: Add front-matter to notebooks
Several Evernote alternatives (e.g., Boostnote) and many static website generators use YAML metadata markup at in Markdown files to render them appropriately. This front-matter appears at the top of the file, and follows a schema similar to the following:
---
link:
title:
description:
keywords:
author:
date:
publisher:
stats:
tags:
---
My exported Evernote notes do not have this front-matter, but as will be demonstrated later, it is critically important for targeted note searches.
So since you're wondering, yes, I did hack together another script to inject this front-matter into every existing Markdown note within a notebook directory.
#!node
const path = require("path");
const execSync = require("child_process").execSync;
const { writeFileSync, readFileSync } = require("fs");
const args = process.argv.slice(2);
const mdDir = path.resolve(args[0] || "."); // assume this command is in a notebook directory
const frontMatter = `
---
link:
title: <title>
description:
keywords:
author:
date:
publisher:
stats:
tags:
---
`.trim();
const capitalize = (s) => {
return s.charAt(0).toUpperCase() + s.slice(1);
};
const lsResults = execSync(`ls ${ mdDir }/*.md`).toString().split("\n").filter(n => !!n);
console.info(lsResults);
lsResults.forEach(m => {
const fileContent = readFileSync( m ).toString();
if ( fileContent.startsWith("---\n") ) {
console.info(`${ m } has front matter, skipping...`);
return;
}
const fileName = path.basename( m );
const formattedFrontMatter = frontMatter.replace(
'<title>',
capitalize(fileName.replace(/-/g, " ").replace(".md", ""))
)
const newContent = `${ formattedFrontMatter }\n${ fileContent }`;
writeFileSync( m, newContent )
});
Since most notes already had filenames derived from their Evernote titles, I took advantage of that fact and turned those filenames into the note's front-matter title -- sans hyphens, and with sentence case. It's rough, I know, but better than nothing. The rest of the information I will have to add manually. The most important fields to me are title, author, and tags. (Are tags different than keywords? I don't know.) On these fields -- and the note's file name -- I will most frequently perform targeted searches.
Step 6: Searching for notes
Searching for files by name is easy. If I want to search for a file with the word taxes
in it, I simply use the find command:
cd $NOTEBOOK
find . -iname "*taxes*"
This will give me a list of file paths in which the word taxes
appears. I try to name my notes intelligently so this kind of search can be productive. But sometimes I want to be more specific.
In that case I can rely on the tags
front-matter that I've added to each note. For example, I have a recipe for a mixed drink that my brother recommends. I've tagged this mixed drink with alcohol
, and can quickly find it using the ack
command (you could use grep
as well, but I prefer ack
):
$ ack '^tags.*alcohol.*'
alcohol/super-complex-highly-rewarding-concotion-to-drink.md
10:tags: alcohol, don-julio, grand-marnier, kombucha, cocktail
This simple command reveals that the file I'm looking for is alcohol/super-complex-highly-rewarding-concotion-to-drink.md
.
In fact, I could use ack
to search for any front-matter field, simply by using the correct search expression. (The observant reader will notice that the expression resembles those I used when renaming files with perl-rename
. The syntax is very similar.) In this case, the search expression reads: files that contain a line that beings with 'tags' (^tags) followed by any other characters (.*) but ALSO that has the world 'alcohol' in it, followed by any other characters (.*)
.
If I want to cast a wider net, I can also use ack to search for any term that occurs in any of my notes with the simple command: ack <search-term>
.
Step 7: Creating new notes
Now that my notes are exported, cleaned, organized, and "front-mattered", how do I add new notes to my notebooks?
Adding a new Markdown file is as simple as using your favorite text editor to save a file with the .md
extension. Because the evernote2md
export favored file
and image
directories for external assets, I use those same conventions for my own notes. If I had a notebook directory called economics
, for example, and I had a note called the-history-of-economics.md
, I might reference assets like this:
The Author Adam Smith wrote the seminal work, The Wealth of Nations.
<!-- this is an image of Adam Smith -->
![Picture of Adam Smith](image/adam-smith.png)
<!-- this is a link to the das-kapital.pdf file -->
Later, Karl Marx challenged Adam Smith's ideas in his work, [Das Kapital](files/das-kapital.pdf).
Now, the vast majority of my notes are articles clipped from the Internet with the Evernote Web Clipper. As I've stated, this is one of the strongest features of Evernote, and the one I'll probably miss the most.
However, I've since discovered the clean-mark npm package which will do the exact same thing a) by exporting a web page in well-formatted Markdown, and b) adding front-matter by default. This is now my go-to method of snipping articles from the Internet. The only caveat is that all assets referenced by an article will not be downloaded, but instead referenced by their individual Internet URLs. If images or external files are an integral part of an article, it will be up to me to download them manually and adjust the links accordingly.
Step 8: Accessing notes from multiple devices
So far I've entertained two methods for accessing my notes on multiple devices.
I use the MegaSync cloud storage service to back up and synchronize files across devices. It is, by far, the best cloud storage service I've used. Mega has clients that work on Windows, OSX, Linux, and Android -- which is awesome since I have devices that run each of those. (Also supports iOS but I have an Android phone so I don't care :D ). Synchronizing notes and files is flawless -- the only downside is that the Android client does not render Markdown files, or show their plain text content, which obviously makes it an non-ideal mobile client solution. This is my only real gripe about Mega's mobile offering. It's 2021: everything should render Markdown.
As an alternative, I've also contemplated using Github to manage all of my notes. I am very familiar with Git and letting it manage versions of my files and track individual commits is very appealing to me. Synchronizing across devices is trivial, as Github's web interface will render Markdown files (including embedded images) in any web browser -- mobile or not. My only hesitation is that Github (unlike Mega) does not offer end-to-end encryption (my original issue with Evernote) which does not offer me the measure of privacy I desire.
This is the last big issue I need to solve before I have a complete Evernote replacement that meets all of my needs.
Conclusion
Leaving Evernote has been an adventure, but I've learned a lot along the way -- mostly that the tools I need to achieve my values are already within my reach, and they demand nothing but the time to learn. It's amazing how much of our personal lives we heft into the "cloud", to Big Tech services that don't actually give a crap about our privacy, and will use our own data against us when we don't bring our thoughts in line with whatever pre-established narrative to which they beat their drums. If you don't control your data, you roll the dice on ever more tenuous odds.
Reclaiming my data -- and making it my own again -- has been one of the most humanizing experiences I've had in a long time. I hope this inspires others to embark on a similar quest, for freedom -- for knowledge -- for autonomy.
Posted on January 31, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.