Pulling your dev.to posts down locally

Alright kids! Strap in! This is kind of a meta post since I'm writing it here on dev.to, but I'm about to show you how I pulled all my writings on dev.to down locally into a new Bridgetown site I made! (Which may feature a blog...who knows...)

First step, create a file to run your script. I'll be using Ruby here, but feel free to use whatever language you fancy.

First, let's grab all the articles I've written. Feel free to change username to match your dev.to username.

#!/usr/bin/env ruby

require "json"
require "net/http"
require "time"

username = "konnorrogers"

json = JSON.parse(
  Net::HTTP.get(URI("https://dev.to/api/articles?username
=#{username}"))
)

Done right?!

Not quite! We still need to loop through all of our articles and gather the content we need. To help with exploring the dev.to API, you can write the return JSON to a file like this:

filename = "./articles.json"
File.write(filename,
  JSON.pretty_generate(
    JSON.parse(
      Net::HTTP.get(URI("https://dev.to/api/articles?username
=#{username}"))
    )
  ).to_s
)

However, I already did this part and know what I need for my Bridgetown site, but feel free to use the snippet above for exploring the API. We use pretty_generate on the JSON so its easier to read.

Anyways, when looking through the data returned we don't get back the body_markdown which has the raw markdown for our posts. What we need to do is loop through all of our "articles" and then grab the body_markdown property for each one.

Here we go:

json.each do |obj|
  title = obj["title"]

  # turn anything thats not a number or letter into a hyphen, then squash reoccurring hypens into 1
  # Example: 
  #   "How can I pull my data from dev.to?"
  #=> "how-can-i-pull-my-data-from-dev-to"
  file_title = title.downcase.gsub(/[^0-9a-z]/i, "-").split(/-+/).join("-")

  # Produces a string like this: 2023-06-20 17:24:40 -0400
  date = Time.parse(obj["published_at"]).to_s

  # Pulls only yyyy-mm-dd
  file_date = date.split(" ").first

  # produces a path like this:
  # "src/_posts/2023-06-20-pulling-your-devto-posts-down-locally.md"
  file_path = "src/_posts/#{file_date}-#{file_title}.md"

  # don't waste an API call!
  next if File.exist(file_path)

  description = obj["description"]

  # Comma separated string
  categories = obj["tags"]

  article_path = obj["path"]
  article_url = URI("https://dev.to/api/articles#{article_path}")

  # One second seems to be the secret sauce to get around rate limiting.
  sleep 1

  # We can't get the info we need from the initial API call so we need to go to the article_url
  # to get the raw markdown.
  article_json = JSON.parse(Net::HTTP.get(article_url))
  body_markdown = article_json["body_markdown"]

  content = "---\n"
  content << "title: "#{title}\n\""
  content << "categories: #{categories}\n"
  content << "date: #{date}\n"
  content << "description: "|\n  #{description}\n\""
  content << "---\n\n"
  content << body_markdown

  File.write(file_path, content, mode: "w")
end

Now let's run our script and watch the magic happen. This may take a while because dev.to rate limits to what seems to be about 1 API call per second, so if you have say 60 posts, it'll take roughly 1minute to gather all your files.

ruby my-script.rb

And here's what it pulled down for me!

src/_posts/2023-06-13-inserting-a-string-on-the-first-line-of-every-file-with-vim.md
src/_posts/2023-06-07-maintain-scroll-position-in-turbo-without-data-turbo-permanent.md
src/_posts/2023-05-30-button-to-vs-link-to-and-the-pitfalls-of-data-turbo-method.md
src/_posts/2023-05-22-rails-frontend-bundling-which-one-should-i-choose.md
src/_posts/2023-05-22-revisiting-box-sizing-best-practices.md
src/_posts/2023-04-08-how-to-keep-a-persistent-class-on-a-litelement.md
src/_posts/2022-11-22-jest-vitest-and-webcomponents.md
src/_posts/2022-10-20-actiontext-all-the-ways-to-render-an-actiontext-attachment.md
src/_posts/2022-10-10-actiontext-safe-listing-attributes-and-tags.md
src/_posts/2022-10-04-actiontext-modify-the-rendering-of-activestorage-attachments.md
src/_posts/2022-07-20-why-we-still-bundle-with-http-2-in-2022.md
src/_posts/2022-04-08-testing-scopes-with-rails.md
src/_posts/2022-04-07-adding-additional-actions-to-trix.md
src/_posts/2022-03-13-converting-a-callback-to-a-promise.md
src/_posts/2022-03-10-escaping-the-traditional-rails-form.md
src/_posts/2022-02-21-adding-text-alignment-to-trix.md
src/_posts/2022-01-29-modifying-the-default-toolbar-in-trix.md
src/_posts/2022-01-29-exploring-trix.md
src/_posts/2021-11-30-cross-browser-vertical-slider-using-input-type-range.md
src/_posts/2021-11-01-rebuilding-activestorage-first-impressions.md
src/_posts/2021-10-27-why-jest-is-not-for-me.md
src/_posts/2021-10-06-frontend-bundler-braindump.md
src/_posts/2021-07-08-writing-code-block-highlighting-to-a-css-file-with-rouge.md
src/_posts/2021-07-06-creating-reusable-flashes-in-rails-using-shoelace.md
src/_posts/2021-07-03-pulling-down-somebody-s-fork-with-git.md
src/_posts/2021-07-02-fixing-fatal-error-ineffective-mark-compacts-near-heap-limit-allocation-failed-javascript-heap-out-of-memory-in-webpacker.md
src/_posts/2021-07-02-migrating-hls-videos-to-mp4-format-in-rails.md
src/_posts/2021-06-25-querying-activestorage-attachments.md
src/_posts/2021-05-25-case-switch-statement-in-ruby.md
src/_posts/2021-05-15-arel-notes.md

Best of luck and hopefully this gives you some motivation to dust off your self-hosted blog! I personally have been writing on dev.to because my old blog site is a 4 year old Gatsby site I have exactly 0 hope of ever getting running again. So here's to new beginnings! 🥂

Blog

Pulling your dev.to posts down locally

Konnor Rogers

Join Our Newsletter. No Spam, Only the good stuff.

Related