Aditya Nagla
Posted on April 22, 2020
Hello there, welcome to part 2 of my series to build a custom docs project. Since I'm a kind of DRY person, you should see part 1 of this series here to see the introduction and problem statement of this project.
In this tutorial, we'll learn how to convert our HTML code to the markdown text using the reverse_markdown
gem. We'll also learn how to add a definition of a custom HTML tag(<bold></bold>
) and parse it with this gem.
Note: This part is not something that we'll implement in our docs project. I wanted to write this to show you guys that we can do something like this for your future projects.
Getting Started
For this tutorial, you'll need Ruby installed in your machine. You can use this GoRails guide to setup ruby.
First, create a new folder and cd
into it:
$ mkdir htmltomd
$ cd htmltomd
Install the reverse_markdown
gem:
- First, you need to create a file named Gemfile
$ touch Gemfile
- Then add
reverse_markdown
gem to this file:
# Gemfile
source 'https://rubygems.org'
git_source(:github) { |repo| "https://github.com/#{repo}.git" }
# Your ruby version
## It's a good practice to mention it here!
ruby '2.6.5'
gem 'reverse_markdown', '~> 2.0'
Install the dependencies:
$ bundle i
Now create three files:
-
config.rb
: We'll write our ruby code logic in this file -
bold_converter.rb
: We'll implement a custom tag<bold></bold>
in our example, soreverse_markdown
will not know about our custom tag, so we'll add this logic here. -
index.html
: We'll write our HTML code in this file -
hello.md
: The output of the converted HTML code to markdown will be injected in this file.
$ touch config.rb
$ touch bold_converter.rb
$ touch input.html
$ touch hello.md
Codebase:
Let's start writing the code:
First, open up your favorite editor and write a sample HTML code in index.html
file:
<h1>Hello world</h1>
<p>Hey there</p>
<hr />
<blockquote>Let there be the end</blockquote>
<table>
<tr>
<td>Hello 1</td>
<td>Hello 2</td>
</tr>
<tr>
<td>Body 1</td>
<td>Body 2</td>
</tr>
<tr>
<td>Body 3</td>
<td>Body 4</td>
</tr>
</table>
<bold>This is bold</bold>
Now, open the config.rb
file and write our logic:
# First, we need to import `reverse_markdown` library
require 'reverse_markdown'
# Then import the `bold_converter.rb` - See point 3
require './bold_converter'
# Now let us import our `index.html` and `hello.md` files:
## For `index.html`, we only give READ permission
## For `hello.md`, we'll give RE-WRITE permission
### This means, everytime we parse code,
### It will clean `hello.md` file and rewrite code.
index_html_file = File.open('./index.html', 'r')
hello_md_text_file = File.open('./hello.md', 'w+')
# Read the contents of `index.html` file
html_code = index_html_file.read
# Register our custom `<bold></bold>` tag - See point 3
ReverseMarkdown::Converters.register :bold, ReverseMarkdown::Converters::Bold.new
# Config for reverse_markdown
ReverseMarkdown.config do |config|
## If there is an unknown tag, raise an error
config.unknown_tags = :raise
config.github_flavored = true
end
# Now, convert the HTML code to MD
md_text = ReverseMarkdown.convert(html_code)
# Now, inject the MD text in `hello.md` file
hello_md_text_file.puts md_text
# Now, close the files
index_html_file.close
hello_md_text_file.close
- This was simple; everything is explained in the code comments above, and to summarize:
- We first import the
reverse_markdown
gem. - Then we import our
bold_converter.rb
file - See next point - Then we import our
index.html
andhello.md
files. - We then register our new
Bold
tag, add config toreverse_markdown
and then convert our HTML code to MD. - We inject the converted MD text to
hello.md
file. - Finally, we close both the files as a good practice.
- If you run the script now, you'll see an error because
<bold></bold>
tag is not yet defined. We'll get this error because we've usedconfig.unknown_tags = :raise
to raise errors when we have undefined tags. For more options details see the docs. We'll define it now!
Now, open the bold_converter.rb
file and write our custom tag logic:
# bold_converter.rb
module ReverseMarkdown
module Converters
class Bold < Base
# This is the main convert logic
def convert(node, state = {})
content = treat_children(node, state.merge(already_bolded_out: true))
if content.strip.empty? || state[:already_bolded_out]
content
else
" **#{content}**"
end
end
end
end
end
-
treat_children
method performs conversion of all child nodes. If<bold></bold>
tag is already defined inreverse_markdown
(which it is not), then we don't want to convert anything. - Even if the string is empty, we don't want to convert anything.
- If everything is right, add
**
to the content.
Running the script:
- Run the script by using this command:
$ ruby config.rb
Now, in the hello.md
file, you'll see the HTML to MD converted text.
Summary and conclusion
Let's take a quick look at what we learned today:
- First of all, I introduced myself, so don't forget to connect with me at Twitter or elsewhere.
- Since I'm a DRY person, I referred to my part 1 article where you can see the introduction and problem statement.
- Then we saw what the prerequisites are, and we also set up and understood our file structure for this tutorial.
- We then wrote our ruby code logic to solve our problem, i.e., to convert an HTML code to markdown text and injecting it into an
hello.md
file. - Finally, we took a quick look at this very summary...
That's it and Let there be the end. 🙏
Posted on April 22, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.