Creating a HTML Tag Function - Part 1
Gabriel José
Posted on November 14, 2022
Have you ever used some tagged template literals in javascript? Libraries like styled components and lit are examples that has it in there features. We can call tag functions
the actual functions used in this tagged template literals. In this tutorial I’m going to focus in a tag function implementation similar to the html tag function from lit. PS: I’ll be using Typescript in this tutorial.
Create the function
A quick brief to tag function creation, here I’m not going to focus on its creation details and focus more on the function details. We have many ways to create this function, at first I’ll make it just returning strings because it is more simple and to show why this approach is not the best ideia. After it I’ll show how to change it to return a single element or a Document Fragment with many elements.
Here we’ve a simple tag function that don’t do nothing special, just returns the same string making the correct concatenation.
// Will be using `any` for the value type because it literally can be anything
function html(staticText: TemplateStringsArray, ...values: any[]) {
const fullText = staticText.reduce((acc, text, index) => {
return acc + text + (values[index] ?? '')
}, '')
return fullText
}
With only this you already can create use it as a tagged template literal to create html text.
const paragraph = (content: string) => html`<p>${content}</p>`
html`<div>${paragraph('Hi')}</div>` // <div><p>Hi</p></div>
This is nothing special, you already can make it with normal concatenation. But in normal concatenation we have a problem that we can resolve the html tag function, which is XSS, if don’t know exactly how this works I’ll give you an example.
Preventing XSS
Lets say we have a to do list, you have an input to receive the user value and then you get that value and set it into a <li>
element innerHTML
without making any prevention. Using it this way makes your code vulnerable to this XSS attack which the user can write HTML in your input and this gets parsed when set it to the DOM.
<form>
<input name="field" />
<button>Submit</button>
</form>
<ul></ul>
const form = document.querySelector('form') as HTMLFormElement
const ul = document.querySelector('ul') as HTMLUListElement
form.addEventListener('submit', event => {
event.preventDefault()
const li = document.createElement('li')
li.innerHTML = form.field.value
ul.append(li)
form.reset()
})
If you insert <img src="" onerror="alert('hi')" />
in the input, you’ll see the code in the onerror attribute been executed, this can be called a DOM based XSS.
Yes in my example it can be solved with a simple change, instead to use the innerHTML
to set the elements content use the textContent
. But in our current case we will use the html
tag function to set the element’s inner HTML, so we need to address a parsing of a possible html inserted by the user. To do so, let’s analise the string of the input value in the html
tag function and replace the <
and >
characters for the corresponding HTML Entities. This will be enough to prevent any inserted data to not be inserted as valid HTML.
function html(staticText: TemplateStringsArray, ...values: any[]) {
const fullText = staticText.reduce((acc, text, index) => {
const stringValue = String(values[index] ?? '')
.replace(/</g, '<')
.replace(/>/g, '>')
return acc + text + stringValue
}, '')
return fullText
}
const form = document.querySelector('form') as HTMLFormElement
const ul = document.querySelector('ul') as HTMLUListElement
form.addEventListener('submit', event => {
event.preventDefault()
ul.innerHTML += html`
<li>${form.field.value}</li>
`
form.reset()
})
Great!!! But with this we create another problem, lets use this previous example to show to problem.
const paragraph = (content: string) => html`<p>${content}</p>`
html`<div>${paragraph('Hi')}</div>` // <div><p>Hi</p></div>
By opening it in the browser, you’ll see that its shown literally the text <p>Hi</p>
, which is not the wanted behavior. So now we need to make sure that any content returned by the html
tag function is not to be parsed and be placed as a valid HTML.
Again this approach is not the only one possible. First step resolve this problem is to have some way to identify this string as an unique one. To do so lets create a specific class to our string and return an instance of it.
class HTMLString extends String {
constructor(data: string) {
super(data)
}
}
function html(staticText: TemplateStringsArray, ...values: any[]) {
const fullText = staticText.reduce((acc, text, index) => {
const stringValue = String(values[index] ?? '')
.replace(/\</g, '<')
.replace(/\>/g, '>')
return acc + text + stringValue
}, '')
return new HTMLString(fullText)
}
By doing so you will notice that some parts of the existing code has errors, probably about the fact that we’re returning an instance of HTMLString
and not the string itself, but its only a type checking error, the code works correctly, but how? An instance of String
it kind different from the string itself, but the Javascript can understand its meaning via some special methods, that i’ll not talk about here, and by doing so it catches the string value and uses it. In our case we create a specific class to use as reference, so we can know that the returned value come from the html
tag function. Ok, but what about the type? Lets cast the instance of HTMLString
to string
, with it the Typescript will continue to interpret it as a string.
return new HTMLString(fullText) as string
Now lets use introduce the code that differentiate the string values.
function getCorrectStringValue(value: any) {
if (value instanceof HTMLString) {
return value
}
return String(value ?? '')
.replace(/\</g, '<')
.replace(/\>/g, '>')
}
function html(staticText: TemplateStringsArray, ...values: any[]) {
const fullText = staticText.reduce((acc, text, index) => {
const stringValue = getCorrectStringValue(values[index])
return acc + text + stringValue
}, '')
return new HTMLString(fullText) as string
}
Yes, its simple as it is. And with it, everything continues to work as expected.
Conditional values
Some times we want to use a short condition to show something in screen, with our current implementation it may not work as expected. Lets create the following scenario, when the user type checked:
before the item name it is added as checked, this would never be a feature in a real world scenario, but it will work well to show our example.
First create this css class.
.checked {
text-decoration: line-through;
}
const form = document.querySelector('form') as HTMLFormElement
const ul = document.querySelector('ul') as HTMLUListElement
form.addEventListener('submit', event => {
event.preventDefault()
const { value } = form.field as HTMLInputElement
const isChecked = value.startsWith('checked:')
ul.innerHTML += html`
<li class="${isChecked && 'checked'}">
${value.replace('checked:', '')}
</li>
`
form.reset()
})
When someone types something like “checked:go to market”, it works fine, defines the checked class in the li and the text with a line through it. But when something is not typed this way, the li has the class of false
, which doesn’t make anything wrong happen, but can lead to unexpected behaviors in other cases, to fix it we’re going to switch to a ternary, like this.
isChecked ? 'checked' : ''
But this is not a good thing when you have to do it so many times. So we can resolve this situation in our getCorrectStringValue
function.
function getCorrectStringValue(value: any) {
if (value instanceof HTMLString) {
return value
}
const isUnwantedValue = value === undefined
|| value === null
|| value === false
if (isUnwantedValue) {
return ''
}
return String(value ?? '')
.replace(/\</g, '<')
.replace(/\>/g, '>')
}
I added this isUnwantedValue
constant which checks if the value is equal to some values we don’t want, we need to check this specifically because we don’t want that 0 (zero) to be converted to an empty string, by the way, the value false
can be a bit inconvenient to some people too, so it’s up to you to if you want it to be parsed.
These are the basic features for the html tag function and there are some features like attach events, change the direct element property instead of its HTML attribute. These features can be done with strings, but it's going to be a lot of work to do and has the disadvantage that every element change must be done as it enters the DOM, with small elements it's not a problem, but when you need to look at other strings recursively and keep all your data until such time as it will be appended to a real element. To summarize, we will change our tag function to work with Document Fragments, which will lead to new problems to be solved, but its benefits are worth the cost.
I hope you liked the content, any questions leave a comment below and see you in the next post.
Posted on November 14, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
October 17, 2024