Don't duplicate your data - Learnings from code reviews
Johannes Kettmann
Posted on September 17, 2020
Handling data properly can be difficult. We have to fetch it from APIs. We have to aggregate it with data from other sources. And we have to transform it efficiently to use in our UIs.
In the past months, I conducted many code reviews for junior developers during this new course. I was surprised to see one mistake over and over again. A mistake that can lead to nasty bugs that are very hard to debug.
This post is about data duplication and its antidote: a single source of truth.
Before I explain what that means let's have a look at a code example.
Duplicated data
The following component renders a list of blog posts that it receives from its parent.
A user can select a filter to show only posts that were created on a particular day. The component filters and renders the provided posts accordingly.
const PostList = ({ posts }) => {
const [selectedDay, setSelectedDay] = useState(null);
const [filteredPosts, setFilteredPosts] = useState(posts);
const onChangeDay = (day) => {
setSelectedDay(day);
const postsForDay = posts.filter(
(post) => isSameDay(post.createdAt, day)
);
setFilteredPosts(postsForDay);
};
return (
<Wrapper>
<Filter
selectedDay={selectedDay}
onChangeDay={onChangeDay}
/>
{
filteredPosts.map((post) => (
<Post key={post.id} {...post} />
))
}
</Wrapper>
);
};
To implement the filtering the selected day is stored in a state variable. Next to the selected day, we find another state variable that holds the filtered posts.
This filteredPosts
array is then rendered below. It is updated whenever the selected day changes inside the onChangeDay
callback.
Maybe you realized the problem with this approach: the filteredPosts
state is just a subset of the posts
prop. We duplicate part of the posts
array and thus store the data in two different places.
Ok, true.
But what's the problem here?
We have to keep the duplicates in sync with the originals.
Imagine the following situation: The parent component allows the user to edit a post. The user decides to change the title of a post from "Data duplication rocks!" to "Data duplication sucks!".
What would happen now?
- The parent component re-renders with the updated
posts
array. - The
PostList
component re-renders with the updatedposts
prop.
So far so good. But remember how the component looks like:
const PostList = ({ posts }) => {
const [selectedDay, setSelectedDay] = useState(null);
const [filteredPosts, setFilteredPosts] = useState(posts);
const onChangeDay = (day) => { ... };
return (
<Wrapper>
<Filter ... />
{
filteredPosts.map((post) => (
<Post key={post.id} {...post} />
))
}
</Wrapper>
);
};
The PostList
actually displays the data from the filteredPosts
array. And this a subset of the old version of the posts
prop.
That means that the UI would still display the old post with its outdated title "Data duplication rocks!"
The problem is that we only update one version of the post. Our filteredPosts
array is out of sync.
A single source of truth
How would a better version of our component look like?
We wouldn't copy the data into another state variable. We would try to use only one source: the posts
prop. A single source of truth.
function PostList({ posts }) {
const [selectedDay, setSelectedDay] = useState(null);
const filteredPosts = posts.filter(
(post) => isSameDay(post.createdAt, selectedDay)
);
return (
<Wrapper>
<Filter
selectedDay={selectedDay}
onChangeDay={setSelectedDay}
/>
{
filteredPosts.map((post) => (
<Post key={post.id} {...post} />
))
}
</Wrapper>
);
}
See how we were able to get rid of the filteredPosts
state and replace it with a normal variable?
This version is simpler and less likely to introduce a bug.
In case you're worried about performance implications you might be right. If the posts array is very long or the filtering complicated the app might be slow.
But in that case, we could simply make use of the useMemo hook.
const filteredPosts = useMemo(() => posts.filter(
(post) => isSameDay(post.createdAt, selectedDay)
), [posts, selectedDay]);
The useMemo
hook returns a memoized value. The provided function is only run when the dependencies change.
This means that the filtering in the above example is only run when the posts
array changes. If the component is re-rendered but the posts
array stays the same useMemo
simply returns the memoized value and doesn't need to execute the expensive filtering logic again.
Exercise time
Here is another example that could benefit from some simplification.
function Books() {
const [data, setData] = useState(null);
const [books, setBooks] = useState([]);
useEffect(() => {
fetchData().then((data) => setData(data));
}, []);
useEffect(() => {
if (!data) {
return;
}
const mappedBooks = mapBooks(data);
setBooks(mappedBooks);
}, [data]);
return (
<div>
{
books.map((post) => (
<div key={post.id}>{post.title}</div>
))
}
</div>
);
}
I leave it to you as an exercise to find the problem and refactor this component to use a single source of truth.
Get the refactored code and an explanation of the component above
Follow the link above and drop your email to get my solution to the exercise. I'll also walk you through the original code and explain in detail what it does.
Posted on September 17, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.