Moving Mongo Out of the Container | MongoDB Atlas Hackathon 2022 on DEV
Alex Antsiferov
Posted on December 7, 2022
What I built
I have a pet project that I started some time ago, while studying programming. It's a Telegram bot for people learning German, called Dasbot.
I'm pretty proud of its daily audience of a few hundred users who have collectively answered more than 300k quiz questions š, but I must confess: until now its database has been residing in a Docker container š¦. Like, not even on a mounted volume š¤¦āāļø.
This hackaton motivated me to amend this gruesome mistake.
Also, now that I know about change streams, I can display some real time stats on the bot's web page, yay!
Category Submission:
No idea! Just wanted to share some life lessons :)
App Link
You're welcome to use the bot and answer its questions! (Especially if you're struggling with German like I do). If it annoys you, just ban it š
Screenshots
Description
German language is difficult! Especially terrible are its grammatical genders which defy any logic, so you just have to memorize them.
Dasbot actually helps you do this, with a simple spaced repetition algorithm.
It's written in Python, because I was studying Python at that time.
And it's using MongoDB for database, because I didn't need much structure in my documents.
(There should be a photo of my desk here, covered with all the bureaucratic papers they send you twice a day here in Germany š©).
In the database I keep everyone's scores neeeded for the repetition system. I also collect stats (user, word, answer, time) -- there could be some useful insights in there.
Link to Source Code
https://github.com/wetterkrank/dasbot -- main app
https://github.com/wetterkrank/dasbot-docs-live -- web app with the new /stats page
Permissive License
Background
So, I used Docker.
It's a great tool! And I guess it's ok for a study project to spawn a database in a container. But when you do it in "production", you start collecting some gotchas. Here's a couple of mine.
mongo:
ports:
- "0.0.0.0:27017:27017"
-- this was a part of my docker-compose.yml
.
After the launch, everything worked fine for a few days, and then I found my database empty!
I checked the Mongo logs and found some dropDatabase
calls coming from unknown IPs. Hacked! šŖ But how!? I knew my ufw
rules by heart! What I didn't know is that Docker keeps its own iptables
and will not be trammelled by a mere firewall.
So when you expose the port using 0.0.0.0
, you share it with the world full of people with port scanners.
Fast forward to this November. I just updated a config setting and decided to restart the containers manually.
Then I pinged the bot and was slightly surprised that it didn't recognise me. So I looked at the db collections... interesting... 0 documents... š°
After scrolling up the shell history, I noticed that I typed docker-compose down
instead of docker-compose stop
. Here goes my data! Luckily, I had a backup š
.
How I built it
As for the moving to Atlas part: this was simple!
I would have loved to use the live migration service but I decided to start with M0 cluster so didn't have the opportunity and just used mongorestore
instead:
DB_CONTAINER="dasbot_db"
RESTORE_URI="mongodb+srv://$DB_USERNAME:$DB_PASSWORD@mydb.smth.mongodb.net/"
echo "Piping mongodump to mongorestore with Atlas as destination..."
docker exec $DB_CONTAINER mongodump --db=dasbot --archive | mongorestore --archive --drop --uri="$RESTORE_URI"
One notable hiccup was the speed of mongorestore
-- a pitiful 50Mb of data took several minutes to load! However, increasing the number of workers (numInsertionWorkersPerCollection
) helped.
Ā
For the change streams (real time stats) exercise I had to refresh my knowledge of aggregation pipelines and write some JS code. I already mentioned stats
collection above, it can be used to build all kinds of reports.
So I've added a couple of triggers which are responsible for aggregating this data and publishing the updates to a separate database, and an Atlas app that lets users access this database anonymously.
// Scheduled to run twice per day
// Updates correct / incorrect counters in answers_total
exports = function() {
const mongodb = context.services.get("DasbotData");
const collection = mongodb.db("dasbot").collection("stats");
const pipeline = [
{ $group: {
_id: { $cond: [ { $eq: ["$correct", true] }, 'correct', 'incorrect' ] },
count: { "$sum": 1 }
}
},
{
$out: { db: "dasbot-meta", coll: "answers_total" }
}
]
collection.aggregate(pipeline);
};
// This runs on every `stats` insert and updates the aggregated results
exports = function(changeEvent) {
const db = context.services.get("DasbotData").db("dasbot-meta");
const answers_total = db.collection("answers_total");
const fullDocument = changeEvent.fullDocument;
const key = fullDocument.correct ? "correct" : "incorrect";
const options = { "upsert": true };
answers_total.updateOne( { "_id": key }, { "$inc": { "count": 1 } }, options); // { _id:, value: }
};
To display the data, I made a simple React app that uses the Realm Web SDK. Now, when someone answers the bot's question, you can immediately see it ā”.
Additional Resources/Info
This tutorial was quite handy!
Posted on December 7, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
December 7, 2022