Databricks introduction
Rajnish
Posted on November 11, 2024
Databricks
it is a unified, open analytics platform for building, deploying, sharing and maintaining data, analytics, and AI solutions at scale.
Clusters
it’s a collection of VM (Virtual Machines) instances.
over which computational workloads are distributed across workers
There are two types
All-Purpose Clusters | Job Clusters |
---|---|
Analyse data collectively using interactive Notebooks | Run automated jobs |
Create cluster from the workspace or API | The Databricks job scheduler creates job clusters when running jobs |
Configuration information is retained for upto 70 clusters for upto 30 days | Configuration information is retained for upto 30 most recently terminated cluster |
💖 💪 🙅 🚩
Rajnish
Posted on November 11, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
dataengineering Databricks vs. Hadoop: Which Platform is Best for Predictive Analytics?
November 13, 2024