I passed the AWS Data Analytics Specialty Exam! (DAS-C01)
Kyle Escosia
Posted on July 10, 2021
While still fresh from my memory, I will share tips on how to pass the Data Analytics Specialty Exam! This exam tests your ability to design, build, secure, and maintain analytics solutions on AWS that are efficient, cost-effective, and secure.
This exam is definitely challenging and very detailed, I was lucky enough to be involved in various data lake projects using some of these AWS Services, so I had hands-on experience. But hey, anything can be learned right?
ā Pre-requisites
I would definitely recommend to take the AWS Solutions Architect Associate Exam first before this one. As this would help you get an overview of the principles in the AWS Cloud and it will be much easier for you to visualize how AWS Services work together.
ā Content Outline
Below are the AWS Services covered:
The exam will test you on different domains:
Check out the full Exam Guide here
ā Tips in General (study materials, practice exams, etc.)
Read the questions carefully, 190 minutes is very long and will give you enough time to properly analyze your answers.
Look for keywords in the question, try to choose the most appropriate service for that keyword, and try to build and visualize the solution in your mind or in the whiteboard.
Abuse the Review Button! If you are unsure with your answer, don't spend too much time on it and just proceed with the next one.
Pre-exam, make sure you get a good sleep. Tbh, I was anxious going into the exam, so I drank a lot of coffee and ate chocolates lol. But as soon as I answered the first few questions, I gained confidence then proceeded with it! I guess it worked out pretty well?? šš
Again, this shouldn't be your first exam. AWS recommends to have at least an associate level exam.
Practice Exam
I personally recommend Sir Jon Bonso's practice exams, it is available on Udemy and on Tutorials Dojo. This, in my opinion, is the CLOSEST ONE to the actual exam, I even got an exact scenario from the practice exam while I was taking it. So try your best to score high on this one! Their explanations are also super helpful!
Additionally, their website, hosts the AWS Service Cheat Sheets, so be sure to read through that!
Study Materials šš„
Stephane Maarek and Frank Kane's Udemy Course are very insightful, be generous with yourself in these study materials as they will greatly help you in the exam. They also have a practice exam.
I'd like to also recommend CloudAcademy as a learning portal, they offer full courses for your certifications including hands-on (using a their service account)! So if you want to experience using a service without using your account, definitely check them out!
š„ re:Invent videos are also very helpful, these are the hidden gems from the re:Invent sessions:
Deep dive and best practices for Amazon Redshift - this helped me have a detailed understanding of what Redshift is
Building Serverless Analytics Pipelines with AWS Glue - a deep dive on Glue components
Serverless data preparation with AWS Glue - a more recent one, this discusses the updates that were rolled out in AWS Glue
High Performance Data Streaming with Amazon Kinesis: Best Practices
Big Data Analytics Architectural Patterns & Best Practices - this is a great discussion on how AWS Services integrate with each other
More AWS Videos can be found here at AWS Stash
ā Tips by Service
In this section, I will give tips on what to study for each services, below is an outline.
Collect
- Amazon Kinesis
- AWS Database Migration Service (DMS)
- Amazon Simple Queue Service (SQS)
- AWS Snowball
- AWS IoT
- AWS Managed Streaming for Kafka (MSK)
- AWS Direct Connect
Storage
Processing
- AWS Lambda
- AWS Glue
- Amazon EMR
- AWS Lake Formation
- AWS Step Functions
- AWS Data Pipeline
- Other AWS Services
Analysis
- Amazon Kinesis Data Analytics
- Amazon ElasticSearch Service
- Amazon Athena
- Amazon Redshift
- Amazon SageMaker
Visualization
Security
ā Collection
For the most part, questions require you to know how to move data from one source to another, using the right tools in the right situation. Knowing the advantages and disadvantages of each one will help you answer those questions.
Amazon Kinesis
Amazon Kinesis has 4 capabilities, namely: Video Streams, Data Streams, Firehose, Data Analytics.
For the Video Streams, I didn't get a question for this so you only need to remember that this is used for streaming video data for analytics.
For Kinesis Data Streams and Kinesis Firehose, most of the collection part will revolve on this, you need to KNOW how to differentiate Kinesis Firehose and Data Streams, I can't stress this out enough. Please study this one as there are a lot of answers that involve the use of Kinesis Data Streams and Firehose. You need to be able to distinguish both.
There are also troubleshooting and scenario-based questions, like how would you solve a ProvisionedThroughPutExceeded
error, when should you merge or split shard, what encryption options are available, and how the Kinesis service integrates with other services.
AWS Database Migration Service (DMS)
This came up in a few questions, make sure you know when to use DMS vs other tools.
Amazon Simple Queue Service (SQS)
Just the difference between Kinesis and SQS. What should you use on each problem.
AWS Snowball
The test will add this as an option, although including this as a part of the solution is feasible, you should look at what the exam asks you for. If the solution requires you to migrate data fast, perhaps this is not the most appropriate one.
AWS IoT
I didn't get a lot of IoT questions, but it's nice to have an overview of this one. IoT topics, rules, and etc. Just browse through it.
AWS Managed Streaming for Kafka (MSK)
MSK is another option for streaming data similar to Kinesis, so knowing when to use MSK vs Kinesis will be crucial.
AWS Direct Connect
Part of a data warehouse migration is integrating your on-premises data center to your Amazon VPC network, knowing when to use a Site-to-Site VPN and a Direct Connect can help you with this.
ā Storage
Amazon S3
From Storage Classes, Replication, Lifecycle, and etc. You need to know your Amazon S3 concepts! Amazon S3 Glacier also covers some part for archiving purposes.
Amazon DynamoDB
Understanding what DynamoDB and its features is essential also, as the exam will trick you into choosing other services instead of this one so knowing the advantages and disadvantage of DynamoDB will help you filter those tricky questions.
Amazon Elasticache
I didn't get an Elasticache question, but knowing what it is also helps.
ā Processing
AWS Lambda
Lambda covers a lot in the exam, as it integrates with almost all of the AWS Services, so knowing when to use Lambda or not is definitely one you should be studying for.
AWS Glue
AWS Glue also shows up on a lot of the options, I've had no troubles with Glue as I've been using it since Version 1. Generally if the question looks for a cost-effective_ solution which requires no operational overhead, definitely look for a Glue answer.
Features of Glue also shows up, bookmarks, DynamicFrame
functions, job metrics, and etc. Troubleshooting a Glue Job is also one, what should you do if Glue throws an error.
Amazon EMR
Study and understand EMR, all of its application! Period.
AWS Lake Formation
If the question is about managing access to your data lake, Lake Formation is the answer instead of managing it via IAM.
AWS Step Functions
Used for orchestration, an overview will do.
AWS Data Pipeline
I didn't get a Data Pipeline question, but they show up as one of the answers, so knowing it will help.
Other AWS Services
S3Select, S3DistCP, Hadoop Tools (Ganglia, Mahout, Ranger, HCatalog, etc.), just a basic understanding of what they do will suffice.
ā Analysis
Amazon Kinesis Data Analytics
KDA allows you query streaming data using SQL, knowing window functions will help and when you should use KDA vs Lambda.
Amazon ElasticSearch Service
Generally for log analysis, look for an ES solution along with Kibana
Amazon Athena
Most cost-effective solution involves Athena as a solution. Watch out for answers that involves Redshift Spectrum, as the exam will trick you into using Athena even if the best one is Spectrum.
Amazon Redshift
Node types, resizing options, dist styles, Redshift Spectrum, cluster administration, encryption options. Please study and remember these.
Amazon SageMaker
I didn't get a lot of SageMaker questions but an overview will help.
ā Visualization
Amazon Quicksight
Row-level security, Standard and Enterprise editions, authentication options (MS AD, SAML), Kibana vs Quicksight solution.
Other Visualization Tools
There are 1-2 questions that offers D3.js, HighCharts, and a custom chart as a solution, knowing when to choose between those and Quicksight is nice.
ā Security
Security covers a lot in the exam, it can be very tricky for you if you didn't study for these.
Amazon STS
In some parts, they require you to access other AWS Account, so knowing IAM in general, how STS and authentication works with AWS is a nice to have.
AWS Key Management Service (KMS)
KMS shows up in almost all of the security questions in the exam, so please make sure you are prepared for this. I warned you lol.
Cloud HSM (Hardware Security Module)
If the exam asks you about managing your own security options, look for an HSM solution.
Final Thoughts
Studying all at once can be overwhelming, so try to take your time in understanding each services first and how they work.
Whitepapers and Webinars are very helpful, especially Migration videos, as they give you an overall design on how things are done in AWS.
Consider also having a habit of watching 1 video or reading a whitepaper at a time to avoid information overload (small wins!).
To others who have passed and taken the same exam, feel free to share your thoughts. I would gladly add it to this post to help others pass! Let's learn from each other!
Good Luck!
Posted on July 10, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.