Dynamic Allocation Issues On Spark 2.4.8 (Possible Issue with External Shuffle Service?)
Johny
Posted on October 14, 2024
Hey Team,
I am having some issue with dynamic Allocation for spark 2.4.8. I have setup a cluster using your clemlab distribution (https://www.clemlab.com/) . Spark jobs are now running fine. The issue is when I try to use dynamicAllocation options. I am thinking the problems could be due to External Shuffle Service but I feel like it should be setup properly from what I have.
From the resource manager logs we can see that the container goes from ACQUIRED to RELEASED resources which is weird. It does not go to RUNNING state.
I am out of ideas at this point how to make the dynamic Allocation work. So I am turning to you in hope that you may have some insight in the matter.
There are no issues if I do not use dynamic Allocation and spark jobs work just fine but I really want to make dynamic allocation work.
Thank you for the assistance and apologies for the long message but just wanted to supply all details possible.
Here are setting I have in ambari related to it:
Yarn:
Checking the directories here I can find necessary jar on all nodemanager hosts in the right directory:
/usr/odp/1.2.2.0-138/spark2/yarn/spark-2.4.8.1.2.2.0-138-yarn-shuffle.jar
/usr/odp/current/spark2-cient/yarn/spark-2.4.8.1.2.2.0-138-yarn-shuffle.jar ( I believe there is symbolic link to the above jar)
Spark2:
In the spark log I can see this message continuously spamming:
24/10/13 16:38:16 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
24/10/13 16:38:31 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
24/10/13 16:38:46 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
24/10/13 16:39:01 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
24/10/13 16:39:16 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
24/10/13 16:39:31 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Posted on October 14, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 28, 2024