Spark yarn mode sets the core number, quantity and memory of the executor

Static settings on demand

Reference linkStack Overflow
The following case contains the three aspects in the title: number of executor cores, number of executors, and executor memory. Parameters such as driver memory are not mentioned in the original text.

Case: 6 nodes, 16 cores + 64 GB RAM per node

Since each executor is a JVM instance, we can assign multiple executors to each node.

  1. Reserve resources for system operation
    The original content of the article is: In order to ensure the operation of the operating system and hadoop process, each node must reserve 1 core + 1 GB memory. So the resources available per node are: 15 cores + 63 GB RAM.
    But if you don’t want your system to be under high load, you can reserve more resources. My personal experience is to reserve 4 cores + 4 GB of memory for each node.

  2. Determine the number of cores per executor – the “magic number”
    The number of executor cores = the number of tasks that the executor can execute concurrently. Research shows that it is not better to allocate more cores to the executor. More than 5 cores allocated to any application will only lead to performance degradation, so we generally refer to the executor The number of cores is set to 5 and a number below 5. In the following case explanation, we set the number of cores to 5.

  3. Set the number of executors
    The number of executors on each node = 15 / 5 = 3, so the total number of executors is 3 * 6 = 18. Since YARN’s ApplicationMaster needs to occupy one executor, we set the number of executors to 18 – 1 = 17.

  4. Set the memory size allocated by the executor
    In the previous step, we assigned 3 executors to each node, and the RAM available to each node was 63 GB, so the memory for each executor is 63 / 3 = 21 GB.
    However, when spark applies for memory from YARN, YARN will allocate overhead memory to each executor, and overhead = max(384 MB, 0.07 * spark.executor.memory). In our case overhead = 0.07 * 21 = 1.47 GB > 384 MB, so the memory size we requested is 21 – 1.47 ~ 19 GB.

  5. The final allocation plan we got is: each executor is allocated 5 cores, 17 executors are set, and the memory allocated to each executor is 19 GB. The spark-submit command is written as:

spark-submit --master yarn\
 --executor-cores 5 --num-executors 17\
 --executor-memory 19 filename

spark dynamic allocation (spark.dynamicAllocation)

Reference linkspark official documentation
Configuring the resource allocation of spark to dynamic allocation mode is slightly different depending on the running mode of spark (1. standalone mode 2. YARN mode 3. Mesos mode). This article only introduces the relevant configuration of YARN mode. For detailed configuration, please refer to the above Official documentation content in the link.

  1. positionspark-<version>-yarn-shuffle.jar, and copy it to each node$SPARK_HOME/yarnin the directory.
  2. at each nodeyarn-site.xmlIn the file, in the propertiesyarn.nodemanager.aux-servicesAdd inspark_shuffle, and add attributesyarn.nodemanager.aux-services.spark_shuffle.class, the value isorg.apache.spark.network.yarn.YarnShuffleService. As follows:
		<property>
                <name>yarn.nodemanager.aux-services</name>
                <value>spark_shuffle,mapreduce_shuffle</value>
        </property>
        <property>
              <name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
              <value>org.apache.spark.network.yarn.YarnShuffleService</value>
        </property>
  1. Restart all NodeManagers in the cluster.
  2. exist$SPARK_HOME/confFound in the directory namedspark-defaults.confconfiguration file (create one if it does not exist), add the following two configurations:
spark.shuffle.service.enabled           true
spark.dynamicAllocation.enabled         true
  1. Restart spark. If spark can be restarted successfully, the dynamic allocation mode of spark resources is configured successfully.

Related Posts

Elasticsearch: Metadata fields – Introduction to metadata fields

Business growth encounters bottleneck? You must check out these ways to use big data to drive business growth

What are the common classifications of software testing tests?

HBase deployment is completed, but the Web UI interface cannot be opened. Effective solution

Java login Kerberos authentication expiration problem

[Yunxiang·People] Huawei Cloud AI senior expert Bai Xiaolong: How does AI release application productivity and move towards AI engineering?

Start learning FPGA from the underlying structure (12)—-Introduction to FIFO IP core and its key parameters

The simplest Anaconda+Python3.7 installation tutorial on the Internet for Win10 (100% successful)

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*