YARN Virtual Memory issue in Hadoop

We often get into YARN container memory issue like the error message below. This may happen if you are running various Hadoop applications like Hive, Shell, Pig, Sqoop, or Spark, either running from a command line (CLI) or running from a Oozie workflow.

Current usage: 135.2 MB of 2 GB physical memory used; 6.4 GB of 4.2 GB virtual memory used. Killing container.

I read many blogs, StackOverflow posts, Hadoop/YARN documentation, and they suggest to set the one or more of following parameters.

In mapred-site.xml:


<name>mapreduce.map.memory.mb</name>
<value>4096</value>

<name>mapreduce.reduce.memory.mb</name>
<value>8192</value>

<name>mapreduce.map.java.opts</name>
<value>-Xmx3072m</value>

<name>mapreduce.reduce.java.opts</name>
<value>-Xmx6144m</value>

In yarn-site.xml:


<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>

<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>

I was running my applications on AWS EMR (Elastic MapReduce – AWS’s Hadoop distribution) from an Oozie workflow, and none of those above settings helped. I was setting those parameters only on master node and restarting the YARN process. But when the application, especially a Shell script that can run on any slave nodes, is running on a slave node, the YARN settings on master node didn’t help. I had to set those parameters on every slave node (Node Manager) of the cluster.

And that can be done using configuration like below. This configuration has to be set while launching the EMR cluster. This can be directly set on the EMR console or load from a JSON file. This configuration setting sets the parameters on yarn-site.xml on all slave nodes.

[
 {
   "Classification": "yarn-site",
   "Properties": {
     "yarn.nodemanager.vmem-pmem-ratio": "10",
     "yarn.nodemanager.vmem-check-enabled": "false"
 }
]
Advertisements