HOW MUCH MEMORY DOES A NAMENODE NEED?
1.为什么考虑给NameNode
分配内存的问题?
A namenode can eat up memory, since a reference to every block of every file is maintained in memory.
2.该分配多少?
It’s difficult to give a precise formula because memory usage depends on the number of blocks per file, the filename length, and the number of directories in the filesystem; plus, it can change from one Hadoop release to another.
- 一般的情况
The default of 1,000 MB of namenode memory is normally enough for a few million files, but as a rule of thumb for sizing purposes, you can conservatively allow 1,000 MB per million blocks of storage.
For example, a 200-node cluster with 24 TB of disk space per node, a block size of 128 MB, and a replication factor of 3 has room for about 2 million blocks (or more): 200 × 24,000,000 MB ⁄ (128 MB × 3). So in this case, setting the namenode memory to 12,000 MB would be a good starting point.
3. 如何分配NameNode
的运行内存?
You can increase the namenode’s memory without changing the memory allocated to other Hadoop daemons by setting HADOOP_NAMENODE_OPTS in hadoop-env.sh to include a JVM option for setting the memory size.
修改hadoop-env.sh
中的 HADOOP_NAMENODE_OPTS
配置项就可以修改NameNode
的内存,但是不会修改Hadoop
其它 daemon
的分配内存。
HADOOP_NAMENODE_OPTS allows you to pass extra options to the namenode’s JVM. So, for example, if you were using a Sun JVM, -Xmx2000m would specify that 2,000 MB of memory should be allocated to the namenode.
- 注意的操作
If you change the namenode’s memory allocation, don’t forget to do the same for the secondary namenode (using the HADOOP_SECONDARYNAMENODE_OPTS variable), since its memory requirements are comparable to the primary namenode’s.
如果修改了primary NameNode
之后,同时也需要对secondary namenode
进行一个修改。