我对Hadoop hdfs有一些问题。 (Hadoop 2.7.3) 我有2个名称节点(1个活动节点,1个备用节点)和3个数据节点。复制因子是3。
$ hdfs dfs -df -h /
Filesystem Size Used Available Use%
hdfs://hadoop-cluster 131.0 T 51.3 T 79.5 T 39%
使用-df
命令使用的磁盘为51T。
$ hdfs dfs -du -h /
912.8 G /dir1
2.9 T /dir2
但是使用-du
命令使用的磁盘约为3T。
我发现一个数据节点的使用率达到100%。
Live datanodes (3):
datanode1:
Configured Capacity: 48003784114176 (43.66 TB)
DFS Used: 2614091989729 (2.38 TB)
Non DFS Used: 95457946911 (88.90 GB)
DFS Remaining: 45294174318384 (41.19 TB)
DFS Used%: 5.45%
DFS Remaining%: 94.36%
*****datanode2******
Configured Capacity: 48003784114176 (43.66 TB)
DFS Used: 48003784114176 (43.66 TB)
Non DFS Used: 0
DFS Remaining: 0
DFS Used%: 100%
DFS Remaining%: 0%
datanode3:
Configured Capacity: 48003784114176 (43.66 TB)
DFS Used: 2615226250042 (2.38 TB)
Non DFS Used: 87496531142 (81.49 GB)
DFS Remaining: 45301001735984 (41.20 TB)
DFS Used%: 5.45%
DFS Remaining%: 94.37%
我的问题是
19/11/06 11:27:51 INFO balancer.Balancer: Decided to move 10 GB bytes from datanode2:DISK to datanode3:DISK
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized
19/11/06 11:27:51 INFO balancer.Balancer: Will move 10 GB in this iteration
19/11/06 11:27:51 INFO balancer.Dispatcher: Limiting threads per target to the specified max.
19/11/06 11:27:51 INFO balancer.Dispatcher: Allocating 5 threads per target.
No block has been moved for 5 iterations. Exiting...
-
尽管
datanode2
已满,但是节点的状态显示为“服务中”或“活动”或“正常”。当然,在这种情况下,我无法在hdfs中写入新数据。 -
-df
的结果和-du
的结果太大。为什么?