HDFS
查看集群信息类
-
查看文件系统的概要信息
hdfs dfsadmin -report
这个命令会展示关于 HDFS 集群的详细报告,包括总容量、已使用容量、剩余容量、数据节点状态等。
命令使用显示示例:
Configured Capacity: 1010676727808 (941.27 GB) Present Capacity: 242967904412 (226.28 GB) DFS Remaining: 242965824508 (226.28 GB) DFS Used: 2079904 (1.98 MB) DFS Used%: 0.00% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 ------------------------------------------------- Live datanodes (4): Name: 10.10.1.20:9866 (hadoop-master1) Hostname: hadoop-master1 Decommission Status : Normal Configured Capacity: 252669181952 (235.32 GB) DFS Used: 536384 (523.81 KB) Non DFS Used: 178615046336 (166.35 GB) DFS Remaining: 60607238484 (56.44 GB) DFS Used%: 0.00% DFS Remaining%: 23.99% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 8 Last contact: Fri Sep 20 09:42:57 GMT 2024 Last Block Report: Fri Sep 20 07:31:26 GMT 2024 Num of Blocks: 39 Name: 10.10.1.23:9866 (hadoop-worker1.zookeeper-cluster) Hostname: hadoop-worker1 Decommission Status : Normal Configured Capacity: 252669181952 (235.32 GB) DFS Used: 630496 (615.72 KB) Non DFS Used: 178614952224 (166.35 GB) DFS Remaining: 60741456127 (56.57 GB) DFS Used%: 0.00% DFS Remaining%: 24.04% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 6 Last contact: Fri Sep 20 09:42:57 GMT 2024 Last Block Report: Fri Sep 20 04:01:10 GMT 2024 Num of Blocks: 42 Name: 10.10.1.24:9866 (hadoop-worker2.zookeeper-cluster) Hostname: hadoop-worker2 Decommission Status : Normal Configured Capacity: 252669181952 (235.32 GB) DFS Used: 605920 (591.72 KB) Non DFS Used: 178614976800 (166.35 GB) DFS Remaining: 60875673770 (56.69 GB) DFS Used%: 0.00% DFS Remaining%: 24.09% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 4 Last contact: Fri Sep 20 09:42:57 GMT 2024 Last Block Report: Fri Sep 20 04:01:20 GMT 2024 Num of Blocks: 39 Name: 10.10.1.25:9866 (hadoop-worker3.zookeeper-cluster) Hostname: hadoop-worker3 Decommission Status : Normal Configured Capacity: 252669181952 (235.32 GB) DFS Used: 307104 (299.91 KB) Non DFS Used: 178615275616 (166.35 GB) DFS Remaining: 60741456127 (56.57 GB) DFS Used%: 0.00% DFS Remaining%: 24.04% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 6 Last contact: Fri Sep 20 09:42:57 GMT 2024 Last Block Report: Fri Sep 20 04:19:03 GMT 2024 Num of Blocks: 21
-
查看 NameNode 状态
hdfs haadmin -getServiceState nn1
在 HA(高可用)配置下,此命令可以查看指定的 NameNode 实例(如
nn1
或nn2
)是处于active
还是standby
状态。命令使用显示示例:
(base) root@hadoop-master1:/opt/hadoop/bin# ./hdfs haadmin -getServiceState nn1 active (base) root@hadoop-master1:/opt/hadoop/bin# ./hdfs haadmin -getServiceState nn2 standby (base) root@hadoop-master1:/opt/hadoop/bin# ./hdfs haadmin -getServiceState nn3 standby
-
查看 Namenode 的 Web UI 地址
hdfs getconf -namenodes
这个命令会列出当前配置中的所有 NameNode 的主机名或 IP 地址。
命令使用显示示例:
(base) root@hadoop-master1:/opt/hadoop/bin# ./hdfs getconf -namenodes hadoop-master1 hadoop-master2 hadoop-master3
-
列出 HDFS 中的文件
hdfs dfs -ls /
列出 HDFS 根目录下的文件和文件夹。
-
查看特定文件的副本数
hdfs dfs -stat %r /
可以查看 HDFS 文件的副本数
-
查看特定文件的详细信息
hdfs fsck /sparklog/app-20240920065523-0000.zstd -files -blocks -locations
命令使用显示示例:
(base) root@hadoop-master1:/opt/hadoop/bin# ./hdfs fsck /sparklog/app-20240920065523-0000.zstd -files -blocks -locations Connecting to namenode via http://hadoop-master1:9870/fsck?ugi=root&files=1&blocks=1&locations=1&path=%2Fsparklog%2Fapp-20240920065523-0000.zstd FSCK started by root (auth:SIMPLE) from /10.10.1.20 for path /sparklog/app-20240920065523-0000.zstd at Fri Sep 20 09:55:30 GMT 2024 /sparklog/app-20240920065523-0000.zstd 27510 bytes, replicated: replication=3, 1 block(s): OK 0. BP-1758250195-10.10.1.20-1726216569504:blk_1073741902_1091 len=27510 Live_repl=3 [DatanodeInfoWithStorage[10.10.1.24:9866,DS-8b6323fb-b50a-4cc5-a19d-2d58cb25677e,DISK], DatanodeInfoWithStorage[10.10.1.23:9866,DS-2abc995d-b4ba-4280-bdce-e875f43071cb,DISK], DatanodeInfoWithStorage[10.10.1.20:9866,DS-6bbad34a-80b2-4032-8884-b3645b9af203,DISK]] Status: HEALTHY Number of data-nodes: 4 Number of racks: 1 Total dirs: 0 Total symlinks: 0 Replicated Blocks: Total size: 27510 B Total files: 1 Total blocks (validated): 1 (avg. block size 27510 B) Minimally replicated blocks: 1 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 3.0 Missing blocks: 0 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Blocks queued for replication: 0 Erasure Coded Block Groups: Total size: 0 B Total files: 0 Total block groups (validated): 0 Minimally erasure-coded block groups: 0 Over-erasure-coded block groups: 0 Under-erasure-coded block groups: 0 Unsatisfactory placement block groups: 0 Average block group size: 0.0 Missing block groups: 0 Corrupt block groups: 0 Missing internal blocks: 0 Blocks queued for replication: 0 FSCK ended at Fri Sep 20 09:55:30 GMT 2024 in 1 milliseconds The filesystem under path '/sparklog/app-20240920065523-0000.zstd' is HEALTHY
-
检查 HDFS 节点状态
hdfs dfsadmin -report
查看每个 DataNode 的状态和存储容量。
Configured Capacity: 1010676727808 (941.27 GB)
Present Capacity: 242810003424 (226.13 GB)
DFS Remaining: 242807851684 (226.13 GB)
DFS Used: 2151740 (2.05 MB)
DFS Used%: 0.00%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (4):
Name: 10.10.1.20:9866 (hadoop-master1)
Hostname: hadoop-master1
Decommission Status : Normal
Configured Capacity: 252669181952 (235.32 GB)
DFS Used: 579252 (565.68 KB)
Non DFS Used: 178654497100 (166.38 GB)
DFS Remaining: 60567745420 (56.41 GB)
DFS Used%: 0.00%
DFS Remaining%: 23.97%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 8
Last contact: Sat Sep 21 04:29:28 GMT 2024
Last Block Report: Sat Sep 21 03:12:19 GMT 2024
Num of Blocks: 47
Name: 10.10.1.23:9866 (hadoop-worker1.zookeeper-cluster)
Hostname: hadoop-worker1
Decommission Status : Normal
Configured Capacity: 252669181952 (235.32 GB)
DFS Used: 565344 (552.09 KB)
Non DFS Used: 178654511008 (166.38 GB)
DFS Remaining: 60701963063 (56.53 GB)
DFS Used%: 0.00%
DFS Remaining%: 24.02%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 6
Last contact: Sat Sep 21 04:29:28 GMT 2024
Last Block Report: Sat Sep 21 03:12:13 GMT 2024
Num of Blocks: 46
Name: 10.10.1.24:9866 (hadoop-worker2.zookeeper-cluster)
Hostname: hadoop-worker2
Decommission Status : Normal
Configured Capacity: 252669181952 (235.32 GB)
DFS Used: 648980 (633.77 KB)
Non DFS Used: 178654427372 (166.38 GB)
DFS Remaining: 60701963063 (56.53 GB)
DFS Used%: 0.00%
DFS Remaining%: 24.02%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 6
Last contact: Sat Sep 21 04:29:28 GMT 2024
Last Block Report: Sat Sep 21 03:12:13 GMT 2024
Num of Blocks: 46
Name: 10.10.1.25:9866 (hadoop-worker3.zookeeper-cluster)
Hostname: hadoop-worker3
Decommission Status : Normal
Configured Capacity: 252669181952 (235.32 GB)
DFS Used: 358164 (349.77 KB)
Non DFS Used: 178654718188 (166.39 GB)
DFS Remaining: 60836180138 (56.66 GB)
DFS Used%: 0.00%
DFS Remaining%: 24.08%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 4
Last contact: Sat Sep 21 04:29:28 GMT 2024
Last Block Report: Sat Sep 21 03:12:13 GMT 2024
Num of Blocks: 29
- 查看正在进行的 HDFS 块重新平衡
hdfs balancer
这个命令可以查看 HDFS 块重新平衡的状态,确保数据均匀分布在集群中的所有 DataNode 上。
2024-09-21 04:31:10,635 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
2024-09-21 04:31:10,712 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2024-09-21 04:31:10,712 INFO impl.MetricsSystemImpl: Balancer metrics system started
2024-09-21 04:31:10,731 INFO balancer.Balancer: namenodes = [hdfs://mycluster]
2024-09-21 04:31:10,731 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false, sort top nodes = false, hot block time interval = 0]
2024-09-21 04:31:10,731 INFO balancer.Balancer: included nodes = []
2024-09-21 04:31:10,731 INFO balancer.Balancer: excluded nodes = []
2024-09-21 04:31:10,731 INFO balancer.Balancer: source nodes = []
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved NameNode
2024-09-21 04:31:10,734 INFO balancer.NameNodeConnector: getBlocks calls for hdfs://mycluster will be rate-limited to 20 per second
2024-09-21 04:31:11,919 INFO balancer.Balancer: dfs.namenode.get-blocks.max-qps = 20 (default=20)
2024-09-21 04:31:11,919 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
2024-09-21 04:31:11,919 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
2024-09-21 04:31:11,919 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
2024-09-21 04:31:11,919 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
2024-09-21 04:31:11,919 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
2024-09-21 04:31:11,920 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 100 (default=100)
2024-09-21 04:31:11,920 INFO balancer.Balancer: dfs.datanode.balance.bandwidthPerSec = 104857600 (default=104857600)
2024-09-21 04:31:11,927 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
2024-09-21 04:31:11,927 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)
2024-09-21 04:31:12,021 INFO net.NetworkTopology: Adding a new node: /default-rack/10.10.1.20:9866
2024-09-21 04:31:12,022 INFO net.NetworkTopology: Adding a new node: /default-rack/10.10.1.23:9866
2024-09-21 04:31:12,022 INFO net.NetworkTopology: Adding a new node: /default-rack/10.10.1.24:9866
2024-09-21 04:31:12,022 INFO net.NetworkTopology: Adding a new node: /default-rack/10.10.1.25:9866
2024-09-21 04:31:12,025 INFO balancer.Balancer: 0 over-utilized: []
2024-09-21 04:31:12,025 INFO balancer.Balancer: 0 underutilized: []
Sep 21, 2024 4:31:12 AM 0 0 B 0 B 0 B 0 hdfs://mycluster
The cluster is balanced. Exiting...
Sep 21, 2024 4:31:12 AM Balancing took 1.803 seconds
- 查看 HDFS 配置
hdfs getconf -confKey dfs.replication
这个命令可以查看指定的 HDFS 配置项,比如 dfs.replication
表示副本数。
- 查看 HDFS 中当前的安全模式状态
hdfs dfsadmin -safemode get
用于检查 HDFS 是否处于安全模式。
Safe mode is OFF in hadoop-master1/10.10.1.20:8020
Safe mode is OFF in hadoop-master2/10.10.1.21:8020
Safe mode is OFF in hadoop-master3/10.10.1.22:8020
-
Hadoop 中手动开启 Safe Mode(安全模式)
hdfs dfsadmin -safemode enter
-
关闭 Safe Mode
hdfs dfsadmin -safemode leave