HDFS追加數據報錯解決辦法

【HDFS追加數據報錯解決辦法】主要的兩個錯誤 , 今天晚上一直輪著報:
第一個2022-10-25 21:37:11,901 WARN hdfs.DataStreamer: DataStreamer Exceptionjava.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[192.168.88.151:9866,DS-4b3969ed-e679-4990-8d2e-374f24c1955d,DISK]], original=[DatanodeInfoWithStorage[192.168.88.151:9866,DS-4b3969ed-e679-4990-8d2e-374f24c1955d,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1304)at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1372)at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1598)at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1499)at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481)at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:719)appendToFile: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[192.168.88.151:9866,DS-4b3969ed-e679-4990-8d2e-374f24c1955d,DISK]], original=[DatanodeInfoWithStorage[192.168.88.151:9866,DS-4b3969ed-e679-4990-8d2e-374f24c1955d,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.這個我看了看文章還是比較能理解 , 看網上意思是無法寫入;即假如你的環境中有3個datanode,備份數量設置的是3 。在寫操作時,它會在pipeline中寫3個機器 。默認replace-datanode-on-failure.policy是DEFAULT,如果系統中的datanode大于等于3,它會找另外一個datanode來拷貝 。目前機器只有3臺 , 因此只要一臺datanode出問題,就一直無法寫入成功 。
當時我也是在進行操作時只打開了集群中的一臺機器,然后datanode應該是集群中的數量,但是我只有一個,所以報錯了 。然后我讓三臺機器都啟動了Hadoop , 然后就可以了 。
網上普遍的方法是修改hdfs-core.xml文件,如下:
<property><name>dfs.support.append</name><value>true</value></property><property><name>dfs.client.block.write.replace-datanode-on-failure.policy</name><value>NEVER</value></property><property><name>dfs.client.block.write.replace-datanode-on-failure.enable</name><value>true</value></property>網上對節點響應是這么解釋的,我覺得對于二、三節點這個可以留意一下:dfs.client.block.write.replace-datanode-on-failure.policy , default在3個或以上備份的時候,是會嘗試更換結點嘗試寫入datanode 。而在兩個備份的時候,不更換datanode,直接開始寫 。對于3個datanode的集群,只要一個節點沒響應寫入就會出問題,所以可以關掉 。
第二個appendToFile: Failed to APPEND_FILE /2.txt for DFSClient_NONMAPREDUCE_505101511_1 on 192.168.88.151 because this file lease is currently owned by DFSClient_NONMAPREDUCE_-474039103_1 on 192.168.88.151appendToFile: Failed to APPEND_FILE /2.txt for DFSClient_NONMAPREDUCE_814684116_1 on 192.168.88.151 because lease recovery is in progress. Try again later.第二個就是這兩句話來回搗,但是我看because后都是lease,我感覺應該是一個問題,這個應該就是網上所說的節點響應吧,我看第一句后面提到了owned by DFSClient , 估計是我只開了一個機器的問題,因為后面的IP就是第一臺機器,如果你們全開了,拿應該就是節點響應問題,這就回到了上面,就去老老實實修改文件吧 。

    推薦閱讀