i have large file of 250GB to upload from my own premises HDFS to azure block blob storage using
distcp command, i am using below command
hadoop distcp \
-Dmapreduce.map.log.level="DEBUG" \
-Dfs.azure.account.key.storageAccount.blob.core.windows.net=<account_key> \
-Dmapreduce.task.timeout=0 -overwrite -i -numListstatusThreads=1 \
/user/test/250gb_file \
wasbs://contianer@storageAccount.blob.core.windows.net/testDir/
Firstly, i am not able to upload file more than size of 195GB. how can we upload the file of size more than 195Gb using distcp command
Secondly, file is large in size so how can we upload file in parallel because sequentially its taking long
time to upload. how can we support multipart in this distcp while uploading file in azure block blob