Uploading large files to an Azure File Share using a shell script and standard linux commands
I recently looked at how I could perform an upload of a file to an Azure File Share with a limited set of tools.
I recently looked at how I could perform an upload of a file to an Azure File Share with a limited set of tools.
Realistically there are several ways we could achieve this, for example if we were able to install additional tools we could leverage azcopy. In my scenario I only have the following available to me and I'm limited to leveraging bash/shell scripting:
The basic idea is that we will call the Azure Files REST API to perform the uploads there are two important API's for our scenario:
When calling the Put Range API there are some special considerations which must be considered when specifying the range value.
The high-level process to upload a file is as follows:
- Create File Object
- Split file into 4MB parts
- Upload each part individually
Here is the example script which achieves the above.
#!/bin/bash
if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ] || [ -z "$3" ]
then
echo "Arguments required: $0 [STORAGEACCOUNT] [SASTOKEN] [FILENAME] [FILESHARE]";
exit 1
else
STORAGEACCOUNT="$1"
SASTOKEN="$2"
FILENAME="$3"
FILESHARE="$4"
FILESIZE=$(stat -c%s "$FILENAME")
FILEMD5=$(cat $FILENAME | openssl dgst -md5 -binary | openssl enc -base64)
FILEDATE=$(date -u)
RESTAPIVERSION=2018-11-09
echo "==========================="
echo "FileName: $FILENAME"
echo "FileSize: $FILESIZE"
echo "FileMd5: $FILEMD5"
echo "FileDate: $FILEDATE"
echo "==========================="
# Create the file object
curl -X PUT -H "x-ms-content-md5: $FILEMD5" -H "Content-Length: 0" -H "x-ms-date: $FILEDATE" -H "x-ms-version: $RESTAPIVERSION" -H "x-ms-content-length: $FILESIZE" -H "x-ms-type: file" "https://$STORAGEACCOUNT.file.core.windows.net/$FILESHARE/$FILENAME?$SASTOKEN"
# We need to break the file into seperate parts if FileSize > 4MB
split -C 4m --numeric-suffixes --suffix-length=10 $FILENAME part
# Upload each part of the file by performing multiple Put Range operations
FILEPOINTER=0
for PARTNAME in $(ls part*);
do
PARTSIZE=$(stat -c%s "$PARTNAME")
PARTMD5=$(cat $PARTNAME | openssl dgst -md5 -binary | openssl enc -base64)
PARTDATE=$(date -u)
FILERANGE="bytes=$FILEPOINTER-$(($FILEPOINTER + ($PARTSIZE-1)))"
echo "--------------------------"
echo "PartName: $PARTNAME"
echo "PartSize: $PARTSIZE"
echo "PartMd5: $PARTMD5"
echo "PartDate: $PARTDATE"
echo "FileRange: $FILERANGE"
echo "Current Filepointer: $FILEPOINTER"
curl -T ./{$PARTNAME} -H "Content-MD5: $PARTMD5" -H "x-ms-write: update" -H "x-ms-date: $PARTDATE" -H "x-ms-version: $RESTAPIVERSION" -H "x-ms-range: $FILERANGE" -H "Content-Type: application/octet-stream" "https://$STORAGEACCOUNT.file.core.windows.net/$FILESHARE/$FILENAME?comp=range&$SASTOKEN"
FILEPOINTER=$(($FILEPOINTER + $PARTSIZE))
echo "Next Filepointer: $FILEPOINTER"
echo "--------------------------"
done;
fi
The script is designed to leverage a Shared Access Signature to authenticate against the Azure File Share, there are multiple methods for generating a SAS. The easiest method is via the Azure Portal.
Once you have generated your SAS you can call the script as follows
> ./upload.sh [STORAGEACCOUNTNAME] [SAS_TOKEN] test.txt test
===========================
FileName: test.txt
FileSize: 268435392
FileMd5: GrtXGlunzIv6NR1GR5xhAQ==
FileDate: Thu Jul 30 09:04:48 UTC 2020
===========================
--------------------------
PartName: part0000000000
PartSize: 4194300
PartMd5: Bt0oR+0/GxqNvosgCKOTbA==
PartDate: Thu Jul 30 09:04:49 UTC 2020
FileRange: bytes=0-4194299
Current Filepointer: 0
Next Filepointer: 4194300
--------------------------
--------------------------
PartName: part0000000001
PartSize: 4194304
PartMd5: aurqSCs0sRF8zLzBc7JYwA==
PartDate: Thu Jul 30 09:04:57 UTC 2020
FileRange: bytes=4194300-8388603
Current Filepointer: 4194300
Next Filepointer: 8388604
--------------------------
...
If the script runs successfully the file should be appear in your Azure File Share.
Hope this helps!