Storing azure blob content-encoding gzip

Abstract: This blog post is on how to reduce the size of blob file uploaded in Azure storage and how you can improvise the performance by implementing azure blob content-encoding gzip compression while uploading the file.

Hi, Friends this blog post is based on my real life problem scenario where I was storing the JSON object in a blob container. The scenario was I was uploading the JSON string to the blob. It was working all well till one day the size of JSON object increased and due to which we were facing performance issue on the web application which was consuming this JSON object while downloading the response from the Azure storage. In order to reduce the size of the JSON object, we implemented the GZip compression on our JSON object which minimized the size of JSON to the great extend helping us solve the performance issue.

Note* Install Microsoft.Azure.Management.Storage nugget package to connect to Azure storage.

Considering you would be having the BlobStorage class which will be communicating to BlobContainer as shown below with respective code for compression:

public class BlobStore : StorageBase,IBlobStore
    {
        private CloudStorageAccount _account;
        private CloudBlobClient _client;
        private CloudBlobContainer _container;
        public BlobStore(IStorage storageConf) : base(storageConf)
        {
        }
        public void UploadJsonFileToBlob(object obj, Dictionary<string, string> metadata, string blobAddressUri)
        {
            _client.DefaultRequestOptions.ServerTimeout =               TimeSpan.FromHours(1);
            this.CreateContainerIfNotExists();
            var blobReference = _container.GetBlockBlobReference(blobAddressUri);
            var blockBlob = blobReference;
            UploadToContainer(blockBlob, obj, metadata);
        }
        private void connect()
        {
            var cred = new StorageCredentials(_store, _key);
            _account = new CloudStorageAccount(cred, true);
            _client= _account.CreateCloudBlobClient();
        }
        private void CreateContainerIfNotExists()
        {
            _container= _client.GetContainerReference(_share);
            _container.CreateIfNotExists();
        }
        private void UploadToContainer(CloudBlockBlob blockBlob, object obj, Dictionary<string, string> metadata)
        {
            SetBlobProperties(blockBlob, metadata);
            using (var ms = new MemoryStream())
            {
                LoadStreamWithJson(ms, obj);
                var jsonStringByteArray = ms.ToArray();
                using (MemoryStream comp = new MemoryStream())
                {
                    using (GZipStream gzip = new GZipStream(comp, CompressionLevel.Optimal))
                    {
                        gzip.Write(jsonStringByteArray, 0, jsonStringByteArray.Length);
                    }
                    var bytes = comp.ToArray();
                    blockBlob.Properties.ContentEncoding = "gzip";
                    blockBlob.Properties.ContentType = "application/json";
                    blockBlob.UploadFromByteArray(bytes, 0, bytes.Length);
                }
            }
        }
        private void SetBlobProperties(CloudBlockBlob blobReference, Dictionary<string, string> metadata, string contentType = "application/json")
        {
            blobReference.Properties.ContentType = contentType;
            foreach (var meta in metadata)
            {
                blobReference.Metadata.Add(meta.Key, meta.Value);
            }
        }

        private void LoadStreamWithJson(Stream ms, object obj)
        {
            var json = JsonConvert.SerializeObject(obj);
            StreamWriter writer = new StreamWriter(ms);
            writer.Write(json);
            writer.Flush();
            ms.Position = 0;
        }

        public async Task DownloadGzipCompressed(string filename)
        {
            string Response = string.Empty;
            try
            {
                this.connect();
                this.CreateContainerIfNotExists();
                var blobReference = _container.GetBlockBlobReference(filename);
                if (blobReference.Exists())
                {
                    var data = await GetByteArrayOfFileAsync(blobReference);
                    Response = Encoding.UTF8.GetString(data, 0, data.Length);
                }

            }
            catch (Exception ex)
            {
                throw ex;
            }
            return Response;
        }

        private async Task<byte[]> GetByteArrayOfFileAsync(CloudBlockBlob cloudFile)
        {
            byte[] data;
            using (MemoryStream ms = new MemoryStream())
            {
                await cloudFile.DownloadToStreamAsync(ms);
                ms.Seek(0, SeekOrigin.Begin);
                data = ms.ToArray();
            }

            using (MemoryStream comp = new MemoryStream(data))
            {
                using (MemoryStream decomp = new MemoryStream())
                {
                    using (GZipStream gzip = new GZipStream(comp, CompressionMode.Decompress, true))
                    {
                        gzip.CopyTo(decomp);
                    }
                    data = decomp.ToArray();
                }
            }
            return data;
        }
    }

This results in compressing the blob content, to a great extent and will be helpful when we are trying to download the blob or we are trying to access the blob via the URL.

As shown below:


Happy Learning.

2 thoughts on “Storing azure blob content-encoding gzip

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.