Understanding Tiering internals.

Anirudha | Tue, 02/02/2021 - 13:06

How do you decide tiering policy :

There are multiple reasons you would want to use the Tiering feature. E.g-  you are running out of storage resources or you have edge deployments where you want to store all the data in a central cluster while only important data is stored locally.

While it's easy to configure tiering, there are few aspects you should consider while deciding when to tier the data. Accessing data locally vs from remote site, involves additional network latency, so you may not want to tier data aggressively Or wait until the last moment. Finding the good balance between keeping data local vs remote, is always a good idea.

We will take a look at a couple of examples.

  • How often do you access the data once stored on Objects .
    • E.g - If you have daily backups being pushed to Objects, and last 30 days backups are more active . And access becomes infrequent after 30 days. In this case  you can think of configuring a 30 days tiering policy, so when data access is more frequent, access is served locally. And post 30 days, you can still access the data but Objects will read the data from the remote S3 endpoint.
  • Take a look at your rate of data ingest.
    • E.g - If you have deployed 3N Objects cluster on 4N AOS cluster on 120TB platform. This configuration gives you roughly 240TB of logical storage (Compression saving can help you drastically to save space here) 
    • If your ingest data per day is 10TB, then you are going to reach 90% of your storage usage after 21 days.
    • In this case you may want to configure the tiering policy to tier data after 15-18 days so you keep enough breather for moving the data to S3 and also keep some buffer on the local cluster for new I/Os.
  • Edge deployments on smaller clusters.
    • In typical Robo deployments, you may deploy just 1N/2N Objects cluster on smaller AOS clusters. Given its robo deployment, you may use regular smaller nodes for deployment. And your central Objects/AOS deployment could be much bigger in size.
    • Given its just 1N objects deployment, you may run out of storage resources sooner than you think, but you may still want to keep the data around for a longer time. Once cluster storage usage reaches 90%, then new writes will be stopped, and this will impact your entire workload.

    • To overcome this situation, you can configure a tiering policy on Robo deployments pointing to centrally deployed Objects cluster. So even though your Robo deployment is much smaller in size, it can still store a high amount of data with the help of tiering.
    • You can keep only important data on the local cluster and move all the data to the central Objects cluster. Keeping all the data on-prem, without overshooting storage resources on local clusters.
  • Network speed
    • Lot depends on your network speed. Objects will use the same network for serving user I/Os and and for tiering workload. Having lower network speed will slow down tiering drastically and will impact the entire cluster.
    • So make sure you consider network speed while configuring tiering policy. Higher network speed is always recommended.
    • If you have replication also configured on the cluster, then network speed becomes even more important.

 

Let’s understand how regular I/O looks :

I am using AWS Python boto3 SDK for all further experiments :

  • Get the LifeCycle Policy Configured on the bucket.

“””

In [37]: import boto3

In [38]: session = boto3.session.Session()

In [44]: s3client.get_bucket_lifecycle_configuration(Bucket=bucket)

Out[44]:

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '283',

   'content-type': 'text/plain; charset=utf-8',

   'date': 'Sun, 31 Jan 2021 07:04:50 GMT'},

  'HTTPStatusCode': 200,

  'RetryAttempts': 0},

 u'Rules': [{u'Filter': {u'And': {u'Prefix': 'demo-object',

     u'Tags': [{u'Key': 'demo-key', u'Value': 'demo-value'}]}},

   u'ID': 'Tiering',

   u'Status': 'Enabled',

   u'Transitions': [{u'Days': 30, u'Endpoint': 'Amazon S3'}]}]}

In [45]:

“””

You should be able to see the exact same policy configured in the UI.

  • Put Object and head-object response :

“””

In [45]: s3client.put_object(Bucket=bucket, Key="demo-key")

Out[45]:

{u'ETag': '"d41d8cd98f00b204e9800998ecf8427e"',

 'ResponseMetadata': {'HTTPHeaders': {'accept-ranges': 'bytes',

   'content-length': '0',

   'date': 'Sun, 31 Jan 2021 07:07:23 GMT',

   'etag': '"d41d8cd98f00b204e9800998ecf8427e"',

   'server': 'NutanixS3',

   'x-amz-request-id': '165F3F58FC672E17'},

  'HTTPStatusCode': 200,

  'HostId': '',

  'RequestId': '165F3F58FC672E17',

  'RetryAttempts': 0}}

 

In [46]: s3client.head_object(Bucket=bucket, Key="demo-key")

Out[46]:

{u'AcceptRanges': 'bytes',

 u'ContentLength': 0,

 u'ContentType': 'binary/octet-stream',

 u'ETag': '"d41d8cd98f00b204e9800998ecf8427e"',

 u'LastModified': datetime.datetime(2021, 1, 31, 7, 7, 23, tzinfo=tzutc()),

 u'Metadata': {},

 'ResponseMetadata': {'HTTPHeaders': {'accept-ranges': 'bytes',

   'content-length': '0',

   'content-type': 'binary/octet-stream',

   'date': 'Sun, 31 Jan 2021 07:07:55 GMT',

   'etag': '"d41d8cd98f00b204e9800998ecf8427e"',

   'last-modified': 'Sun, 31 Jan 2021 07:07:23 GMT',

   'md5sum': '',

   'server': 'NutanixS3',

   'x-amz-request-id': '165F3F607CD68283'},

  'HTTPStatusCode': 200,

  'HostId': '',

  'RequestId': '165F3F607CD68283',

  'RetryAttempts': 0}}

“””

Here as you can see “'last-modified': 'Sun, 31 Jan 2021 07:07:23 GMT',” of the object. 30days time will start @ 01 Feb 2021 00:00:00 GMT and the object will become eligible for tiering on 01 March 2021 00:00:00 GMT .  Object will be tiered in the ATLAS full partial scan after it becomes eligible for tiering.

  • Post tiering when data is moved to remote endpoint -  this is how head-object response looks , 

“””

In [33]: s3client.head_object(Bucket=”old-demo-bucket”, Key=”old-key”)

Out[33]:

{u'AcceptRanges': 'bytes',

 u'ContentLength': 124928,

 u'ContentType': 'binary/octet-stream',

 u'ETag': '"0640b7f4a84f982f6bd5bc551e87b32f"',

 u'LastModified': datetime.datetime(2020, 11, 21, 4, 48, 37, tzinfo=tzutc()),

 u'Metadata': {},

 'ResponseMetadata': {'HTTPHeaders': {'accept-ranges': 'bytes',

   'content-length': '124928',

   'content-type': 'binary/octet-stream',

   'date': 'Sun, 31 Jan 2021 06:49:43 GMT',

   'etag': '"0640b7f4a84f982f6bd5bc551e87b32f"',

   'last-modified': 'Sat, 21 Nov 2020 04:48:37 GMT',

   'md5sum': '',

   'server': 'NutanixS3',

   'x-amz-request-id': '165F3E623F0E4E29',

   'x-amz-tagging-count': '1',

   'x-ntnx-location': 'kS3'},

  'HTTPStatusCode': 200,

  'HostId': '',

  'RequestId': '165F3E623F0E4E29',

  'RetryAttempts': 0}}

“””

(Above output is from one of my old bucket.)

 

What happens internally  :

  • You may notice head-object returns additional header in above response - “x-ntnx-location” as “kS3” , this indicates the data is tiered to the remote S3 endpoint. 
  • You do not have to make any changes to the S3 API call, i.e  your application does not need to know whether the object is served locally or from a remote site. Application will continue to access Objects endpoint for all S3 requests , and Objects internally will stream the data from local or tiered sites.
  • Once data is tiered to a remote site, you may find all the object names found on the remote site are different than the one uploaded in the local cluster.
    • E.g - a8b07324-9c86-46fb-49f2-7bb0b2602753/a8b07324-9c86-46fb-49f2-7bb0b2602753/24141918135/29/24141918272
  • This is mainly because data is tiered in the internal data structure form to remote clusters which only Objects can understand.
  • You may have a few thousands or millions of small objects which are transferred in just a couple of objects on a remote cluster. While large objects are broken into smaller objects, to ensure reliable data transfer.
  • Only data is tiered while all the object metadata is still kept locally. Which makes end user application experience seamless and all the magic is done within your Objects cluster.
  • Storing metadata on Objects but data on remote clusters makes it impossible to understand the data in its original form if someone tries to read the data directly from the remote cluster.
  • Keeping metadata local, also helps in serving all other S3 API calls (except data read) faster, such as head-object or list-objects.
  • You can use all other Objects features on this bucket, and they are well integrated with tiering.

 

Give it a try to this feature and let us know your feedback.