How to Obtain Higher Performance from Amazon EBS “gp2” Volumes for SAP HANA

Rahul Deo
7 min readJun 12, 2021

First things first… “This blog assumes that you know what SAP HANA Database is, how it works, what AWS EBS storage is and how it works with AWS EC2 instances and what disk IOPS are”.

Now assumptions are out of the way, so we focus on the intent of this blog, which is to demonstrate how striping the disk in software-based RAID 0 configuration can dramatically improve disk I/O performance and can overcome the IOPS limitations of the GP2 EBS storage type.

Preface

SAP HANA has stringent I/O requirements when it comes to performance benchmark for the storage to be used for persistent data and logs. A good write/read performance for is critical for scenarios like write transactions to redo log, persist delta merges to storage after commit, loading row store tables from persistent storage to main memory during database restart, or on-demand loading of column store tables into main memory etc. (For more details on SAP HANA storage requirements refer document “SAP HANA Storage Requirements”).

So, what is the right storage type for SAP HANA?

Storage choices in AWS

For hosting SAP HANA workload on AWS, there are two SAP certified storage solutions, General Purpose SSD (gp2) and Provisioned IOPS SSD (io1/io2). Ref: AWS Storage Configuration for SAP HANA

For most scenarios AWS generally recommends to, start with using gp2 volume type for SAP HANA persistent data and only move to io1/io2 if the level of I/O performance or availability desired cannot be achieved with General Purpose SSD. (Ref: SAP Note 1656250 — SAP on AWS: Support Prerequisites)

For production scenarios SAP recommends using EBS optimized EC2 instances in conjunction with above storage types. SAP has certified various EC2 instances types and sizes for hosting SAP HANA database, the list can be accessed at CERTIFIED AND SUPPORTED SAP HANA® HARDWARE DIRECTORY.

The General Purpose SSD (gp2) single volume has a size range of 1 GiB — 16 TiB and I/O performance of 3 IOPS per GiB with minimum start value of 100 IOPS.

But there is an IOPS cap on a gp2 volume which is at 16000 IOPS per volume, which is the maximum you could achieve with single volume, that means any gp2 volume bigger than 5.33 TiB, you cannot achieve further IOPS improvement. This limitation is not suitable for performance requirements of SAP HANA for persistent data. So how to overcome this limitation?

Answer to above question is the striping of multiple volumes to overcome IOPS limitation of a volume type. Multiple EBS volumes can be striped into a single filesystem using software RAID such as Logical Volume Manager (in RAID-0 configuration).

As each EBS volume in AWS is protected against physical drive failure via mirroring in AWS’s underlying infrastructure, any RAID configuration higher than RAID-0 is unnecessary and not recommended for SAP HANA, as higher RAID configurations would negatively impact the I/O performance. This is as recommended by SAP. (Ref: SAP Note 1656250 — SAP on AWS: Support Prerequisites)

For SAP HANA, read and write performance is critical for /hana/data, /hana/log and /hana/backup. These three filesystems contain data, transaction redo logs and backup respectively and a good I/O would ensure faster loading of tables in main memory, faster restart, quicker writes of transactions in redo logs and good backup write performance. Overall setup of an SAP HANA instance with striping of volumes would look like:

Reference SAP HANA Instance

For /hana/log and /hana/backup initial logical volume can be created with single volume, which can be extended in future with additional volume to meet space and performance requirement.

Storage Configuration for SAP HANA

How to configure gp2 volumes in RAID-0 configuration with Logical Volume Manager based striping? AWS has provided detailed documentation with configuration steps at https://docs.aws.amazon.com/sap/latest/sap-hana/operating-system-and-storage-configuration.html.

The example shown below should be able to help in understanding how striping gp2 volumes can drastically improve the disk I/O performance.

For this demonstration, I created a t2.micro instance with following configuration for storage (type gp2):

· 10 GB for root volume

· 5 GB for additional volume 1

· 5 GB for additional volume 2

Once the instance is provisioned, below is the disk layout:

For demonstration purpose we will keep root volume unstriped, while striping both 5 GB volumes into a single logical volume.

To start with, create physical volume from attached disks xvdb and xvdc.

Execute the command:

# pvcreate /dev/xvdb /dev/xvdc

Then create volume group from these two 5 GB physical volumes. For demo purpose, I am using the nomenclature from SAP HANA storage configuration documentation by AWS.

Execute the command:

# vgcreate vghanadata /dev/xvdb /dev/xvdc

After volume groups are created, we will create a logical volume for SAP HANA data by adding up 5 GB sizes of xvdb and xvdc volumes.

Execute the command:

# lvcreate -n lvhanadata -i 2 -I 256 -L 10G vghanadata

Where “-i 2” represents number of stripes (equals number of volumes) and -I 256 represents the stripe size.

But executing above command ends in error…….

Now why is that?

This happens because, when physical volumes are used to create a volume group, its disk space is divided into 4 MB extents by default. This extent is the minimum amount by which the logical volume may be increased or decreased in size.

So ideally number of extents for our 2 * 5 GB volumes should be:

(10 GB * 1024 MB / 4 MB) = 2560 but maximum extents available for us is 2558????

That’s because for each physical volume created around ≈ 4 MB space is unallocated and not usable, in our case there are two physical volumes are used to create a volume group so 8 MB of space is unusable which is equivalent to two extents. (Hope this helps!!!)

So, I chose a value of 9.9 GB for the logical volume (LV), assuming that should give some comfortable buffer for usage of available extents.

The command adjusted the LV size to accommodate the stripe boundary size.

Now we have to create an XFS file system with the newly created logical volume for HANA data.

Execute the command:

# mkfs.xfs -f /dev/mapper/vghanadata-lvhanadata

Create two directories /hana and /hana/data at root level.

Execute the command:

# mkdir /hana /hana/data

Use the echo command to add entries to the /etc/fstab file with the following mount options to automatically mount these file systems during restart.

Execute the command:

# echo “/dev/mapper/vghanadata-lvhanadata /hana/data xfs nobarrier,noatime,nodiratime,logbsize=256k 0 0” >> /etc/fstab

Mount the newly created file system mapped to /hana/data directory.

Execute the command:

# mount /dev/mapper/vghanadata-lvhanadata /hana/data

The updated disk layout looks like this:

Volumes xvdb and xvdc are part of same volume group and mounted as single logical volume for /hana/data.

The Test!!!

Now for the finale, we need to perform a I/O test to check if striping two volumes helps us with some gain in performance on not. For which we will use fio utility to run a benchmark. I executed the benchmark twice, once on the root volume which is an unstriped single 10 GB volume, and second on the lvhanadata logical volume. Following command was executed:

# fio — randrepeat=1 — ioengine=libaio — direct=1 — gtod_reduce=1 — name=test — filename=random_read_write.fio — bs=4k — iodepth=64 — size=4G — readwrite=randrw — rwmixread=75

This command will create a 4 GB file and perform 4KB reads and writes using a 75%/25%, with 64 operations running at a time.

Result on the root volume:

Read IOPS: 2302, Write IOPS: 769

Read Throughput: 9.21 MB/s, Write Throughput: 3.08 MB/s

Result on the lvhanadata logical volume:

Read IOPS: 4617, Write IOPS: 1543

Read Throughput: 18.9 MB/s, Write Throughput: 6.17 MB/s

We could clearly see, that striping the gp2 volumes result in almost DOUBLE the IOPS and throughput values compared to an unstriped single large volume.

This I/O gain would help in highly performant SAP HANA database instances where the IOPS and throughput can theoretically be increased until the last supported values by the HANA instance to which volumes are attached.

I hope this write up was informative….Thanks for reading!!

--

--