This article describes the limitations and potential risks when Thin Provisioned LUNs are used as a Shared Storage SR in the current release of Citrix XenServer.
Most of the major Storage Vendors (including NetApp, Dell, HP, etc.) currently offer customers the option of configuring LUNs as Thin Provisioned to provide more efficient storage space usage as well as cost savings. However, Thin Provisioned LUNs can only achieve the desired efficiency by leveraging Space Reclamation features.??
Currently some operating systems and storage vendors together provide this functionality so that both Thin Provisioned LUN + Space Reclamation would work. However, the current release of XenServer does not.??
Limitation and Potential Risk
1. Does Citrix support Thin Provisioned LUN + Space Reclamation in its implementation of XenServer block based storage types (iSCSI or HBA)?
a. "No, we don't support it in any current XenServer release."??
b. The components required to support space reclamation in thin provisioned LUNs are missing in both the current XenServer kernel and the LVM version used by XenServer.
2. What would be the expected behavior on the current XenServer release if a customer uses a Thin Provisioned LUN as a XenServer SR?
a. The LUN would "fatten up" over time.??
b. That is, it will start thin but as the user creates and deletes VDIs the deleted space is not released. The space they consumed may be re-used by XenServer but it will never be returned to the Storage array as "free space" which may be used in a different LUN.??
c. Storage controllers would still see the blocks as “used” even when XenServer has deleted VDIs, so space reclamation would not be performed.
3. What would be the risk of using Thin Provisioned LUN as a XenServer SR in current release?
a. The user/administrator must never allow the storage array to run out of physical space to back the thin-provisioned LUN.
b. The tapdisk processes in XenServer check if they have sufficient space based on the size of the Logical Volume - if the array runs out of space the tapdisks will start returning "out of space" errors to the guests. The guests will think they still have plenty of space and may not handle this error well. In any case most guests will crash as their root file systems / swap space becomes unusable.
c. Some operations (eg coalescing of VHD files) check (and reserve) sufficient space in the LUN before starting. If the LUN is thin-provisioned and the array runs out of physical space then the coalesce process will receive "out of space" errors. As the operation had reserved the space in advance this error path is not well handled.
d. The storage Controller itself would make the LUN go offline when all the space was used.
1. Space Reclamation:??
2. How host operating systems can automatically reclaim space and keep LUNs online https://library.netapp.com/ecmdocs/ECMP1196995/html/GUID-93D78975-6911-4EF5-BA4E-80E64B922D09.html