Sumash Singh, Director – Data Centre Solutions, South Asia & Korea, Dell EMC
Data&StorageAsean: What is the difference between deduplication and other Data Reduction technologies such as compression?
Sumash Singh: Data deduplication looks for redundancy of sequences of bytes across very large comparison windows. If the sequence of bytes is unique, it's stored on disk; if it is a duplicate, a reference is created to it and it isn't stored again.
Compression on the other hand refers to a data reduction method which reduces the physical amount of storage required to save a dataset. Think of compression as taking an item and manipulating it such that the item takes up less space than it originally did.The key difference is that most compression systems work with a file, a set of files, or possibly a tape in a given instance, while deduplication systems work over an entire storage environment for an extended period of time.
Data&StorageAsean: Why do the use cases we see for deduplication seem to be limited to backup appliances and all flash arrays?
Sumash Singh: Disks are the primary backup target in most data centres. As primary storage becomes more affordable over time, enterprises typically store many versions of the same information so that new workers can reuse previously done work. Operations such as backup store a lot of duplicate data, consuming unnecessary storage space on the disk, electricity to power and cool the disk drives, and bandwidth for replication. This creates a chain of cost and resource inefficiencies within the organisation.
Deduplication is a perfect fit to lower storage costs as fewer disks are needed. It also improves disaster recovery since there's far less data to transfer. As a result, enterprises of all sizes rely on backup and recovery with deduplication for fast, reliable, and cost-effective backup and recovery.
Data&StorageAsean: Are there different approaches to deduplication and if so what are the benefits and downsides of each?
Sumash Singh: Deduplication is ideal for highly redundant operations like backup, which requires repeatedly copying and storing the same dataset multiple times for recovery purposes. Eliminating redundant data will significantly shrink storage requirements thus lowers costs, and improve bandwidth efficiency. However, deduplication requires a commitment to using it across an entire storage environment. It also requires special hardware and/or software to perform the deduplication and store the results.
Deduplication also addresses a common challenge for remote offices where there are no on-site skills to manage backups. Using a deduplication-capable disk array for backup reduces the dependency on tape, and also the need to manage them. Add to that the ability to replicate deduplicated data across the WAN and you have a low management overhead backup solution. The downside however is not all data is suitable for deduplication; image, video and audio or other types of compressed data will gain little from deduplication.
Data&StorageAsean: Is deduplication technology relevant as companies virtualise and cloud enable?
Sumash Singh: Yes, deduplication is important even in a virtual or cloud environment. As the use of software-defined storage grows, so does the need for technology to make it more secure, efficient and reliable. For example, our Dell EMC Data Domain Virtual Edition includes features such as data deduplication, replication, data integrity, and encryption. This enables organisations to reduce the cost and complexity of their data protection infrastructure, while delivering on business Service Level Agreements (SLAs).
Data&StorageAsean: Are there any unique features you would like to share about your own deduplication offerings?
Sumash Singh: We recently announced the expansion of our midrange storage portfolio with two new SC All-Flash data storage arrays, along with key software updates to Dell EMC Unity designed to boost efficiency and cost savings for mixed block and file workloads.
The Dell EMC SC5020F and Dell EMC SC7020F arrays are offered with intelligent data deduplication and compression, RAID tiering and pervasive thin provisioning help make cost savings automatic; while the new Unity v4.3 OS provides several key updates including deduplication to help lower costs, along with new technologies to facilitate non-disruptive system upgrades and enable file synchronisation.
By combining deduplication and compression, customers will now be able to advance their IT modernisation efforts via ultra-efficient storages for a wide range of workloads and applications. Find out more here.