Thursday, March 27, 2014

Why do we need SAN?

 Why do we need SANs any more when many virtualised app and storage controller servers and virtualised storage can co-exist in a single set of racks?
Storage area networks (SANs) came into being so that many separate physical servers could each access a central storage facility at block level. They each saw their own LUN (Logical unit number) chunk of storage as being directly connected to them even though it was actually accessed through a channel, the Fibre Channel fabric tying the SAN together.
Nowadays our servers are are vastly more powerful through having multiple processors, typically X86 ones, connected to many sockets, with each processor having many cores, and each core capable of running several threads. The processor engines are managed and allocated to applications by a hypervisor engine such as VMware's ESX. Each core runs an application inside an O/S wrapper such as Windows or Linux and the application can be multi-threaded.
Blade Systems such as those from HP can cram more than a hundred cores in a collection of 1U rack shelves. These servers need to communicate, both to client systems, to storage facilities, and to the wider world via networking.
The local client access is a given; Ethernet rules. The storage access has been provided by a Fibre Channel SAN, with smaller enterprises using a cheaper and less complex and scalable iSCSI SAN or perhaps a filer or two. Filers have been remarkably successful in providing VMware storage, witness NetApp's wonderful business results over the past few quarters.
We now have storage in the mid-range that swings both ways; unified storage providing file and block access. This looks as if it makes the virtualised storage choice more difficult, but in fact a reunification initiative could come about by using the storage array controller engines to run applications.

SNAPSHOT

Snapshot is a common industry term denoting the ability to record the state of a storage device at any given moment and preserve that snapshot as a guide for restoring the storage device in the event that it fails. A snapshot primarily creates a point-in-time copy of the data. Typically, snapshot copy is done instantly and made available for use by other applications such as data protection, data analysis and reporting, and data replication applications. The original copy of the data continues to be available to the applications without interruption, while the snapshot copy is used to perform other functions on the data.
Snapshots provide an excellent means of data protection. The trend towards using snapshot technology comes from the benefits that snapshots deliver in addressing many of the issues that businesses face. Snapshots enable better application availability, faster recovery, easier back up management of large volumes of data, reduces exposure to data loss, virtual elimination of backup windows, and lowers total cost of ownership (TCO).

cow view
source:IBM

EMC SAN INTERVIEW QUESTIONS

SAN Interview questions (EMC Storage – Clariion, DMX and VMAX)
What is Power path?
Power path CLI to manage disks
List Power path policy
What is Vault drive?
What is the PSM Lun?
Basic of Storage
Define RAID? Which one you feel is good choice?
Storage Array used in DAS
Explain iSCSI login, fabric login
Advantage of migration from DAS to SAN
What is Meta Lun?
Explain Clariion architecture
Explain DMX architecture
Explain Enginuity operation layers
What is hard and soft zoning?
Explain WWN
What is zoning and how to create?
What is VSAN and how to create?
Hardware Models of clarion
What is FCID?
Explain Navishere/Symmtric Management console /ECC
Initialization of clarion array
Explain rule 17 in DMX
Why and how symmask, symld and symdg are used in DMX?
Symdev
Explain about symcfg
What is SYMAPI?
Configuration change in DMX
What is VCMDB?
Can windows, Linux, Solaris share the same FA in DMX?
What is Snap view?
What is Mirror view?
What is SAN Copy?
Explain Time finder and SRDF
Difference in iSCSI and NAS
What is IQN?
Explain SAN, NAS and CAS using devices used in these model
Difference in iFCP and FCIP
What is fabric?
What is RAID? Explain RAID3, RAID5 and RAID1/0
What is Hot Spare Disk?
What are the bay in DMX-3
Version and Model
Brief the Symmetrix CLI command
Create Storage group and add device into storage group in DMX.
Create Time Finder Clone using CLI
Composite Device group
create SRDF
What is iSCSI?
What is Disk Controller?
How does data got saved in case of striping and incase of concatenation?
What is the minimum no. of disks required for RAID 5 and RAID 6?
Difference between time finder and clone?
What is SRDF R1 & R2?
What is the version of Symmetrix DMX4?
In 4-24 what do 24 mean?
What is fabric?
Importance of RAID6?
How many disk failures RAID 5 supports?
Importance of masking?
Different RAID levels?
What is quorum disk and it is importance?
How to manually restore failed paths in Clariion?
Flash drives in DMX4?
What is LCC? Link Control Card
Storage provisioning in DMX?
Steps for zoning using CLI?
Describe SMCLI commands you have used
LUN, Base LUN and Metalun?
Difference between HP EVA 5000 and 8000?
What is CMI? Clariion Message Interface
What are the I/O operations in Clariion?
Use of SPs?
What is VMCDB?
What is Hyper?
What is a device in DMX?
What is SAN Kit?
Channel directors and disk directors?
What is global memory?
Difference between Emulex and Qlogic?
What is storage array in Clariion?
What FCID?
What is F-LOGI and P-LOGI? How authentication happens?

Storage Area Network

Understanding The Basics Of SAN Technology

In today’s competitive business environment, virtually every industry and technology focus area is under intense pressure to deliver more performance at a lower cost and the storage technology world is no exception to this. For the SAN market, in particular, much of the focus is not only on reducing the overall cost of the systems but also in making more optimal use of disk space and reducing power and energy requirements of SAN systems. Optimal storage space utilization is being pursued on a number of fronts, beginning with an emphasis on reducing (or eliminating) duplicate data on a storage infrastructure, a process known as deduplication. Deduplication is the process of eliminating duplicate files or data blocks across storage volumes. Typically, deduplication approaches can be defined as either “inline” or “post-process”. An inline deduplication implementation identifies and eliminates duplicate data as it is moved between host and target. A post-process implementation works after the data has been initially written to disk by identifying and eliminating duplicate data after the initial write. Deduplication approaches offer to reduce duplicated data in the range of 50-95% so you can see that some significant cost savings can be achieved over time. Until recently, deduplication has been a feature offered solely by disk backup and virtual tape library (VTL) tools but vendors are beginning to add deduplication feature sets into their network-attached storage (NAS) systems as well. Sophisticated storage management and reporting tools are also being actively used to monitor and report on utilization of the storage infrastructure and to highlight opportunities for improvement.



Another area of emphasis in the quest to reduce costs and improve the overall return on investment for SANs is the energy footprint of the storage infrastructure and the associated server space required by these systems. The overall energy footprint is being aided (i.e. total energy consumption is being reduced) through core technology improvements in the actual power consumption of storage systems. In addition, larger disk drives have resulted in improved storage density which leads to an improved optimization of server floor space used. Because an increase in server floor space indirectly equates to increased cooling and power costs, reducing floor space required by the storage infrastructure can have a direct impact on the hosting expenses. Drive manufacturers are also aggressively implementing spin-down technologies that cause drives to stop moving when not being used while still being able to rapidly spin up – offering excellent performance at a lower total amount of power consumed. The most optimal “spin-down” drives are the solid state drives (SSDs) which obviously don’t spin at all. SSDs currently compete very effectively in the area of performance and power consumption but are still more expensive than their mechanical drive counterparts.

An additional item that is driving both the adoption of SANs but also the improved cost-efficiency of these systems is the uptake of SANs in the small-to-medium-sized business (SMB) environment. Shared storage environments such as SANs, when combined with virtualized servers and the ability to share storage capacity across these multiple hosts allows smaller businesses to more effectively scale up infrastructure as required, while retaining the benefits offered by SANs for companies that are much larger.

Another trend related to SANs is the increasing popularity and adoption of iSCSI and related multi-protocol storage technologies. iSCSI is increasingly becoming a popular option for businesses that realize that their current needs would only utilize a fraction of available fiber-channel capacity, making iSCSI a more practical and more cost-effective solution that better meets their needs. The rapid acquisition of many of the smaller iSCSI specialty companies by the larger storage infrastructure companies mean that the power of iSCSI is rapidly being combined with the more sophisticated management tools offered by the larger companies. Multiprotocol storage options go beyond iSCSI to include additional multiple protocols such as NFS, CIFS, and fiber channel. A multiprotocol storage solution consolidates SAN and NAS arrays into a finite number of multiprotocol arrays which helps simplify the management and configuration of these disparate environments. 

RAID

RAID stands for Redundant Array of Inexpensive (Independent) Disks.
On most situations you will be using one of the following four levels of RAIDs.
  • RAID 0
  • RAID 1
  • RAID 5
  • RAID 10 (also known as RAID 1+0)
This article explains the main difference between these raid levels along with an easy to understand diagram.

In all the diagrams mentioned below:
  • A, B, C, D, E and F – represents blocks
  • p1, p2, and p3 – represents parity

RAID LEVEL 0


Following are the key points to remember for RAID level 0.
  • Minimum 2 disks.
  • Excellent performance ( as blocks are striped ).
  • No redundancy ( no mirror, no parity ).
  • Don’t use this for any critical system.

RAID LEVEL 1

Following are the key points to remember for RAID level 1.
  • Minimum 2 disks.
  • Good performance ( no striping. no parity ).
  • Excellent redundancy ( as blocks are mirrored ).

RAID LEVEL 5


Following are the key points to remember for RAID level 5.
  • Minimum 3 disks.
  • Good performance ( as blocks are striped ).
  • Good redundancy ( distributed parity ).
  • Best cost effective option providing both performance and redundancy. Use this for DB that is heavily read oriented. Write operations will be slow.

RAID LEVEL 10

Following are the key points to remember for RAID level 10.
  • Minimum 4 disks.
  • This is also called as “stripe of mirrors”
  • Excellent redundancy ( as blocks are mirrored )
  • Excellent performance ( as blocks are striped )
  • If you can afford the dollar, this is the BEST option for any mission critical applications (especially databases).