I/O (part 2)

In part 2 of I/O we will consider how to observe some pain points in your overall storage design. These concepts could be applied to any technology once you understand how they can be applied. The concern areas include any connection between the application running in the operating system all the way to the spinning platters inside the disk drive. In this I will speak specifically to iSCSI as this is becoming increasingly common in storage networks.

Lets begin right at the server where the application or files are presented from. There are some things to tune here but nothing that will make a significant difference. If using a physical server ensure the NIC(s) you are connecting to the storage with are 1Gb server type network cards. Most popular ones these days support some type of TCP offloading and the associated drivers are a better quality in the supported OSs. If this machine is virtual the VM itself will not be performing the iSCSI translation rather VMware will be handling this piece. If you find yourself needing to use an iSCSI initiator from within a VM use a dedicated  vmxnet 3 virtual NIC if supported. One of the methods to check if I/O is the issue, check PerfMon or iostat (with respect to OS) and look for queue depth, length, or hold time. This measurement can indicate if the OS is holding SCSI requests waiting to be processed. One potential solution depending on the root cause is to enable MPIO as this can assist with performance issues and also provide iSCSI redundancy.

Virtual Host
The next link in the chain is normally VMware, Xenserver or some other virtualization technology. In a physical environment this can obviously be skipped. In a virtual host environment some of the same rules apply however keeping in mind you now have many servers using the same iSCSI connections. In a local storage environment you had a direct path between the controller and the disk drive using a 68pin or SAS cable and was typically capable of more then 1Gb/sec. Now you have many servers using perhaps a single 1Gb connection to it's respective disk as well as the latencies introduced with the other components. Evaluating the performance here can be done in a similar approach by checking for disk latency and queue. Make sure latency is less then 50ms and queue is less then 50. If using an application, like a SQL database, some vendors have much stricter limit of between 2ms and 10ms for latency. Using such technologies as MPIO, better network cards, updated drivers, fully patched hosts can assist to provide the desired performance. Also providing dedicated iSCSI interfaces should be one of the first things considered in a properly designed host.

Moving to the switch infrastructure can also play a significant role in the overall performance and is often overlooked. The basic rule is to use a good quality switch with plenty of port buffering. This will ensure the packets flow through without becoming blocked due to the buffers filling. This could be seen from the VM and the host showing high levels of latency however the SAN showing low overall utilization and no signs of stress. The switch itself may not show a high CPU level or any other stress as it may not have a lot of traffic on all ports or the configuration may not have CPU intensive tasks. Also to ensure the switch will not be asked to perform some of these other functions or pass non-iSCSI traffic it is recommended to use dedicated switches. In some designs or budgets this may not be possible so ensure the switch you are using is a good quality switch. Some examples include the HP 2910al or the Cisco 3750. Obviously there are many full Gb switches on the market even in the sub $200 range and may be fine for lab/test situations I would caution using them in a production network as these may not have enough buffering to maintain a non-blocking state.

Considering storage, this is one area that is not as clear. Due to the amount and diversity of technology these vendors use one must understand the architecture and hardware used. Typically most vendors will have some method to measure CPU, memory utilization (often local cache), disk queue depth and latency. Virtualized systems will always perform better (as most systems) when RAID 10 or RAID 50 sets are chosen over RAID 5. Using SAS, SCSI or FC 10Krpm or 15Krpm disks obviously will always perform better then the SATA, SAS 7Krpm disks. Another philosophy concerning the number of spindles or amount of disks used can also prove to be beneficial however as SAN vendors use different technology this may or may not help as much as it used to. One consideration to support this is if the disk controller can handle many disks in a large RAID set. Recently Intel and others have shown processors are becoming so fast software based RAID can outperform hardware based RAID sets. Also as you are designing your disk system do not add parity disks (or equivalent of a disk) in your write I/O calculations as this stripe when written will actually increase write time. Read times will lessen however also keep in mind especially in virtualized environments the platters are housing blocks of simply more blocks of data. Each time the virtual OS writes a file it changes a block (VMware example) in the .vmdk file, then changes a block on the VMFS partition, which in turn changes a block on whatever filesystem the SAN uses to store data. In the world of virtualization this can be virtualized, not sitting directly on platters, also. ;-)


No comments:

Post a Comment