[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Answer: Re: [FW1] Rainwall-E vs StoneBeat FullCluster
Hi, Recently members of this list requested more information and details between Rainfinity's Rainwall product and Stonesoft's StoneBeat FullCluster. The following document provides an overview of our technology, especially with respect to areas where we feel the FullCluster product is stronger. This document is also available as an 87 KB Acrobat PDF document with much cleaner formatting; available upon request. Stonesoft welcomes comments, review and your own independent comparisions between the two products. We invite independent third parties to perform their own benchmark testing and evaluations, as several companies have already chosen. Feedback on this document and a continued discussion of the product are encouraged, and we also invite questions from potential customers at any time. :-) Evaluation copies, licenses, product documentation, white papers and more are available on our Web site at http://www.stonebeat.com/. The response: (c) 2000 Stonesoft, Inc. All rights reserved. Introduction In today's world, with the rapid growth of e-commerce, e-mail communication, and Web services, the firewall has become a critical component in day-to-day business activities for companies large and small. Any outage - planned or unplanned - of the firewall will directly impact the organization's productivity, regardless of the duration. Organizations are increasingly recognizing the need for high availability and reliability in their firewalls and network services. To supplement and enhance firewall software, such as Check Point Software Technologies' FireWall-1, a number of companies offer solutions to provide high availability through software or hardware components. Two products stand out as the most viable competitors in the software high availability solution market - the industry-leading StoneBeat family, and the Rainfinity Rainwall product line. This paper presents a competitive analysis of Stonesoft's StoneBeat FullCluster clustering software to Rainfinity's Rainwall-E to assist the network or firewall administrator in their evaluation of such solutions. Additionally, it addresses recent statements and claims by Rainfinity regarding the StoneBeat clustering technology, correcting misconceptions and inaccuracies. Overview StoneBeat FullCluster The StoneBeat FullCluster product was released in 1999 as an extension of the original StoneBeat high availability product. FullCluster enables the clustering of multiple firewall machines into a single network identity. Since the firewall is a cluster of machines, administrators can perform maintenance on the firewalls during business hours, and the failure of one or more nodes in the cluster due to hardware or software problems will be completely transparent to the users. In addition, the cluster performs dynamic load balancing, which increases the performance of the overall firewall. FullCluster also includes a fully customizable test subsystem, which allows the firewall administrator to set up appropriate responses to just about any problem with the firewall nodes. Rainfinity Rainwall Rainfinity offers two versions of its Rainwall product. The basic Rainwall-S product is a software solution that provides hot standby or load sharing configurations for two firewalls. It is most directly comparable in features to Stonesoft's StoneBeat product, Nokia appliances, and Check Point Software Technologies' own HA solution. The second version, Rainwall-E is more advanced, providing enhanced scalability, load balancing, improved performance, and increased availability and reliability. Rainwall uses RAIN, a clustering technology developed at Caltech for NASA. Rainwall-E is on an almost equivalent feature level for comparison to Stonesoft's FullCluster clustering product. Common Misconceptions IP and MAC Addressing The configuration of IP and MAC addresses on the firewall nodes is perhaps one of the areas where Rainwall and FullCluster differ the most. Each product takes an entirely different approach to this aspect of configuration. In a posting to the Check Point FireWall-1 mailing list on 25 September 2000, Mark Decker from Rainfinity stated, "StoneBeat achieves clustering at Layer 2, by cheating the rules of Ethernet to allow more than one machine to have the same MAC address. Rainwall achieves clustering at Layer 3, by creating a pool of Virtual IP addresses (VIPs) that float dynamically among nodes in the cluster." This statement suggests that Stonesoft has done something wrong with allowing more than one machine to have the same MAC address, and may seem to be a legitimate claim to those unfamiliar with the product. So let's examine how FullCluster allows more than one interface to have the same MAC address in more detail. Since the early 1970s, standards for the Internet have been established by a peer-review method of proposals known commonly as RFCs (Request For Comments). The basic protocols of the Internet - TCP/IP - have been established through these documents. The Internet Engineering Task Force has working groups assigned to work out the details of proposed standards - how protocols will communicate and interact with different devices, and protocols, how packets will be written, data formatted, and more. Each standard is assigned a number by which it can be referenced. With FullCluster, a particular set of RFCs come into play that help explain the single MAC address concept. The first defines a standard developed in the late 1980s, called IP multicast [RFC 1112]. The second, RFC 2236, defines a complimentary protocol, IGMP version 2. Multicast was developed to address issues of limited data capacity at the time. One potential use was to enable a more efficient method of streaming audio and possibly even video over the Internet. By using multicast, a single data stream can be sent to multiple destination machines, or recipients - far more efficient than sending multiple data streams of the same audio data. Efficiency is also gained by limiting the traffic to just that set of machines that wish to receive the data - in other words, finding the happy medium between the limits of a unicast (single machine only) or the wasteful broadcast (everyone, regardless of interest). Multicast essentially enables the transmission of an IP datagram to a group of machines with zero or more members identified by a single destination IP address. Hosts can indicate their membership in a multicast group through IGMP (Internet Group Management Protocol), which enables hosts to join or leave a group as needed. StoneBeat FullCluster uses Ethernet multicast as its means of achieving a configuration of a single MAC address on more than one physical interface. Because multicast sends the same packet to all interfaces at once, and only to the nodes on the cluster, it enables the most efficient use of that network's data capacity. If communications are required to individual firewall nodes, an additional, dedicated unicast IP and MAC address can be assigned to each interface as well. Retaining the single cluster IP address ensures that other devices view the firewall cluster as a single identity on the network, eliminating the need to configure other devices with respect to the firewall node. Rainfinity uses multiple virtual IP addresses (VIPs) to perform the same type of function. Each node in the cluster is assigned a dedicated IP address for each interface. Additional, virtual IP addresses are assigned to each interface as well. The virtual IP addresses are floated between the nodes in the cluster, moving if a node fails, or is overloaded. Although Rainfinity eliminates the use of advanced IP concepts such as multicast, this methodology presents problems when designing a complex network topology. The first problem with a multiple virtual IP approach is that the cluster is no longer transparent to the rest of the network. Routers and devices on the networks shared by the Rainwall cluster must be configured to use these virtual IPs as their default gateways. Many devices do not support multiple default gateways, or do not correctly choose equal-cost default routes, so they need to be configured to use one of the virtual IPs. If the cluster is expanded and new virtual IPs added, these other devices must then be reconfigured to use the new VIPs. The second problem with a multiple virtual IP approach is the consumption of a large number of IP addresses by the cluster itself. The subnet size will place a practical limitation on the number of virtual IPs, and, in turn, the number of nodes in the cluster, that can be used. For example, many ISPs will subnet external, Internet address space to their customers with a mask that allows only 32 hosts. This allows the ISP to provide many customers with the very limited amount of IP addresses left for assignment. If you wished to set up a 16 node Rainfinity cluster, and assign at least one VIP to each node, you would completely exhaust your address space - leaving no room for the ISP's router, addresses for NAT (Network Address Translation), or other devices on that segment. Since FullCluster supports a single VIP configuration, it would enable a 16-node cluster while still leaving 15 addresses free for other purposes. When Rainfinity needs to move a VIP address to another node in the cluster, it faces another issue. Each interface has an ARP table associated with it. ARP is a means of mapping the higher level IP addressing to the particular network topology in use, such as Ethernet. When machines need to locate an IP on their local network, they send out an ARP request, asking for the corresponding physical (MAC) address on the segment. These entries mapping the MAC address to the IP are then stored in the ARP table, or ARP cache, so that the query doesn't need to be made again. When the VIP is moved by Rainfinity, the systems using that IP need to either a) wait until the ARP cache timeout is reached, the cache is flushed, and a new query can be made, or b) be notified that the ARP entry is stale and needs to be updated. To accomplish a rapid transfer to the new node, Rainfinity uses a technique known as gratuitous ARP to advertise the new IP-MAC address pair. With load balancing, the gratuitous ARP requests add additional network traffic on the operative networks and add additional overhead on each client (each must update its own ARP cache accordingly). The use of gratuitous ARP poses additional problems. First, not all devices support gratuitous ARP. Some network devices are even configured to ignore it explicitly for security reasons. Since a device supporting gratuitous ARP will update an entry without question, it becomes possible to spoof the device into sending the data to an illegitimate host instead of the intended receiver. The Web site http://packetstorm.securify.com/ can provide additional information about gratuitous ARP issues. Browse to the site and use "arp cache poisoning" as your search expression. Level 3 switches often will need their ARP cache disabled to support such a configuration, assuming the hardware manufacturer provides such an option. IP route cache on Cisco devices must also be disabled, which severely degrades the performance of those devices on the network. And under high traffic conditions, the ARP packet may not be heard, or may be lost, in which case each device on that subnet must wait until their ARP cache expires to get the update that an IP has moved. With FullCluster, the failure of a node in the cluster is transparent to any other device on that network, and requires no special update information to be sent out on the wire. Additional Hardware Requirements Rainfinity also argues that StoneBeat FullCluster requires additional hardware. Tricia Walker, a channel sales manager for Rainfinity noted in an e-mail on June 28, 2000, "Rainwall communicates in-band. Because our overhead is so low, we do not require a dedicated heartbeat line. For HA, StoneBeat needs two dedicated heartbeat lines and two hubs to process. We do not. Because of our "flexible pipe" architecture, as long as the nodes in the cluster are on the same subnet they do not have to be physically located next to each other, or on the same floor, or in the same building, or in the same state for that matter." There is some truth to this statement, as FullCluster does require a dedicated heartbeat network. However, it is not because of FullCluster's overhead; nor does the heartbeat network impose the physical distance limitation between nodes. Both products require some form of communication between the nodes in the cluster. Both use Check Point FireWall-1's state synchronization to assist in seamlessly transferring connections between the firewall nodes in the event of a failure. FullCluster uses state synchronization to simply enable the seamless transition of connections for load balancing, whereas Rainfinity absolutely relies on it in order to properly handle the asymmetric routing their product often introduces. FullCluster requires a dedicated network for inter-node communication for several reasons. First, the use of a dedicated heartbeat network ensures that the communication of the nodes in the cluster will not be affected by outside factors, such as peak network traffic, denial-of-service (DoS) attacks, SYN flood attacks, or other network issues. The heartbeat is a critical component for any cluster to perform at an optimal level, and to do so correctly. Using a dedicated heartbeat network ensures that this critical communication takes place. Second, the heartbeat network can be made highly available, with a redundant network link and second hub. The heartbeat network already adds a level of security for the cluster communication, since the dedicated heartbeat LAN can be secured through an effective firewall policy. But an additional, redundant heartbeat network adds other benefits as well. With two dedicated LANs, one can also use the link to add additional security and performance of other firewall communications. Traffic, both log data and policy changes, can be communicated between the management server for the firewalls and the individual nodes on this secure channel. State synchronization tables, which can be quite large with multiple nodes and many connections, can be moved more efficiently and securely on one of the heartbeat LANs as well, without having the synchronization delayed (creating out-of-sync state tables) or impacting other network traffic. The heartbeat network and its backup can be implemented at little to no additional cost. Since the dedicated heartbeat LAN can use hubs, many sites already have surplus hubs on hand, since they have upgraded to switches. For those who don't, the heartbeat can be implemented in a switch on an independent VLAN, or hubs can be purchased from $ 40.00 USD to $ 125.00 USD. Depending on the type of machine and interface cards used, the site will most probably have the additional required interfaces to increase their peace of mind, knowing the network design is more efficient and secure. The distance limitation is a factor for any HA solution. For any node to handle an established connection in the event of a node failure, each node must have an interface in the same broadcast domain for each network. No node of any HA solution can take over another node that is not connected to the same networks. With modern networking hardware, it is possible to link two remote sites with switches connected via a dedicated link to each other. Check Point presents an example of such a configuration for creating multiple-entry point (MEP) VPNs in their "VPN-1 for the Security Professional" text book. As long as the administrator designs the network to meet the topology's distance requirements (2 km for fiber optic), and is willing to make the investment in the proper switching components, either product can meet fail over requirements for a remote location. Since the overhead of the software products is related to performance issues, FullCluster's overhead will be discussed as part of the "Performance" section, below. Load Balancing The achievement of load balancing is another area where the two products differ. FullCluster performs dynamic load balancing of individual connections. As each new connection is established through the cluster IP address, FullCluster's load balancing filter examines the source and destination IP and port. A determination has already been made as to which node in the cluster will handle that connection, based on the current load of the online nodes, and their relative capacities. If a node reports an overload condition, connections are moved to other nodes in the cluster. If a node fails, all of the connections being handled by it are redistributed to other online nodes, based on their current load and total relative capacity. Because FullCluster handles the load balancing itself, and because it bases the load balancing functions on individual connections, the end result is a very fine-grained, dynamic load balancing system. Rainwall uses the transfer of the virtual IP addresses to move load from one node to another. If a node fails, or becomes overloaded, the VIPs assigned to that node are transferred to other online nodes in the Rainwall cluster. Although it sounds like a similar, dynamic process on the surface, the solution is not as robust. For example, consider a network with 100 hosts, Class C subnet. They use a four node Rainwall cluster as their default gateway to the Internet. The network administrator assigns 8 VIPs to the cluster, two per machine. The 100 hosts are split 1/8 to each VIP. If node B goes offline, the two VIPs assigned to it are moved to two (presumably) other nodes. However, that translates into the total traffic of about 12 other machines being added to two other nodes. But the third node in the cluster can never assist in taking its share of the load - there aren't enough VIPs to assign to it, and Rainwall has no other means of adjusting the traffic. The answer is often to assign more VIPs to the cluster at the beginning. But additional VIPs bring their own problems in terms of configuration and added complexity, as outlined in "IP and MAC addressing", above. Additional Device Configuration Mr. Decker also made the claim that, "We add only 4 rules to the firewall and 2 commands to the router. Rainwall requires no installation of NICs or tweaking of MAC addresses, and licensing is not bound to IP addresses. Anyone who has installed StoneBeat before will know why these are issues." However, the use of multiple VIP addresses suggests a bit more work on the part of the network or firewall administrator to achieve load balancing. Each router or device on the network requires a default gateway address. With Rainfinity either the machines are configured to each use one of the VIPs, and requiring reconfiguration in the event a node is added, or routers must be used with routing tables for each VIP. The more VIPs used, the better the load balancing function, but more configuration is required as well. If routers are used, multiple routers with a redundant configuration are required to maintain the high availability of the network. FullCluster uses one additional rule on the firewalls, enabling communication between the nodes on the heartbeat network. For most cases, this is the only change required to support FullCluster. Some switches and routers may require additional commands to properly support the Ethernet multicast.In many cases, however, IGMP can be enabled on the switch, allowing FullCluster to use IGMP to manage the multicast group memberships. Manageability The ability to install, configure and manage a product is one of the main concerns of the end user. FullCluster now provides three separate interfaces for configuring the product after the initial installation is complete. In addition to the original command line utility, which prompted for configuration information, FullCluster provides a Web-based configuration utility. For the Windows NT version, an NT native GUI is also included. Each of the new user interfaces can be used to configure all of the nodes of a cluster from one location, eliminating the need to perform configuration tasks on each cluster individually. Since FullCluster handles the load balancing on the firewall nodes, it also easily supports Network Address Translation (NAT), VPNs, and other advanced technologies. Editing configuration files is even rarer with the 2.0 version, as FullCluster now takes advantage of OMI and other technologies available in FireWall- 1 to inform the load balancing filter of NAT rules, for example. Rainfinity requires you to edit several text files by hand on each node in the cluster. Jack Coates, a Rainfinity software engineer noted that, "The best way to implement NAT is by using a separate (not managed by Rainwall) IP, then adding a set of routes on your router which point that IP to the VIPs." It would seem that Rainfinity recommends additional configuration and the use of an additional network interface to perform network address translation. Performance and Scalability One of the most difficult areas to measure and compare between two products is performance. Terms such as "bandwidth" and "throughput" are often used, but rarely understood or clearly defined. The comparison of statistical figures for products, especially ones that implement their core technologies differently, is hard to perform in an unbiased, scientific fashion. Often laboratory tests are not a clear indicator for how the product will behave in production environments, as operating system configuration, network hardware, firewall rules, the use of advanced firewall features such as VPN or NAT, and the size and amount of different types of packets will all affect performance results. It is also clear that any testing performed by Stonesoft or Rainfinity on their respective products can be suspect from the customer's point of view, since each has a vested interest and bias towards their respective solution. Several points can be made about some of the performance issues raised by Rainfinity, however. Stonesoft has measured the difference in the lab between a system with FullCluster and without FullCluster, and has determined that FullCluster adds 1-4% total overhead on a machine. Although Rainfinity typically uses the size of the pipe involving the network interfaces of the firewall itself, those interfaces are rarely the bottleneck in a network. The size of the pipe coming into a building, often a T-1 line (1.5 Mbps) is one of the main constraints. Additionally, FireWall-1 ensures that the maximum bandwidth for an interface (200 Mbps for FastEthernet, full-duplex) will only be 93% of its capacity, since the firewall must process the packets. Add VPN encryption, user authentication, and network address translation, and the speed through the firewall decreases as well. Typically the purpose of clustering and load balancing the nodes is to relieve the overhead on the firewall itself. To address the issue of performance further, Stonesoft welcomes and suggests benchmarking by independent, third party companies which do not have a vested interest in the outcome. We also invite customers to download FullCluster and obtain an evaluation license from http://www.stonebeat.com/ to perform their own evaluation of the technology and performance in their own environments. Although Rainfinity claims that Rainwall is now "infinitely scalable", this is obviously not practical. There are a finite number of IP addresses, and out of that set only a fraction are assigned and available to any particular subnet. Most ISPs will provide up to 32 addresses from the allocated space for customers to work with. As mentioned before, this limits the customer to a 16-node cluster with one VIP per node, assuming they have no other devices on that network segment. FullCluster can scale transparently up to 16-nodes per cluster, while only using 1 or 17 IP addresses (it is possible to use one cluster IP with no dedicated IPs for each interface) total. As new nodes are added to the cluster there is no need to configure other devices on the network to enable the new node's participation in load balancing. OPSEC Certification Rainfinity has mentioned OPSEC certification as another factor in their favor. On the FireWall-1 mailing list, Mr. Decker pointed out, "Rainwall-E is OPSEC-certified for Load-Balancing including VPN-1; StoneBeat FullCluster isn't." At the time Mr. Decker made his statement, it was true that FullCluster had not yet achieved OPSEC certification. But just three days later FullCluster for Windows NT was certified. Stonesoft's original StoneBeat product has been re-certified as well. Stonesoft has also been an OPSEC Alliance Partner, and Authorized Training Center, for many years now. Additional Benefits of StoneBeat Clustering for Internet Services With the StoneBeat product family, companies can enable load balancing and clustering technology not only for their firewall, but also for their Web servers and most every other Internet service provided. This application independent solution provides the same familiar, robust, and reliable technology for high availability in many areas critical to e-commerce, MSPs, and ISPs. In addition to FullCluster for Check Point FireWall-1, FullCluster is available for Gauntlet and Raptor. For companies looking for high availability in all of their network services, StoneBeat products also include WebCluster (supporting Microsoft IIS, iPlanet Enterprise Server, and Apache), DNS Cluster (supporting BIND and Microsoft DNS Server), SecurityCluster (supporting Trend Micro's VirusWall, and Finjin's SurfinGate, with more to come as testing continues), and CacheCluster (supporting most proxy servers). Single Management Interface For all of the StoneBeat product family, the clusters can be managed with a single, Java-based GUI (graphical user interface) client. This intuitive, easy-to-use GUI represents the critical information on each node in the cluster in an easy to understand format. Multiple clusters can be monitored in the same GUI, making it easy for managed service providers (MSPs) to monitor multiple customer sites. The StoneBeat GUI, unlike the Rainfinity interface, allows for different access levels (monitor-only or the ability to control). The clusters can be fully controlled or monitored from a command line interface as well. Rainfinity also offers a Java-based GUI. However, there is no support for access controls, and multiple clusters or sites. The display of virtual IP addresses, and their purpose and status, is not very clear. Fully Customizable Test Subsystem Although both products offer the ability to perform various health checks on the system, StoneBeat FullCluster is the only product that offers an extensive set of internal tests, going beyond network interfaces and the firewall software, to include components of the operating system, such as file systems and CPU utilization. In addition to the robust set of internal tests, FullCluster can be configured to use any executable program or script to perform custom tests. Tests can also be grouped with logical operators, allowing a firewall administrator to create more complicated test conditions. The cluster can send alerts or transition the node to an offline state in the event of a test failure. Strong, Public Company Stonesoft Corporation was founded in 1990, offering unique software solutions to complex business problems. Since 1996 the company has offered high availability solutions for FireWall-1, and expanded that offering with FullCluster and the StoneBeat product family in 1999. In addition to providing high availability solutions, the company's Networks division offers consulting, support and sales of products from strategic relationships with WebTrends, RSA Security, Trend Micro, and others. This successful Finnish company is publicly traded on the Helsinki Stock Exchange, and has seen annual sales growth of 60-70% over the last four years. The company is headquartered in Helsinki, Finland, with offices in the US, Brazil, Singapore, Japan, Spain, Germany, Sweden, England and France. The US headquarters is in Atlanta, Georgia. Stonesoft's customers include Fortune 500 companies, financial institutions, ISPs, MSPs, telecommunications firms, government agencies, and others. Customers include: NASA, Bank of America, and Merrill-Lynch. StoneBeat products have been sold to over 4,000 customers worldwide, making it the industry leader for high availability clustering solutions. [This document is based on comparative information between FullCluster 2.0 and Rainwall-E 1.3, which is the latest version available from Rainfinity's Web site for evaluation.] See: Hall, Eric. Internet Core Protocols. chapter 4. Sebastopol, CA: O'Reilly and Associates. 2000. for more information about multicast. ================================================================================ To unsubscribe from this mailing list, please see the instructions at http://www.checkpoint.com/services/mailing.html ================================================================================
|