Bring us your use cases, and let us know if you would like to participate in an upcoming beta!
Recently I joined the Likewise Software team as vice president of Business Development. One of the key reasons I joined the company was the increasing role I see Likewise playing in platform interoperability. Take for instance one project–initially introduced a little over a year ago and more recently updated: Likewise-CIFS client-side and server-side SMB/CIFS support, which provides Microsoft Windows clients access to folders and files on Linux, Unix, and Mac computers.
We are seeing a broad array of use cases for the Likewise-CIFS server. These include a Likewise-CIFS server virtual appliance, Likewise-CIFS as a NAS “head” on top of existing SAN devices, Likewise-CIFS as a front-end for cloud- based services such as Amazon S3, as well as supporting vendors looking to license our CIFS server capabilities as an OEM for a variety of other use cases (including HP and Data Domain).
Here are more details on a couple of use cases we’ve seen to date:
Do you have other potential use cases to consider beyond what we’ve detailed here? We’d love to hear from you.
Professionally speaking, the past sixteen months have been, for me, some of the most exciting at Likewise that I can remember. Today was the culmination of the combined effort of our entire engineering team as we announced that Likewise-CIFS is the integrated SMB/CIFS solution for Windows client support on some Hewlett-Packard StorageWorks products.
What makes this announcement special to me personally is not the fact that a major vendor has chosen Likewise; we’ve already established ourselves in the AD authentication bridge space with companies such as Isilon, DataDomain/EMC, VMware, and Citrix. What makes the HP announcement particularly meaningful to me is my personal involvement in the HP partnership. Over the past several months I’ve gone through all the daily status meetings, bug triages, late night debugging sessions, and the general things that go with enterprise scale software development. Having worked at HP prior to coming to Likewise in 2005, I’m glad to see company endeavors succeed, particularly now that I have new friends amongst HP engineers.
Milestones like this announcement are good times to review where we’ve been and where we plan to move towards in the future. In January 2009, Likewise began an initiative that would become Likewise-CIFS, the SMB/CIFS file server component of the Likewise Open project. Even though I’ve worked on another SMB server in the past, Likewise-CIFS was really a brand new start. The server’s multi-threaded architecture and modular components allowed us to parallelize much of the initial work, which was extremely important because the entire file server was being written from scratch.
To fully appreciate the difference between where we started in January 2009 and where we are now in April 2010, it’s best to examine the heart of Likewise Open–the code itself. A quick glance at the repository from git://git.likewiseopen.org/likewise-open shows around 6800 commits. That means that over the 337 working days (discounting weekends but including holidays) in the last 16 months the project has averaged about 20 commits per day. Of course commits in and of themselves do not necessarily equate with improvement. How many of the 6.8K commits added new code and new value? Looking strictly at new components, we can conservatively say that a minimum of over 360,000 lines of new, handwritten C code has been added. That doesn’t even include the 130,000 lines of C# code included in the Likewise Management Console that was made available under the LGPL last year.
If you aren’t a programmer, these numbers probably contain little meaning. In that case, let’s talk about features. In the past year, the following is a list of some of the new things that have been added to Likewise-CIFS:
All this describes the road we have traveled thus far. What about our future plans? Will the next twelve months be as exciting as the past sixteen? I believe so and here’s why. There’s several what I call “point” features still remaining for the file server. Things like Distributed File System (DFS) support and consolidation roots, Access Based Enumeration (ABE), Shadow Copies, and Alternative Data Streams (ADS) are isolated, individual features with a high degree of end user visibility.
But these are really just enhancements. What broad initiatives do we have in play for the coming year that would match the scale of writing a new SMB file server from scratch? We have several ideas already in discussion which I hope to be able to share in the coming months. But sufficient to say that our path forward is to continue to build upon the Likewise Open platform base that we’ve put into play. The way forward is up–to build upon the foundation already laid.
I’ve spent some time recently looking at the Windows Presentation Foundation (WPF). WPF is part of Vista, part of .NET 3.0 and part of Silverlight.
At some level, I’m disappointed with WPF. After hearing so much about it (but ignoring it) during the last few years, I expected it to be a radical new way of writing graphical user interfaces. Instead, it seems like a slightly different way of developing Winforms applications.
With .NET 2.0, you use Visual Studio to design your Windows “forms”. Visual Studio automatically generates code for you that creates all of the visual elements (windows, buttons, list boxes, etc.) at run time. When your form’s constructor is called, it calls “InitializeComponents()” and the generated code does the rest. The Visual Studio forms editor also lets you easily attach code to different events raised by the visual components.
With WPF and .NET 3.0, you use the Expression Blend tool to design your user interface. As with the Visual Studio forms editor, Expression Blend also lets you easily attach code to handle UI events. The output of Expression Blend, however, is not code, it’s XML. When your code’s constructor is called, again, InitializeComponents is called but, this time, the function works by loading the XML and interpreting it (creating forms, buttons, list boxes, etc.) rather than by executing generated code.
At this level, the only advantage/difference of/between WPF and .NET 2.0 Winforms is the use of XML rather than generated code. Mind you, this can be a significant advantage. By managing the UI specification as data separate from code, WPF facilitates the use of skilled graphical designers to develop user interfaces. Designers can use Expression Blend to fine tune UI without worrying about unintended changes to program code.
After looking WPF further, however, I realized how it is more significant than it appears at first blush. The WPF designers have completely reimplemented the basic Windows UI elements (and more) in a much more cohesive, sensible, fashion. The net result (no pun intended) is very cool.
For 20 years now Windows programmers have been suffering the limitations of the original Windows 1.0 design from 1985. Windows 1.0 defined a basic set of UI controls: window, menu, list box, static control, text control, push button, radio button and group box (I think that’s all of them!). These controls were implemented by Windows itself and could be composited by programmers in their own applications. Additionally, programmers could subclass these controls to alter their behavior or to implement their own user-defined controols.
Subsequent versions of Windows introduced new controls. Somewhere along the line, combo boxes, context menus, rich text controls, progress bars and other controls were added. The concept of a small set of built-in controls with narrowly prescribed behavior persisted however. You could do some things like image-based pushbuttons or scrolling lists of images by taking advantage of owner draw features but the amount of customization available with the built-in controls was minimal.
.NET 1.1 and 2.0 added new controls, too, including DataGrid and DataGridView that had no built-in counterparts. These controls, however, resembled the built-in ones in how the could be used and customized.
With WPF, the original Windows UI elements are totally subsumed by the new WPF UI model. It is possible to use WPF to write what looks like a traditional Windows application, but it is also possible to write applications with much more sophisticated user interfaces.
WPF has a very clean notion of containment and transformation. Let me explain what I mean by these. Consider a traditional Windows 1.0 List control. It contains a list of strings and can present these strings in a vertical list, providing scrollbars if they are needed to view all the list contents. In WPF, the ListBox control is a container that will provide a scrolling list of whatever it contains. What can it contain? Anything! Well, any WPF UI element. If you put static text boxes in a WPF list, it’s alot like a Windows 1.0 list. But if you want, you can put editable text boxes or tree views in a WPF ListBox and it will do the right thing with them. There are several container controls in WPF and all of them support this functionality.
Similarly, WPF provides a consistent mechanism for visual transformation. In graphics (and, don’t forget, WPF has full support for 2D and 3D graphics) “transformation” refers to mathematical manipulations to modify the appearance of what is being displayed. There are translation, scaling and rotation transformations that can move, size and rotate graphical data. WPF supports these transformations, too. If you surround a text box with a 90 degree rotation transformation, the text box will appear (and function) vertically instead of horizontally. Transformations can apply to entire graphical elements (for example, our previous ListBox) or to contained elements (we could have one tree view rotated within our list of tree views).
Beyond the generalized concepts of containment and transformation, WPF also adds support for animation including keyframe animation. With keyframe animation, Expression Blend lets you specify the visual characteristics of a UI at two (or more) points in time and the WPF run-time code will take care of gradually transforming the UI for the intervening points. You can, for example, place an image at one (x,y) coordinate to start and at another (x,y) coordinate 10 seconds later. The WPF run-time code will then gradually move the image from the initial to its final location over the course of 10 seconds. Key frame animation can be applied to scaling and rotation transformations as well as to other visual effects (transparency, for example).
So far, I’ve mostly read about WPF. I want to write some non-trivial software to put it through its paces. From the design perspective, I really like it. I also like the relationship between stand-alone WPF applications and Silverlight (browser-based) applications. I’ll post again on the topic when I have more to say.
I had the opportunity to spend a few hours at Oscon yesterday in Portland, Oregon. Oscon is the Open Source Conference held by O’Reilly. I was pleasantly surprised by the size of the conference, the number of exhibitors and the presence of several large companies. Open source software has definitely become mainstream and accepted by industry.
At Likewise, we consider ourselves an open source company. Likewise Open has been very successful and has opened many doors for us (no pun intended). It’s helped us tremendously, even when we end up selling our Enterprise version instead. Nevertheless, I have some observations about open source, not all of them positive.
There are several definite advantages to using open source. It enables you to build a solution without having to reengineer every component. We make use of both MIT Kerberos and OpenLDAP in our products. If we had needed to rewrite these components, it would have taken us much longer to get to market. We’ve also made use of Samba components. Samba has been around a long time, has had “a lot of eyes” on it and has figured out the subleties of talking to Microsoft systems. Again, using open source saved us a lot of time.
There are some disadvantages to open source, too. It can be difficult to get the “owners” of an open source project to do what you think is the right thing. Although open source is “open”, certain projects are led by designated groups of people. Different projects have different guidelines around software submission and how they go about accepting external contributions. Very often, your contributions have to be vetted before they’re accepted in the main code. If your code is not accepted, your only option is to distribute your own modified version of the open source project (your branch). Branching is not a good thing.
Sometimes, code changes are rejected due to style considerations or differences in design approaches. These are objections that can be dealt with relatively easily. More difficult are rejections due to “dogma”. Some open source projects, for example, are irrationally opposed to anything that they perceive as helping Microsoft. Even our intent is to make non-Windows systems work better they still oppose our goal of making these systems work better with Microsoft Active Directory. This, of course, doesn’t apply to the Samba project (who had the goal before we did) but applies to other open source projects/companies/teams with which we’ve had to deal.
There is little we can do in these cases other than to develop our own alternatives.
Another issue which we’ve encountered with some open source software is a certain lack of industrial rigor. I’ve worked a lot with both commercial software developers (I spent 11 years at Microsoft) and with academic programmers (4 years at Microsoft Research). Sometimes, open source software sometimes resembles the latter more than the former.
What do I mean by “academic” programmers? Say that you’re in school, you take a programming course and you’re asked to write a program that converts degrees from Celsius to Fahrenheit. You write something like:
void main(int c, char **argv)
{
int degrees = atoi(argv[1]);
printf("%d Celsius is %g Farhenheit\n", degrees, (degrees * 9.0)/5.0 + 32);
}
Your professor would probably give you a passing grade for this. It works. In industry, however, your boss would likely complain about several things:
Open source software is not always industrial quality code. We have found many cases of memory corruption and leakage even in mature open source projects. We have also found and fixed many, many, bugs.
Note that the title of this post does not suggest that proprietary software is immune from similar flaws. Many proprietary software companies (including my ex-employers) are guilty of releasing software that is not ready for prime time. “Good Software” can be either open source or proprietary. Similary, “Bad Software” does not care about its licensing model.
What I will suggest, however, is that companies that have to support their products, keep customers happy and, ultimately, make money are much more motivated to develop Good Software than organizations which develop software but don’t actually have to deal with the consequences of poor code. There is no stronger motivator to write Good Software than an irate customer.
We are heavy users of virtualization at Likewise Software. Since we develop software for over 100 different platforms (multiple flavors of UNIX, Linux and Mac OS X), we have to be able to boot up a Red Hat 2.1 machine one minute and a Open Solaris machine the next. Developers and testers, both, need access to a wide variety of machines on a regular basis. Without virtualization, it would either be very expensive (we’d need hundreds of machines) or very slow (we’d have to re-image machines all the time) in order to do our work.
We also use virtualization outside of development/test. Over time, we’ve tended to collect an assortment of servers running project management tools, bug databases, internal wikis, HR, financial and other applications. A few months ago, our IT folk examined all the servers in our inventory and migrated many of them to virtual machines.
Unquestionably, virtualization can bring about good things — reduced administrative costs, increased flexibility, reduced energy use, etc. Virtualization doesn’t always make sense, however.
Occasionally, I have a conversation with someone who’s basically saying something like “Virtualization is terrible! I moved my database server and my risk management grid onto VMs and now they run at half the speed they used to!”. Yes, I do want to whack them upside the head when they say this.
Obviously, if you have a CPU-intensive, heavily threaded, application running on a physical server it’s going to slow down if you put it on a virtualized server along with other CPU-intensive applications. If you wouldn’t run these two apps on the same physical server, certainly, don’t run them on two VMs on a single physical server. VM hypervisors can run multiple virtualized machines effectively and with little degradation in performance, but only to the extent that the virtualized systems are amenable to this. If the VMs are running applications that are not heavily threaded and do not heavily tax their CPU and I/O systems then the VM hypervisor can exploit multiple cores and spare CPU cycles to provide acceptable performance.
There are some “textbook” examples of applications/systems that are ideal for virtualization. Web farms, for example, can deploy web sites in their own VMs and give you complete control of a virtualized server. You can muck with system configuration to your heart’s content without worrying about other web sites that might be deployed on the same physical server. Web farms can also quickly duplicate VMs allowing them to provide additional load-balanced capacity on an on-demand basis.
Beyond the textbook examples, here are some others to consider.
Infrequently run applications are great candidates for virtualization. Consider financial apps that might only be run at quarter- or year-end. Rather than dedicating a machine to these applications that sits idle 95% of the time, these applications can be deployed on virtual systems that are suspended until needed. This approach is ideal for sensitive applications such as financial and reporting systems. It is best to not run these applications on shared hardware. If there are other applications on the same computer this increases the likelihood of intential or unintential access to secure data. With virtualization, physical systems don’t have to be “wasted” on infrequently used sensitive applications. Note, too, that by suspending sensitive VMs while they’re not in use that you’re reducing the attack surface for hackers.
Another great use of virtualization is for old, legacy, systems. If you’re running old versions of Windows NT or SUSE Linux or Solaris x86 and don’t want to update them (why fix something that’s not broken?) why not move these systems to VMs? In all likelihood, these systems are running on flaky outdated (perhaps unsupported) hardware. It’s possible that they’ll run faster on VMs than on old metal.
Demo systems are ideal candidates for virtualization. The systems receive a lot of ”wear and tear” – they’re frequently polluted with sample data and often left in weird states. Moving these to VMs allow you to use VM snapshots to quickly restore them to a recognizable state.
Finally, one of my favorite uses for VMs is as security honeypots. Create a VM (especially a Windows VM) and give it a suggestive name, perhaps, payroll or HR. Create some directories and files in it, again, with suggestive names. Now, turn on all the auditing features available in the OS. Protect this system as you would any other secure server in your network (but don’t use the same admnistrative passwords!). If possible, isolate this VM from your other systems. Put it on its own subnet and disallow routing to other systems, for example. If you have an intrusion detection system, make sure it monitors this VM. There should be no access to this computer (other than by you, to assure its health). If your IDS or audit logs signal that someone is trying to access the system, you know you’re under attack.
Virtualization has been around for 30+ years. I used VM/370 in college in 1977. It offers many benefits that, thanks to VMWare, Xen and others, are now available to any computer user. At the end of the day, however, virtualization is simply multitasking with really, really, good application isolation. Rather than multitasking applications that call a single operating system instance, hypervisors multitask entire operating system instances. The rest of the gory details (how they virtualize hardware, where drivers live, etc.) are just that: details.
A couple of weeks ago, I wrote about usability testing. Mostly, I talked about testing methodology and the value that it brings. We’ve now finished 8 sessions and I thought it would be good to revisit the subject.
Once again, I’m amazed by how much value usability testing can provide. We’ve been testing our Likewise Open evaluation and download process with the goal of increasing the number of people who successfully install the software. This process begins with a user arriving at our web site and ends with that user performing a successful “join” operation to connect his/her non-Windows computer to Microsoft Active Directory. When we first decided to test the process, I thought we’d not learn much. What could be simpler than clicking on a download link and running an installer program? For the 1,498,753rd time in my life, I was wrong.
The first thing that we learned is that our home page is not our Home page.
Although we’ve tried several web analytics packages, we’ve lately becomed enamored with Google Analytics. The free version is relatively capable and sufficient for our current needs. A few weeks of analysis with the tool told us that more customers are coming to our Likewise Open community page than to our corporate home page. Looking at the analytics report, it was obvious why: our partners are driving traffic to our site and their linking to the Likewise Open page instead of the corporate home page.
This makes sense. When our Linux partners want to reference Likewise, they want to get their customers as close to their final destination as possible. They don’t want to link to a high-level page with a lot of sales-oriented material. By linking to our Likewise Open community page, they are taking users to a page that’s very relevant and only a couple clicks away from a download.
When we realized that our Likewise Open page was our effective home page, we realized we needed to improve it. We knew that it would be a bad idea to make it too sales-oriented but we also knew that it had several shortcomings. This was borne out in usability testing and quickly corrected.
The second thing we learned is that clicking on a link is non-trivial.
Our download page has a big table with many different rows for different operating systems (Linux, Solaris, etc.), different CPUs (i386, SPARC, Itanium, etc.), different CPU modes (32/64 bit) and different packaging forms (RPM, DEB, etc.). The user has to find the right row and click on a download link. Simple no? No.
If you don’t set your mime-types properly on your web page links, Firefox can make a mess of things. We had many complaints of users who would get a screenful of binary stuff instead of a downloaded file.
The next thing we learned is that it’s possible to be too smart.
Linux and UNIX folk are used to painful install processes. They’ll download packages and then have to use some type of package manager (rpm, dpkg, etc.) to install it. We decided to make life easier for customers by giving them a nice, executable, installer program. In the case of Linux, Mac and other operating systems that are likely to have a GUI present, we make use of a Bitrock-based setup program. Download the software, run it. Simple no? No.
When Firefox downloads a program to Linux, it doesn’t retain its executable file mode. Before you can run it, you have to chmod +x it. If users didn’t read our 100 page Installation and Administration Guide they might not realize this. In fact, as usability testing pointed out, they might try to do other weird stuff.
Our setup programs are typically called something like LikewiseOpen-4.1.0.2921-linux-i386-rpm-installer. Long, yes, but it tells you everything you need to know: product name, version, operating system, architecture and packaging format. Note that we include “rpm” (or “deb” or other) in the name. Some Linux folk would fail to realize that the installer was an executable, would see the “rpm” in the name and think, “maybe I’ve got to install this thing with the rpm program.” Wrong.
The last thing that we learned is that nobody reads anything. No documentation, for sure. They don’t spend much time reading screen output, either.
After users install the software, they need to run our domain-join utility afterwards. We tell them this at the very end of the installer program. Alas, as usability testing showed us, many users decide to just ignore that information and hit Enter repeatedly without reading anything. Right after they dismiss the last dialog they realize that they just missed something important.
We’ve made numerous changes to the Likewise Open pages as a result of our testing. Much of it has been simple to accomplish: more prominent links; short, task-specific document; a short video; corrected mime-types. We’ll do a new round of usability testing now to verify the results of our changes. I’m confident that we’ll see improvement but I’m much less confident now that we won’t find a different set of problems to address. The main lesson of usability testing is that your software UI is never as good as you think it is.
After re-reading my last post, I realize that some of you might have no clue what I’m talking about when I mention network attached storage (NAS). To use an oxymoron, this post is a follow-up primer.
The idea with a NAS is to centralize storage across multiple machines in a network. Instead of having to maintain numerous independent disk drives on the individual machines in a network, NAS places all key files in a central location and worries only about managing the NAS. This concept is frequently used with server computers but can also be used with workstations. Microsoft Active Directory, for example, supports the concept of a roaming profile that allows your personal files to be stored in one consistent place regardless of what computer you login to. UNIX and kin can do something similar with automounts.
There are actually two main mechanisms for implementing centralized storage.
The storage area network (SAN) approach is a different approach than that used by NAS. A SAN storage appliance provides low-level storage “blocks” to the computers connected to it. The SAN device has no concept of a “file” only of an assortment of storage blocks assigned to a particular computer. SANs are frequently accessed by a separate, high-speed, fibre channel network but can also be accessed over Ethernet using iSCSI and other other protocols.
A NAS device, on the other hand, provides file-level operations. The device implements the smb/cifs protocol and/or the NFS protocol in order to provide file-oriented services to Windows or UNIXy computers (respectively).
If you have used a traditional Netware or Windows-based file server you have used a NAS device. There are much cooler devices now, however. Isilon, for example, makes very clever clustered storage NAS devices that allow multiple NAS nodes to replicate data in a fashion that provides redundancy and high-availability at much lower cost than SANs and many other NAS devices.
The Linksys NAS 200 device that I talked about in the last post is a dirt-cheap home NAS device. It is not particularly fast nor does it offer much sophisticated functionality. Its security model, for example, is very crude. I run a Windows domain controller at home but the NAS 200 does not integrate with AD-based security. To avoid authentication hassles, I simply allow the guest (any user) to have read/write access to all the shared folders. Fine for home (where things are protected with a perimeter firewall and with secure wireless access points) but not fine for a more public network.
I installed the Linksys appliance in order to provide a backup destination for the 6 computers that we have strewn throughout the house. Using the appliance means that I don’t have to dedicate a general-purpose computer to this task. Additionally, Linksys has figured out how to set up Raid and how to automatically perform various recovery operations all using a simple Web interface. It would have been much more complicated for me to figure this out myself.
The one last piece of the backup puzzle that I’d like to implement would be to add some form of offsite storage. Ideally, the NAS 200 would, itself, backup files to some Web-based storage provider. Since it doesn’t, I might have to implement this myself with some type of periodic job that detects new files on the NAS and copies them to a service during off hours.
In between eating hot dogs and blowing up fireworks this weekend, I worked on a couple of home IT projects that I’d been planning for a while. My goals were straightforward. First, I wanted to implement a more robust backup solution. Second, I wanted to get rid of a Fedora Core 5 server and replace it with something newer. The two projects were related since the FC5 server was being used solely as a Samba file server to host my backup drives. Here’s what I ended up doing to accomplish both tasks.
I didn’t like the dedicated Fedora server for two reasons. First, I was stuck using a computer for a very narrow purpose. I have only two computers in my “server room” (my den) and my other one is my AD domain controller. I’ve been installing lots of Windows application software on the AD machine because I can’t run it on Linux. Installing random software on a DC is not a good idea. The second reason I wanted to get rid of the Fedora machine is because I wanted to run a more current distro. I worried about replacing FC5 with Ubuntu, however, because FC5 uses a funky logical volume manager. If the change failed, I might have to scramble to recover my data.
To get rid of the FC5 file server, I spent $150 on a Linksys NAS 200 device and $200 on two 500Gb SATA hard disks. The Linksys device is, essentially, a cheapy Linux box with an ethernet port and two SATA drive bays. It can be configured to use the drives separately or in a RAID 0 (striping) or RAID 1 (mirroring) array. I chose the latter configuration giving me 500Gb of storage but with the security of knowing that I can lose a drive and still have my data.
The Linksys NAS 200 was pretty easy to install. I made one mistake which was to start using it (copying over 100Gb to it) before realizing that it was running very slowly. A look at the Linksys web site showed that there was a firmware update that allowed the use of a non-journaled file system. Without journaling, the Linksys device is much faster but will have to perform a “scandisk” (fsck) if detects any disk errors. Installing the firmware upgrade and switching to the non-journaled file system required reformatting the disks and re-copying the 100Gb again.
WIth the NAS in place, I was able to go to my FC5 computer and copy over all the old backups. I then changed the key computers in the house to use the NAS instead of the FC5 machine for backups. Along the way, I also stopped using Windows backup software and started using NTI Shadow (dumb, cheap) instead.
Now that my FC5 computer was out of a job, I could repurpose it. I increased its RAM to 2Gb, deleted its Linux file system partitions and installed Windows XP on it. Deleting the partitions was necessary as, with them, Windows XP would get confused during installation.
The first thing I did after installing XP (well, the second, after waiting for SP2 and a million other updates to install), was to install VMWare Workstation. VMWare Server is free, but the Workstation version allows for multiple “snapshots” which I find very useful.
With VMWare installed, the first VM I created was an Ubuntu 8.04 Linux VM.
What’s the point of replacing a Linux machine with a Windows machine running Linux in a VM? Two things: first, I can run Windows software in the host operating system. Actually, I will probably create a Windows XP VM and run the Windows software in the VM instead of on the host OS. Second, if I get tired of one Linux distribution, I can always create another VM with a different one.
With VMWare, I can keep my host OS in pristine condition. I won’t install any application software there. If any problem occurs in a guest VM, I can always use the VMWare snapshot features to “undo” them. Worst case, I can blow away a VM and recreate it. What about data? Here’s the key: don’t keep your important data on virtual disks. Use virtual machines, but keep your data on real drives or on a NAS device. Windows file shares or the Linux mount.cifs command can help with this (if you keep all your data on WIndows file servers; if you want, you can use NFS and store data on UNIX file servers, instead). Use virtual disks only to store operating system files.
This is exactly the architecture used in large virtualized Enterprise IT departments. Application data is kept on attached storage accessed by one or more virtual machines. Deploying additional virtual server instances is easy because the data is centrally located. The same concept can be used at home, on a smaller scale.
Everything is up and running now. I’m happy running Ubuntu instead of FC5 and I’m happy knowing my data backups are mirrored. I’ll be tracking the performance of the NAS device over the next few weeks and months. Consumer NAS devices are a tricky tradeoff of simplicity vs. functionality and performance. Someday, I want to experiment with removing a drive from the array and validating that the RAID rebuild occurs properly. For now, I’m just hoping the NAS is doing the right thing.
This post might more accurately refer to Totalitarianism instead of Socialism but what it lacks in precision, it makes up for with alliteration!
July 4th! Independence Day. In Seattle, we also refer to it as “the day before summer begins.” It always rains here on July 4th. It’s a tradition.
Beyond its meteorological implications, July 4th commemorates the signing of the Declaration of Independence. 232 years ago, representatives from the original 13 colonies formalized their desire to secede from the British Empire. What beef did they have with King George? Why all the fuss?
True, we know they were upset about “taxation without representation” and about the Stamp Act and import tariffs. Did you know, however, that Britain had ceded on all these points? Did you know that the 13 colonies had the highest standard of living in the world at the time and a very low tax rate?
Beyond any concrete economic or political issues, the colonies wanted to secede from Britain because they resented being told what to do by a remote sovereign that treated them as second-class citizens. If you read the Declaration of Independence and, even more so, the Constitution, you will easily detect a fundamental distrust of centralized Federal government. The forefathers went out of their way to delegate the least number of powers to the Federal government – the rest were reserved for the states. Arguably, the 2nd Amendment is all about the rights of the States to maintain militias so that they could fight the Federal government. Remember, the militias were formed to fight the British. The last thing the colonies wanted was to be powerless against another powerful central government that might turn out to be just as objectionable as the first.
This fundamental question of State vs. Federal rights is one that remains with us today. Battles over abortion, Medicaid payments, “No Child Left Behind” and other issues focus on what are state responsibilities and what are Federal responsibilities. The States rights proponents argue that local government is more efficient and more responsive to local needs than a centralized Federal government.
Centralized vs. distributed control is also a frequent topic of discussion in Socialist vs. Capitalist systems. In Socialist systems, the State owns all the capital and decides how to allocate it. In Capitalist systems, individuals own capital and they decide how to invest it. Again, the question is one of tight, centralized, control vs. a loose distributed system.
This same tension exists in large software architecture. To what degree should software be tightly controlled by some central program vs. loosely controlled and autonomous? Occasionally, I will review a design for a complex system and I will compain to the architect that the design is “too Communist.” What I mean by this is that it too heavily relies on central planning and control.
As with governments and economies, there is a strong argument that software architecture should avoid excessive dependence on centralized control. Centralized control can lead to very brittle software that breaks when an alternative design would simply bend. Centralization frequently translates to designs with single points of failure. A server goes offline or a process fails and the whole system collapses.
As one who dislikes both centralized designs and socialist governments, I like to see software architectures with:
It is my premise that such systems are no harder to develop than centralized ones but they do require more creative architects to design them. Because we tend to design systems in a hierarchical fashion, it’s easiest to develop control systems that also flow from top-to-bottom. Developing loosely coupled systems takes a different mind-set. Once you’ve identified the components in your design, you need to treat each as an independent entity and you need to think about how they get the information they need and what they do with it when they’ve finished processing it. You also need to consider what each entity should do when it detects and error condition. If you design your components to be independent and robust, your design might be better able to heal itself if a component has an intermittent failure.
It is true that, in the physical world, there are economies of scale but it is also true that past a certain point, inefficiency and bureaucracy increase with organizational size. In the software world, there are few benefits to centralization and size and many clear deficiencies. As the computer and IT industry moves increasingly towards outsourcing and SaaS, we need to keep in mind to what degree delegating responsibilities to service providers is a move towards centralization and all of its flaws.
We’ve been doing a lot of interviewing lately, looking for developers, QA folk and deployment engineers. We’ve looked at hundreds (if not thousands) of resumes, performed numerous phone and live interviews but made only a handful of offers. It’s been difficult to find people with the skills that we’re looking for.
Likewise Software is in the identity management business. We make software that allows non-Windows systems to authenticate against Microsoft Active Directory and to employ AD-based group policy. As such, our needs at probably more sophisticated than those of most companies.
First, we need someone with good Windows networking/AD/DNS skills. Our biggest challenge at customer sites is assuring that their directories are properly configured. Our employees (especially our deployment engineers) need to be familiar with Active Directory and its architecture. They need to be able to run Likewise and Microsoft tools to assure that AD is properly configured and working properly. They need to be comfortable using tools like ADSIEDIT to look at objects in AD and they need to know what LDAP is. Experience with DNS and UNIX Bind is also valuable. Customers who choose to use Bind have to properly configure it to forward to AD/DNS or they have to manually set up a series of service records. Familiarity with NSLOOKUP and other tools is valuable.
Because random things can always go wrong when using a network, familiarity with network analyzers such as Ethereal is also valuable.
Second, we need someone with good Windows administrative skills. They have know how users are created in AD, how access controls are applied to resources and how Group Policy is used to help manage systems. They have to have some sense of how organizational units are used in AD and how GP objects can be inherited to accomplish company and departmental security and management goals.
Third, we need experience with UNIX and Linux administration. We support numerous versions of each so having familiarity with different shells and editors is a plus. You can’t rely on bash or vi being available on every system. Different versions of UNIX and Linux also have their own vagaries regarding where they store certain files and how they start/stop daemons. Having rudimentary knowledge of different places to look/techniques to use is important. You might be working on HPUX one minute and on Ubuntu the next. Some knowledge of how local accounts are stored in /etc/passwd and /etc/shadow is a must.
Fourth, we need someone with rudimentary knowledge of UNIX/Linux architecture. Knowledge of PAM and NSSWITCH is valuable. Understanding how name resolution works and how networks and firewalls are configured is valuable, too.
Fifth, some cursory programming skills are useful. We frequently need to write or modify shell scripts to help with deployments or testing/monitoring tools. Our account migration tools can generate scripts and being able to modify those is also valuable.
Sixth, some Mac knowledge comes in handy as does some experience with Linux Gnome desktops.
Seventh, general knowledge of Kerberos, Kerberos-based SSO and Kerberized applications is useful.
Finally, some experience with third party identity management systems is useful since we often need to interface with IBM ITIM or Sun Identity Manager or Microsoft ILM.
If you know anyone that meets all of these qualifications let me know. I’m pretty sure that I’ve hired all four of them and that my competitors have the other the other four.
Of course, we don’t expect candidates to have all of these skills. We’re lucky if they have half of them. My observation, however, is that our needs, if you factor out a couple of domain-specific things (e.g. Kerberos and LDAP), are not far from what any modern IT department needs if they’re running a heterogeneous data center. The amount of information that you need to know to effectively manage both Windows and non-Windows computers is huge. It’s not surprising that many departments choose to segregate these duties and assign them to different teams. As an unfortunate consequence, however, there is often little interaction and, sometimes, open hostility between these teams. Introducing interoperability solutions is complicated by the inherent distrust between the two camps. IT departments would do well to encourage education and personnel movement between the teams as a way to cross-pollinate ideas.