An Intranet Case Study

(Adapted from "MSWeb: An Enterprise Intranet" by Louis Rosenfeld and Peter Morville (2002) on

We've had the privilege of getting up close to a large number of corporate intranets. And the best approach we've seen so far is that taken by Microsoft's intranet portal (MSWeb) team. Like Microsoft itself, MSWeb is insanely huge and distributed. Let's use some numbers to paint a picture of the situation. MSWeb contains:

Microsoft estimates that a typical employee spends 2.31 hours per day engaging with information, and 50 percent of that time is used looking for that information. Here are just a few examples of how this chaotic environment hurts Microsoft employees.

Where to begin? This is your typical case of "silo hell." With as many as 8,000 possibilities available, employees have a hard time determining where they should begin looking for the information they need.

Inconsistent navigation systems. Navigation systems are quite inconsistent because they employ many different labeling schemes. Therefore, users are confused each time they encounter a new one. Not only does this inhibit navigation, it also muddles the user's sense of place.

Concepts and labels. Because different labels are used for the same concepts, users miss out on important information when they don't search or browse for all the possible labels for those concepts. (E.g. searching for "Windows 2000" should also include "Microsoft Windows 2000," "Windows 2000," "Win 2000," "Win2000," "Win2k," "Win 2k," and "w2k".) Conversely, a term doesn't always mean what you think it does. (E.g. ASP can mean "active server pages," "application service providers," or "actual selling price.")

Ignorance is not bliss. Often, users are happy when they get any relevant information. But in a knowledge-intensive environment like Microsoft's, users are much more demanding - their jobs depend on finding the best information possible. In this case, employees often get frustrated because they don't know when to stop searching. Is the content simply not there? Or is a server down somewhere? Or maybe they didn't enter a good search query?

It's not hard to see how a typical employee's 1.155 hours per day might get burned up. In short, Microsoft employees face an expansive and confusing information environment that's about as intimidating as the Web itself.

The flip side of this problem is how these challenges affect the people who are responsible for making Microsoft's content, or aggregating that content into portals. When a site is brought into the MSWeb fold, it comes with its own information architecture. Its organization and labeling systems and other tricky information architecture components must be integrated into the broader MSWeb architecture or be replaced altogether. For example, as many as 50 different variants of product vocabularies had been created in the Microsoft intranet environment. Fixing such problems is a messy and complicated challenge for any information architect.

The MSWeb team must determine ways to normalize and simplify the environment to make content management easier and more efficient.

In the late 1990s, people at Microsoft began talking about an odd and often misunderstood term - taxonomies. When taxonomies become a common part of everyday conversation, it's a sure sign that an organization is ready for a deeper look into information architecture.

MSWeb team knew that the time had come for a more ambitious approach to improving MSWeb. The team - populated by an impressive mix of information scientists, designers, technologists, and politically savvy managers - began to consider what users meant when they called for better (or any) taxonomies. Microsoft's employees thought of taxonomies as constructs that would help them search, browse, and manage intranet content more effectively.

In response, the MSWeb team developed a more generalized operating definition of taxonomies that would be more in line with how other employees were using the term. This flexibility - the willingness to speak the language of clients, rather than rigidly clinging to a "correct" but ultimately unpopular meaning - was key. It set the tone for successful communications between the MSWeb team and its clients throughout the organization.

Three flavors of taxonomies

The team defined taxonomies as any set of terms that shared some organizing principle. For example, descriptive vocabularies were seen as controlled vocabularies that described a specific domain (e.g., geography, or products and technologies) and included variant terms for the same concept. Metadata schema were collections of labeled attributes for a document, not unlike a catalog record. Category labels were sets of terms to be used for the options of navigation systems. These three areas comprised the foundation of the MSWeb approach. Better searching, browsing, and managing of information would be achieved by designing taxonomies that could be shared throughout the enterprise.

1. Descriptive vocabularies for indexing

Developing terms to manually index important pieces of content seemed a smart proposition for the MSWeb team. It would complement automated indexing by the search engine, which was currently the primary means of making the site's content available. But creating and applying descriptive vocabularies is an expensive proposition, especially within an information environment as large as Microsoft's. And there are so many different ways to index content. So half the battle was in selecting which vocabularies would deliver the most value to the organization as a whole.

Search Log Analysis - Search log analysis helped the MSWeb team gauge user content needs in their own words and determine appropriate vocabulary terms. Studying the search log's most common queries also helped the team get a good overview of which content areas were generally most valuable to users.

Availability - The team looked for decent controlled vocabularies that had already been developed in-house or that were available commercially.

Politics - The team was careful to talk with content stakeholders about what they felt they needed to make their content more accessible.

After taking all of these considerations into account, Microsoft narrowed its vocabulary development to the following vocabularies:

2. Metadata schema

Developed hand-in-hand with controlled vocabularies, metadata schema describe which metadata to use to describe or catalog a content resource. While Microsoft's descriptive vocabularies were driven by content and context, metadata schema were informed more by issues of users and content.

The MSWeb team developed a single schema that has value for both MSWeb and other intranet sites. Borrowing from the Dublin Core Metadata Element Set, MSWeb's schema was intended to be sufficiently 'stripped down' so that content owners would use it to describe resources, resulting in more records and therefore a more useful collection of content. The schema's simplicity was balanced with the goal of providing enough descriptive information to augment searching and browsing by users.

The schema's core fields are:

The schema has been commonly extended with these optional fields:

MSWeb began to use the metadata schema to create resource records in 1999. These fuel the immensely useful "Best Bets" search results and hold huge potential for improving areas such as content management.

3. Category labels

The third type of taxonomy - labels for the categories in site-wide navigation systems - was geared toward providing users of Microsoft intranet sites with navigational context. Category labels help users know where they are and where they can go. The MSWeb team employed a user-centered process for designing navigation systems, relying upon useful standbys as card sorting and contextual inquiry. On its web page, category labels are on the left-hand side of the screen. Descriptions of nodes, displayed on the right-hand side, help catalogers choose the appropriate category label.

MSWeb's "three taxonomies" approach is steeped in traditional library science, which isn't surprising considering the backgrounds of many of those on the MSWeb team. But it's important to note how willing the team was to abandon the traditional library science concepts that didn't make sense in the intranet environment. For example, the team did not try to create "traditional" thesauri for its metadata schema and category label taxonomies. Other standards familiar to the LIS community, such as Dublin Core, weren't initially adopted for MSWeb's metadata schema because they were not appropriate at the time.

The MSWeb team has been driven by a philosophy built on a flexibility of mind. Although many team members have library science backgrounds, they have left their disciplinary baggage at the door in order to achieve buy-in and support from colleagues from different backgrounds and with different perspectives. The team was also successful because it was flexibly designed - not just LIS people, but technologists, technical communicators, designers, and strategists. In addition to lending the team more credibility with outsiders, the team's interdisciplinary nature meant that many ideas were explained, translated, and fought over before they were ever exposed to outsiders. Interdisciplinary perspectives lead, as always, to a better and more marketable set of services.

Discussion Questions

  1. Information Ecology: What are the major information challenges the organization is facing?
  2. Information Use Behaviors: Who are the users of the Intranet? What are their information needs? What are the desired information use outcomes?
  3. Value-added Intranet: In what ways does the Intranet address the information challenges and information use behaviors noted above?