|
|
|
|
|

If search is a platform technology, best bets is a must-have
addition to that platform. Try naming a report global brief,
putting it in an index with a million other documents and finding
it. Besides the obvious, global and brief are two very
common words. Add to that, it’s a daily report that client’s look
for by 7am each morning. There may be millions of global and
brief and thousands of global brief. Is it reasonable
to expect any search engine to find this needle in the haystack?
Best Bets is a way to feature the most common
results based on the search string entered and they get presented at
the top of search results page. This is not a weighting scheme.
Weighting pollutes the search engine and makes it impossible to
consistently maintain the proper weights for key documents.
Remember, the index is in constant flux. Instead, best bets runs off
keywords. At its most basic, a lookup table would be maintained with
keywords that automatically push documents to the user. For example,
any mention of the word global in the search string would
bring back the latest edition of the Global Brief. Note that these
best bets should always be set off from the rest of the search
results to show these are hand-selected and not a part of the
results.
The EDM Logic
Approach:
While we are big fans of Best Bets, we do not believe in the lookup
table approach. This is not adaptive to the changing needs of users
and humans cannot keep up with huge indexes and long lists of
keyword matches. We also do not believe in matching from the
“keyword” side. In other words, look at the top 200 or so documents
accessed by users and work backwards to develop keyword matches.
Search follows the 80/20 rule. 80% of searches are looking for the
most common documents in your index. Taking the Top 200 or even 500
and assigning keywords to those documents is by far the better way
to implement Best Bets. And the keywords are not picked by humans.
It is fully automated by analyzing all the traffic to that
particular page and determining the most frequent keyword or
keywords. And the Top 200 documents are constantly in flux. Every
week, the system rebuilds the tables to match the changing needs of
users, with no intervention by IT. |
|
|
|
|

Some may ask what this has to do with search, but again the goal is
findability. Being able to navigate the documents is a key component
of knowledge discovery and should be a part of all search platforms.
The three pane navigator concept plays off the Apple iPod’s
extraordinary success and ease of use. Users can navigate through
large amounts of documents using three clicks or less. And this is
not limited to company-defined navigation schemes. Using web
analytics, navigation choices may include most downloaded or most
frequently read documents. Users can also add their own tags and
have personalized navigation to “dogear” and recall documents
already visited in the past.
But this innovation
does not stop at navigation. Search plays a role both on the front
end and backend of the 3 pane navigator. At any step in the
navigation process, users can enter a search term and only search a
specific area of the index. Second, once navigating and reviewing
documents, users can search from the document listings pages.
The EDM Logic Approach:
As search consultants, it’s counterintuitive that navigation would
play a critical role in the platform, but any help a user can give
to the search engine drives infinitely better results in the end.
Just by selecting an area of the organization, the indexed search
may be reduced by 80% or more. This drives much better recall. Some
of our clients make it mandatory that users choose and area before
any search string can be entered, but this is the exception and not
the rule. The real advantage of the 3 pane navigator is after the
user’s first search. Down the left side of search results are the
same navigation areas, but this time with counts (in parentheses)
that show how many documents match the search criteria for a
particular segment of the index. With one additional click, users
improve search results dramatically. |
|
|
|
|

While the three pane navigator is a great start to refine search,
there are additional options that drive findability, especially
after the initial search is executed. Some may feel like an
additional query or clicking an additional link will turn off users.
Actually, quite the opposite is true. Users want an iterative
experience with technology. They are prepared to take the journey as
long as they are on the right track. What they are not prepared to
do is start over again and again with a new search string. And they
are not prepared to navigate multiple screens without some type of
input. This is why most users do not go beyond the second results
screen in Google.
So, how do we have
this conversation with users of search? The simplest is to offer
search within results – the ability to add a word or series of words
to the existing query. Still another is to suggest additional
keywords the user may add to the query, but we find that computers
have difficulties interpreting which related terms to suggest.
The EDM Logic Approach:
The best way to refine search is to allow the user to select several
documents “of interest” on the search results page and then ask the
system to find more of the same. Let’s say the user searches on the
term monopoly. This may bring a mix of results from board
games to a form of business practice, even on the first page of
results. By selecting several that look interesting about the
business of monopolies, and clicking refine search, the user is
presented a full list of very relevant documents. This usually gets
a “wow” from the user because the documents are all right on point.
The reason is that extracts from the selected documents are used to
drive the new search. A one word search string, monopoly, may become
200-300 words that describe a form of business practice in great
detail. Note that clients must employee a pattern-matching engine to
get this to work. For example, Google limits queries to just 32
words. |
|
|
|
|

If communal taxonomy is an advanced concept for the search platform,
kSense even takes it a step further. kSense stands for Knowledge
Sense and is a direct play off AdSense from Google. But with kSense,
content authors market their work without the exchange of money. For
example, let’s say an analyst puts out a morning note on the Big
Three, but it really has some interesting information on Ford. The
analyst can login to an administrative console and signup for the
keyword “Ford,” so that anytime a user query contains the term Ford,
the morning note is featured over on the right-hand side of the
search results page. Like AdSense, the analyst includes a title,
brief two line description and link to the morning note. Content
authors can sign up for multiple keywords and when there is overlap,
the document that is clicked on most gets the top spot on the search
results page. Of course, the age of the document will have to be
factored in to allow new documents to compete. In some cases,
clients want the order to be data descending showing the very latest
at the top of the right-hand column.
The EDM Logic Approach:
Content authors “advertising” their work within the enterprise
search engine may give companies some pause, but authors really
respond to their ability to impact the delivery of information. They
are passionate about their work product and like to see it
well-represented on the site. This is also a form of best bets,
where frustrated authors can have some control over the search
results, even if it’s in the right-hand column. More important is
the immediate feedback they get from the kSense program. By
monitoring clicks, they can fine-tune the marketing message or get a
better understanding of which documents get visited and which do
not. |
|
|
|
|

This is a very advanced concept in search and involves some
extensive programming to implement. Taxonomies are often created by
a steering committee or some group that tries to represent the
interests of users and employees navigating large data sets. We do
this all the time and the secret is that it does not have to be
perfect. Employing machine learning after the taxonomy is deployed,
usage can be analyzed and changes are made within 24 hours of issue
identification. The entire index gets updated with the fine-tuned
taxonomy. But what if there was a way to eliminate the steering
committee and need to analyze usage data.
The EDM Logic Approach:
Sites can use path tracking to analyze not only what users click on,
but what order they navigate web pages within the site. Think of it
as RFID for documents. One day, as you grocery shop and put things
in the basket, a computer screen on the cart will make suggestions
for other items and even recipes. It will even keep a total cost of
everything in the basket – all fully automated. Why should this be
any different for documents? By following paths users take through
web sites, natural clusters begin to emerge. Let’s say documents
dealing with oil and gas seem to be prevalent with users’ searching
for information on logistics. Taxonomy could be formed that combines
articles on oil with articles on logistics to form a “neighborhood”
of interest. Even if a company does not want enterprise taxonomy to
be fully automated, this technology could be very useful in
suggesting related articles based on previous user activity. The
best part is that communal taxonomies are constantly tuning. So, if
an event happens somewhere in the world, within days information
surrounding the event start to cluster and become a part of the
navigation. |
|
|
|
|
|
|