Drupal AI, Project 2: Virtual SME


This project began as the notion of using an interactive story as a template for an AI system.   As conceived, "the Choose Your Own Adventure structure will serve as a base for a more elaborate Artificial Intelligence (AI) system: a conversation can be modeled as a story with decision branches."

Project 2, the Virtual Subject Matter Expert:

The second iteration involves an interactive navigation through a knowledge-base by a sort of digital Subject Matter Expert (SME).  This project concerns the design of an instructional unit that is targeted to high-school level computing science students.

Basic groundwork for the project involves describing a visual/spatial model for showing how to build a Drupal system.  "First you must build a foundation: install and configure the underlying LAMP stack.  Then you must install Drupal.  Think about the kind of site you want to build, and map out it's requirements.  Which do you want to design first, the look or the information architecture?  Sometimes choosing a theme first will help inspire you, but more often than not you want to jump right in creating content types for your newspaper, brochure, magazine, or 'digital clubhouse' for your school or work or church group." ... The central metaphor may be the exploration of a space.

This creates good metaphors for navigating the site, as well as producing more compellingly readable copy.  More dynamic text will better engage readers.  Exploring the visual landscape (or 'information architecture') of the subject matter to be explained will reveal internal structural dynamics which may be exploited for more effective instruction.  Certain metaphors are inimical to certain domains of knowledge, and should inform the structure for those information trees.

Notes for the project:

Let's begin to map out the "conversation" for the site.  Although perhaps trite, the metaphor of the website as a house is familiar and therefore expedient.  We begin with a discussion of what sort of house we want: how many floors do we need?  What sorts of rooms will we have, and how many people will be walking through them?  Once you've got an idea of the basic structure you're after, you can choose a number of different ways to go.

My first attempt might seem a little cheesy, but I'm going to push the Choose Your Own Adventure metaphor a bit hard.

"You approach a scary looking house on a hill.  It is the website you are going to build.  You will build the foundation, and then you will either climb into the basement and fix the plumbing, walk into the foyer and design the layout of rooms, or stay outside and begin to paint and landscape."

FOUNDATION: installing Apache, MySQL, PHP and Drupal, more or less in that order.  Not optional, although the first three can be done in pretty much any order.

BASEMENT: landscape of modules and installation profiles.  Apache Solr and Aegir and Pantheon, hosting and multihoming.  Drush, features, devel, and backup_migrate.

GROUND FLOOR: defining site struction, choosing sections, filetypes and basic taxonomies.

EXTERIORS: styling and theming.  CSS, themer, ds and themekey.

OUTER GATE: firewalling Drupal, permissions, mollom, system reports and updates.  This is the first and last thing you think about.  Other sections should link to the appropriate security considerations throughout.

Both the narrative and the site-structure begin to become problematic when people begin to branch out in more lateral form.  I think the trick is going to be in having the navigation within each basic category (Foundation, Basement, etc.) being essentially linear.  A consideration of the previous decision-tree reveals a typical pattern:

Within each of the individual "nodes on the network" are arbitrarily long or short linear progressions in which readers simply flip from one page to the next.  This is much like how conversations can oscillate between more expository/literal and more creative/imaginative modes.  Perhaps the expository decision-tree might look more like the following:

(of course with many more lines from each category to the tops of neighboring lines, links to "uncle/aunt" nodes...)

The problem with becoming overly linear is that it begins to feel more like narrative than navigation, being led about rather than exploring.  This is one of the fundamental explorations with this little site: where are the gaps and the room for improvement.  Anything which can make the exposition seem more organic and more fluid are to be considered a move in the right direction.  Along the way, I will explore if the use of tags, rules and contexts can enable a more dynamic means of presenting the site content.

Conclusions from the project:

This project has been filled with many fruitful failures, and many objecrt lessons in the difficulties inherent in any AI project.  In one major sense, this particular project is an abject failure, because it demonstrates no real artificial intelligence, it's not going to pass the Turing test, etc.  In another very real sense, it has revealed for me a better direction.

My realization is that conscious understanding of a communicating entity's meaning is determined through the application of a successive series of approximating contexts.  In other words, when someone comes at us with a series of semotic signals, we make a cascading series of guesses about possible context.  Based on various criteria, we make differing assumptions (more or less conscious) about that person's emotional state, what subject matter they might be concerned with at the moment, etc.  If Fred from accounting strides up to us and asks "Good stuff, eh?" on Monday morning as he's walking out of a board meeting, we may safely assume he's talking about the budget, whereas if we run into Fred stumbling out of the poolside bar at Planet Hollywood Las Vegas wearing a loud Hawaiian shirt then "Good stuff, eh?" is far more likely to apply to some parasol drinks with particularly high alcohol content.  Computers are notoriously inept at determining human social context.

Good thing, then, that computers have great facility at determining beaurocratic context.  A loan officer's computer and a university admissions officer's computer can "make great sense of" a question such as "How do I apply" because the context is predetermined, which greatly narrows down the subject matter about which the system must appear expert.  Although it is very tempting to build a "Frankenstein AI" system such as Eliza as a Virtual Subject Matter Expert (with or without visual avatar to complete the simulacrum), I am going to abandon that notion for the third or even fourth iteration of this Drupal AI project.

What I am finding is that a great deal of the hierarchy of a given subject matter is not generalizable into abstract form/equation/process/algorithm, but is in fact highly idiosyncratic to the information tree of that particular knowledge base.  In other words, we should not expect the data structures of dentistry to match up with those of vetrinary medicine, let alone for there to be a generalized form of knowledge structures for medicine, law, sports, technology, etc.  With that in mind, my next project will be a further refinement of the "Virtual SME" notion, this time with the data set of cognitive science, rather than Drupal HOW-TOs.  The next steps will be to allow multiple means of site navigation: a basic linear narrative, but with branching, searches, index, etc. etc.  As I complete data entry for the existing Drupal SME project, I am paying particular attention to keywords with an eye towards applying multiple vocabularies (and free-form tagging) in ways that approach humans' seemingly effortless determinations of "best-fit" contextualizations.