Lifestyle Listings Engine…stuff. PART 2
Yesterday I left off with a promise to explain what an expert system is, and give some insight into how those play into Onboard Informatics approach to aggregating listings (and other) data. Ok…so as cliffhangers go it’s no Stewie Kills Lois, but I’m in this for the art not the ratings.
The first time the phrase “expert system” found its way into the dictionary (Merriam-Webster) was in 1977: computer software that attempts to mimic the reasoning of a human specialist. While there are many other concepts and an almost unconscionable amount of geekspeak jargon that go along with this concept, this definition holds up. Computers are stupid, standard software only mildly less so. In a perfect world I would have a human being working in constant attendance on an individual feed. Actually, make that a team of people all dedicated to one feed since the burnout rate would be staggering if you didn’t round-robin them.
So, acknowledging that human wisdom and expertise is the most reliable way to make good decisions, how do support a solution which scales? By attempting to codify that “wisdom”, i.e. domain-specific knowledge applied within a rules and/or model-based construct, into software. In any given domain there’s what’s generally known, and then there’s what true experts know–the real skinny.
Want the secret sauce? Distill the heuristic knowledge of a body of human experts into a knowledgebase (just think of “heuristics” as meaning “rules of thumb”). Now develop a robust system able to infer/derive new knowledge for decision-making from that knowledgebase, enabling appropriate feedback loops so that the system continuously improves over time. This amounts to a (tragically) long array of IF-THEN assessments which are dynamic in nature, growing ever longer as the system “learns”. We’re hardly the first to this particular party in a general way, but I don’t think it’s been done as significantly in RE before.

Some brief descriptions of what’s going on in this flowchart:
1) Controller: manages the occasionally complex interaction between some of the subsystems, with the heaviest load being the interaction between KnowledgeBase and Inference Engine. Those components make up the bulk of “intelligent” operation in the system and require significant management to maintain correct order of operation, etc. This piece also works as the “scratchpad”, allowing for fast ad hoc iterative assessment and conclusion.
2) KnowledgeBase: this is where the heuristic, IF-THEN rules, fuzzy logic, rules of thumb, expertise, wisdom, whatever you want to call it are stored. Don’t be too impressed, this is just a database, and a pretty simple one at that.
3) Inference Engine: this component is responsible for actually implementing the IF-THENrules, and can do so both sequentially and in parallel based on the need. Ok, you can be impressed with this one–it’s tough.
4) Knowledge Feedback Loop Engine: responsible for determining what information is fed back into the KnowledgeBase in order to refine the systems capabilities, most often (at this point) in partnership with human guides.
5) Human Experts: aka, wetware. These are folks whose brains are being drained (as we speak) into the KnowledgeBase–a relatively painless process. Occasionally they don’t even know they are participating. Obviously these people will be the first casualties of the war when the system inevitably decides it can serve real estate better than us. Heroes, all.
6) Report Generation: the primary way by which the Human Experts can see into the systems activities and provide feedback, correction, etc.–e.g. the system reports that image population in a particular feeds records has fallen below norm, but not critically. Only humans are capable of reaching the intuitive and experiential conclusion that the tolerance may need to be updated due to conditions beyond the systems ken.
“Listings” aggregator - means not the end
I think that might be a good place to conclude, by highlighting that humans are an intrinsic part of this system. We fold humans in all along the way because 1) we just don’t trust the computers that much–they’re dumb, and potentially evil (what…you never saw Terminator?); and 2) the system is in place, but as with any knowledge-based system it’s only as good as its knowledgebase is complete. I’d warrant ours is better than anyone else’s, but it’s too new to be complete and needs to be spot-checked by an actual for-real expert who repairs mistakes and reports them back into the system, thereby improving the knowledgebase and system. The upshot is our system may make mistakes from time to time, but they’ll be caught and repaired quickly, and shouldn’t happen twice.
That our approach uses an expert system puts it way ahead of the curve in the RE space, but I don’t want to overstate it. We’re still at the nascent stages of this. It should also be remembered that our goal isn’t to be an IDX, VOW, or even generic “listings” aggregator–that’s the means, not the end. The result we want to achieve is good enough information about the available properties in a locale that we can formulate effective search pathways, and help our customers create compelling user experiences.
Ultimately, our vision calls for the incorporation of a neural network to allow for some predictive analysis, advanced natural language processing, and some other pieces which are evolutionary by their very nature. “Better than anything I’ve ever seen in real estate” is a phrase I heard more than once at Inman this past week. That’s not a bad place to start, but we have a long way to go to bring this to where we think it can and should be.
So day one was basically a statement of the problem, day two a brief description of the “superstructure” used to solve that problem (trust me, that was brief). Tomorrow, I’ll post about some of the specific problems which occur in the aggregation of listings data, and try to give some insight into how our system handles those recurring issues.
- Liam
Tags: data, IDX, Lifestyle Listings Engine, listings feeds, Onboard Informatics, VOW.












on Jan 14th, 2009 at 12:38 am
thanks for explaining all of this stuff.