It’s easy to become overwhelmed by terms and definitions when looking to understand what “knowledge” is – in the computable sense. For example, what is the difference between knowledge, data, and information?
In this article – the first in an occasional series within the blog – we’ll take an intuitive and practical approach to understanding what knowledge is. This will establish a baseline for exploring – in future articles – how knowledge is captured, found/retrieved and applied to solve real-world challenges. These challenges may range from helping a customer use a product feature to aiding a researcher in finding the most suitable molecule to trigger a specific immunity in a human.
Data, information, knowledge, content …
Consider the statement “this stone weighs one kilogram.” Compare it with the statement “Penicillin treats bacterial infections but cannot be given to patients who are allergic to penicillin.”
Intuitively, the first statement is closer to “data” and the second closer to “knowledge”. This is partly because the first is a simple or atomic fact or assertion, while the second statement is built of multiple assertions (“facts”) interconnected by specific relationships (“medications treat conditions” “patients can have conditions and allergies”) and a rule (“don’t treat patients with medications they are allergic to”). By the way, that rule may seem obvious to a human but must be explicitly represented for a computer to make use of it.
A classic way to appreciate the differences between data and knowledge is the “DIKW” paradigm, which is a hierarchy of increasingly complex representations of observed or reasoned phenomena, from data, to information, to knowledge, to wisdom (hence the acronym). More detail on DIKW and similar models can easily be found using your favorite search engine.
The focus of this article is on knowledge, and we assume that it is more important (at this introductory level) to understand intuitive differences than to argue fine distinctions about whether any given statement represents data, information, knowledge, content or whatever. Not all statements would be so easy to classify as the previous examples.
Representing knowledge
We refer to individual pieces of knowledge (facts, relationships, rules, etc.) as knowledge assets that may be simple (ie. atomic, self-sufficient, or stand-alone) or complex (ie. built from associations between simpler knowledge assets). Knowledge assets are typically created and maintained in knowledge management systems (KMS). A KMS can contain multiple Knowledge Bases (KBs), where each KB’s assets pertain to a specific topic area. What’s the difference between a KB and a database, or a KMS and a Content Management System? Ideally, KBs and KMSs surpass these other repository types in being able to reason with existing facts and to infer new facts from relationships and rules. Inference is a key capability of effective knowledge-based systems.
Having said that, knowledge assets can be represented in a database, and some content management systems do contain knowledge assets. And not all assets in a KB truly represent knowledge, but also supporting data and information. In sum, there is overlap among system types, but each is designed and optimized for particular purposes. If a database is sufficient for your needs, then a knowledge base is overkill. Trying to manage complex knowledge assets in a database is a significant challenge.
In current times, the term “knowledge base” is often used to refer to very specific applications where consumers or customer service personnel can seek specific answers to questions about products or services. This is just one application of knowledge.
What's next?
In this article we have scratched the surface to introduce an intuitive and practical understanding of what knowledge is, in a computable context. Future articles will add both depth and breadth to this and related topics. For example, it goes without saying that knowledge is only as useful as your ability to find it, and the topic of search – from simple to sophisticated – will be addressed in the future.
Comments