FOCUS 1
Using Machine learning (and AI) in ontology matching
Problem:
- The essence of SemWeb is structured data sets (machine readable). [Intro to SemWeb, Li Yau]
- Constructing formal structured data requires ontology as a reference.
- One of the main obstacles facing SemWeb today is the lack of robust and formal ontologies. [add reference]
- That’s because there are multiple ontologies of the same domain, eventually this introduce inconsistency and confusion (inconsistency problem).
- Ontology matching is a solutions used to address this (inconsistency) problem.
- Current methods rely on manual ontology matching which is not sufficient; so tedious and error-prone. [1]
Solutions:
- Use ML and AI techniques for automated matching.
- Onto matching using ML. [1]
- PSL for onto alignment. [2]
Return value, benefiting fields, and applications:
i.e.
- More reliable ontologies for semantic web application in general.
- ie) IBM-WATSON: Watson-Paths projectl
More details see:
[[research: engineering the problem]]
Reference:
[1] Ontology Matching: A Machine Learning Approach http://homes.cs.washington.edu/~pedrod/papers/hois.pdf
[2] Probabilistic Similarity Logic http://event.cwi.nl/uai2010/papers/UAI2010_0089.pdf
FOCUS 2
Ontology Learning
Automatic creation of ontology, includes extracting the corresponding domain’s terms and relationships from corpus. [1]
Problem:
Techniques:
- Use NLP methods
- For finding similar concepts, use recommender systems’ methods for finding similar users.
References:
[1] http://en.wikipedia.org/wiki/Ontology_learning
FOCUS 3
Extracting Knowledge Base from Corpus
Problem:
Building an ontology is a manual, error-prone, and tedious process. Finding a tool
Idea: Automated ontology generation (domain specific) using NLP techniques.
Steps:
- Build a python module or framework (name it Pyonto) to transform natural language sentences into ontology concepts. [see, [2]]
Solution:
- Use facts (axioms in natural language) and the true KB from knowledge extraction engines (i.e. NELL, FreeBase .. ) and from structured semantic databases (i.e. DBpedia, LinkedData..) to automatically generate domain specific ontologies. See [9]
-
Use NLTK:
- To create KB statements from corpora
- Then, transform the KB into Ontology
- To create KB statements from corpora
-
Unsupervised learning of relations between concepts, see [8]
References:
[1]
NLP for onto: Applying NLP FOR BUILDING DOMAIN ONTOLOGY:
FASHION COLLECTION
[2]
- Quepy Framework: http://quepy.readthedocs.org/en/latest/index.html
- iepy Framework: https://github.com/machinalis/iepy
[3]
- Machinalis at github https://github.com/machinalis
- Quepy talk: http://www.machinalis.com/blog/quepy-talk-at-pydata-silicon-valley-2014/
[4]
Also check out python-owl Seth: http://seth-scripting.sourceforge.net/
[5] IRBook.docset from Stanford’s Intro to Info Retrieval
[6] paper: Survey on clustering methods for ontological knowledge
[7] Chapter 5: Ontology Learning Using Word Net Lexical Expansion and Text Mining read it online: http://goo.gl/Gi0JRz
[8] Generating Ontologies from Linked Data GOLD
[9] Unsupervised learning of semantic relations for molecular biology ontologies, pdf