首页 > 笔记, 逻辑, 语义网, 思路 > 笔记:描述逻辑的云计算(1)背景

笔记:描述逻辑的云计算(1)背景

Description Logic in the Cloud 这是很扯蛋的说法

或者说描述逻辑的并行计算(Parallel Computing with Description Logic),主要是指查询和推理两种任务。

对于RDFS或者OWL-RL的某个子集,利用MapReduce或者其他基于集群的(cluster-based)的计算,工作不少。不过一般都是基于规则(rule-based)的推理,不保证推理的完备性(completeness)。很多只支持非常有限的推理,比如BBN的SHARD工作。

模块化本体(modular ontology)语言,如Distributed Description Logics, E-Connections and Package-based Description Logics,基于非经典局域语义(Local Model Semantics),可做分布式推理。但是局域语义的复杂性,使它们不适合现在的工程应用。

所以这个系列,主要是在普通全局语义下,探讨完备的推理算法。其中包括对Tableau Algorithm (树图算法)的并行化的一些讨论。

下面附对Rule-based并行推理的一个简短比较(摘自我自己的一个报告)。这些工作,主要是parallel triple-store,而不是parallel reasoner。

———————————

Distributed RDF Reasoning

Most existing work on distributed RDF reasoning relies on parallelization of rule-based reasoning or partition of data on a cluster.

WebPIE (Web-scale Parallel Inference Engine) by Urbani et al [7, 6] performs rule-based forward reasoning based on the MapReduce programming model. It is implemented using the Hadoop framework.  They have shown inference on a triple set of 100 billion triples and in 1.35 hours on 64 nodes against 10 billion triples. This system does not support querying.

SAOR (by Hogan et al.) [1] computes the closure of an RDF graph using two passes over the data on a single machine. A fragment of the OWL Horst semantics is implemented to allow more efficient materialization and to prevent “ontology hijacking”.

In MaRVIN [10, 4], Kotoulas, Oren and others have presented a technique based on data-partitioning in a peer-to-peer network. A load-balanced auto-partitioning approach was used without upfront partitioning costs.

In Williams, Weaver et al [5], straightforward parallel RDFS reasoning on a cluster is presented. This approach replicates all schema triples to all processing nodes and distributes instance triples randomly. Each node calculates the closure of its partition using a conventional reasoner and the results are merged. To ensure that there are no dependencies between partitions, triples extending the RDFS schema are ignored. This approach does not support complete RDFS reasoning.

Newman et al. [2] decompose and merge RDF molecules using MapReduce and Hadoop. They perform SPARQL queries on the data but performance is reported over a dataset of limited size (70,000 triples).

Husain et al. [8] report results for SPARQL querying using MapReduce for datasets up to 1.1 billion triples.

References

  1. A. Hogan, A. Harth, and A. Polleres. Scalable authoritative OWL reasoning for the web. International Journal on Semantic Web and Information Systems, 5(2), 2009.
  2. A. Newman, Y. Li, and J. Hunter. Scalable semantics the silver lining of cloud computing. In Proceedings of the 4th IEEE International Conference on eScience. 2008.
  3. Adjiman, P., Chatalic, P., Goasdou, F., Rousset, M.-C., and Simon, L. (2006). Distributed Reasoning in a Peer-to-Peer Setting: Application to the Semantic Web . Journal of Artificial Intelligence Research, 25:269,314.
  4. E. Oren, S. Kotoulas, G. Anadiotis, R. Siebes, et al. Marvin: Distributed reasoning over large-scale semantic web data. J. Web Sem., 7(4):305-316, 2009.
  5. G. Williams, J. Weaver, M. Atre, J. A. Hendler. Scalable Reduction of Large Datasets to Interesting Subsets, In Web Semantics: Science, Services and Agents on the World Wide Web, , 2010
  6. J. Urbani, S. Kotoulas, E. Oren, F. van Harmelen, Scalable Distributed Reasoning Using MapReduce, in: Proceedings of the 8th International Semantic Web Conference, 2009.
  7. J. Urbani, S. Kotoulas, J. Maassen, F. van Harmelen, H. Bal, OWL reasoning with WebPIE: calculating the closure of 100 billion triples, in: Proceedings of the 7th Extended Semantic Web Conference, 2010.
  8. M. F. Husain, P. Doshi, L. Khan, and B. Thuraisingham. Storage and retrieval of large rdf graph using hadoop and mapreduce. In M. G. Jaatun, G. Zhao, and C. Rong, (eds.) Cloud Computing, vol. 5931, chap. 72, pp. 680-686. Springer Berlin Heidelberg, Berlin, Heidelberg, 2009.
  9. R. Soma and V. Prasanna. Parallel inferencing for OWL knowledge bases. In International Conference on Parallel Processing, pp. 75{82. 2008.
  10. S. Kotoulas, E. Oren, and F. van Harmelen. Mind the data skew: Distributed inferencing by speeddating in elastic regions. In Proceedings of the WWW. 2010.
Advertisements
  1. 还没有评论。
  1. 2012/04/16 @ 01:32

发表评论

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / 更改 )

Twitter picture

You are commenting using your Twitter account. Log Out / 更改 )

Facebook photo

You are commenting using your Facebook account. Log Out / 更改 )

Google+ photo

You are commenting using your Google+ account. Log Out / 更改 )

Connecting to %s

%d 博主赞过: