Back to all projects

MuSeNet

About

People

Data

Publications

About

This project is devoted to building a large multilingual semantic network through the application of novel techniques for semantic analysis specifically targeted at the Wikipedia corpus. The driving hypothesis of the project is that the structure of Wikipedia can be effectively used to create a highly structured graph of world knowledge in which nodes correspond to entities and concepts described in Wikipedia, while edges capture ontological relations such as hypernymy and meronymy. Special emphasis is given to exploiting the multilingual information available in Wikipedia in order to improve the performance of each semantic analysis tool. Significant research effort is therefore aimed at developing tools for word sense disambiguation, reference resolution and the extraction of ontological relations that use multilingual reinforcement and the consistent structure and focused content of Wikipedia to solve these tasks accurately. An additional research challenge is the effective integration of inherently noisy evidence from multiple Wikipedia articles in order to increase the reliability of the overall knowledge encoded in the global Wikipedia graph. Computing probabilistic confidence values for every piece of structural information added to the network is an important step in this integration, and it is also meant to provide increased utility for downstream applications. The proposed highly structured semantic network complements existing semantic resources and is expected to have a broad impact on a wide range of natural language processing applications in need of large scale world knowledge.

The project is a collaboration between the Language and Information Technologies group at University of Michigan and the Natural Language Processing group at Ohio University. The project is sponsored by the National Science Foundation, under awards #1018613 and #1018590.

Back to top

People

Razvan Bunescu (PI)

Rada Mihalcea (PI)

Mike Chen

Jincheng Chen

Bharath Dandala

Samer Hassan

Yunfeng Huang

Kevin Janowiecki

Hui Shen

Back to top

Data

Back to top

Publications

Back to top