Session I: Opening
8:30–8:45Inauguration by Chairs
8:45–9:48Invited Talk by Prof. Vittorio Loreto
9:48–10:00Social (distributed) language modeling, clustering and dialectometry
David Ellis
10:00–10:30Coffee Break
Session II: Special Theme
10:30–10:55Network analysis reveals structure indicative of syntax in the corpus of undeciphered Indus civilization inscriptions
Sitabhra Sinha, Raj Kumar Pan, Nisha Yadav, Mayank Vahia and Iravatham Mahadevan
10:55–11:20Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences in dialectology
Martijn Wieling and John Nerbonne
11:20–12:10Panel Discussion on "Bridging the gap between language dynamics and NLP: Can network theory help?"
Panelists:
  1. Alexander Mehler, Goethe Universität
  2. Fabio Massimo Zanzotto, University of Rome "Tor Vergata"
  3. Vittorio Loreto, University of Rome "La Sapienza"
  4. Greg Kondrak, University of Alberta
  5. Monojit Choudhury (as moderator), Microsoft Research, India
Session III: Semantics
13:50–14:15Random Walks for Text Semantic Similarity
Daniel Ramage, Anna N. Rafferty and Christopher D. Manning
14:15–14:40Classifying Japanese Polysemous Verbs based on Fuzzy C-means Clustering
Yoshimi Suzuki and Fumiyo Fukumoto
14:40–15:05WikiWalk: Random walks on Wikipedia for Semantic Relatedness
Eric Yeh, Daniel Ramage, Christopher D. Manning, Eneko Agirre and Aitor Soroa
15:05–15:18Measuring semantic relatedness with vector space models and random walks
Amac¸ Herdadelen, Katrin Erk and Marco Baroni
15:18–15:30Graph-based Event Coreference Resolution
Zheng Chen and Heng Ji
15:30–16:00Coffee Break
Session IV: Classification and Clustering
16:00–16:25Ranking and Semi-supervised Classification on Large Scale Graphs Using Map-Reduce
Delip Rao and David Yarowsky
16:25–16:50Opinion Graphs for Polarity and Discourse Classification
Swapna Somasundaran, Galileo Namata, Lise Getoor and Janyce Wiebe
16:50–17:15A Cohesion Graph Based Approach for Unsupervised Recognition of Literal and Nonliteral Use of Multiword Expressions
Linlin Li and Caroline Sporleder
17:15–17:40Quantitative analysis of treebanks using frequent subtree mining methods
Scott Martens
17:40–18:00 Closing Remarks

 

Invited Talk: Collective Dynamics of Social Annotation
Prof. Vittorio Loreto

The enormous increase of popularity and use of the WWW has led in the recent years to important changes in the ways people communicate. An interesting example of this fact is provided by the now very popular social annotation systems, through which users annotate resources (such as web pages or digital photographs) with text keywords dubbed tags. Collaborative tagging has been quickly gaining ground because of its ability to recruit the activity of web users into effectively organizing and sharing vast amounts of information. Understanding the rich emerging structures resulting from the uncoordinated actions of users calls for an interdisciplinary effort. In particular concepts borrowed from statistical physics, such as random walks, and the complex networks framework, can effectively contribute to the mathematical modeling of social annotation systems. First I will introduce a stochastic model of user behavior embodying two main aspects of collaborative tagging: (i) a frequency-bias mechanism related to the idea that users are exposed to each \newpage \noindent others tagging activity; (ii) a notion of memory, or aging of resources, in the form of a heavy-tailed access to the past state of the system. Remarkably, this simple modeling is able to account quantitatively for the observed experimental features with a surprisingly high accuracy. This points in the direction of a universal behavior of users who, despite the complexity of their own cognitive processes and the uncoordinated and selfish nature of their tagging activity, appear to follow simple activity patterns. Next I will show how the process of social annotation can be seen as a collective but uncoordinated exploration of an underlying semantic space, pictured as a graph, through a series of random walks. This modeling framework reproduces several aspects, so far unexplained, of social annotation, among which the peculiar growth of the size of the vocabulary used by the community and its complex network structure that represents an externalization of semantic structures grounded in cognition and typically hard to access.