SMILA Tutorial

Mastering Unstructured Information with SMILA - the Unified Information Access Architecture


The amount and diversity of information is growing exponentially, mainly in the area of unstructured data, like emails, text files, blogs, images etc. Poor data accessibility, user rights integration and the lack of semantic meta data are constraining factors for building next generation enterprise search and other document centric applications. Missing standards result in proprietary solutions with huge short and long term cost. Overcoming these problems is a key issue for gaining agility in an organization.

SMILA is an extensible framework for processing unstructured information in the enterprise. Besides providing essential infrastructure components and services, SMILA also delivers ready-to-use add-on components, like connectors to most relevant data sources. Using the framework as their basis will enable developers to concentrate on the creation of higher value solutions, like semantic search applications, information extraction and the like.

SMILA is an open source project under the umbrella of the eclipse foundation. It is also a part of the German research programme THESEUS. Further information can be found at http://www.eclipse.org/smila and http://theseus-programm.de/en/index.php.

This half day SMILA Tutorial will introduce the concepts and approach behind the framework, how to use it to build an application and how to integrate new components into it. The topics that will be addressed are:
  • SMILA in a nutshell
  • Installation, crawler and service configuration, and building a search application with SMILA and Lucene
  • Creating a simple native SMILA component
  • Using Web Services as SMILA components (Open Calais as exercise)
  • Presentation of some Demo Applications based on SMILA
Participants should have a basic understanding of JAVA and programming. For the practical exercises, a laptop running Windows or Linux is required. Participants will receive a CD-ROM containing the most recent SMILA release, Eclipse, Protege and a Java SDK.

The tutorial is presented by Igor Novakovic, Attensity Europe GmbH, Germany.

It will be held on Wednesday from 9:00 am - 12:00 am.

Applied Descriptive Pattern Mining

Tutorial : Applied Descriptive Pattern Mining

Pattern mining, and more specifically descriptive pattern mining aims at extracting relevant "nuggets" of knowledge from a set of data.
In this context, the focus is on descriptive applications, i.e., for explorative data mining, e.g., extracting knowledge for humans,
instead of considering mainly automatic approaches, e.g., for classification.
Prominent approaches for descriptive pattern mining include frequent pattern mining, supervised descriptive rule induction,
and subgroup discovery but also descriptive community mining approaches.

The aim of this tutorial is to provide a general, comprehensive overview of the state-of-the-art of descriptive pattern mining.
We consider different types of data such as structured (tabular) data, texts, and networks.

The main contributions of this tutorial are the following:

  • We provide a comprehensive overview on descriptive pattern set mining techniques.
  • We present an in-depth introduction into a set of mining algorithms for different data types and show their relations.
  • We discuss a range of applications providing showcases for the introduced algorithms.

The Tutorial is presented by

  • Martin Atzmüller (University of Kassel)
  • Florian Lemmerich (University of Würzburg)

Date: Friday 30. Sep 2011 afternoon