Key-Based Top-K Search in Multidimensional DatabasesAuthor : K.Anuratha, S.Senthamaraikannan and R.Rajaguru
Volume 1 No.1 January-June 2012 pp 29-36
Previous studies on supporting free- form keyword queries over RDBMSs provide users with linked-structures (e.g., a set of joined tuples) that are relevant to a given keyword query. Most of them focus on ranking individual tuples from one table or joins of multiple tables containing a set of keywords. The problem of keyword search in a data cube with text-rich dimension(s) (so-called text cube) is studied. The text cube is built on a multidimensional text database, where each row is associated with some text data (a document) and other structural dimensions (attributes). A cell in the text cube aggregates a set of documents with matching attribute values in a subset of dimensions. Given a keyword query, the goal is to find the top-k most relevant cells. This project studies the problem of keyword-based top k search in text cube, i.e., given a keyword query, find the top-k most relevant cells in a text cube. When users want to retrieve information from a text cube using keyword queries, relevant cells, rather than relevant documents, are preferred as the answers, because:(i) relevant cells are easy for users to browse; and (ii)relevant cells provide users insights about the relationship between the values of relational attributes and the text data. The proposed algorithm uses relevance scoring formula for finding the top-k relevant cells by exploring only a small portion of the whole text cube (when k is small) and enables early terminatio.