Background 1

  • While data science has shown a great potential as a new paradigm of science, scientists are often unaware of how to access science data for their research, and
  • there are not nearly enough data scientists available to help them, which hinders many scientists from benefiting from the opportunity of data science.

Background 2

  • Meanwhile, there has been increasing interest in the technology of conversational agent, which was greatly contributed by recent breakthroughs made in AI.
  • Search intent of scientists will not be expressed in just one query. It will be rather expressed by a series of related queries.
  • Scientists often may not know what is their exact queries. Rather, they will often develop their queries during attempt of multiple related queries.


  • To develop a dialog system for search-oriented dialog, to help scientists search science data


  1. gives an initial question
  2. modifies the question by adding a new condition ('human')
  3. changes the condition (from 'human' to 'mouse')
  4. adds a new condition as an alternative

Implementation Environment

Platform (open)

  • LODQA - an open source project for natural language query interface for searching SPARQL endpoints.

Subject domains (open)

  • Glyco Biology



  • To develop a benchmark dataset


  • To develop a baseline system
  • to develop an evaluation framework


  • To develop a full system (with a focus on user intent detection and its translation into SPARQL)