2020 ICDM Knowledge Graph Contest : Specification

E-mail: icdmcontest@mininglamp.com

Event-centric Knowledge Graph Construction seeks to extract event-centric relational knowledge from online-media[1]. The casual relationships of an arbitrary event is one type of such relational knowledge. The task of cause extraction of the consumer events aims to extract consumer events with predefined event types and the causes of the extracted events from texts for a given brand or product. Teams from both degree-granting institutions and industrial labs are invited to compete in this 2020 ICDM Knowledge Graph Contest by automatically extract causal relationships of consumer events. This competition is a regular competition for candidates willing to compete for a prize. This year’s contest is brought to you by Mininglamp Academy of Sciences and the Institute of Automation, Chinese Academy of Sciences. Data sets and submission constructions will be released on biendata.com by 26 June 2020.

1. Competition description

1.1 Background and impact

Extracting causes of consumer events are the center of attention for many business scenarios such as content advertising, social listening, etc. Take content advertising as an example. Today’s advertisers are not satisfied with the direct exposure of brands or products, and they prefer to embed the content through product features, and subtly inspire consumers to take the initiative to associate their brands or products with arbitrary consumer events. To this end, explicitly extracting causes of consumer events becomes an important technique to build such a system addressing the advertisers’ needs.

1.2 Novelty

Related competitions for extracting events and their causes from free texts have taken place in NLP forums. In these competitions, the causes of extracted events are the events with a predefined schema, and these causes are often represented as a word or a structured tuple. Moreover, these competitions aim to extract all events of a predefined event type.

Compared with previous competitions, our competition has the following novelty:
1) We aim to extract events with a specified subject (a brand or product).
2) The causes of the extracted events are represented as continuous tokens (one span) or multiple non-continuous tokens (multiple spans)

1.3 Data

500 recent articles from Instagram are selected by industry-solution experts to ensure the language formality, diversity, and depth of knowledge in terms of real-world applications. In this competition, we focus on five event types: consumer attention, consumer interest, consumer needs, consumer purchase, and consumer use. The 500 articles will be labeled and serve as the training set. There will be a separate online testing set. An example of the data is shown as follows:

Input: Text+Brand/Product
Output: A span or multiple spans with the predefined event types.

Example 1:

Input:
Text: Did you see my stories from yesterday? Our @ikeausa opened up and I shared a shop with me.💗 This is always my top requested content.😉 (I also went to At Home Stores and Hob Lob and shared stories from there too. 😉) It felt like forever since I had been in an Ikea because they’ve been closed for months. There are SO many great new items! We left with 3 carts full shopping for the nonprofit we’re partnering with, our kids church remodel, and our own homes. IKEA prices are so good though, I’m always shocked how low the total is! My favorite finds were the wicker poufs, or Ikea calls them stools. The quality is amazing, and they are only $29.99! I also shared how to make a basket wall like mine on a budget. After you watch my stories, let me know if you have any questions. It was such a fun day with my sweet friend, exactly what my soul needed.

Brand/Produce: IKEA ALSEDA wicker poufs
Output:
Span: (40,60)
Span String: @ikeausa opened up
Event Type: Consumer Purchase

1.4 Model submission and evaluation

Each competition team is invited to build a model that takes an article and a brand or product as input and outputs a span or multiple spans with the event type.

Team submissions will be judged by unified evaluation tools. We use the Precision, Recall, and F1-Measure as the evaluation criteria.

Where a correctly predicted cause is that the span and the event type are both correct.

2. Rules

This contest is open to the public, with the following restrictions:

  • Organizers, contest committee members, the employees of Mininglamp are not eligible to participate in the contest.
  • Privately sharing code or data outside of teams is not permitted. It is okay to share code if made available at a public website.
  • Each participant can join only one team. The maximum number of members on each team is five.

3. Schedule and readiness

  • Contest requirement specification and sample data available: June 26th, 2020
  • Contest team registration and first stage begins. Train and validation sets are available: July 6th, 2020
  • Second stage begins. Testing sets are available: August 10th, 2020
  • Contest final submission deadline: 23:59, August 23th, 2020.
  • Contest finalist notifications: August 31st, 2020

4. Organizing Team

5. Prizes

The winning team will be awarded a prize of USD 10,000. There will also be second and third prizes.

References

[1] Marco Rospocher, et al. ”Building event-centric knowledge graphs from news.” Journal of Web Semantics, Volumes 37–38, 2016, pp. 132-151.