Workshop on
Machine Learning for CyberSecurity
Co-located with ECMLPKDD 2019
September 20, 2019 - Wurzburg, Germany
About the workshop

About MLCS 2019

Short description

The last decade has been a critical one regarding cybersecurity, with studies estimating the cost of cybercrime to be up to 0.8 percent of the global GDP. The capability to detect, analyse, and defend against threats in (near) real-time conditions is not possible without employing machine learning techniques and big data infrastructures. This gives rise to cyberthreat intelligence and analytic solutions, such as (informed) machine learning on big data and open-source intelligence, to perceive, reason, learn, and act against cyber adversary techniques and actions. Moreover, organisations’ security analysts have to manage and protect systems and deal with the privacy and security of all personal and institutional data under their control. The aim of this workshop is to provide researchers with a forum to exchange and discuss scientific contributions, open challenges and recent achievements in machine learning and their role in the development of secure systems.

Relevance to the Machine Learning Community

Cybersecurity is of the utmost importance for computing systems. The ethics guidelines for trustworthy artificial intelligence authored by the European Commission’s High Level Expert Group on Artificial Intelligence on December 2018, have recently highlighted that machine learning-based artificial intelligence developments in various fields, including cybersecurity, are improving the quality of our lives every day.

Due to the scale and complexity of current systems, cybersecurity is a permanent and growing concern in industry and academia. On the one hand, the volume and diversity of functional and non-functional data, including open source information, along with increasingly dynamical operating environments, create additional obstacles to the security of systems and to the privacy and security of data. On the other hand, it creates an information rich environment that, leveraged by techniques in the crossing of modern machine learning, data science and visualization fields, will contribute to improve systems and data security and privacy.

This poses significant, industry relevant, challenges to the machine learning and cybersecurity communities, as the main problems arise in contexts of dynamic operating environments and unexpected operating conditions, motivating the demand for production-ready systems able to improve and, adaptively, maintain the security of computing systems as well as the security and privacy of data.

Based on the recent history, we plan to organize this workshop as a European forum for cybersecurity researchers and practitioners that wish to discuss the recent developments of machine learning for developing cybersecurity, by paying special attention to solutions rooted in adversarial learning, pattern mining, neural networks and deep learning, probabilistic inference, anomaly detection, stream learning and mining, and big data analytics.

Motivation

The last decade has been a critical one regarding cybersecurity, with studies estimating the cost of cybercrime to be up to 0.8 percent of the global GDP. Cyberthreats have increased dramatically, exposing sensitive personal and business information, disrupting critical operations and imposing high costs on the economy. The number, frequency, and sophistication of threats will only increase and will become more targeted in nature. Furthermore, today’s computing systems operate under increasing scales and dynamic environments, ingesting and generating more and more functional and non-functional data. The capability to detect, analyse, and defend against threats in (near) real-time conditions is not possible without employing machine learning techniques and big data infrastructure. This gives rise to cyber threat intelligence and analytic solutions, such as (informed) machine learning on big data and open-source intelligence, to perceive, reason, learn, and act against cyber adversary techniques and actions. Moreover, organisations’ security analysts have to manage and protect these systems and deal with the privacy and security of all personal and institutional data under their control. This calls for tools and solutions combining the latest advances in areas such as data science, visualization, and machine learning.

We strongly believe that the significant advance of the state-of-the-art in machine learning over the last years has not been fully exploited to harness the potential of available data, for the benefit of systems-and-data security and privacy. In fact, while machine learning algorithms have been already proven beneficial for the cybersecurity industry, they have also highlighted a number of shortcomings. Traditional machine algorithms are often vulnerable to attacks, known as adversarial learning attacks, which can cause the algorithms to misbehave or reveal information about their inner workings. As machine learning-based capabilities become incorporated into cyber assets, the need to understand adversarial learning and address it becomes clear. On the other hand, when a significant amount of data is collected from or generated by different security monitoring solutions, big-data analytical techniques are necessary to mine, interpret and extract knowledge of these big data.

Goals

The workshop aims at providing researchers with a forum to exchange and discuss scientific contributions and open challenges, both theoretical and practical, related to the use of machine-learning approaches in cybersecurity. We want to foster joint work and knowledge exchange between the cybersecurity community, and researchers and practitioners from the machine learning area, and its crossing with big data, data science, and visualization. It aims to highlight the latest research trends in machine learning, privacy of data, big data, deep learning, incremental and stream Learning, and adversarial learning. In particular, it aims to promote the application of these emerging techniques to cybersecurity and measure the success of these less-traditional algorithms.

The workshop shall provide a forum for discussing novel trends and achievements in machine learning and their role in the development of secure systems, and to identify new application areas as well as open and future research problems related to the application of machine-learning in the cybersecurity field.

Call for papers

MLCS welcomes both research papers reporting results from mature work, recently published work, as well as more speculative papers describing new ideas or preliminary exploratory work. Papers reporting industry experiences and case studies will also be encouraged. However, it should be noticed that papers based on recently published work will not be considered for publication in the proceedings.

Topics

All topics related to the contribution of machine learning approaches to the security of organisations’ systems and data are welcome. These include, but are not limited to:

  • Machine learning for:
    • the security and dependability of networks, systems, and software
    • open-source threat intelligence and cybersecurity situational awareness
    • data security and privacy
    • cybersecurity forensic analysis
    • the development of smarter security control
    • the fight against (cyber)crime, e.g., biometrics, audio/image/video analytics
    • vulnerability analysis
    • the analysis of distributed ledgers
    • malware, anomaly, and intrusion detection

  • Adversarial machine learning and the robustness of AI models against malicious actions
  • Interpretability and Explainability of machine learning models in cybersecurity
  • Privacy preserving machine learning
  • Trusted machine learning
  • Data-centric security
  • Scalable / big data approaches for cybersecurity
  • Deep learning for automated recognition of novel threats
  • Graph representation learning in cybersecurity
  • Continuous and one-shot learning
  • Informed machine learning for cybersecurity
  • User and entity behavior modeling and analysis

Paper submission

MLCS welcomes both research papers reporting results from mature work, recently published work, as well as more speculative papers describing new ideas or preliminary exploratory work. Papers reporting industry experiences and case studies will also be encouraged. However, it should be noticed that papers based on recently published work will not be considered for publication in the proceedings.

Submissions are accepted in two formats:

  • Regular research papers with 12 to 16 pages including references. To be published in the proceedings, research papers must be original, not published previously, and not submitted concurrently elsewhere.
  • Short research statements of at most 6 pages. Research statements aim at fostering discussion and collaboration. They may review research published previously or outline new emerging ideas.

All submissions should be made in PDF using the EasyChair platform and must adhere to the Springer LNCS style. Templates are available here.

All regular workshop papers (except papers reporting recently published work) will be published in the workshop proceedings. Research statements will be published online in the workshop program page.

Important dates

Regular and research statement papers / Competition

  • June 7

    Extended: June 21

    Submission deadline

  • July 19

    Paper author notification

  • July 26

    Camera ready submission deadline

  • August 24

    Deadline for participation in competition

  • August 31

    Notification of competition results

Organizing Committee

Program Committee

Confirmed members

  • Aikaterini Mitrokotsa, Chalmers University of Technology, Sweden
  • Alysson Bessani, University of Lisbon - LASIGE, Portugal
  • Cagatay Turkay, City University London, United Kingdom
  • Gianluigi Folino, CNR-ICAR, Italy
  • Giorgio Giacinto, University of Cagliary, Italy
  • Ilir Gashi, CSR / City University London, United Kingdom
  • Leonardo Aniello, University of Southampton, United Kingdom
  • Luis Muñoz-González, Imperial College London, United Kingdom
  • Marc Dacier, Eurecom, France
  • Marco Vieira, University of Coimbra, Portugal
  • Miguel Correia, University of Lisbon, Portugal
  • Mihalis Nicolaou, The Cyprus Institute, Cyprus
  • Pavel Laskov, University of Liechtenstein, Liechtenstein
  • Rogério de Lemos, University of Kent, United Kingdom
  • Sara Madeira, University of Lisbon, Portugal
  • Tommaso Zoppi, University of Florence, Italy
  • Vasileios Mavroeidis, University of Oslo, Norway
  • V.S. Subrahmanian, Dartmouth College, USA

Competition

Multi-task learning in natural language processing for cybersecurity threat intelligence

This competition is organized within the scope of the European Union H2020 project DiSIEM - Diversity Enhancements for Security Information and Event Management. To find out complete information about the competition, please visit the competition web page hosted with the DiSIEM project home page.

Competition participants are encouraged to submit a regular paper describing their methodology and early results, which will be considered for publication in the proceedings. Final results may be updated on the deadline for participation in the competition.

The DiSIEM project objective is to improve Security Information and Event Management (SIEM) systems’ capabilities using diversity-related technology. One contribution of this project relies on the application of machine-learning techniques to integrate diverse Open Source INTelligence (OSINT) to identify relationships, trends and anomalies, and hence help reacting to new vulnerabilities affecting an IT infrastructure or even predict possible emerging threats.

In this context we launch a competition on multi-task learning in natural language processing for cybersecurity threat intelligence. Over the course of the project we collected all tweets from a carefully chosen set of accounts. These datasets were labelled to distinguish the tweets related to the security of IT infrastructures specified by three industry partners: a worldwide travel services provider, a global-company cybersecurity department, and a nation-wide power utility. The security-related tweets were then carefully annotated to locate specific named entities that help building an Indicator of Compromise (IoC).

The challenge for the participants is to develop one model that can simultaneously classify tweets as relevant or not for the cybersecurity of an infrastructure, and locate specific named entities in those tweets. For that, three training data sets are provided, one for each industry partner case study. Similarly, three validation data sets are prepared to evaluate the candidate solutions. The final ranking of the competition will be given by averaging evaluation metrics over the three case studies.

Program

Keynote speaker

MLCS 2019 programme

Room 2.010

20/09/2019 10:30 - 18:00

10:30-10:40 Welcome to MLCS 2019!
Annalisa Appice, Università degli Studi di Bari
Session 1: Keynote talk
Session chair: Pedro Ferreira, Faculty of Sciences - University of Lisbon
10:40-11:30 Towards Trustworthy AI
Mario Fritz, CISPA – Helmholtz Center for Information Security
Session 2: Adversarial Defences
Session chair: Mario Fritz (tentative), CISPA – Helmholtz Center for Information Security
11:30-12:00 Defense-VAE: A Fast and Accurate Defense against Adversarial Attacks
Xiang Li and Shihao Ji
12:00-14:00 Lunch break
Session 3: Privacy and Network Security
Session chair: Michael Kamp, University of Bonn
14:00-14:30 Auto Semi-supervised Outlier Detection for Malicious Authentication Events
Georgios Kaiafas, Christian Hammerschmidt, Sofiane Lagraa, and Radu State
14:30-15:00 Analyzing and Storing Network Intrusion Detection Data using Bayesian Coresets: A Preliminary Study in Offline and Streaming Settings
Fabio Massimo Zennaro
15:00-15:30 Are Network Attacks Outliers? A Study of Space Representations and Unsupervised Algorithms
Félix Iglesias, Alexander Hartl, Tanja Zseby, and Arthur Zimek
15:30-15:45 Short paper: A unified view on differential privacy and robustness to adversarial examples
Rafael Pinot, Florian Yger, Cédric Gouy-Pailler, and Jamal Atif
16:00-16:20 Cofee break
Session 4: Panel discussion
Session chair: Donato Malerba, Università degli Studi di Bari
16:20-18:00 Future directions of research in machine learning for cybersecurity
Gavin Brown, School of Computer Science, The University of Manchester
Mario Fritz, CISPA – Helmholtz Center for Information Security

Venue

Please, read about the venue in the ECML venues web page.
You will find a description of the venue and a map.

Contact Us

for any question regarding the workshop