MLCS 2019 - Workshop on Machine Learning for CyberSecurity

About MLCS 2019

Short description

The last decade has been a critical one regarding cybersecurity, with studies estimating the cost of cybercrime to be up to 0.8 percent of the global GDP. The capability to detect, analyse, and defend against threats in (near) real-time conditions is not possible without employing machine learning techniques and big data infrastructures. This gives rise to cyberthreat intelligence and analytic solutions, such as (informed) machine learning on big data and open-source intelligence, to perceive, reason, learn, and act against cyber adversary techniques and actions. Moreover, organisations’ security analysts have to manage and protect systems and deal with the privacy and security of all personal and institutional data under their control. The aim of this workshop is to provide researchers with a forum to exchange and discuss scientific contributions, open challenges and recent achievements in machine learning and their role in the development of secure systems.

Relevance to the Machine Learning Community

Cybersecurity is of the utmost importance for computing systems. The ethics guidelines for trustworthy artificial intelligence authored by the European Commission’s High Level Expert Group on Artificial Intelligence on December 2018, have recently highlighted that machine learning-based artificial intelligence developments in various fields, including cybersecurity, are improving the quality of our lives every day.

Due to the scale and complexity of current systems, cybersecurity is a permanent and growing concern in industry and academia. On the one hand, the volume and diversity of functional and non-functional data, including open source information, along with increasingly dynamical operating environments, create additional obstacles to the security of systems and to the privacy and security of data. On the other hand, it creates an information rich environment that, leveraged by techniques in the crossing of modern machine learning, data science and visualization fields, will contribute to improve systems and data security and privacy.

This poses significant, industry relevant, challenges to the machine learning and cybersecurity communities, as the main problems arise in contexts of dynamic operating environments and unexpected operating conditions, motivating the demand for production-ready systems able to improve and, adaptively, maintain the security of computing systems as well as the security and privacy of data.

Based on the recent history, we plan to organize this workshop as a European forum for cybersecurity researchers and practitioners that wish to discuss the recent developments of machine learning for developing cybersecurity, by paying special attention to solutions rooted in adversarial learning, pattern mining, neural networks and deep learning, probabilistic inference, anomaly detection, stream learning and mining, and big data analytics.

Motivation

The last decade has been a critical one regarding cybersecurity, with studies estimating the cost of cybercrime to be up to 0.8 percent of the global GDP. Cyberthreats have increased dramatically, exposing sensitive personal and business information, disrupting critical operations and imposing high costs on the economy. The number, frequency, and sophistication of threats will only increase and will become more targeted in nature. Furthermore, today’s computing systems operate under increasing scales and dynamic environments, ingesting and generating more and more functional and non-functional data. The capability to detect, analyse, and defend against threats in (near) real-time conditions is not possible without employing machine learning techniques and big data infrastructure. This gives rise to cyber threat intelligence and analytic solutions, such as (informed) machine learning on big data and open-source intelligence, to perceive, reason, learn, and act against cyber adversary techniques and actions. Moreover, organisations’ security analysts have to manage and protect these systems and deal with the privacy and security of all personal and institutional data under their control. This calls for tools and solutions combining the latest advances in areas such as data science, visualization, and machine learning.

We strongly believe that the significant advance of the state-of-the-art in machine learning over the last years has not been fully exploited to harness the potential of available data, for the benefit of systems-and-data security and privacy. In fact, while machine learning algorithms have been already proven beneficial for the cybersecurity industry, they have also highlighted a number of shortcomings. Traditional machine algorithms are often vulnerable to attacks, known as adversarial learning attacks, which can cause the algorithms to misbehave or reveal information about their inner workings. As machine learning-based capabilities become incorporated into cyber assets, the need to understand adversarial learning and address it becomes clear. On the other hand, when a significant amount of data is collected from or generated by different security monitoring solutions, big-data analytical techniques are necessary to mine, interpret and extract knowledge of these big data.

Goals

The workshop aims at providing researchers with a forum to exchange and discuss scientific contributions and open challenges, both theoretical and practical, related to the use of machine-learning approaches in cybersecurity. We want to foster joint work and knowledge exchange between the cybersecurity community, and researchers and practitioners from the machine learning area, and its crossing with big data, data science, and visualization. It aims to highlight the latest research trends in machine learning, privacy of data, big data, deep learning, incremental and stream Learning, and adversarial learning. In particular, it aims to promote the application of these emerging techniques to cybersecurity and measure the success of these less-traditional algorithms.

The workshop shall provide a forum for discussing novel trends and achievements in machine learning and their role in the development of secure systems, and to identify new application areas as well as open and future research problems related to the application of machine-learning in the cybersecurity field.

Call for papers

MLCS welcomes both research papers reporting results from mature work, recently published work, as well as more speculative papers describing new ideas or preliminary exploratory work. Papers reporting industry experiences and case studies will also be encouraged. However, it should be noticed that papers based on recently published work will not be considered for publication in the proceedings.

Topics

All topics related to the contribution of machine learning approaches to the security of organisations’ systems and data are welcome. These include, but are not limited to:

Machine learning for:

the security and dependability of networks, systems, and software
open-source threat intelligence and cybersecurity situational awareness
data security and privacy
cybersecurity forensic analysis
the development of smarter security control
the fight against (cyber)crime, e.g., biometrics, audio/image/video analytics
vulnerability analysis
the analysis of distributed ledgers
malware, anomaly, and intrusion detection

Adversarial machine learning and the robustness of AI models against malicious actions
Interpretability and Explainability of machine learning models in cybersecurity
Privacy preserving machine learning
Trusted machine learning
Data-centric security
Scalable / big data approaches for cybersecurity
Deep learning for automated recognition of novel threats
Graph representation learning in cybersecurity
Continuous and one-shot learning
Informed machine learning for cybersecurity
User and entity behavior modeling and analysis

Paper submission

Submissions are accepted in two formats:

Regular research papers with 12 to 16 pages including references. To be published in the proceedings, research papers must be original, not published previously, and not submitted concurrently elsewhere.
Short research statements of at most 6 pages. Research statements aim at fostering discussion and collaboration. They may review research published previously or outline new emerging ideas.

All submissions should be made in PDF using the EasyChair platform and must adhere to the Springer LNCS style. Templates are available here.

All regular workshop papers (except papers reporting recently published work) will be published in the workshop proceedings. Research statements will be published online in the workshop program page.

Organizing Committee

Program Committee

Confirmed members

Aikaterini Mitrokotsa, Chalmers University of Technology, Sweden
Alysson Bessani, University of Lisbon - LASIGE, Portugal
Cagatay Turkay, City University London, United Kingdom
Gianluigi Folino, CNR-ICAR, Italy
Giorgio Giacinto, University of Cagliary, Italy
Ilir Gashi, CSR / City University London, United Kingdom
Leonardo Aniello, University of Southampton, United Kingdom
Luis Muñoz-González, Imperial College London, United Kingdom
Marc Dacier, Eurecom, France
Marco Vieira, University of Coimbra, Portugal
Miguel Correia, University of Lisbon, Portugal
Mihalis Nicolaou, The Cyprus Institute, Cyprus
Pavel Laskov, University of Liechtenstein, Liechtenstein
Rogério de Lemos, University of Kent, United Kingdom
Sara Madeira, University of Lisbon, Portugal
Tommaso Zoppi, University of Florence, Italy
Vasileios Mavroeidis, University of Oslo, Norway
V.S. Subrahmanian, Dartmouth College, USA

Competition

Multi-task learning in natural language processing for cybersecurity threat intelligence

This competition is organized within the scope of the European Union H2020 project DiSIEM - Diversity Enhancements for Security Information and Event Management. To find out complete information about the competition, please visit the competition web page hosted with the DiSIEM project home page.

Competition participants are encouraged to submit a regular paper describing their methodology and early results, which will be considered for publication in the proceedings. Final results may be updated on the deadline for participation in the competition.

The DiSIEM project objective is to improve Security Information and Event Management (SIEM) systems’ capabilities using diversity-related technology. One contribution of this project relies on the application of machine-learning techniques to integrate diverse Open Source INTelligence (OSINT) to identify relationships, trends and anomalies, and hence help reacting to new vulnerabilities affecting an IT infrastructure or even predict possible emerging threats.

In this context we launch a competition on multi-task learning in natural language processing for cybersecurity threat intelligence. Over the course of the project we collected all tweets from a carefully chosen set of accounts. These datasets were labelled to distinguish the tweets related to the security of IT infrastructures specified by three industry partners: a worldwide travel services provider, a global-company cybersecurity department, and a nation-wide power utility. The security-related tweets were then carefully annotated to locate specific named entities that help building an Indicator of Compromise (IoC).

The challenge for the participants is to develop one model that can simultaneously classify tweets as relevant or not for the cybersecurity of an infrastructure, and locate specific named entities in those tweets. For that, three training data sets are provided, one for each industry partner case study. Similarly, three validation data sets are prepared to evaluate the candidate solutions. The final ranking of the competition will be given by averaging evaluation metrics over the three case studies.

Program

Keynote speaker

Mario Fritz

CISPA – Helmholtz Center for Information Security
Germany

MLCS 2019 programme

Room 2.010

20/09/2019 10:30 - 18:00

10:30-10:40	Welcome to MLCS 2019! Annalisa Appice, Università degli Studi di Bari
	Session 1: Keynote talk Session chair: Pedro Ferreira, Faculty of Sciences - University of Lisbon
10:40-11:30	Towards Trustworthy AI Mario Fritz, CISPA – Helmholtz Center for Information Security
	Session 2: Adversarial Defences Session chair: Mario Fritz (tentative), CISPA – Helmholtz Center for Information Security
11:30-12:00	Defense-VAE: A Fast and Accurate Defense against Adversarial Attacks Xiang Li and Shihao Ji
12:00-14:00	Lunch break
	Session 3: Privacy and Network Security Session chair: Michael Kamp, University of Bonn
14:00-14:30	Auto Semi-supervised Outlier Detection for Malicious Authentication Events Georgios Kaiafas, Christian Hammerschmidt, Sofiane Lagraa, and Radu State
14:30-15:00	Analyzing and Storing Network Intrusion Detection Data using Bayesian Coresets: A Preliminary Study in Offline and Streaming Settings Fabio Massimo Zennaro
15:00-15:30	Are Network Attacks Outliers? A Study of Space Representations and Unsupervised Algorithms Félix Iglesias, Alexander Hartl, Tanja Zseby, and Arthur Zimek
15:30-15:45	Short paper: A unified view on differential privacy and robustness to adversarial examples Rafael Pinot, Florian Yger, Cédric Gouy-Pailler, and Jamal Atif
16:00-16:20	Cofee break
	Session 4: Panel discussion Session chair: Donato Malerba, Università degli Studi di Bari
16:20-18:00	Future directions of research in machine learning for cybersecurity Gavin Brown, School of Computer Science, The University of Manchester Mario Fritz, CISPA – Helmholtz Center for Information Security

About MLCS 2019

Short description

Relevance to the Machine Learning Community

Motivation

Goals

Call for papers

Topics

Paper submission

Important dates

Regular and research statement papers / Competition

June 7

Extended: June 21

July 19

July 26

August 24

August 31

Organizing Committee

Annalisa Appice

Battista Biggio

Donato Malerba

Fabio Roli

Ibéria Medeiros

Michael Kamp

Pedro M. Ferreira