We use cookies on this site to enhance your user experience

By clicking the Accept button, you agree to us doing so. More info on our cookie policy

CN1: Network and Infrastructure Resource Management

for High-Volume EO Data Processing

Summary

To facilitate the processing of EO data in a heterogeneous computation resource landscape and subject to a number of request-specific constraints, a sophisticated method of workflow task placement is needed. Subproject CN1 aims to develop a data-centric and compute-centric networking architecture that consolidates network management and the management of the serverless platform. In order to manage the resource demands of complex EO workflows, the development of a fitting infrastructure abstraction layer is a key research area, enabling fine-tuned task placement decisions in a multi-objective environment. Due to the incorporation of workflow-specific optimizations, and the integration of high-performance computing into the networked resources, the project is tightly connected to both subprojects SE2 and HPC1.

CN1 Overview Fig. 1:Structured Approach and embedding of CN1 into the Research Unit SOS.

Context & Motivation

Current EO-related data-science is performed in isolated high-performance computing environments, necessitated by the size of the datasets and the complexity of the performed computations. However, the results generated by experts, as well as the workflows they develop for their specific subject, are rarely shared between even researchers in related fields, resulting in unnecessary recomputation and decreased sustainability. The sharing of both datasets and computing resources between researchers may improve research quality by enabling increased collaboration, but requires the uncomplicated pooling of resources and the possibility of sharing results without the need for deep technical knowledge of the underlying network mechanisms by the EO researcher.

State-of-the-Art

While a large number of studies exist in the multitude of intersecting topics that this subproject aims to tackle, none of them investigate the emergent behavior of a system facilitating EO-workflows in a serverless environment.

Goals & Objectives

  1. Development of a scalable control plane overlay
  2. Solution to the task placement problem, see also SE1
  3. Development of monitoring mechanisms to monitor network conditions and resources
  4. Development and evaluation of an adaptive data-plane
  5. Derivation and evaluation of performance improvement methods

Approach

Designing a highly functional control-plane overlay requires a solid basis of performance analysis, which will leverage theoretical queuing models for an abstract evaluation, and discrete-event based simulation for the evaluation of the implemented control plane mechanisms, which is extended throughout progress in the project to include resource monitoring and performance improvement mechanisms. The task-placement problem has been studied in related literature, and the formulation of these problems through Integer/Mixed-Integer Linear Programming can address the conflicting objectives competing in our use-case. For detailed lifecycle analysis of compute nodes, a Markov- or Semi-Markov model may be used, facilitating both stationary and transient system analysis to understand time dynamics.

Novelties

This subproject focuses on the unique challenges that arise from the resource demand profile specific to the EO domain. This leads to an implementation of the data-centric and compute-centric paradigm that is fine-tuned to the specific requirements of the workflows at hand, enabling a sustainable and efficient compute environment.


Team