We use cookies on this site to enhance your user experience
By clicking the Accept button, you agree to us doing so. More info on our cookie policy
We use cookies on this site to enhance your user experience
By clicking the Accept button, you agree to us doing so. More info on our cookie policy
for High-Volume EO Data Processing
To facilitate the processing of EO data in a heterogeneous computation resource landscape and subject to a number of request-specific constraints, a sophisticated method of workflow task placement is needed. Subproject CN1 aims to develop a data-centric and compute-centric networking architecture that consolidates network management and the management of the serverless platform. In order to manage the resource demands of complex EO workflows, the development of a fitting infrastructure abstraction layer is a key research area, enabling fine-tuned task placement decisions in a multi-objective environment. Due to the incorporation of workflow-specific optimizations, and the integration of high-performance computing into the networked resources, the project is tightly connected to both subprojects SE2 and HPC1.
Fig. 1:Structured Approach and embedding of CN1 into the Research Unit SOS.
Current EO-related data-science is performed in isolated high-performance computing environments, necessitated by the size of the datasets and the complexity of the performed computations. However, the results generated by experts, as well as the workflows they develop for their specific subject, are rarely shared between even researchers in related fields, resulting in unnecessary recomputation and decreased sustainability. The sharing of both datasets and computing resources between researchers may improve research quality by enabling increased collaboration, but requires the uncomplicated pooling of resources and the possibility of sharing results without the need for deep technical knowledge of the underlying network mechanisms by the EO researcher.
While a large number of studies exist in the multitude of intersecting topics that this subproject aims to tackle, none of them investigate the emergent behavior of a system facilitating EO-workflows in a serverless environment.
Designing a highly functional control-plane overlay requires a solid basis of performance analysis, which will leverage theoretical queuing models for an abstract evaluation, and discrete-event based simulation for the evaluation of the implemented control plane mechanisms, which is extended throughout progress in the project to include resource monitoring and performance improvement mechanisms. The task-placement problem has been studied in related literature, and the formulation of these problems through Integer/Mixed-Integer Linear Programming can address the conflicting objectives competing in our use-case. For detailed lifecycle analysis of compute nodes, a Markov- or Semi-Markov model may be used, facilitating both stationary and transient system analysis to understand time dynamics.
This subproject focuses on the unique challenges that arise from the resource demand profile specific to the EO domain. This leads to an implementation of the data-centric and compute-centric paradigm that is fine-tuned to the specific requirements of the workflows at hand, enabling a sustainable and efficient compute environment.
Team