Software

The StarLight community and its partners directly participate in the many networking innovations, trials, experiments and tool integration – using software it develops, its partners develop, and open-source software – to respond to the networking requirements of its science communities.

BigData Express (BDE) (in progress)

The BDE initiative, led by Fermi National Accelerator Laboratory, is designing and developing a schedulable, predictable, and high-performance data-transfer service, supported by QoS-guaranteed data transfer, DTN-as-a-Service, Network-as-a-Service, distributed resource brokering/matching, scheduling and measurements. The BDE WebPortal allows users to access BDE data transfer services. The BDE scheduler is a software stack that schedules and orchestrates resources at BDE sites to support data transfers. The BDE AmoebaNet stack is an SDN-enabled network service enabling application-aware networks. Other SDX capabilities being addressed include advanced switching capabilities based on innovative switching architectures and technologies, including highly programmable sliceable SDN switches, supporting new concepts of continually dynamic adjustments to stream variability.

Clonzilla for Open Network Intall Environments (CzfO)(in progress)

CzfO is a new method for SDN programming that the StarLight community is investigating.

Data Transfer Node (DTN)-as-a-Service (DaaS) (in progress)

A screenshot of a cell phone

Description automatically generated
The DaaS software stack (illustrated here) provides users with an advanced workflow environment for high-speed network data transfer using DTNs. The DaaS software stack consists of multiple modules that implement core functions and are controlled by a workflow controller that enables users to orchestrate data transfers with predefined parameters.

DaaS, being developed by the StarLight community and its partners, is being integrated with existing SDX services. The DaaS–Jupyter /NVMe over Fabrics configuration received the SupercomputingAsia conference’s Data Mover Challenge 2020 Best Speed and Science Integration Award.

The StarLight community is also working with the Open Science Grid (OSG) to integrate DaaS with the OSG community as a production service across the U.S. and globally.

DaaS 2.0, now in development, will include services for containerization, integration of SDN control for E2E paths, OAM integration, and new SDX data transport services for data streaming with NVMeoF.

Note: DaaS was initially implemented by StarLight and its partners as a prototype service and shown in the SCinet Network Research Exhibition (NRE) at SC17, SC18 and SC19, demonstrating capabilities for dynamic provisioning, SD-WAN provisioning, dynamic implementation of WAN L2 paths, and support for large file transfers and data streams through DTNs. The DaaS framework was also presented at the Innovating the Network for Data-Intensive Science (INDIS) workshops at SC18 (presentation and paper) and SC19 (presentation and paper).

In an SC20 SCinet email dated June 26, 2020, Dr. Michelle Zhu, associate professor and associate chair in the Computer Science Department of Montclair State University and the chair of INDIS workshop since 2018, wrote: The synergy effects between the research presented at the INDIS workshop and the network demos showcased by NRE enable research opportunities within SCinet…One example is the SCinet DTN-as-a-Service Framework developed by Jim Chen, associate director of iCAIR at Northwestern University and his multi-institutional team from iCAIR, LBNL and ESnet. Initially implemented as a prototype through SCinet NRE demos at SC17 and SC18, the design and implementation of the framework was presented at SC19 through the research paper “SCinet DTN-as-a-Service Framework” by Yu et al. at the INDIS 2019 workshop, and showcased in the NRE demo “Toward SCinet DTN-as-a-Service”. As of this year, the SCinet DTN-as-a-Service framework is expected to be a standard SCinet service for the SC20 conference.

Data Plane Development Kit (DPDK) (in progress)

DPDK is an open-source software project providing data plane libraries and network interface controller polling-mode drivers to offload TCP packet processing from the OS kernel to processes running in user space. This is one of several network programming tools based on network computing and kernel by-pass techniques that the StarLight community is investigating.

Distributed Research Environment Orchestration (in progress)

Today, science workflow management and network orchestration are accomplished with completely separate software stacks. Since many computational scientists use Jupyter notebooks to manage their workflows, the StarLight SDX initiative has a project to make networks more application aware with Jupyter, integrating it with extensions to manage SDX network services and resources. Notebooks, along with libraries and data, can be placed in containers, such as Docker, for easy transportability. Supporting these services will require support for integrated compute and storage at exchange points, using virtualization and containerization techniques.

Extended Berkeley Packet Filter (eBPF) (in progress)

eBPF is an in-kernel virtual machine used for system tracing. eBPF map, its primary generic data structure, enables data to be communicated within the kernel or between the kernel and user space, thereby enabling programs and tools to be executed in kernel space without having to write new kernel modules. This is one of several network programming tools based on network computing and kernel by-pass techniques that the StarLight community is investigating.

Global Research Platform (GRP) Initiative (in progress)

The evolving international GRP effort focuses on the design, implementation, and operation strategies for next-generation distributed services and network infrastructure on a global scale, including interoperable Science DMZs, to facilitate data transfer and accessibility. Several initiatives include the Global Research Platform, the Asia Pacific Research Platform, the Australia National Research Platform, the Canadian Research Platform, the Korean Research Platform, the Netherlands Research Platform and the Poland Research Platform.

Kubespawner: Global Science DMZ Architecture

The StarLight SDX initiative partners with UCSD to prototype the use of specialized software stacks to create a multi-institution, hyperconverged Science DMZ using Kubernetes as the core orchestrating resource. Kubespawner, the JupyterHub Kubernetes Spawner, enables JupyterHub to spawn single-user notebook servers on a Kubernetes cluster. Kubernetes is an open-source system that automates the deployment, scaling, and management of containerized applications. Kubernetes deploys a JupyterHub setup that scales across multiple nodes; e.g., supporting over 50 simultaneous users. It can easily scale to more nodes and reduce to less. Consequently, many JupyterHub deployments can be run with Kubernetes only, eliminating a need to use scripts, such as Ansible, Puppet and Bash.

Multi-science Open Network Environment (MultiONE) (in progress)

MultiONE is a project of the StarLight consortium and the LHC networking community to develop LHCONE capabilities for domain science to use isolated global virtual networks. One prototype project is DUNEONE (the Deep Underground Nutrino Experiment Open Network Environment), which led to considerations for the design of MultiONE. This is now an active project within the LHC networking community.

P4 Programming Language (in progress)

The StarLight community is investigating the integration of Jupyter high-level processes with an underlying set of network programming processes, including P4 (Programming Protocol-Independent Packet Processors). P4 is an emerging network programming language, i.e., a domain-specific language for network protocols. The StarLight SDX project is exploring use of P4 to support large-scale data-intensive science workflows on high-capacity high-performance WANs and LANs by introducing into the network a high degree of differentiation not previously possible, including P4 telemetry techniques. StarLight and its partners developed an international P4 testbed for computer science research, and created P4MT – a Multi-Tenant cabability for this testbed, which is being used for multiple research projects.

PetaTrans 100 Gbps Data Transfer Node (DTNs) Project (in progress)

DTNs support Science DMZ services. The StarLight community is experimenting with multiple configurations for 100 Gbps DTNs over 100 Gbps WANs, designed to optimize end-to-end (E2E) large-scale, sustained individual data streams. The StarLight community is designing PetaTrans 100 Gbps DTN research project to improve large-scale WAN services for high-performance, long-duration, large-capacity, single data flows, and received the SupercomputingAsia conference’s Data Mover Challenge 2019 Innovation Award.