June 17, 2024


“Because of the facility of open supply, the compute functionality supplied by the HPE Spaceborne Laptop-2, and the scalability of Azure, we’re empowering builders to construct for house at a pace that’s out of this world.”—Kevin Mack, Senior Software program Engineer, Microsoft

This morning Microsoft Information printed a narrative about the usage of Azure, enabled by HPE’s Spaceborne Laptop-2 on the Worldwide Area Station (ISS). The challenge was designed to beat the restricted bandwidth between ISS and Earth by validating the advantages of a computational workflow that spans edge and cloud. Below this workflow, examination of high-volume uncooked information is processed and carried out on the ISS utilizing the HPE Spaceborne Laptop-2’s edge computing platform and a a lot smaller information set containing solely “fascinating bits” is shipped to Earth, the place cloud sources are used to carry out compute-intensive evaluation to find out what these fascinating bits actually imply.

The Azure Area staff carried out the software program improvement wanted for the complete experiment in simply three days.

A quick background

The Worldwide Area Station (ISS), a microgravity and house atmosphere analysis laboratory, has simply noticed 20 years of steady human presence. New expertise is delivered to it usually, as wanted to maintain up with the analysis being carried out. Computer systems used on the ISS have usually been custom-built with specialised and programming fashions, wanted to ship the reliability wanted in house. Sadly, the developer expertise for concentrating on these spaceborne methods is complicated, making programming gradual and difficult in comparison with the commercial-off-the-shelf methods utilized by most builders right now.

Put in in 2017, Spaceborne Laptop-1, designed by HPE, validated that a trendy, commercial-off-the-shelf pc may survive a launch into house, be put in by astronauts, and function appropriately on the ISS—with out “flipping bits” as a consequence of elevated radiation in house. Mainly, it was a year-long take a look at to see if the pc used on Earth would operate usually in house. Constructing on this success, HPE’s Spaceborne Laptop-2, an edge computing platform with purposely designed options for harsh environments, was put in in April 2021 to ship twice as a lot compute efficiency, and for the primary time, synthetic intelligence (AI) capabilities to advance house exploration and analysis by enabling the identical programming fashions and developer experiences used on Earth.

In some ways, Spaceborne Laptop-2, which is comprised of the HPE Edgeline EL4000 Converged Edge system and HPE ProLiant DL360 Gen10 server, is the final word edge computing system platform, placing a game-changing quantity of compute on the fringe of house. Nevertheless, the true limiting issue is the bandwidth between the ISS and Earth. Though Spaceborne Laptop-2 helps the utmost accessible community speeds, it solely receives from NASA an allocation of two hours of communication bandwidth every week to transmit information to earth, with a most obtain pace of 250 kilobytes per second.

In some instances, working round restricted bandwidth may be achieved by HPE serving to researchers to compress information on Spaceborne Laptop-2 earlier than sending it all the way down to Earth. In different instances, the information may be absolutely analyzed in house with no need to make use of the gradual downlink in any respect. However what about analysis that requires extra compute or bandwidth than what Spaceborne Laptop-2 can present, or that may be allotted to a single experiment amongst many? To deal with such eventualities, HPE utilized its imaginative and prescient for an “edge to cloud” expertise, through which Spaceborne Laptop-2 is used to carry out preliminary evaluation or filtering on massive information units, extract what’s fascinating or surprising, after which burst these outcomes all the way down to Earth and into the general public cloud for full evaluation.

The Azure Area experiment

The Azure Area staff at Microsoft proposed an experiment that simulates how NASA would possibly monitor astronaut well being within the presence of elevated radiation publicity, as exists exterior of our protecting ambiance. Such publicity will solely improve as astronauts enterprise past the ISS’s low-earth orbit into and past the Van Allen Belts.

The experiment assumes entry to a gene sequencer onboard the ISS, which is used to usually monitor blood samples from astronauts. Nevertheless, gene sequencing can generate an unimaginable quantity of information—far an excessive amount of for a 2Mbps/sec downlink—and the output must be in contrast in opposition to a big medical database that’s always being up to date.

To beat these limitations, the experiment makes use of HPE Spaceborne Laptop-2 to carry out the preliminary technique of evaluating extracted gene sequences in opposition to reference DNA segments and seize solely the variations, or mutations, that are then downloaded to the HPE floor station.

On earth, the information is uploaded to Azure, the place the Microsoft Genomics service does the computational “alignment” work—the method of matching the brief base-pair gene sequence reads within the downloaded information (that are about 70 base pairs in size) in opposition to the complete three giga-base-pair human genome, as required to find out the place within the human genome every mutation is positioned and the kind of change (deletion, addition, replication, or swap). Aligned reads are then checked in opposition to the Nationwide Institute for Well being’s dbSNP database to find out what the well being impacts of a given mutation would possibly imply. Watch the video under to see Azure in motion.

Genomics testing on the ISS with HPE Spaceborne Computer-2 and Azure

Improvement course of and computational workflow

The whole experiment was coded by 10 volunteers from the Azure Area staff and its father or mother group, the Azure Particular Capabilities, Infrastructure, and Innovation Crew. All main software program parts (each ISS-based and Azure-based) have been written in Python and bash utilizing Visible Studio Code, GitHub, and the Python libraries for Azure Capabilities and Azure Blob Storage. David Weinstein, Principal Software program Engineering Supervisor at Azure Area, led the three-day improvement effort—consisting of a one-day hackathon and two days of cleanup.

The next graphic reveals the computational workflow. It begins on the ISS, on Spaceborne Laptop-2, which runs Purple Hat Linux 7.four.

Architecture diagram shows how the International Space Station connects to Azure

In house

  • A Linux container hosts a Python workload, which is packaged with information representing mutated DNA fragments and wild-type (that means regular or non-mutated) human DNA segments. There are 80 strains of Python code, with a 30-line bash script to execute the experiment.
  • The Python workload generates a configurable quantity of DNA sequences (mimicking gene sequencer reads, about 70 nucleotides lengthy) from the mutated DNA fragment.
  • The Python workload makes use of awk and grep to match generated reads in opposition to the wild-type human genome segments.
  • If an ideal match can’t be discovered for a learn, it’s assumed to be a possible mutation and is compressed into an output folder on the Spaceborne Laptop-2 network-attached storage system.
  • After the Python workload completes, the compressed output folder is shipped to the HPE floor station on Earth through rsync.

On Earth

  • The HPE floor station uploads the information it receives to Azure, writing it to Azure Blob Storage by means of azcopy.
  • An event-driven, serverless operate written in Python and hosted in Azure Capabilities screens Blob Storage, retrieving newly acquired information and sending it to the Microsoft Genomics service through its REST API.
  • The Microsoft Genomics service, hosted on Azure, invokes a gene sequencing pipeline to “align” every learn and decide the place, how effectively, and the way unambiguously it matches the complete reference human genome. (The Microsoft Genomics service is a cloud implementation of the open-source Burroughs-Wheeler Aligner and Genome Evaluation Toolkit, which Microsoft tuned for the cloud.)
  • Aligned reads are written again to Blob Storage in Variant Name Format (VCF), a regular for describing variations from a reference genome.
  • A second serverless operate hosted in Azure Capabilities retrieves the VCF information, utilizing the decided location of every mutation to question the dbSNP database hosted by the Nationwide Institute of Well being—as wanted to find out the medical significance of the mutation—and writes that data to a JSON file in Blob Storage.
  • Energy BI retrieves the information containing medical significance of the mutated genes from Blob Storage and shows it in an simply explorable format.

The Aligner and Analyzer features complete about 220 strains of code, with the Azure providers and SDKs dealing with the entire low-level “plumbing” for the experiment. The features are mechanically triggered by blob storage uploads and are configured to level to the fitting storage accounts—requiring only a small quantity of code to parse the uncooked information and question Microsoft Genomics and the dbSNP database at runtime.

Develop and take a look at

Throughout improvement and take a look at, builders didn’t have entry to HPE Spaceborne Laptop-2 or the HPE floor station, in order that they recreated these environments on Azure, counting on GitHub Codespaces to additional improve their velocity. They packaged each the ISS and floor station environments into an Azure Useful resource Supervisor (ARM) template, which simulates the latency between the ISS and the bottom station by deploying the Spaceborne Laptop-2 atmosphere to an Azure information heart in Australia and the bottom station atmosphere to 1 in Virginia.

The outcomes

On August 12, 2021, the 120MB payload containing the experiment developed by Azure Area was uploaded to the ISS and run on Spaceborne Laptop-2. The experiment is configurable, so Azure Area was in a position to execute “take a look at”, “small”, and “medium” eventualities, executed in that order.

Desk 1 reveals the outcomes of the experiment when it comes to processing occasions and information volumes:


Take a look at



Uncooked information examined




Downloaded to Earth




Run time on ISS

20 seconds

2 minutes

1 hour

Obtain time from ISS

<1 second

2 seconds

17 seconds

The experiment’s profitable completion—and the information collected by means of it—is proof of how an edge-to-cloud computing workflow can be utilized to assist high-value use instances aboard the ISS which may in any other case be inconceivable as a consequence of compute and bandwidth constraints. With out preprocessing the simulated output of the gene sequencer on the ISS to filter out solely the gene mutations, 150 occasions as a lot information would should be downloaded to Earth. Thus, a 200GB uncooked full human genome learn which might require over two years to obtain given bandwidth and downlink window constraints, might be filtered to 1.5GB—which may be transmitted in simply over an hour. Microsoft expects deliberate assessments to additional improve this ratio.

Equally, trying to carry out the entire processing that’s being finished on Azure would require importing a duplicate of the complete reference human genome and a duplicate of the complete dbSNP database. To complicate issues, the dbSNP database is continually being up to date and peer-reviewed by scientists throughout the globe, that means that common synchronization can be required to keep up a helpful copy in house.

Construct cloud functions productively, wherever

From a software program improvement perspective, the developer velocity with which Azure Area delivered the experiment is as spectacular as its outcomes—with all parts delivered over a three-day interval utilizing serverless Azure Capabilities written in Python, and best-in-class developer instruments resembling Visible Studio Code and GitHub. To assist the event of extra experiments by others, Weinstein’s staff at Azure Area plans to publish the Useful resource Supervisor templates containing the simulated ISS and floor station environments they used for improvement and take a look at.

Making such capabilities accessible to others is only one early step for Azure Area, a brand new vertical inside Microsoft that was publicly introduced a few yr in the past. Its twofold mission: to allow organizations who construct, launch, and function spacecraft and satellites and to “democratize the advantages of house” by enabling extra alternatives for all actors, massive and small, in a lot the identical approach that assist for open supply on Azure has helped democratize cloud computing. One such instance is Azure Orbital, a floor station as-a-service that gives communication and management for satellite tv for pc operators—together with the flexibility to simply course of satellite tv for pc information at a cloud-scale.


Right now, HPE’s Spaceborne Laptop-2 is performing its mission of enabling “edge-to-cloud” experiments and proofs-of-concepts (POCs). Initially, these are absolutely remoted from the operational methods that astronauts and NASA rely on to function the ISS itself.

As humankind continues to push farther into house, having actual compute energy on the edge will develop into an increasing number of necessary. Use instances will improve, and builders who make use of compute-intensive information science, AI, and machine studying instruments will all the time be pushing the boundaries of the compute sources they’ve accessible to construct the subsequent era of cloud-native functions. The sting-to-cloud compute mannequin is one answer to such challenges, and the genomics experiment by Azure Area validates how that mannequin might help “democratize house” for the good thing about all. As one of many builders on the Azure Area staff put it, “Anybody may adapt what we constructed to their very own wants as a result of it’s all open supply—based mostly on instruments and applied sciences that any highschool programming class may obtain and use.”

Study extra about open-source improvement on Azure.


Source link

Leave a Reply

Your email address will not be published. Required fields are marked *