Jan 13, 2023
10 min

📤 Offloading data with poor connectivity

A walk-through on the challenges and strategies of offloading robotics data under a poor connection.

📤 Offloading data with poor connectivity

Once data is collected to disk on a robot (read our post last month on how this works), it must be offloaded (i.e. moved from the robot) to be put to use.

Offloaded data can be used to calculate performance analytics, build machine learning datasets, simulate edge-case scenarios, and more.

Poor connectivity is the norm for robotics applications though, which makes offloading a challenge. From remote farms to weak Wi-Fi in warehouses – a good connection is far from a guarantee.

The sheer volume of robotics data (often terabytes per day) is too large to be offloaded from remote areas using standard procedures.

We’ll dive deeper into the challenges of connectivity in varying environments and strategies to successfully offload data.

Twenty new Starlink satellites about to be deployed.

A brief look back at data collection

At Woeden, we like to think of data collection and offload as separate tasks with their own challenges.

Data collection is the process of identifying data worth recording and actually recording it. We went into depth last month on various data collection strategies (see here).

Data offload is the process of moving data off of a robot and into cloud storage or some other storage medium.

Numerous strategies exist for collecting data, including event-based and on-demand paradigms as well as enhancements like rolling buffers. These methods are essential to employ because modern networks lack the bandwidth to stream all this robotics data.

While an intelligent data collection strategy can save disk space, ultimately a powerful offload system must also be implemented to actually free up the disk space for future use.

The high-level offload mechanisms

It is common to see data offloaded from a fleet in one of two ways:

  1. Direct upload from the robot’s onboard storage
  2. Indirect upload via detachable storage devices

Both of these strategies have their own advantages and disadvantages, so let’s dive into the details!

Application-based offload challenges

One of the simplest ways of characterizing various robotics applications is identifying whether or not the robots are primarily deployed indoors or outdoors.

Indoor robots may be used in applications like manufacturing, warehousing, and food preparation. Robots deployed indoors can have complex connectivity issues, including:

  • Facility lacks access to a high speed internet connection or enforces a bandwidth limit
  • The robot lacks a high bandwidth connection (e.g. poor Wi-Fi in a large indoor space)
  • The robot’s bandwidth use has a negative impact on other internet users

They also tend to have physical access barriers that make direct hardware offload challenging. Food preparation robots, for example, tend to be inaccessible once deployed for sanitation purposes, which makes it difficult to simply fetch the drive with the recorded data.

Outdoor robots are used in applications such as agriculture, robo-taxiing, and defense. It can be a challenge to offload data from robots in these settings, because cellular connectivity can be limited or nonexistent and drives may need to be shipped hundreds to thousands of miles.

ABB robot deployed in a sterile food processing plant.

Direct offload from a robot

Let’s take a brief look at the various strategies you may employ to actually upload the data and what transmission mediums are available to you.

Upload strategies

Direct offload of data from a robot may take one of three forms.

Brute force upload of all data

This method is as simple as it sounds: no strategy for uploading portions of the data. Just upload it all!

This is usually only achievable for companies with lower data volumes, high budgets for data storage, and also usually involves a hardwired uplink with a high bandwidth internet connection. You might see this in an R&D lab setting or when an autonomous car returns to a garage.

This approach is easy and requires little engineering effort. Once you offload data from your robot, you can send it right back into the field. But it’s rare that you will collect so little data or a network will be so easily available.

Selective upload of specific data

It’s often undesirable to offload all collected data from robots due to large transmission and cloud storage costs.

An engineer may opt to selectively offload a subset of the collected data. This may be both manual and automatic.

Manual review usually involves a preview of the data, such as a GIF, and using visualization or monitoring tools.

Automated review can help for events that are known in advance to be important. If a serious error or system fault occurs – like an emergency stop button push or an autonomous car disengagement – this data could be uploaded immediately.

Trickle stream of important data

For settings with limited bandwidth where a robot needs to upload, the data can be broken into chunks to upload when the time is right. This is the optimal way to offload from a robot that operates for long periods of time in areas with poor connectivity.

It is valuable to have a bandwidth-aware upload process since unrestricted offload could:

  • interfere with streaming critical live data from other parts of the system
  • upset your customer by using too much of their bandwidth
  • incur additional fees from your cellular or satellite provider

Event tags can be used to determine what data should be offloaded automatically.

This strategy increases data availability and ensures your team has access to all critical data as soon as possible.

Verizon cellular towers.

Transmission mediums

A number of different options are available for transmitting the data.

  • Cellular / LTE are convenient options for both indoor and outdoor robots if it is available. Verizon’s M2M plans allow you to upload up to 10 GB on each robot per month for just $80/mo.
  • While Satellite is available everywhere on Earth with a view of the sky above, this option can get incredibly expensive fast. Not only do Iridium’s cards cost well over a thousand dollars per unit, you will also be charged per 1,000 bytes. It is critical to be intelligent about what data is transmitted over satellite.
  • Wi-Fi may be an always-available option for indoor robots and an occasionally available option for outdoor robots. If you have access to this, consider yourself lucky, because the only cost to you will be a $30 Wi-Fi card for your robot’s computer.
  • Hardwired ethernet is usually only an option for fixed installations, such as robotic arms for manufacturing. The cost to install the ethernet line from a router to your actual robot may not be small though, and there’s a possibility the property owner won’t allow it.

There are a number of other approaches that may make sense for your application, including combined approaches where lightweight data is uploaded wirelessly and heavyweight sensor data is uploaded via a trickle stream or by direct connection. We expect Starlink, especially the RV variety, to make accessing data from your robots much easier in the future!

Indirect offload from a detached drive

There are two major strategies for offloading data from a detached drive. Both of these strategies enable you to collect enormous amounts of data, so you will need to be selective about what data to keep after it’s uploaded to the cloud to regulate costs.

The workstation

You may be fortunate enough to have a parking or storage facility for robots when they have completed a session. This may look like a garage for self-driving cars or autonomous tractors.

Many mobile robots have a “mission” or “session” oriented deployment model, where the robot is in the field for a limited amount of time and then returns to “home base”. It’s common for a robot operator to manually remove storage devices from robots and connect to a drive dock. Then the data can be automatically uploaded.

Shipping drives

The saying goes: “Never underestimate the bandwidth of a station wagon full of tapeshurtling down the highway.” Even the fastest hardwired internet connections can be slower than shipping hard drives.

For robotics applications involving long deployments in extremely remote settings, like marine robotics or defense, the best approach may be to ship detached drives using a service like AWS Snowcone.

Alternatively, you may roll your own shipping infrastructure and collect data on a NAS, where it will then be shipped long distances, potentially even the world, and offloaded en-masse later.

This strategy allows you to upload high fidelity data without much of a software engineering effort. However, it tends to require significant physical and operational infrastructure that may be unrealistic for your business.

Begin offloading robotics data

We’ll walk you through a few options to begin offloading data from your robots, starting from easiest to most difficult.

Simply copy the data

Perhaps the most simple approach to copying data off your robot is to use tools like ssh, scp, rsync, etc. to copy the data directly. You can then inspect your data in your local machine’s filesystem in the normal way.

While this approach is simple to implement and easy to understand, it does require that a connection is maintained throughout entire uploads. 

It can also make it hard to share data between colleagues since the data is often stored on a single machine. This approach also lacks a unified web interface to browse and preview the data from your browser, making it much more difficult to effectively manage and share data.

Mount buckets as drives

Another solution is to mount a cloud storage bucket as a drive on your robot (or local computer if you have a detached disk), and you can copy data directly to it.

This makes it extremely easy to move files and get data onto a number of machines, and you can share references to logfiles via object keys in storage buckets. However, bucket mounting can be unstable under flaky internet connections, and fairly slow to upload and download.

Similar to the above, you’ll lack access to a unified web interface to effectively manage and share data. Read on here for mounting buckets as drives.

Auto-upload on reconnect

The last option is to build some sort of system that continually checks the status of the network connection.

This should upload any newly recorded data as soon as a connection to the internet is re-established.

Data is often lost or corrupted due to loss of an established connection.

This can be a challenging system to maintain and similarly does not offer a unified web interface to browse data.

Use Woeden to offload remote data

This is a large system to build in-house, especially if you wish to benefit from more advanced functionality such as a trickle stream.

If you follow our guide here, all you need to do is install our agent on your robot. It only takes five minutes.

Woeden offers the ability to preview and offload recordings directly from your robot and via a detached disk.

You can manually select recordings of interest and notify your robot to begin uploading them.

We upload a small GIF with each recording for you to preview, along with metadata such as the duration, messages collected, etc.

Our infrastructure is aware of various network constraints. Spotty connections are expected, so we keep track of data that’s in the process of being uploaded even if a connection is lost.

Paired with our data collection capabilities, you can quickly begin building the ultimate database for your robotics data.

Our dashboard displaying a simple breakdown of offloaded data.

Overview

It’s not easy to offload data from your fleet. The amount of data is enormous, and a plethora of network limitations exist that make it a challenge.

Our approach immediately provides you with a number of critical data offload features, such as trickle streams, selective offload, and visual previews.

Get started with us by following our guide here.

‍

Drop in, collect and search data from your robots

Get started
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.