Edge computing is defined by a wide diversity of use cases.
To illustrate, the 2025 Open Source Summit brought together two edge experts with experience in two vastly different types of edge systems to explain some rules they’ve learned for building an edge computing system.
One was from Chick-fil-A, a popular chicken sandwich chain, which runs a Kubernetes-based edge computing platform for its 3,000+ outlets across the U.S. The other expert was from the U.S. Air Force, which has among its other duties delivering connectivity and associated core services to remote parts of the world.
“The edge all depends on who you need to serve, how many people you need to serve, what you’re serving,” explained Michael Henry, chief of information technology for the Secretariat of the Air Force, studies and analysis.
Edge is all about bringing “compute to where the action is,” further explained Brian Chambers, chief architect for Chick-fil-A.
The two speakers are also members of Edge Monsters, a group of edge architects who meet on the regular to discuss best practices around this unique form of computing.
Chick-fil-A’s tech stack, built largely on open source.
The Edge Is Defined by Constraints
Henry said, “The edge is something you need to live in, in order to deploy it.” Although there are commercial and open source solutions to help build a stack, each edge system is something unique, defined by both its restraints and requirements.
Chick-fil-A was driven to build an edge platform to support the heavy business some of its stores were seeing, which was leading to unnecessary capacity and logistics challenges.
At least the restaurants stayed put.
The Air Force’s edge systems had to be deployed all over the world on a moment’s notice.
“We need to be able to pack it up, start it up and get our network connectivity and our core services up and running at a moment’s notice,” Henry said.
The two organizations’ operations are different in other ways as well. The Air Force has a lot of legacy equipment to support, whereas Chick-fil-A makes it a point to standardize the hardware across all its locations.
Nonetheless, there are commonalities across most edge systems.
The chief characteristic of an edge system is that it has pretty serious constraints of one type or another. Unlike the cloud, with its unlimited scalability (in theory), the edge is practically defined by constraints: physical constraints, network constraints and power constraints.
Edge locations may not have on-site tech support. Space is usually at a premium at most locations. So is power. And then there are financial constraints. One solution may work well for 1 or 10 locations, but does it get unduly expensive for 1,000 or 100,000 locations?
Different systems also have differing requirements.
The Air Force, for instance, requires multiple levels of connectivity for its remote outposts. It also has some “deep security concerns,” in terms of preventing hostile actors from compromising the system, Henry said.
The requirement is to “deploy the edge such that we know what is running on this box. And there is absolutely nothing that’s unintentional being able to push patches to this thing,” he said, explaining the server itself is probably an unmanned relay station and “zip-tied” to a tree or powerline somewhere.
Cluster configuration is also a huge deal for the military service. The armed service needs to push an update to an edge node and have that go live across the fleet.
All of which comes down to one thing, according to the duo: You own the entire stack.
6 Principles for Building Edge Systems
In their keynote talk, the engineering duo shared what they called “six principles for crushing it at the edge.”
1. Do the Differentiated Heavy Lifting
When building an edge system, “No one is going to do the work for you,” Henry said. There may be some “turnkey solutions” that can help along the way, but in the end, the system you will design will be unique to your operating characteristics.
What will the infrastructure be? How will it “fan out” across all the end nodes? What are your end points? How are you booting? What is your content delivery network?
Also, keep in mind, you can’t bring every service from the cloud over to the edge: They won’t fit.
2. Build a ‘North Star’ Edge Controller
A considerable chunk of your time will be dedicated to sorting out the edge controller.
“There are just so many edge controllers out there that you can play with. But they all have their quirks. They all have their problems. Maybe it’s this Raspberry Pi 5, but you got the wrong board, or, like, the board itself doesn’t really work for you exactly right. Am I going to power it correctly?” Henry said.
You have to figure out how to “kick-start” the controller if it goes out, especially if the engineering team is 3,000 miles away. A microSD card can be used to reimage a system, which saves the cost and time of sending an engineer to the remote location; a local operator can do the reboot.
“Because if you have thousands of these things, getting a pre-burn-in contract so that you delay it gives you a supply chain problem,” Henry said.
3. Use Declarative Infrastructure
“If it’s not declarative, it doesn’t exist,” Chambers said. One rule of Chick-fil-A is that all deployments must follow the practice of Infrastructure as Code (IaC), where all software configuration is predefined so it can be rolled out automatically.
Chambers knew from the beginning that, given the number of Chick-fil-A outlets, IaC would be necessary. Once developers started to make changes to their applications, deploying that software had to be done in a uniform and documented manner, lest each system “drift” into its own unique configuration, making the fleet harder to manage as a whole.
Declarative infra allows for versioning: Managers can see which stores have which versions of the software. It allows for rolling back of changes if a fix or upgrade doesn’t quite work as planned.
Chick-fil-A has used Git as its repository and version control system for the past several years. Chambers said, “Git made a lot of sense for us,” given its cleanly defined API. Since Chick-fil-A uses Suse’s K3 platform, its Kubernetes Manifest files are stored in Git.
Every location gets its own repo.
“It’s fairly simple to build and maintain a solution that has a bunch of Git repos,” Chambers said. It “assigns something location-specific to an agent that lives at the edge that pulls that stuff down and applies changes, and then closes the loop back via an API.”
4. Secure for a Hostile Environment
Unlike data center computing, your edge device may be in an unfriendly environment, and so you must take extra precautions to guard against breaches.
If necessary, you need a way to kill and erase the contents of a device, through Trusted Platform Module (TPM)- or Hardware Security Module (HSM)-based devices, Henry suggested.
Most importantly, you have to figure out how to send out the kill signal “without making it a vulnerability,” Henry said. This should also be done in a way that doesn’t brick the device itself, unless you want to be traveling to some remote location to install replacement units.
5. Use Telemetry Sparingly
Telemetry is vital for understanding your remote system, but it can also be a bandwidth hog.
Chick-fil-A had an incident where the credit card payment system stopped working across about 100 stores, Chambers recalled. The culprit was edge applications that were “consuming all of the available bandwidth,” by just updating their logs, he shared.
So, when designing the telemetry for your remote system, you have to figure out what data you truly need, and what the network constraints are for sending that data.
During routine operations, when everything is running fine, you probably won’t need all the logging data.
Chick-fil-A has a number of apps that run on a store server, as well as a number of edge-connected devices, such as fryers. It collects telemetry data for all of these resources and sends it back to the cloud, where it is dispersed to the appropriate management apps in the cloud.
“Our edge is a platform for many different software engineering teams that build different applications, and so they may have their own preferences about what monitoring tools they use to troubleshoot their app, whether it’s AWS‘ CloudWatch or Datadog, Grafana or Splunk,” Chambers said.
Use tools that can help deliver the data only when you need it. Datadog‘s open source Vector tool helped Chick-fil-A excerpt data only when it is needed, such as for a short duration of debugging. The app team can, for instance, get all the data for a 30-minute time window and switch it off afterwards.
“We love our logs in the Air Force,” Henry agreed. “Our logs completely consume things.”
But sending everything back to a data center can be expensive from remote locations.
“You need to push things back across the wire as minimally as possible, but yet still understand the operation,” Henry said.
6. Sort Your Storage
Edge devices require multiple forms of storage, Henry advised.
Of course, every edge setup has its own needs, so you need to determine what kind of storage is best, be it block, file or object storage. What is your file system, and what requirements does it have? Ceph, for instance, requires multiple disks.
You also need to figure out a disaster recovery plan. You need to answer, “What happens when things go wrong?” Henry said. Does the data need to be retrieved, or can you just swap it out for a new unit?
You can enjoy the entire talk here:
YOUTUBE.COM/THENEWSTACK
Tech moves fast, don’t miss an episode. Subscribe to our YouTube
channel to stream all our podcasts, interviews, demos, and more.
SUBSCRIBE
Group
Created with Sketch.
Joab Jackson is a senior editor for The New Stack, covering cloud native computing and system operations. He has reported on IT infrastructure and development for over 25 years, including stints at IDG and Government Computer News. Before that, he…
Read more from Joab Jackson