Stress Testing Networks

UMKC researcher works to minimize the threat natural disasters and other calamities pose to computer networks.

UMKC researcher Deep Medhi is working at the leading edge of an increasingly critical field: understanding how electronic communication networks operate — and fail — under conditions of extreme stress.

His primary goal: creating computer, cellular and other network infrastructure that can survive, or be quickly restored, after a major natural disaster such as a tornado or earthquake.

The project is not about making sure you can post a selfie while sitting in the rubble. In the 21st century, police, fire, medical, emergency management and transportation management teams all depend on networked systems. If a natural disaster obliterates street signs and familiar landmarks such as buildings and trees, network connectivity is vital for emergency responders to know where they are and where they are going.

At the same time, a natural disaster sends people flying to their smartphones, flooding all platforms as they try to connect with friends and relatives, clogging an already severely taxed system. In the days that follow, insurance and financial systems become critical to recovery and rebuilding.

“As a researcher, I sit at the crossroads between science and engineering,” Medhi says. “From a science perspective, I want to understand if the current computer networks work well under a number of situations, and what could deter them from working well. As an engineering researcher, I want to see if we can devise new methods or solutions to improve the current situation. What can we do to make computer networks more resilient or self-healing if a failure event occurs?”

Step one is figuring out how to measure the causes and impact of network failures, which will make it possible to construct mathematical models and algorithms for testing different approaches.

Such modeling is critical to developing solutions. Researchers can’t wait for another tornado to test each theoretical approach. Modeling also allows Medhi to factor cost-benefit analysis into his calculations. He’s looking for real-world solutions and recognizes that a cost-prohibitive solution is no solution at all.

One promising approach is to prioritize network traffic based on the source — giving first responders priority access during emergencies.

“There was an earthquake in Los Angeles causing physical damage to the communication network. People found out about the earthquake from CNN and called their friends and families to find out if they were okay. That created a huge surge in traffic to L.A.,” Medhi says. “My work looks at resilient network design. First, redundancy should be built in a cost-effective manner with a goal of providing limited services, especially for prioritized users under duress. Second, we look at ways to create alternate routing options so some traffic can still go through under severe situations.”

Through modeling, Medhi is also looking for ways to prevent or inhibit “butterfly effects” — local failures that spread repercussions through far-flung systems. A 2008 Boston Globe article described the “butterfly effect” as “the concept that small events can have large, widespread consequences.” The name stems from former MIT meteorologist Edward Lorenz’s suggestion that “a massive storm might have its roots in the faraway flapping of a tiny butterfly’s wings.”

“Consider an undersea fiber optic cable cut somewhere in the Mediterranean,” Medhi says. “This can potentially affect all call center traffic between workers in India and customers in the U.S.”

He says a lot of people aren’t even aware that such “butterfly effects” are possible.

“The primary limitation of most current approaches is that failures are considered very local,” he says.

Medhi’s research seeks to identify what new approaches are needed to contain such problems so that the failure does not have a cascading effect on parts of networks that are far away from the failure.

Now, as vast quantities of data are moving to “the cloud,” Medhi is examining vulnerabilities there.

“Most recently I’ve focused on networks behind cloud computing. For massive cloud services, we rely on complex data center networks,” he says. “How do we evaluate, design and evolve data center networks for dynamic traffic and dynamic service requirements? And we need to understand where and how big data fits in the design and analysis of large data center networks.”

Medhi cautions that people should look for slow but steady progress, rather than dramatic breakthroughs, in this type of research.

He has spent more than a quarter of a century trying to understand failures, initially single isolated failures, and then interdependent failures in computer communication networks.

One of the important underlying factors that impacts recovery from a failure is the routing and re-rerouting property of a computer network.

A science problem in computer networks that has intrigued him for many years is whether multipath routing provides substantial benefits. The conventional wisdom in the field has long been that multipath routing is better than single-path routing. That is, if there are multiple ways to go between any two points in a network, it would be better than just one path between the same two points.

Intuitively, that makes sense. But the question Medhi addressed with his research collaborators is: When every pair of points (or nodes or routers) in a network has traffic between them, does it make sense for every pair to resort to multipath routing when they are all contending for the same resources (capacity) in the network at the same time? Medhi recently established a very counterintuitive theoretical result for a range of goals, and he quantified the impact for a number of scenarios. In general, as the network size grows, the benefits of multipath routing diminish, and the performance is nearly the same as if everyone were to use single-path routing. This matters in reducing the routing table size at routers without compromising on efficiency for sending traffic such as web pages.

Lately, Medhi has been fascinated by the problem of quantifying and improving the quality of experience for watching a video on the Internet. His research team has developed tools to measure and quantify what could potentially make the quality of experience vary for different people watching from different locations. His team has also developed an algorithm to improve quality of experience of watching movies over the Internet that has been measured to perform better than the method used by Netflix.

“Proposing a new scheme or algorithm or model is not sufficient; it must be backed by a strong set of studies appropriate for the problem. In the process, it is also important to point out the strengths and weaknesses of a method,” Medhi says. “The vision is important — where we want to go — but innovation requires that we take incremental steps toward the vision.”

Leave a Reply

Your email address will not be published.