A Common Flaw in UDP Policy Routing in Mainstream Censorship-Circumvention Proxy Softwares

Introduction

Three years ago I wrote a blog post exploring how V2Ray implements two UDP filtering strategies. Many developers (myself included) found V2Ray’s approach puzzling at the time, but after reading this article, you’ll likely see the thinking behind that unusual design.

A strange kind of policy routing

Many censorship-circumvention proxy programs come with policy routing features that classify traffic based on IP, port, and simple DPI, among other signals. However, many of them (V2Ray excluded) exhibit counterintuitive behavior for UDP policy routing. Rather than picking on any specific project, I’ll go straight to the crux of the issue with a few tables.

Assume we configure the proxy with the following routing rules. It forwards different destination IPs to different servers—one of the most common uses of policy routing in such software.

Destination IP CIDR	Outbound Tag
1.1.1.1/32	proxy_a
8.8.8.8/32	proxy_b

Now our program creates a socket bound to source_ip:source_port and sends a UDP datagram to 1.1.1.1:53. Unsurprisingly, the proxy forwards this datagram via server proxy_a. The end-to-end state looks like this:

Client Source	Outbound Tag	Server Source
source_ip:source_port	proxy_a	proxy_a_ip:proxy_a_source_port

So far so good. Next, the same socket sends another UDP datagram, this time to 8.8.8.8:5353. What will the proxy do?

A. Forward it via proxy_a and let proxy_a send it out.
B. Forward it via proxy_b and let proxy_b send it out.

The correct answer is A. If you picked B, you’re a victim of these proxies; if you picked A, you’re both a victim and a developer.

Why does this odd behavior happen? To implement Endpoint-Independent Mapping in a simple way, these proxies perform policy routing only once per source_ip:source_port. As a result, subsequent datagrams bypass policy routing entirely. This all but turns policy routing’s access-control capability into window dressing: a user program need only send a single UDP datagram to a chosen host to steer all subsequent datagrams to a specific proxy server.

How to exploit it

Given the flaw above, we can design a probing tool that doesn’t require controlling many servers to map a proxy’s policy routing. Since most users base their UDP policy routing on the destination IP’s country, we can prepare a list of IPs from 200+ countries to “steer” the proxy. Each probe would work like this:

Create a new UDP socket.
Send any UDP datagram to one IP in the “steering list”.
Send a UDP datagram to a server we control, embedding in it the “steering IP” used for this probe.
On our server, record the “steering IP” and the proxy server’s IP (as the source IP observed on our server).

Some users also use these proxies as an access-control mechanism. This flaw allows bypassing that control as well, which poses an even more serious security risk.

Correct examples from SD‑WAN solutions

If you’re familiar with networking, you’ve probably noticed the resemblance between these proxies and SD‑WAN setups with multiple WAN uplinks, where policy routing decides which WAN to egress. The only difference is that the “uplinks” here are proxy tunnels. In several open-source router/firewall systems I examined, even with one-to-one NAT configured, the policy routing that selects the WAN is always applied prior to SNAT (Masquerade) on that WAN. In other words, even if policy routing breaks Endpoint-Independent Mapping, it is still enforced strictly.

Revisiting V2Ray

V2Ray does not suffer from the problem described above.

In its early versions, V2Ray implemented only Address-and-Port-Dependent Mapping, which allowed its policy routing to match precisely on destination IP and destination port. Based on that design, VMess required datagrams with different <source_ip, source_port, destination_ip, destination_port> tuples to be forwarded through different VMess connections. V2Ray introduced policy routing a decade ago—earlier than most mainstream circumvention proxies—and even back then avoided the flaw described here. This speaks to the team’s solid fundamentals and deep technical accumulation. Contrary to what a certain overly self-assured developer has claimed, this does not mean Victoria Raymond was “incompetent” or “wrong” in designing VMess’s UDP encapsulation.

When implementing Endpoint-Independent Mapping, V2Ray does not allow destination IP and port to be used as policy-routing criteria. While not ideal, this ensures policy routing always behaves as expected. A possible improvement would be to maintain, for the same source_ip:source_port, a mapping from outbound tag to proxy-protocol connection; then apply policy rules per UDP datagram on top of that, further improving flexibility and control.

Recommendations

If you have strict requirements for policy routing, I strongly recommend making V2Ray your proxy of choice.

Other options worth considering:

VMess, as implemented in most proxy software, does not suffer from this flaw.
Choose not to set policy routing rules for UDP.

For activists, my advice is not to use the policy‑routing features of any censorship‑circumvention proxy software; they come with many potential risks.

One must recognize that circumvention tools are not designed for access control. Modern VPNs such as WireGuard are a safer and more reliable choice.