« Make It Stop | Main | A Strange Way To Spend Your Time »

June 16, 2008

Comments

publius,

I agree that the glut-of-long-haul-fiber theory is bogus, but I'm confused about two related issues:

1. Is last-mile congestion a larger problem for providers than the disparity between internal and external transport costs? For example, Comcast pays a great deal less to transport a byte of data between two points on its internal network than it does to transport that same byte between its internal network and a system hosted on an external network, even if the distance is the same.

2. Is last-mile congestion an equally serious problem for different provider types? I thought that telco central offices had much larger aggregate bandwidth connections to the internet than the equivalent cable concentration points and thus suffered from peer congestion effects a great deal less. If so, does that difference translate into any variation in net neutrality lobbying?

The prospects of inventing substantially better compression techniques are quite limited; The degree to which data can be compressed is fundamentally limited, and current compression algorithms are pushing that limit. Algorithms which accomplish the same degree of compression with less processing power might be in the offing, but that's really no substitute for having actual bandwidth available.

T - #2 is easier so I'll start with that. My understanding is that cable is actually more susceptible for a few reasons. One, streets often share capacity. Thus, if your neighbor is downloading porn, it could slow your connection down. Two, the coaxial lines are more limited in terms of capacity -- for instance, i think adding upload capacity is very difficult. Verizon's fiber (FIOS) blows cable out of the water, it's just hard to get unless you're in a city or dense suburb (and that's only in Verizon's area).

On #1 - I don't think external costs are a big deal b/c the backbone is by all accounts competitive. I think Level 3 is their provider, which is good because it's a non-telco. If the non-telco backbone providers suddenly disappeared, then it becomes a much bigger problem.

The backbone is at a "watch closely but no regulation needed" stage right now.

This is one of your best posts ever, publius. I've had trouble formulating a succinct response to people who think that a cable-TV-like tiered model is the only one possible. You've provided an outstanding one.
Thanks.

Brett-

I think you're taking 'compression' a bit too literally. The point is that there is room for all kinds of innovations that can use bandwidth more efficiently.

Bittorrent is an example. The total data transferred is more or less the same, but it dramatically changes where the bandwidth is used. And that can improve things a lot, if, say I'm downloading a chunk from someone on the same local segment. Then there's 'web 3.0' type things: smart, deep caches, off-line applications. Who knows what else. (Why not have everyone cache part of the google map data and then peer-download from each other?)

So, lots of room yet for innovation, I think.

(You're not even entirely right about compression. Lossless compression techniques may be pushing Shannon's limit, AFAIK, that doesn't really apply to the lossy audio/video perceptual encodings which make up the bulk of internet transfers.)

To get to complete agreement, though, I'd change one passage:

The ... importance of net neutrality is not [only] to protect people from discrimination, but [also] to create rational incentives for a better, more efficient Internet in the face of capacity constraints.

And I'd disagree that your last sentence is an "aside". It's an important part of the argument, because the model being cooked up by major providers fuses bandwidth charges with non-neutrality (bundled 'big sites'), and of course non-transparent -- the agreements about who gets inside the 'basic bundle' will all be private and commercial.

I think Brett's right: while A/V compression research advances apace, I don't think we're going to see significant improvements over the next decade. Perceptual coding is a hard problem and users have little incentive to experiment with lower bitrate codecs that may involve significant quality tradeoffs.

Why not have everyone cache part of the google map data and then peer-download from each other?

That sounds like a disaster in the making. First, most users have significantly less upload bandwidth available than download bandwidth, which means their ability to upload is already quite limited. The asymmetric in ADSL in significant here and I suspect represents a best case scenario compared with most cable ISPs' last-mile capabilities. Secondly, map access patterns are bursty: what happens when someone makes a post to a high traffic site including a link to a google map? In short order, you'll get tens of thousands of requests for that exact same map section.

You could mitigate some of these problems by using a complex and highly variable distribution scheme that increases the number of extent copies in proportion to the frequency of requests, but that is an extremely hard problem, especially if you don't want lots and lots of bandwidth wasted on coordinating your distributed cache protocol.

Another issue is jitter. From Google's perspective, there are two issues that matter when it comes to users downloading map slices: how long the average user has to wait and how long the unluckiest 10% of users have to wait. Given the likely heavy tail distribution on request satisfaction time, customer satisfaction has a lot more to do with the slowest 10% of interactions than it does with the median interaction. Google servers can make bounded response time guarantees that a user controlled cache distribution network cannot, and in terms of not pissing of users, those guarantees are a lot more important than reducing the average case delay by 10-20%.

Yet another problem is that such a distribution makes denial of service attacks much easier. They don't even have to be narrowly targeted at the victim in order to work: as long as some users who share a CO or cable concentration point participate in map slice distribution, you can still ruin your victim's day. Now, real anycast support could alleviate some of these problems but this sort of thing goes far beyond what anycast implementations can do in practice.

publius, thanks for your answers. What I'm confused about then is why the DSL/fiber providers are throwing their lot in with the cable providers. If internal/external transport cost disparities are not significant and if last-mile congestion issues are much more significant for cable than for DSL/fiber, what exactly is the scarcity problem that motivates the DSL/fiber providers? I know that AT&T does both fiber and cable, so I can understand their perspective, but Verizon only does DSL and fiber, so their lobbying confuses me. Or do you think that there is no scarcity issue for these companies and their only pushing for tiered access because they think doing so will improve profitability?

Great post, but there's all kinds of caveats to add to it.

1. It seems to me that there are good scenarios for metering leading to non-net-neutral policies. Nell names one, but the most likely to me is that once people start coming up with ways to circumvent bandwidth meters (e.g., by XORing your data with another user on the same network), providers will start arguing that they need to do deep packet inspection to prevent "fraud," and that'll be the pretext for monitoring. The DoJ is already behind DPI, and CALEA provides the "in."

2. I don't think external costs are a big deal b/c the backbone is by all accounts competitive.

I strongly disagree with this. Even assuming that the current peering arrangements hold, much of the problem right now is that Comcast is trying to throttle traffic because they're not a Tier 1 provider and their costs are comparatively higher b/c they have to pay Level 3.

This is why Verizon weighed in against Comcast in the recent FCC battles over content management. Verizon is aggressively rolling out their fiber network and by some accounts aren't even bothering to support their copper lines anymore. That means that per-byte provision is inherently cheaper for them, whereas secondary providers like Comcast have to squeeze more out of each byte. That's a serious competition problem since no one wants to build redundant infrastructure and Verizon is already way out ahead on this one.

3. Verizon's fiber (FIOS) blows cable out of the water, it's just hard to get unless you're in a city or dense suburb (and that's only in Verizon's area).

But they're aggressively pushing out to rural areas, even subsidizing new fiber pipes -- and at the very least they're just as good as everyone else. They're never going to be worse; they already have the big markets and there's no way for the cable companies to gain back that market share -- it'll just be slow bleed.

There are more subtle problems here, though. First, Verizon's bandwidth is more symmetric, so they're able to provide BitTorrent-like services more efficiently. Secondly, and more importantly, Verizon owns so much of the network that they'll be able to cache data that's moving internally and make it comparatively cheaper. (In fact, they've been working on this for a while now.) That'll let them box out competitors for consumers and for hosting.

4. Is last-mile congestion an equally serious problem for different provider types? ... If so, does that difference translate into any variation in net neutrality lobbying?

Yes, but it's not really about telcos. It's more about Verizon vs. everyone else, as implied above. Verizon has FiOS, and no one else does. They also have greater coverage since they own all of UUNet's old lines.

That's why Verizon has been backing net-neutral policies, or at least "per-byte" policies -- they can provide it cheaper than everyone else. The problem is that they've got no real problem with content management, just so long as it doesn't put them at a competitive disadvantage (like Comcast's traffic throttling did -- because it was opaque to the outside).

I think Brett's right: while A/V compression research advances apace, I don't think we're going to see significant improvements over the next decade.

There's big, big improvements that can be made if you treat the packets differently. You can order them to reduce successive failures and do tricks with caching since the data characteristics are fundamentally different. The problem, of course, is that you have to differentiate the pipe, meaning it's non-net-neutral.

What I'm confused about then is why the DSL/fiber providers are throwing their lot in with the cable providers. If internal/external transport cost disparities are not significant and if last-mile congestion issues are much more significant for cable than for DSL/fiber, what exactly is the scarcity problem that motivates the DSL/fiber providers?

DSL and cable pretty much suck equally in this case. Verizon is where they are not because they are DSL/fiber (again, they are neglecting their DSL capability), but because they have the best fiber network there is by a long shot. AT&T is somewhere in the middle. Everyone else is out of luck.

IOW, the scarcity problem is that now and for the foreseeable future, upload and inter-network transit is more scarce for everyone except Verizon. They have copper and fiber, so they can play it both ways, but fiber is where they hold the winning hand.

Massive over-provisioning is just a reality of statistical multiplexing. Access tiering doesn't change that, it just prioritizes some packets over others. Therefore, without extreme changes to the protocols (and other types of multiplexing have shown even worse line utilization), the only way of addressing network congestion is to (a) add capacity or (b) reduce the number of packets. (A) is always nice, but not always feasible. And (b) can really only happen is everyone plays nice (all these streaming VOD services need proper caching, maybe can leverage p2p). But hardware/software maintenance is expensive.

So I say not only do we need to enshrine net neutrality in law, but we need to emphasize open protocols. Web caches work because everyone uses HTML; an ISP can very cheaply install a web cache and greatly improve performance. If everyone were using the same e.g. streaming video protocol, ISPs could install caches for much less than it would cost every VOD service to do it.

"The point is that there is room for all kinds of innovations that can use bandwidth more efficiently."

Thought of that; The problem is that most of the innovations use backbone bandwidth more efficiently, not last mile bandwidth. Conserving what's in excess isn't much help. Really, there's not much getting around the need for more bandwidth in that last mile.

Now, to the extent that individual users predictably download the same content, like me checking my email every morning, you might more efficiently utilize that bandwidth by pre-fetching that content during slow moments. Perhaps that's what you meant, but I don't think "compression" is a good term for that sort of strategy.

Good points, Turbulence.

I had in mind a (hopefully) near futurish, high speed fiber/wireless/whatever last mile, rather than existing cable or DSL, which makes a difference, I think. But the google maps thing was just off the top of my head, I'm not arguing for it specifically. Also, I do think perceptual codecs have a lot of headroom: fine tuning of the perceptual tradeoffs psychology, trading encoding complexity for bandwidth. (And it's not really a matter of people experimenting, since it's youtube or netflix or whatever that would be pushing it out.)

The takeaway point is that while a theorem about the maximum efficiency of lossless compression algorithms exists, no such theorem exists for the internet as a whole. I'd be really careful about predicting that we're already doing everything in anywhere near the best way possible.

(That should be "OR trading encoding complexity for bandwidth.")

I suppose I should clarify what I said above -- a collusive darknet would involve a lot more than just XORing data, but it is possible. E.g., via network coding.

trading encoding complexity for bandwidth

This leads to an interesting thought -- in a world of per-byte bandwidth pricing, I would bet money that most of the distributed-computing projects like SETI@home and Folding@home would be very rapidly replaced by people farming out their spare CPU cycles for passthrough compression.

Adam, as always, thanks for the thoughtful input...

Nell names one, but the most likely to me is that once people start coming up with ways to circumvent bandwidth meters (e.g., by XORing your data with another user on the same network), providers will start arguing that they need to do deep packet inspection to prevent "fraud," and that'll be the pretext for monitoring.

Can you explain what on Earth you're talking about here where you describe ways to circumvent bandwidth meters? Are you talking about a world where intra-network data transport is billed to the consumer at much lower cost than inter-network transport? If that's the case, why would providers want to keep their own customers from collaborating to reduce inter-network transport costs? I mean, sure, they lose some cash there, but they also reduce customer costs and they increase the total value of their network. Also, I don't see any technical way of "proving" that a group of nodes are colluding on sharing data transported from other networks. If you combine encrypted links with the ability to route data via intermediaries and occasional random packets, even hard core traffic analysis will never give you the certainty needed to justify charging someone's credit card without risking fraud.

The DoJ is already behind DPI, and CALEA provides the "in."

There's a big difference between shunting off traffic to/from persons of interest for DPI and doing DPI for everyone. These things are similar in some ways but seem so different operationally that I have trouble imagining how DoJ's CALEA work moves us further down that path to global DPI.

Verizon's fiber (FIOS) blows cable out of the water, it's just hard to get unless you're in a city or dense suburb (and that's only in Verizon's area).

Alas, this is not completely true. Here in Cambridge MA, despite being in a pretty dense part of the city, FIOS is not available. Lots of nearby suburbs (with lower density) have it, but the city goes hungry.

There's big, big improvements that can be made if you treat the packets differently. You can order them to reduce successive failures and do tricks with caching since the data characteristics are fundamentally different. The problem, of course, is that you have to differentiate the pipe, meaning it's non-net-neutral.

Can you clarify what you mean here and what problems you're referring to? I can imagine some problems, like VOIP QoS, that can be satisfied in net neutral ways (carriers could agree to honor QoS bits for any customer that agreed to max rate limits on high priority traffic [i.e., no get out of jail free card by declaring that all of your traffic is high priority]).

DSL and cable pretty much suck equally in this case. Verizon is where they are not because they are DSL/fiber (again, they are neglecting their DSL capability), but because they have the best fiber network there is by a long shot. AT&T is somewhere in the middle. Everyone else is out of luck.

See, this is what I was trying to get at with my earlier questions to publius. It seems that you disagree with him on the cause of scarcity for different entities: you're claiming its the disparity between inter- and intra- transport costs and that last-mile congestion is equally crummy for everyone, whereas he seems to be claiming the opposite.

So here's a question for both of you: do you know of any data that substantiates your position on either of those questions?

Seeing Brett's follow-up, I think maybe we're talking past each other here.

I was addressing my comments at the idea that we're facing a 'looming capacity problem', which I interpreted to mean a medium-ish term problem if and when everyone has 50-100+Mbps to the house (or on their cell phone), and we find ourselves backbone limited rather than last-mile limited.

That'd obviously be a situation where more smart distributed caching and peer-to-peer-like stuff might be helpful (especially coupled with cheap storage), or who-knows-what-else that isn't necessarily totally appropriate in our current last-mile limited world. (And I stand by my comments on audio/video compression.)

But, it could be I misinterpreted. (And maybe the economics is such that if we run low on backbone in 15 years, we'll just have another massive build out and be good for another 30.)

Can you explain what on Earth you're talking about here where you describe ways to circumvent bandwidth meters? Are you talking about a world where intra-network data transport is billed to the consumer at much lower cost than inter-network transport?

See the network-coding link I just posted. I was envisioning an equal-cost scenario. Gaming a network with differing inter- and intra-network transport costs is trivial.

I mean, sure, they lose some cash there, but they also reduce customer costs and they increase the total value of their network.

Those costs are different for different providers. Verizon, e.g., wants more transport b/c their per-byte costs are lower than Comcast's.

Also, I don't see any technical way of "proving" that a group of nodes are colluding on sharing data transported from other networks.

I don't think it could be proved, I think it would be a pretext for doing DPI and implementing non-net-neutral policies.

These things are similar in some ways but seem so different operationally that I have trouble imagining how DoJ's CALEA work moves us further down that path to global DPI.

Yeah, you'd think that, but CALEA was the reason the FCC flipped its position on traffic monitoring a few years back. They used to be completely hands-off. I can find you a link for this, and I acknowledge that it's bizarre, but I assure you that it's true.

Here in Cambridge MA, despite being in a pretty dense part of the city, FIOS is not available. Lots of nearby suburbs (with lower density) have it, but the city goes hungry.

Sure, but the point is that you'll eventually have it, and the cable companies don't have a decent alternative in the pipe. Their infrastructure isn't up to it, and Verizon has all that copper-loop capability anyway. It's a matter of "when," not "if."

I can imagine some problems, like VOIP QoS, that can be satisfied in net neutral ways

It seems to me that the problem is that you either (a) implement a tiering system that's "neutral," i.e. opaque to the provider, and it gets gamed by users, or (b) you implement a tiering system with DPI or other controls that are opaque to the users, and, well, you know the story. I don't see the middle ground.

So here's a question for both of you: do you know of any data that substantiates your position on either of those questions?

Yup.:

Because P2P apps in particular can lead to consumer complaints from other subscribers on the local node, Comcast has taken to resetting certain types of traffic to keep the load lighter. In Feld's view, this just shows up problem's with cable's network that the DSL and fiber providers (AT&T, Verizon, etc.) are keen to drive home by keeping this issue in the public eye.

Instead of spending tons of cash to upgrade its hybrid fiber-coax (HFC) network to fiber, Comcast cheaped out and instead bought some Sandvine gear to disrupt certain traffic instead.

"Looked at this way, you can see why the telcos (and therefore their sock-puppet HOTI) would be a shade peeved about Comcast's decision to 'manage' their network in such a 'cost effective' but deceptive manner," writes Feld. "Because if Comcast can keep pretending its network is just as good as FIOS when it isn't, and can even lie when asked about it directly by customers, then Verizon just wasted a couple of bazillion dollars and took a two-year stock beating for NOTHING."

Let me put a different way. The problem here is not telcos versus ISPs or DSL versus cable or fiber. The problem is that the transport providers are not content providers, and their interests are not aligned. Verizon does not give a rat's ass about content management -- they just don't want Comcast to do it opaquely, because it decreases Comcast's effective cost-per-byte, and Verizon can beat Comcast on transport costs.

It used to be the case that per-byte transport costs on the backbone was relatively equal (with some exceptions, but you paid for what you got). That parity is unraveling. Verizon would be happy to do content management if it was done transparently, because tiered access would let them put the screws on content providers. That's just not in their immediate interest -- not as much as beating Comcast, anyway, since neither is doing tiered access right now and if it did happen, it would probably be universal.

When publius says per-byte pricing is "consistent" with network neutrality, that's true but it misses the issue -- the problem is that it's not incompatible with a non-neutral policy. You can charge per byte and still channelize traffic and implement price discrimination. Universal per-byte pricing would just invite gaming, and we've seen how the industry treats BitTorrent users.

The single central maxim of net neutrality is that intelligence resides at the edges of the network. "Capacity constraints" are a pretext for centralizing control. "Efficiency" is a pretext. "Security" is a pretext. The fundamental issue is privacy and the ability of users to control their own traffic. Everything else is secondary -- the decision is ultimately whether we're willing to sacrifice that control for other concerns.

See the network-coding link I just posted. I was envisioning an equal-cost scenario. Gaming a network with differing inter- and intra-network transport costs is trivial.

Can you please explain in more detail what you're talking about then? I know something about network coding and I'm not seeing the relevance here. I don't understand how combining your data with someone else's gets you any benefit if someone has to pay to get the data from the intra-network peer and if you have to pay to schlep the data to the other local user. It seems that the total cost should be the same as if both parties requested their data separately and didn't bother communicating together.

Also, maybe I'm just being slow today, but it seems like your use of network coding boils down to magical thinking. Can you please describe how you think peers would cooperate to use network coding and what specific problems doing so would solve?

I don't think it could be proved, I think it would be a pretext for doing DPI and implementing non-net-neutral policies.

Well, anything could be used as a pretext. I mean, child porn or terrorism can be used as a pretext, so the mere existence of pretexts doesn't concern me overmuch. What I'm more interested in is the existence of plausible pretexts, and I don't see a plausible way in which DPI could possibly detect this sort of collusion with any reliability in a world where encryption and cooperating intermediaries are the norm. So, what am I missing: what is a plausible scenario by which DPI could detect a collusive system involving encryption and intermediaries?

Instead of spending tons of cash to upgrade its hybrid fiber-coax (HFC) network to fiber, Comcast cheaped out and instead bought some Sandvine gear to disrupt certain traffic instead.

OK, now I'm officially confused about your position. The excerpt that you quote seems to back the notion that cable providers (like Comcast) are at a disadvantage compared with DSL/fiber providers because of last-mile congestion. Note that I read your excerpt as putting DSL and fiber providers in the same boat: they don't suffer from the last-mile congestion issue, even if DSL can't provide the same last-mile bandwidth that fiber can. Yet you earlier claimed that "DSL and cable pretty much suck equally in this case. Verizon is where they are not because they are DSL/fiber, but because they have the best fiber network there is by a long shot." So: if Comcast's problems are due to congestion at the local node (Because P2P apps in particular can lead to consumer complaints from other subscribers on the local node...), why does the differential between inter- and intra- transport costs matter at all? I mean, if Comcast had Verizon's advantage regarding a massive long-haul network that allowed it to transport bytes intra-network for free, that would not alleviate the choke point problem at the local node...

When publius says per-byte pricing is "consistent" with network neutrality, that's true but it misses the issue -- the problem is that it's not incompatible with a non-neutral policy. You can charge per byte and still channelize traffic and implement price discrimination. Universal per-byte pricing would just invite gaming, and we've seen how the industry treats BitTorrent users.

OK, but again, how: how do you think this gaming would work?

A few points here:

There is enough long-haul fiber in the ground in Europe to support a massive upgrade of the transmission capacity. The problem is, that the active components which were put in during the boom days are now reaching their limit and need to be replaced/upgraded/extended. That costs real money, especially versus buying a network from a bankrupt carrier and just operating it.

The bandwidth increase is especially serious at the core interconnection points: they see a higher rate of growth than individual networks. Have a look at the graphs at http://www.ams-ix.net/technical/stats/ and http://www.de-cix.net/content/network/Traffic-Statistics.html They are running hard against technology limits.

Local loop bandwidth is only a problem for cable, DSL is fine. A problem for both is the backhauling and distribution networks. Especially when you're dealing with reseller-structures then the links which pass thousands of DSL streams to resellers are easily overloaded.

Flat-rate pricing is usually more expensive than usage-based pricing. But the users love the predictability. see http://blog.tomevslin.com/2005/02/subscription_pr.html

(I once implemented the mediation layer of an ISPs usage based billing system. Fun fun fun.)

I read this post in violent agreement but couldn't help wondering whether it was analogous to Oil?

For instance, in a world of [OPEC quotas], companies have less of an incentive to increase capacity – that’s because scarcity will have been built into the business model. For that reason, a world of [OPEC quotas] will depress innovation and be less receptive to new, [gas-guzzling] applications/services. Thus, in this world, companies would solve the capacity problem by [invading Iraq] and increasingly relying on [OIL] scarcity for [control].

The net-neutrality debate is only about one thing: censorship. It has never been about anything else, it cannot be and will never be about anything else. There are no economic factors. It is exclusively about censorship.

Trying to understand this with an extremely limited grasp of the technology.

Is the main point here that if you don't allow carriers to use complex pricing then you force innovation to be in technology, rather than pricing schemes?

Bernard, I think your assessment is correct, at least regarding publius' original point. At least on the surface, much of the discussion revolves around the question of what mechanism we should use to remedy the problem of bandwidth scarcity. In other words: people want more bandwidth than we have available, so picking a framework for deciding how to allocate that scarce resource and encourage future investment is both critical and contentious. The only caveat I would add is that there is some disagreement as to the nature of this scarcity and how it varies based on provider and underlying technology.

If you have specific questions, I'd be happy to address them in a more explanatory manner. Looking back, I think the discussion may have taken a turn towards impenetrable arguments by folk who know lots of details and I feel a mite bit guilty for that.

Turb: sorry for the delay. I was working on a VoIP project, appropriately enough. :) However, this week I have my legal hat on -- you should have caught me last week when I still had my amateur-network-engineer hat on. :)

As I'm an amateur, I don't claim to be an expert on network theory. Nevertheless, it seems to me that where there is entropy (or a lack thereof), there is the opportunity to game a usage-based system. I imagine that, given a stable network topology, there's no reason that a set of users couldn't multiplex their traffic -- after all, that's essentially what BitTorrent does now. Split the traffic x different ways, run it through a fountain code, and distribute it back to the peers. Given sufficiently stable throughput, you save bandwidth.

Granted, it's not a trivial problem, but there's two reasons why it seems inevitable, at least to me: (1) The decoding has to happen at the network endpoints (unlike caching), and it strikes me as unlikely that Verizon or anyone else will ever be able to exercise sufficient control at the last hop to provide a solution, and (2) In a world of per-byte pricing, even incremental transmission savings are worth it. 10% off a $50 monthly bill is $60 a year, and I don't think an overall 10% savings is really that implausible for most traffic.

Again, I'm not a software engineer, but that's my general take.

Turb,

Thanks, (from a fellow Cantabrigian, BTW).

So, what am I missing: what is a plausible scenario by which DPI could detect a collusive system involving encryption and intermediaries?

Well, I don't think that's the point, but collusion can be detected through traffic profiling. And multiplexing schemes can be disrupted pretty opaquely by just adding a little noise to the network topology -- for example, you could simply flag the users that are maintaining a constant bandwidth level and you send their packets out-of-order and maybe add some random latency. The data would still get there -- and would look the same to the average user -- but the advantages of network-coding tricks would essentially disappear.

But ultimately, they don't have to detect anything -- Comcast was shutting down all kinds of traffic, and they were just stupid enough to get caught. It's about how deep they're allowed to go into your data. The ability to do DPI means they can prioritize specially-tagged traffic (and charge it separately, esp. if there is a usage-based pricing scheme in place, conveniently), which relatively deprioritizes everything else. The point is that once you grant the providers the ability to inspect your packets and redirect them, control over the network has moved from the edges to the center.

I think that the real issue here is using packet inspection to tier access and then to leverage that against content providers like Google. Users are just the pretext. Again, it's the same thing that happened with CALEA. The FCC was all gung-ho on net neutrality and then the telcos got the DoJ to lean on them so they could siphon and monitor. If not for that move, traffic inspection wouldn't even be on the radar.

Yet you earlier claimed that "DSL and cable pretty much suck equally in this case...

Well, they suck for different reasons -- I just mean that they both suck in a relative sense; the pipes are often asymmetrical, esp. DSL; cable has huge congestion problems at the last hop; DSL networks are built on legacy tech, are expensive to maintain, and they scale poorly.

The bottom line, though, is that:

(1) Verizon owns their infrastructure (and indeed they own most of the internet), whereas most everyone else pays for it. FiOS notwithstanding, Verizon always has the option of using their transport as a loss leader. Comcast doesn't have that option.

(2) Comcast, et al., have no alternative for transport anyway. That's why they're trying to squeeze more out of each byte by throttling. Verizon will just keep building and building their infrastructure -- it's the inevitable march. Verizon is too far ahead and no one is going to invest to catch up with them except maybe AT&T -- and they're not exactly the saint of the telecom industry, to put it mildly.

Another thing to keep in mind (maybe the main thing) is that the government merger settlement deals with Verizon/MCI and AT&T/SBC expire very soon. The rate caps on Verizon (basically the only thing holding them to any form of net neutrality) expire next month. That's the economic driver here. The equal-opportunity peering structure as we've know it is about to end. Verizon and AT&T are going to be putting the screws on everyone else and things are going to change real fast.

sorry for the delay. I was working on a VoIP project, appropriately enough. :)

No worries; I won't demand fast response time until after I start paying your retainer.

As I'm an amateur, I don't claim to be an expert on network theory. Nevertheless, it seems to me that where there is entropy (or a lack thereof), there is the opportunity to game a usage-based system. I imagine that, given a stable network topology, there's no reason that a set of users couldn't multiplex their traffic -- after all, that's essentially what BitTorrent does now. Split the traffic x different ways, run it through a fountain code, and distribute it back to the peers. Given sufficiently stable throughput, you save bandwidth.

I'm sorry to harass you about this, I really am, but: I have no idea what you're talking about. Can you please explain? I know how bittorrent works and I'm just not seeing practical ways that bt techniques can be readily generalized.

If it helps, consider this framework for discussion:

Provider P operates a network N connected to the global internet. Two subscribers of P, Alice and Bob both wish to get different pieces of data from their friend Charlie, whose system is not hosted on N. Alice and Bob each must pay P for each byte of data they send or receive, regardless of whether that byte is exchanged with a system hosted on N or on the global internet. Given all that, please:

1. explain a protocol by which Alice and Bob can send their different requests to Charlie and each get back their different answers

2. explain what per-byte costs Alice and Bob incur

3. explain why the costs incurred in (2) are less than the costs Alice and Bob would incur if they both communicated directly with Charlie

4. explain how Alice and Bob discover each other given that N may host hundreds of thousands of systems

I don't care about details or terminology per se, but this is the framework I have in mind when reading your writings that I find confusing. I'm not expecting a detailed explanation of exactly how network coding would operate in this context, but I'd like to at least understand how you think it would be used; I want to see what's connected to the black box labeled network coding without necessarily peering into the box. If my framework makes no sense to you, feel free to describe your own framework. I really don't care about details, I just want to see some explanation for how parties divvy up data and how doing so saves them transmission costs.

Turb, I'm not an expert, but I can give it a shot:

A stable network topology is a prerequisite, so we have to assume that A and B know each other and can rely on each other. We also have to assume that each packet is sufficiently large to make the transaction costs of joint requests worthwhile. I think these are fairly realistic assumptions -- this is essentially the problem addressed by BitTorrent (which is all I meant by the comparison).

The key is lossless compression over a continuous channel -- say A and B want M and N, respectively. Both request one half of M, compress it, and send it to the other. Then both request one half of N, compress it using N as a key, and send it to each other. If the mutual information of the data is sufficiently high (which I imagine is the case for generic web traffic) there should be transmission savings that increase proportionally to the colluding agents and the stability of the network topology (the conditional entropy), at least until the transaction costs of synching become too high (but there could be a way to distribute those costs among the agents).

Again, I am not an expert, but that seems approximately right to me. I may well be wrong. :)

Well, I don't think that's the point, but collusion can be detected through traffic profiling.

I don't see how this could work. Any distribution network will involve encrypted links, transmissions of random data, random splitting of packets into smaller chunks, random aggregation of multiple packets into single packets, random delays, and random packet drops. In fact, adding random packet drops may be necessary for high performance. Given all that randomness, tracking transmissions between hundreds of thousands of peers and determining who is colluding with who is going to be very very hard. It might be doable, but I don't see how.

And multiplexing schemes can be disrupted pretty opaquely by just adding a little noise to the network topology -- for example, you could simply flag the users that are maintaining a constant bandwidth level and you send their packets out-of-order and maybe add some random latency. The data would still get there -- and would look the same to the average user -- but the advantages of network-coding tricks would essentially disappear.

Today, right now, even when connected to the nicest most cooperative provider on Earth, every host has to expect that any of its packets will be reordered or significantly delayed. If your application can't deal with that, you can't function on the internet, or, indeed, on most local networks.

Also, I don't see how you could flag hosts maintaining a constant bandwidth level: most hosts today do just that for all sorts of reasons. And granularity makes a big difference. Because of the way that TCP operates, large or fast connections don't really make use a static amount of bandwidth when you look closely; they're constantly ramping up their transfer rate than cutting it back when packets drop.

But ultimately, they don't have to detect anything

No, but they have to have a plausible explanation for how they could detect these networks otherwise their pretext is just a joke, and if all they need is joke pretext, then terrorists are coming eat my baby works as well as any other.

I think that the real issue here is using packet inspection to tier access and then to leverage that against content providers like Google. Users are just the pretext. Again, it's the same thing that happened with CALEA. The FCC was all gung-ho on net neutrality and then the telcos got the DoJ to lean on them so they could siphon and monitor. If not for that move, traffic inspection wouldn't even be on the radar.

But they can do that right now without DPI! If Comcast wanted to shake down Google today, couldn't they just say "Youtube traffic disproportionately causes congestive losses at node concentrators, ergo, we're going to increase the drop rate for flows that terminate at google/youtube and we're going to do that by looking at the IP destination address"? I mean, nothing is less DPI than the IP address fields and it is not like there's some huge secret as to what IP blocks correspond to Google/youtube. Heck, doing it this way is a lot cheaper and easier than going with DPI.

-- The key here is that a bit as a unit of information (a shannon) is smaller than an actual binary bit, depending on the conditional entropy of the channel. There are many different ways to take advantage of that fact, up to a certain famous Limit. :)

If Comcast wanted to shake down Google today, couldn't they just say "Youtube traffic disproportionately causes congestive losses at node concentrators, ergo, we're going to increase the drop rate for flows that terminate at google/youtube and we're going to do that by looking at the IP destination address"? I mean, nothing is less DPI than the IP address fields and it is not like there's some huge secret as to what IP blocks correspond to Google/youtube. Heck, doing it this way is a lot cheaper and easier than going with DPI.

Battery is dying, but, reasons why this isn't happening today, off the top of my head:
(1) The two biggest providers, Verizon and AT&T, are forbidden from doing this based on their merger agreements (for now), and lean on everyone else not to.
(2) If this happened, Google would sue the snot out of Comcast. But Comcast can accomplish the same thing in reverse by tiering traffic and not get sued.
(3) The DPI arms race hasn't started yet. IP-based profiling would mean that people would just start tunneling traffic, and then Comcast would claim they need DPI for inspection to prevent the tunneling.

The key here is that a bit as a unit of information (a shannon) is smaller than an actual binary bit, depending on the conditional entropy of the channel.

Sure, but you could exploit that by having your web server and browser use gzip encoding. You get all the benefits without having to discover, trust, and coordinate with other parties. Plus, decoding gzip compression is a lot easier than any form of network coding. And that analysis assumes there is substantial mutual information between two web pages from the same server, but I don't think that's true at all. I mean, if I tell you that the 3971th character of the Wikipedia page on Dragons is 'x', does that really give you any information regarding what the 3971th character of the Wikipedia page on the Clean Air Act is?

Alternatively, if you're talking about video traffic ala youtube, a bit isn't actually any smaller than a binary bit: the data's already been compressed enough that it is effectively random. In addition, there's going to be almost no mutual information between any pair of bits from any pair of different youtube video streams. Come to think of it, there likely won't be significant mutual information between any pair of bits from the same youtube video stream.

I think the peer coordination issues are more substantial than I suspect you do. Bittorrent works because some nodes are dedicated to providing swarm metadata and consequently, it incurs relatively high costs per object distributed. Those costs are low today because we don't distribute that many different objects over bittorrent (compare the number of unique .torrent files in the world with the number of unique webpages in the world), but they could become quite significant in a world where we needed bittorrent style distributed peer management for accessing even a small fraction of the URIs we handle today.

(2) If this happened, Google would sue the snot out of Comcast. But Comcast can accomplish the same thing in reverse by tiering traffic and not get sued.

Out of curiosity, on what grounds would Google sue Comcast? I'm not disagreeing with you, I'm just curious.

(3) The DPI arms race hasn't started yet. IP-based profiling would mean that people would just start tunneling traffic, and then Comcast would claim they need DPI for inspection to prevent the tunneling.

Any tunneling would be encrypted, so I don't see how DPI could help. This case is much simpler than a full blown colluding distribution network we discussed earlier. There's already lots of encrypted traffic on the net (most client email access, all corporate VPNs, a large fraction of bittorrent traffic, etc.), and if you add lots more, there will still be no way to determine which requests are intended for Google except by some sort of traffic analysis, but traffic analysis will work just as well without DPI.

Again, Comcast can claim whatever they want, but since DPI is useless for detecting tunneling in the presence of encryption, that claim doesn't make any sense.

Sure, but you could exploit that by having your web server and browser use gzip encoding.

Sure, if your provider lets you. But that's not a generically applicable solution, and there'd still be relative benefits to muxing traffic together.

I mean, if I tell you that the 3971th character of the Wikipedia page on Dragons is 'x', does that really give you any information regarding what the 3971th character of the Wikipedia page on the Clean Air Act is?

Yes, it does, if there's an agreed encoding/decoding solution. I could send you the number of bits to increment past 'x' to reach the next character in the channel, or the factorization of 'x' and the next character, etc.

Depending on the conditional entropy of the information and the channel memory, the amount of information required to send a bit should decrease as log(2).

Come to think of it, there likely won't be significant mutual information between any pair of bits from the same youtube video stream.

No, that's backwards. There's actually significant mutual information between video streams, even if they're individually compressed. You're thinking about two separate channels with separate encoders and decoders. Neither YouTube nor your ISP has control of the decoders at the edges of the network, at least not jointly.

I think the peer coordination issues are more substantial than I suspect you do.

Well, there is no issue today, because people pay for pipes, not bytes. The network coding solution is only efficient if there's multiple end users incurring the transmission costs. If you're paying for the pipe, the incremental cost per byte is effectively zero. If you're paying per byte, it's effectively constant.

This may be getting too technical :)

Out of curiosity, on what grounds would Google sue Comcast? I'm not disagreeing with you, I'm just curious.

Oh, tortious interference or something. Who knows.

Comcast would arguably lose their common-carrier status, so Google could file a class action for copyright infringement or something along those lines, I suppose. :)

-- OK, I really need to put my legal hat back on for real now. There are some people depending on me to crank some stuff out tonight. :)

... oh, but on the issue of pretext (this is kind of legal, so it counts), Verizon blocked Usenet last week. Yes, Usenet in general, not specific groups. Guess why?

-- That was unclear. They blocked everything but the big 8, but that means they blocked the entire alt.* hierarchy of over 100,000 groups.

This has been a great thread -- I've been out so I'm still digesting it, but I may want to follow up with some of you on the more technical aspects that I don't understand.

OK, I finished the legal work, so now I can revisit this. Maybe I'm a glutton for punishment, but I love it when publius posts net-neutrality threads. :)

(Firstly, as a disclaimer, I obviously like Ars.Technica a lot -- they're on top of this issue, and it's hard to resist posting links to them. I read a few telecom news sites, but even so Ars does a great job of tying things together.)

That said, here is an absolutely awesome article from A.T that explains DPI, net neutrality, and touches on the CALEA issue I mentioned earlier, and it does tie them all together. It's a long read, but very good. The bottom line, in my view, is that CALEA compliance requires DPI at some level, so it really sets the stage for access tiering. Before the FBI, DoJ, et al. filed their joint petition with the FCC demanding DPI for CALEA compliance, the FCC really did seem to be trending the other way. The CALEA move was really the foot in the door.

Also creepy is that the DoJ's Antitrust Division recently filed a brief with the FCC arguing against preemptive net neutrality rules. Given that they're supposed to be the ones monitoring the merger expirations I mentioned earlier, and that the merger expirations are one of the main (underreported) issues creating the need for preemptive regulations in the first place, this is somewhat discomfiting.

Now, turning back to the discussion -- Turb makes the key point:

Again, Comcast can claim whatever they want, but since DPI is useless for detecting tunneling in the presence of encryption, that claim doesn't make any sense.

And again, this is a red herring -- it's a pretext for centralizing network control. It's a designed flaw in the infrastructure. The "capacity crisis" is just a way of forcing the issue -- if there's a "crisis," we could make the choice to commit to new infrastructure or make changes in way that preserves net neutrality. But that's not what's happening.

Verizon and Comcast are taking different routes to get to the same destination -- at the end of the day, what they want is to be able to look at your traffic, either to verify your usage rates (Verizon) or to throttle the traffic for artificial QoS (Comcast). The goal is the same.

It's hard to tell how much DPI is really feasible because the companies implementing it don't talk about it (Comcast just dropped the ball here -- everyone does some DPI, because CALEA requires them to), but I think they're a lot better at it than Turb seems to suggest. (I also think that Verizon's method is potentially more insidious than Comcast's -- there's less incentive to obfuscate your data under the Verizon regime.)

But ultimately, the profiling doesn't even have to be accurate or correct -- they could just do traffic profiling like Comcast did, and who cares if it's right or not? At the end of the day, your bandwidth is still being throttled, and network control is being taken away from the users. You're encrypting your traffic? Brzzzt. Your packets are now being deprioritized. No, we don't care if the traffic is legit, we're just ensuring QoS for our dedicated VOD pipe -- it's just business. Why, do you have something to hide?

The key phrase here is "arms race." Take a look at the tricks Skype is already using to mask their traffic. VoIP providers should not be forced to write programs that act like viruses just because their service happens to compete with the telcos' monopoly on voice communications. It's absurd what Skype has to go through (legally and technically) to provide what in any rational world would be considered a straightforward, legitimate business. These guys are playing for keeps.

This has been a great thread -- I've been out so I'm still digesting it, but I may want to follow up with some of you on the more technical aspects that I don't understand.

Again, thanks for the thread, publius. Now that my legal-related business is wrapped up, I can also turn to supplementing my technical knowledge. :)

The bottom line, in my view, is that CALEA compliance requires DPI at some level, so it really sets the stage for access tiering.

Adam, do you think cell phone companies record every single minute of all cell phone conversations that they carry and store that data indefinitely? If you don't, why not? I mean, CALEA compliance requires that cell phone companies have this capability, so surely we must assume that CALEA compliance therefore necessitates that cell phone providers have deployed this capability universally and are employing it continuously, right?

Of course they do no such thing. While they have the infrastructure to record any conversation, the costs of recording all conversations are prohibitively high especially since doing so is really really pointless. Likewise, CALEA requires that network providers maintain the capability to do all manner of inspection on SOME traffic, but that inspection is not cheap and it makes no sense to believe that providers are therefore doing it on ALL traffic.

Honestly, all this talk about CALEA leading to access tiering boils down to fear mongering. Just because the government requires you to do an expensive thing on a small scale does not mean that you have the capability to do so on a large scale, let alone that you actually are doing it on a large scale.

either to verify your usage rates (Verizon)

Why does Verizon need DPI to verify usage rates? I mean, counting bytes on the wire is the easiest thing on earth, and it works regardless of whether or not the traffic is encrypted.

I know you think that Verizon and Comcast are quaking with fear at the prospect of network coding, but do you really think that Google and the like are going to significantly expand their power budget calculating like mad in order to optimize coding and reduce bandwidth by 10-20% when their data centers are already starving for power and density limited by the heat associated with their current computational load? And that assumes that you could come up with an awesome distributed network coding control protocol that was both highly efficient and did not impose significant latency. I mean, if we assume a can opener...

Honestly, all this talk about CALEA leading to access tiering boils down to fear mongering.

Well, I definitely don't want to overstate the case. But I do think that the issue of how deep providers are able to and allowed to drill down into users' packets is important, and I do find it worrisome. The issue I'm trying to get at is that CALEA sets the legal stage for access tiering. But I agree that it's a serious claim, so I'll try to back it up as best I can before going to bed...

First. The technical capability is definitely there -- from one of the AT articles I linker earlier:

DPI vendors ... top-of-the-line products can set you back several hundred thousand dollars, but some of them can inspect and shape every single packet—in real time—for nearly a million simultaneous connections while handling 10-gigabit Ethernet speeds and above.
Legally, things are murkier, because it involves decoding some arcane FCC pronouncements, but I think that if you do, there is at least arguably some cause for concern. The issue starts with this FCC hearing -- "In the Matter of Communications Assistance for Law Enforcement Act and Broadband Access and Services," FCC 04-187, 2004 WL 1774542 (Aug. 9, 2004):
There are potentially several kinds of information about broadband access service that Law Enforcement may seek under section 103’s requirements. For broadband access these potentially include, but are not necessarily restricted, to the following: (1) information about the subject’s access sessions, including start and end times and assigned IP addresses, for both mobile and fixed access sessions; (2) information about changes to the subject’s service or account profile, which could include, for example, new or changed logins and passwords; and (3) information about packets sent and received by the subject, including source and destination IP addresses, information related to the detection and control of packet transfer security such as those in Virtual Private Networks (“VPNs”), as well as packet filtering to favor certain traffic going to or from certain customers.
Maybe I'm reading that wrong, but I find it creepy as hell. As far as I can tell, there's nothing following this declaration contravening it.

Now, from the First Report and Order implementing CALEA:

...we find that facilities-based providers of any type of broadband Internet access service, including but not limited to wireline, cable modem, satellite, wireless, fixed wireless, and broadband access via powerline are subject to CALEA.
It's hard to explain just how bizarre it is that the FCC decided to treat internet and telco providers the same for the purposes of CALEA; in almost every other significant respect the two are separated, basically to protect legacy voice service from VoIP competition. Seriously, this finding is weird.

Finally, from the Second Report and Order implementing CALEA, from May 2006:

CII [Call Identifying Information] may be found within several encapsulated layers of protocols, and as a packet makes its way through the network of the broadband Internet access service provider, these providers’ equipment generally do not examine or process information in the layers used to control packet-mode services such as VoIP, and in fact operate at layers below the ones that carry control information for broadband access services.
In other words, DPI is fair game even if the traffic is in a secured wrapper.

Now, I wouldn't try to claim that ISPs record all your traffic, but the CALEA decision was a major turning point -- indeed, a reversal -- in terms of what they're allowed (required) to do as far as traffic monitoring. I don't think you have to be paranoid to find it a little unnerving.

I know you think that Verizon and Comcast are quaking with fear at the prospect of network coding

That's not what I intended to say. I just think that Verizon will use that as a pretext for packet inspection on a usage-based network just like Comcast used QoS (i.e., Bittorrent) as a pretext on their pipe-based network -- they're just different ways of circumventing net-neutrality.

I agree with publius' original post: I don't think that usage-based pricing is incompatible with net neutrality. But QoS provisioning on a piped network doesn't necessarily contravene those principles either. The question is always, still, how it's being done. ISPs will tell you with a straight face that tiered access is net-neutral because users can pay more for better service -- it's the free market at work!

Similarly Verizon will tell you that their pay-as-you-go methods are the ultimate in net neutrality, because, well, you control it! If you want to pay more for that Bittorrent traffic, go for it!

But thats smoke 'n' mirrors. Look what happens in the real world. On the pipe-based network, BT users take advantage of their idle transmission space and engage in some really pretty brilliant and ultimately very efficient means of moving data around the network cheaply. However, they're putting massive strain on the infrstructure of Comcast's expensive, and since this is all last-hop decoding, Comcast can't gain back any of the benefits of it.

So instead they realize that these guys follow the old 80-20 rule; 20% of the users are using 80% of the bandwidth. Other customers are revolting, so they pull out their Sandvine box and decide, well we can already profile these people -- it's easy to see a bittorrent swarm -- how about we channelize them through pipe that can handle only 40% of our capacity max? There, we've just made our network blazing fast for everyone but the losers who were eating up everyone's bandwidth, and they're still getting twice as much as everyone else anyway.

Comcast would like real access tiering, but I don't think they're gonna get it. They're small fry. They'll try to get by on bad QoS.

Verizon, that's a different story I'll tell tomorrow. See if you can reconstruct my Comcast/QoS-based service/Bittorrent story, but with Verison/usage-based service/network coding.

At the end of the day I just don't think that Verizon is taking is in a good direction and I don't think anyone realizes it...

... guess I burnt everyone out, huh?

Don't take it personally Adam; I for one am drowning in work. I'll try to read up on your links and give you a proper considered response later tonight. I do greatly appreciate the time and energy you've spent trying to explain things.

I lied earlier. Here's a partial response.

But I do think that the issue of how deep providers are able to and allowed to drill down into users' packets is important, and I do find it worrisome.

But providers could always drill down into users' packets. That's exactly what transparent proxying involves and I'd be willing to bet that transparent proxying is deployed on a much wider basis than DPI. In any event, the ability to do DPI seems like a red herring to me: what matters is what you do with the information you glean from DPI. Most of the horrible things you can do with DPI-gleaned data can be easily done with non-DPI data. You want access-tiering? IP headers + DNS lookups + random drops or disconnects can do that. You can do very effective tiering in a completely stochastic manner, without having to worry about checking every single packet.

DPI vendors ... top-of-the-line products can set you back several hundred thousand dollars, but some of them can inspect and shape every single packet—in real time—for nearly a million simultaneous connections while handling 10-gigabit Ethernet speeds and above.

I'm sure these products exist and I'm sure that every provider has purchased them. But, again, there's difference between purchasing one magical box that can do DPI for the 0.1% of your subscribers that are under investigation and purchasing hundreds or thousands of these boxes so that you can do DPI for every single subscriber simultaneously. Rack space at peering points is at a premium and installing thousands of these extremely expensive boxes means taking up a lot of very expensive space.

Here's how you could convince me: look up the financial data on the companies that make these boxes, backtrack to a very rough estimate of how many boxes they've sold, and give me a super rough guess as to what aggregate data rate they're capable of scanning at once. My guess is that you'll get a number that's 1% of the aggregate customer bandwidth employed by the top five providers.

Look, saying that the technical capability is there is like saying that because I own a car, I'm capable of launching an invasion of Germany. Owning one box is not the same as owning thousands of boxes.

It's hard to explain just how bizarre it is that the FCC decided to treat internet and telco providers the same for the purposes of CALEA; in almost every other significant respect the two are separated, basically to protect legacy voice service from VoIP competition. Seriously, this finding is weird.

I just don't get this. From a law enforcement perspective, VOIP and legacy voice are the same damn thing. The FBI doesn't care. Heck, from a technical perspective, VOIP and legacy voice are the same damn thing: lots of long haul digital voice networks are running IP under the covers. So yes, the FCC has built a Chinese wall to maintain the fiction that these two things are radically different, but law enforcement is not the FCC and they don't care about that fiction, especially when they think that fiction might end up killing people. Why is this hard to understand? If you were in law enforcement, wouldn't you feel exactly the same way? Wouldn't you be pressuring the FCC via your superiors to make them look at things from your perspective? Besides, treating them the same for the purposes of CALEA helps protect legacy voice from VOIP: by raising the administrative costs, it helps level the playing field.

In other words, DPI is fair game even if the traffic is in a secured wrapper.

I don't see anything in the bit you quoted about a "secured" wrapper. If the traffic is in a secured wrapper, then DPI is totally useless. In any event, providers can always look into packets, so who cares? What matters is what they do with that information.

Now, I wouldn't try to claim that ISPs record all your traffic, but the CALEA decision was a major turning point -- indeed, a reversal -- in terms of what they're allowed (required) to do as far as traffic monitoring. I don't think you have to be paranoid to find it a little unnerving.

Did it used to be against the law for providers to look at the complete contents of your packets? Didn't providers always need the capability to respond to court orders at some level?

In any event, the ability to do DPI seems like a red herring to me: what matters is what you do with the information you glean from DPI.

Ah, this is what I'm not explaining. Before CALEA, the FCC drew a very sharp distinction between information services and telecommunications services, and there was a significantly heightened expectation of privacy for the former. Common carrier safe harbors, e.g., require that all information be treated equally.

In applying CALEA to VoIP and other internet technologies, the FCC could have easily limited the scope of the wiretap authority to voice applications (i.e., not email, etc.) -- which would have been consistent with past decisions -- but it didn't. It held that information services had to be treated like voice services solely for the purposes of CALEA, which is messed up because (a) it legitimizes and even requires deep packet inspection tied to users, which definitely wasn't the case before, and (b) it preserves all of the monopoly protections of legacy voice carriers. Moreover, the DoJ's recent anti-net-neutrality brief seems to make it pretty clear that this is just a foot in the door for non-neutral access. CALEA was just a cheap excuse.

Rack space at peering points is at a premium and installing thousands of these extremely expensive boxes means taking up a lot of very expensive space.

I'm aware of the technical limitations and costs involved, but I am quite confident that the monitoring capability on the backbone is extensive and very powerful, though it's totally opaque to the outside. CALEA requires all of this. However, since we can't exactly waltz into MAE-East or PAIX to settle this, we'll probably have to agree to disagree on this one.

I just don't get this. From a law enforcement perspective, VOIP and legacy voice are the same damn thing. The FBI doesn't care. Heck, from a technical perspective, VOIP and legacy voice are the same damn thing: lots of long haul digital voice networks are running IP under the covers.

Yes, agreed, but the FCC's CALEA decisions extend wiretap coverage to all packet-based traffic, not just voice. Just voice would have made sense.

Besides, treating them the same for the purposes of CALEA helps protect legacy voice from VOIP: by raising the administrative costs, it helps level the playing field.

Actually, the FCC's decision forces providers to eat the costs of compliance, so this has become a barrier to VoIP deployment. Another reason why it's messed up.

Did it used to be against the law for providers to look at the complete contents of your packets? Didn't providers always need the capability to respond to court orders at some level?

My understanding is that "information services" used to be protected -- the major distinction was between traffic and storage. Transport providers weren't allowed to snoop into the traffic running across their network short of routing it, but emails, e.g., might still be subject to a subpoena if they were stored somewhere. Under the new CALEA standard all traffic moving across the network becomes fair game, even if it's transient.

Ah, this is what I'm not explaining. Before CALEA, the FCC drew a very sharp distinction between information services and telecommunications services, and there was a significantly heightened expectation of privacy for the former.

Did that apply to s**m filtering in ISP email systems?

How does increasing the drop rate for packets sent to a particular destination impact anyone's privacy?

Common carrier safe harbors, e.g., require that all information be treated equally.

I don't understand what it means to treat all information equally. I can guess at how that would work in a voice network, but I don't know how to translate those rules into a packet switched realm. I mean, with voice networks, there is one and only one class of service, so the only issue is connectivity: you either provide it or you don't. But surely common carrier status does not require that all providers purchase suitable bandwidth to ensure that connections from any one node on their network to any other network must have identical bandwidth and loss characteristics, right? That would be insane. Some networks may be more expensive to peer with than others, and so providers might perfectly legitimately decide to skimp on connectivity with those nets. The result would be that those networks would still be accessible, but the performance of flows exchanged with those networks would occasionally (or maybe always) suck.

I don't understand how this process could be forbidden by common carrier regulations, and I also don't understand how this process materially differs from Comcast increasing the random drop rate for youtube bound packets. Can you explain?

In applying CALEA to VoIP and other internet technologies, the FCC could have easily limited the scope of the wiretap authority to voice applications (i.e., not email, etc.) -- which would have been consistent with past decisions -- but it didn't.

How would that work exactly? If I tunnel my VOIP protocol through AIM or a really awesome pair of specially modified email servers, wouldn't that allow me to evade a narrower scoped CALEA? I just don't see how the FCC could have limited the scope of the authority without giving all users a get-out-of-VOIP-wiretapping-free card. At the end of the day, its all bits and bits are fungible.

I could write a fake mail or http server and client pair that effectively tunnel an encrypted VOIP stream over a bunch of sockets that appear to be transmitting legitimate mail messages or web requests. It would be slightly annoying and delays might be higher than you'd get otherwise, but it wouldn't be very hard either.

It held that information services had to be treated like voice services solely for the purposes of CALEA, which is messed up because (a) it legitimizes and even requires deep packet inspection tied to users, which definitely wasn't the case before, and (b) it preserves all of the monopoly protections of legacy voice carriers. Moreover, the DoJ's recent anti-net-neutrality brief seems to make it pretty clear that this is just a foot in the door for non-neutral access. CALEA was just a cheap excuse.

I don't understand what you mean by (b).

I'm aware of the technical limitations and costs involved, but I am quite confident that the monitoring capability on the backbone is extensive and very powerful, though it's totally opaque to the outside. CALEA requires all of this. However, since we can't exactly waltz into MAE-East or PAIX to settle this, we'll probably have to agree to disagree on this one.

When you say extensive and powerful, do you mean "capable of simultaneous DPI on 1% of a provider's subscriber base" or do you mean "capable of simultaneous DPI on 100% of subscribers"? The difference is significant, no?

Yes, agreed, but the FCC's CALEA decisions extend wiretap coverage to all packet-based traffic, not just voice. Just voice would have made sense.

No, just voice would not have made sense. See my earlier comments about making a mail or http server/client pair that hides VOIP traffic.

Actually, the FCC's decision forces providers to eat the costs of compliance, so this has become a barrier to VoIP deployment. Another reason why it's messed up.

Yes, this is what I meant by leveling the playing field: it helps the incumbent legacy voice providers by saddling VOIP with extra costs.

Turb, I have to admit that I'm getting a bit worn out on this. My response to a lot of this just amounts to either "the regulations don't make sense" or "the parties involved say they're doing one thing, but I think they're actually doing something else." I'll fully cop to a fair amount of editorializing on my part.

Did that apply to s**m filtering in ISP email systems?

Yes and no. The question is whether you treat the traffic in a neutral fashion. The (admittedly somewhat fuzzy) distinction lies in whether you discriminate based on content or source.

The default position is no traffic discrimination. But if a provider applies the same automatic rules to all traffic, they're protected by common-carrier safe harbor rules.

The gray area is where carriers apply rules that are neutral in form but discriminatory in effect. It's case-by-case, except where CALEA is concerned.

How does increasing the drop rate for packets sent to a particular destination impact anyone's privacy?

Well, again, it depends on the criteria you use. If you drop packets for legit QoS reasons, that's fine. The problem is where you have situations like Comcast throttling Bittorrent users on the sly. Their argument was that this was a QoS policy, but that was pretty transparently false if only because they were forging TCP resets.

That said, if they'd been smarter about it I think it would be a tougher question. Still wrong, but a tougher question, and it probably would have gone undetected for longer.

The difficult issue is that there's an obvious need for network management, but if QoS is opaque to end users, there'll always be questions of why providers do certain kinds of network management, as well as a temptation to do network management that, while not blatantly discriminatory, might not be truly fair, either -- and from the user standpoint, it's hard to tell the difference.

But surely common carrier status does not require that all providers purchase suitable bandwidth to ensure that connections from any one node on their network to any other network must have identical bandwidth and loss characteristics, right?

There are lots of different places where the phrase "common carrier" is used, so this is a hard question to answer. The transport issues you're talking about have generally been covered by peering arrangements, and major ISPs have been de-peered for providing substandard service. There are safe harbors for copyright and junkmail for providers who treat the traffic in certain ways (respect DMCA notices, no forged headers, etcet.).

The CALEA safe harbor, though, requires that carriers have certain monitoring capabilities in order to be in compliance, though, so it's rare in that it's a positive requirement rather than a prohibition or a carve-out; it's not really a "safe harbor" at all, really.

I just don't see how the FCC could have limited the scope of the authority without giving all users a get-out-of-VOIP-wiretapping-free card. At the end of the day, its all bits and bits are fungible.

That's a good question. The short answer is that VoIP services ultimately have to terminate on the public-switched telephone network, where the FCC still has plenty of control. The FCC could have required VoIP providers to comply with CALEA in order to terminate on the PTSN.

Or, the FCC could have just made a straightforward regulatory pronouncement -- "all VoIP providers have to comply with CALEA." But that's what's weird -- all the decisions discuss VoIP, but then just turn around and say, well VoIP travels over the internet, so instead all internet providers have to comply with CALEA, and traffic monitoring would happen through DPI.

As you point out, that doesn't make any technical sense, because anyone who really wants to avoid monitoring can just encrypt their traffic. So why do it that way? Why not just focus on voice solutions?

I could write a fake mail or http server and client pair that effectively tunnel an encrypted VOIP stream over a bunch of sockets that appear to be transmitting legitimate mail messages or web requests. It would be slightly annoying and delays might be higher than you'd get otherwise, but it wouldn't be very hard either.

Well, exactly. (a) Why should you have to encrypt your traffic to provide VoIP? (b) If it's not that hard, what's the point of regulating it like that? (c) Why cast the net that wide? The CALEA holding regulates all internet traffic, whereas FCC policy in literally every other area distinguishes between information services (data) and telecommunications services (voice). It's overbroad and incoherent.

I don't understand what you mean by (b).

I'm saying that the DoJ's positions make no sense if viewed from the perspective of national security, which is ostensibly their rationale. They do make sense if the goal is to pave the way for a non-net-neutral policy, because they undermine the fundamental rule that you treat all traffic equally.

It's analogous to the way the current Admin's treated FISA -- once you change the default position from "don't monitor unless there's a positive reason to" to "monitor unless there's a reason not to," you've crossed an important line.

When you say extensive and powerful, do you mean "capable of simultaneous DPI on 1% of a provider's subscriber base" or do you mean "capable of simultaneous DPI on 100% of subscribers"? The difference is significant, no?

I don't think it's possible to answer that -- it depends on how they profile, whether packets get shunted off for more analysis, how much analysis you do and what type. My feeling is that the capability is technically very advanced, and I imagine that if you read the press releases by the companies that manufacture traffic-shaping gear, you might agree.

But I still that's sort of beside the point. CALEA requires that all packets pass through some sort of gate, so everything is fair game. At that point it's just a matter of throwing hardware at it. As long as you spread the latency evenly, there's pretty much nothing you can't do.

Yes, this is what I meant by leveling the playing field: it helps the incumbent legacy voice providers by saddling VOIP with extra costs.

Hmm. I suppose this is a matter of opinion and politics, but I don't think the incumbent telcos are the ones that really need protection in this case. That's a different discussion, though.

Another way of putting all that might be that the difference between "send fake TCP resets to users whose number of outbound connections is one standard deviations above the mean over a 24-hour period," and "screw the Bittorrent users, they're mostly pirates and no one will defend them" is not a very big gap technically, but legally and politically, the distinction is important.

If you allow network providers to centralize routing decisions rather than treating all traffic equally, then the distinction between efficiency and abuse is essentially just in how you describe it. When intelligence resides in the edges of the network, it doesn't prevent reasonable network management, but it does level the playing field somewhat. The latter situation might not necessarily be a race to the top, but the former is almost certainly a race to the bottom.

Turb, you were right about compressing the YouTube streams. I wasn't thinking clearly at all. That's definitely end-to-end compression, so probably not efficient under my scheme. It's very amenable to caching, of course, but that doesn't make a difference under a per-byte pricing system (unless inter- and intra-network transport are priced separately).

Still, there's a lot of uncompressed data at the last hop -- and under a per-byte system even the incremental improvements make a difference. No network providers today have complete end-to-end control -- and really, now that I think about it, I suppose I could rephrase my argument along those lines:


I.e., since end-to-end control is necessary to maximize information flow in any network, and under a per-byte pricing scheme you're essentially paying for the information flow directly, it creates a huge incentive for end-to-end control that simply doesn't exist under a per-pipe system.

Of course, a smart network (centralized routing decisions) is in theory more efficient than a dumb network (decentralized routing decisions). But, on the other hand, one of the amazing things about the Internet is how well it's managed to self-regulate. "Capacity crisis" notwithstanding, it still works.

In the end, I think that what we're really talking about is whether network providers can be trusted with end-to-end control of the network. Based on my reading of the history of the telecom industry -- long- and short-term -- my answer is a fairly emphatic "no," but I'll admit it's an issue where reasonable people can disagree.

What I think is less controversial, though, is that sacrificing that much control for a nominal amount of "efficiency" is not a fair deal. The disconnect between the purported problem and the solutions is not credible.

Moreover, there is good reason to be suspicious of doomsday predictions by incumbent carriers -- Bell did essentially the same song 'n' dance for decades, and the social cost of their stonewalling was not trivial. Modems would supposedly melt the network; asynchronous routing would never work. Those technical claims were always ludicrous -- much like the claim that deep packet inspection is necessary to monitor voice traffic in the interests of national security -- but waving that bloody shirt was still an effective strategy, and it was usually effectuated by gaming the FCC.

I'm concerned that the net-neutrality debate is going down a similar path right now. The predictions of an inevitable crisis are not new -- but the simple fact is that the network works well right now, and the claim that future constraints on bandwidth will melt the Internet is simply not very credible.

What, then, is driving the debate here? publius says, perceptively, that "In short, net neutrality encourages better solutions. The entire debate – to me – is about what sort of incentives we want to create."

So the point is this: paying per-byte rather than per-pipe changes the network in a very fundamental way that is inherently less amenable to decentralized control. Scarcity is not the problem here. This is a way of restructuring the incentives determined by the network model.

Basically, I'm not convinced that the per-pipe model is so wildly inefficient that it necessitates retooling the design of the network so radically. The claimed danger is being overstated; the solutions on the table are self-serving. What's being said and what's being done do not match up.

Oh, man. As if on cue, check out the anti-spyware bill that's on deck right now. One guess who gets the exemption allowing them to monitor for "unauthorized" traffic.

I finally found the paper that explains what I was trying to explain hiding on my drive. I can't recommend it highly enough.

The comments to this entry are closed.