Home    About   Contact
Twitter Facebook LinkedIn RSS

Living on the edge

Home » 2011 » June

Living on the edge

Living on the edge

The previous posts in this series sketched out how the route from 10 Gbps to 100 Gbps and beyond approaches the theoretical capacity limit of a DWDM channel. Any system operated at the edge of the envelope tends to fail spectacularly, and high channel capacity optics are no exception. Lower bit rate transceivers had a narrow range of degraded operation where bit error rate (BER) would increase as the received signal level approached the lower limit. As we push the channel capacity to the limit, operating margins are reduced, and the margin for error all but disappears.

From Information Theory 101, we know that increasing throughput by a factor of 10x from 10 Gbps to 100 Gbps would require a 10x improvement in OSNR, with all other things being equal. Transmitting 100 Gbps with more sophisticated PM-DPSK modulation, rather than simple OOK, provided a 4x reduction in symbol rate by coding two bits per symbol in both polarization modes. That left a 2.5x gap that needed to be filled for full backwards compatibility of 100 Gbps waves on existing systems designed for 10 Gbps per DWDM channel.

If this OSNR gap could not be filled, then deployment of 100 Gbps waves would require costly and disruptive re-engineering of installed networks, limiting its utility. Once again, technology originally developed and deployed for wireless communications provided a solution. The secret weapon used to close this gap was improved forward error correction (FEC). But FEC is like a double-edge sword that cuts both ways.

By adding redundant bits to the bit stream, FEC allows bit errors from forward transmission to be corrected at the receiver, without requesting retransmission on a backward channel. This is analogous to redundant RAID arrays in disk drive storage. By including an additional disk drive, and adding redundant data to each disk, a RAID disk array can tolerate complete failure of any one disk without data loss. Likewise, by breaking a bit stream into blocks and adding redundant bits, FEC can correct a limited number of random bit errors, recovering the corrupted receive data as originally transmitted, without loss.

But like everything else, FEC has limits. For a given amount of redundant bits added, a corresponding amount of bit errors can be corrected. Once the input bit error rate reaches a particular FEC algorithm’s limit, the error correction process breaks down, and bit errors appear in the output data. The FEC algorithm fails completely if the bit error rate increases further, and the output data becomes unusable. This catastrophic failure mode exacerbates the so-called “cliff effect” of rapid degradation in digital transmission on noisy links.

Without FEC, the bit error rate would increase more gradually as the OSNR decreased. With FEC, the BER remains near zero as the OSNR degrades, because the algorithm cleans up low-level bit errors. When the received BER stretches the ability of the FEC algorithm to compensate, smaller decreases in OSNR will produce bigger increases in output BER with FEC, than without. So, FEC delays the onset of degraded performance, but it can only do this by reducing the margin for error.

Getting throughput closer to the theoretical OSNR limit requires more efficient FEC algorithms. With these more efficient algorithms, bit errors are corrected to an even lower level of OSNR. FEC does not move the theoretical OSNR limit, however; it just allows error free operation closer to that edge. Once OSNR approaches the limit, the more efficient FEC algorithm still breaks down, but the slippery slope is even steeper.

The key take away here is that empirical “plug-and-pray” deployments of optical gear become even more untenable as data rates increase, leading to brick wall failure modes that provide little or no warning of impending failure. Many operators have foolishly relied on degradation of output BER to serve as a warning system. Increasing dependence on FEC to improve throughput makes this pure folly.

Without proper design up front, rigorous validation of the as-built system against the design parameters, and constant vigilance over the system lifetime, reliable operation will just be an elusive goal. The margin for degraded operation, where intervention can preempt catastrophic failure, becomes vanishingly small as the channel capacity is stretched. Poor practices that have worked in the past will no longer produce the desired results.

The rapid increase in BER near the OSNR limit with FEC does not matter in the case of a fiber cut, but this sudden failure mode is relatively rare. It is much more common to see a gradual degradation of the fiber link over a time span of several days or months. This can be caused by an accumulation of many small macro-bending losses over time, or a single mechanical instability that slowly gets worse (e.g. a loose connector, cracked fiber, or kinked cable). With proper network performance monitoring, the erosion in optical margin or quality factor (Q-factor) can be detected and addressed at the network operations level in the normal course of business.

Without pro-active maintenance, problems propagate up the layers in the network stack. Adverse influences accumulating in the network at layer-0 eventually produce bit errors at layer-1. In an IP network, this causes CRC errors at layer-2 that require packet retransmission under TCP at layer-4. This leads to sluggish application performance at layer-7, which generates angry phone calls at layer-8. At this point, the problem is no longer a purely technical issue, because too many people outside the networking organization are adversely affected.

With FEC, this cascading failure chain snaps more quickly. The next post in this series will address how to make FEC an asset, rather than a liability, and expand on improving network reliability as more complex transmission schemes are necessarily employed to increase fiber capacity.

Doug Haluza,

CTO, Metro|NS

Ed note, this is the fourth post in a series. The previous post is here. The first post is here.




Last post, I reviewed how coherent optics allowed 40 Gbps waves to be dropped into existing 10 Gbps DWDM systems without major modifications. That was good news for network operators who had a much more difficult upgrade path from 2.5 to 10 Gbps. There’s more good news: the optical magicians have pulled another rabbit out of their hat. The new generation of 100 Gbps transponders will also play with 10 and 40 Gbps waves in existing 50 Ghz DWDM windows. The bad news is that it looks like there are no more rabbits in the hat.

At 100 Gbps, the optical to electrical conversion is problematic, because processing a 100 Gbps native stream would require very specialized electronics today. One way to mitigate these problems is to divide the 100 Gbps stream using wavelength division multiplexing into 10 x 10 Gbps or 4 x 25 Gbps optical channels. These lower speed streams can be transmitted by separate lasers and processed using less specialized optoelectronics. This works well for short-range links where fiber capacity is relatively inexpensive. For longer reach where system capacity is valuable, and suitable lasers are expensive, a native 100 Gbps optical channel using a single laser is desirable.

But increasing modulation rate using the same method is not a viable option for upgrading existing networks, making more sophisticated modulation schemes necessary. Encoding two bits per symbol doubles the data rate without increasing the optical bandwidth, or sensitivity to dispersion. Encoding two of these signals, one in each polarization mode of the fiber, allows a further doubling of the data rate, still with the same bandwidth and dispersion tolerance. This scheme, known as dual-polarization quadrature phase-shift keying (DP-QPSK), is now the standard for commercial development of long-haul 100 Gbps on a single wavelength.

Encoding four bits per symbol interval not only enables transmission using a single channel, it also facilitates signal processing without expensive ultra high-speed electronics. The four bits can be processed as four parallel and uncorrelated 25 Gbps payloads on the line side, and then multiplexed into a single 100 Gbps serial handoff on the drop side.

Decoding a polarization multiplexed signal presents a problem, though. Ordinary single mode fiber does not maintain polarization state along its length. So, complex and expensive dynamic polarization controllers were needed in the past to align the receiver with the transmitted polarization state in the optical domain. A coherent detector moves the polarization state into the electrical domain, allowing it to be estimated by the DSP algorithm. The problem of receiving the two scrambled polarization modes is analogous to transmitting data in free space using two antennas and two receivers, known in wireless communications as multiple-input, multiple-output (MIMO). Algorithms developed for MIMO have been adapted to decode the scrambled polarization state in a coherent receiver, making polarization multiplexing feasible.

With these advancements, 100 Gbps DP-QPSK waves can be added to an existing DWDM system engineered for 10 Gbps. In fact, 100 Gbps transponders using all digital dispersion compensation could be used on links that would require dispersion compensation just to pass 10 Gbps. This can bring new life to older fiber routes that are capacity limited and not easily upgraded, or add value to old fiber obtained on long-term IRU.

Of course, there has to be a down side, and naturally it’s cost. Dual polarization adds optical elements and doubles the number of transmitter and receiver elements. The coherent detector doubles the number of receiver elements again. Each of the four receiver elements must employ high speed ADC and sophisticated real-time DSP. So the cost of 100 Gbps DP-QPSK transponders will probably not be too much less than ten times the cost of 10 Gbps, when they become available. Right now the standard is just a multi-source agreement to develop common components that each optical equipment vendor can use in their proprietary implementation. These components are just entering production now.

That does not mean that you can’t deploy 100 Gbps over a single DWDM wave now. Ciena has 100 Gbps line cards for the OME 6500 platform that have been deployed for more than a year. The former Nortel engineers who developed these had to use an additional trick to split the payload into 12.5 Gbps slices so readily available integrated circuit technology could be used to decode the data. In addition to splitting the signal in phase and polarization, they also split the optical carrier into two sub-carriers using frequency division multiplexing in the optical domain. Each sub-carrier carries half the data a la WDM, but the two carriers are only separated by 20 GHz so they can fit in a single 50 Ghz DWDM window.

Technology adapted from wireless to optical communication has allowed an order of magnitude growth in capacity of existing DWDM networks, without costly and disruptive upgrades to the installed plant. But this has taken us pretty close to the theoretical throughput limit under the Shannon–Hartley theorem given the typical parameters of existing large-scale networks. It is possible to get higher data rates with better OSNR, or with more bandwidth; but it’s doubtful that we will see a 400 Gbps transponder suitable for general deployment in existing 50 Ghz DWDM amplified networks originally engineered to carry only 10 Gbps.

Doug Haluza,

CTO, Metro|NS

Ed. Note: this is the second post in a series. Click here for the first post. The next post is here.



Recent Tweets

  • No tweets were found.