File: rfc7786.html

package info (click to toggle)
doc-rfc 20201128-1
  • links: PTS, VCS
  • area: non-free
  • in suites: bullseye
  • size: 1,307,124 kB
file content (1117 lines) | stat: -rw-r--r-- 59,396 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
<pre>Internet Engineering Task Force (IETF)                M. Kuehlewind, Ed.
Request for Comments: 7786                                    ETH Zurich
Category: Experimental                                  R. Scheffenegger
ISSN: 2070-1721                                             NetApp, Inc.
                                                                May 2016


           <span class="h1">TCP Modifications for Congestion Exposure (ConEx)</span>

Abstract

   Congestion Exposure (ConEx) is a mechanism by which senders inform
   the network about expected congestion based on congestion feedback
   from previous packets in the same flow.  This document describes the
   necessary modifications to use ConEx with the Transmission Control
   Protocol (TCP).

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for examination, experimental implementation, and
   evaluation.

   This document defines an Experimental Protocol for the Internet
   community.  This document is a product of the Internet Engineering
   Task Force (IETF).  It represents the consensus of the IETF
   community.  It has received public review and has been approved for
   publication by the Internet Engineering Steering Group (IESG).  Not
   all documents approved by the IESG are a candidate for any level of
   Internet Standard; see <a href="./rfc5741#section-2">Section&nbsp;2 of RFC 5741</a>.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   <a href="http://www.rfc-editor.org/info/rfc7786">http://www.rfc-editor.org/info/rfc7786</a>.

















<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                      [Page 1]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-2" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


Copyright Notice

   Copyright (c) 2016 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to <a href="https://www.rfc-editor.org/bcp/bcp78">BCP 78</a> and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (<a href="http://trustee.ietf.org/license-info">http://trustee.ietf.org/license-info</a>) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   <a href="#section-1">1</a>.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   <a href="#page-3">3</a>
     <a href="#section-1.1">1.1</a>.  Requirements Language . . . . . . . . . . . . . . . . . .   <a href="#page-4">4</a>
   <a href="#section-2">2</a>.  Sender-Side Modifications . . . . . . . . . . . . . . . . . .   <a href="#page-4">4</a>
   <a href="#section-3">3</a>.  Counting Congestion . . . . . . . . . . . . . . . . . . . . .   <a href="#page-5">5</a>
     <a href="#section-3.1">3.1</a>.  Loss Detection  . . . . . . . . . . . . . . . . . . . . .   <a href="#page-6">6</a>
       <a href="#section-3.1.1">3.1.1</a>.  Without SACK Support  . . . . . . . . . . . . . . . .   <a href="#page-7">7</a>
     <a href="#section-3.2">3.2</a>.  Explicit Congestion Notification (ECN)  . . . . . . . . .   <a href="#page-8">8</a>
       <a href="#section-3.2.1">3.2.1</a>.  Accurate ECN Feedback . . . . . . . . . . . . . . . .  <a href="#page-10">10</a>
       <a href="#section-3.2.2">3.2.2</a>.  Classic ECN Support . . . . . . . . . . . . . . . . .  <a href="#page-10">10</a>
   <a href="#section-4">4</a>.  Setting the ConEx Flags . . . . . . . . . . . . . . . . . . .  <a href="#page-11">11</a>
     <a href="#section-4.1">4.1</a>.  Setting the E or the L Flag . . . . . . . . . . . . . . .  <a href="#page-11">11</a>
     <a href="#section-4.2">4.2</a>.  Setting the Credit Flag . . . . . . . . . . . . . . . . .  <a href="#page-11">11</a>
   <a href="#section-5">5</a>.  Loss of ConEx Information . . . . . . . . . . . . . . . . . .  <a href="#page-14">14</a>
   <a href="#section-6">6</a>.  Timeliness of the ConEx Signals . . . . . . . . . . . . . . .  <a href="#page-14">14</a>
   <a href="#section-7">7</a>.  Open Areas for Experimentation  . . . . . . . . . . . . . . .  <a href="#page-15">15</a>
   <a href="#section-8">8</a>.  Security Considerations . . . . . . . . . . . . . . . . . . .  <a href="#page-17">17</a>
   <a href="#section-9">9</a>.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  <a href="#page-18">18</a>
     <a href="#section-9.1">9.1</a>.  Normative References  . . . . . . . . . . . . . . . . . .  <a href="#page-18">18</a>
     <a href="#section-9.2">9.2</a>.  Informative References  . . . . . . . . . . . . . . . . .  <a href="#page-19">19</a>
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  <a href="#page-20">20</a>
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  <a href="#page-20">20</a>













<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                      [Page 2]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-3" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


<span class="h2"><a class="selflink" id="section-1" href="#section-1">1</a>.  Introduction</span>

   Congestion Exposure (ConEx) is a mechanism by which senders inform
   the network about expected congestion based on congestion feedback
   from previous packets in the same flow.  ConEx concepts and use cases
   are further explained in [<a href="./rfc6789" title="&quot;Congestion Exposure (ConEx) Concepts and Use Cases&quot;">RFC6789</a>].  The abstract ConEx mechanism is
   explained in [<a href="./rfc7713" title="&quot;Congestion Exposure (ConEx) Concepts, Abstract Mechanism, and Requirements&quot;">RFC7713</a>].  This document describes the necessary
   modifications to use ConEx with the Transmission Control Protocol
   (TCP).

   The markings for ConEx signaling are defined in the ConEx Destination
   Option (CDO) for IPv6 [<a href="./rfc7837" title="&quot;IPv6 Destination Option for Congestion Exposure (ConEx)&quot;">RFC7837</a>].  Specifically, the use of four flags
   is defined: X (ConEx-capable), L (loss experienced), E (ECN
   experienced), and C (credit).

   ConEx signaling is based on the use of either loss or Explicit
   Congestion Notification (ECN) marks [<a href="./rfc3168" title="&quot;The Addition of Explicit Congestion Notification (ECN) to IP&quot;">RFC3168</a>] as congestion
   indication.  The sender collects this congestion information based on
   existing TCP feedback mechanisms from the receiver to the sender.  No
   changes are needed at the receiver side to implement ConEx signaling.
   Therefore, no additional negotiation is needed to implement and use
   ConEx at the sender side.  This document specifies the sender's
   actions that are needed to provide meaningful ConEx information to
   the network.

   <a href="#section-2">Section 2</a> provides an overview of the modifications needed for TCP
   senders to implement ConEx.  First, congestion information has to be
   extracted from TCP's loss or ECN feedback as described in <a href="#section-3">Section 3</a>.
   <a href="#section-4">Section 4</a> details how to set the CDO marking based on this congestion
   information.  <a href="#section-5">Section 5</a> discusses the loss of packets carrying ConEx
   information.  <a href="#section-6">Section 6</a> discusses the timeliness of the ConEx
   feedback signal, given that congestion is a temporary state.

   This document describes congestion accounting for TCP with and
   without the Selective Acknowledgement (SACK) extension [<a href="./rfc2018" title="&quot;TCP Selective Acknowledgment Options&quot;">RFC2018</a>] (in
   <a href="#section-3.1">Section 3.1</a>).  However, ConEx benefits from the more accurate
   information that SACK provides about the number of bytes dropped in
   the network, and it is therefore preferable to use the SACK extension
   when using TCP with ConEx.  The detailed mechanism to set the L flag
   in response to the loss-based congestion feedback signal is given in
   <a href="#section-4.1">Section 4.1</a>.

   While loss has to be minimized, ECN can provide more fine-grained
   feedback information.  ConEx-based traffic measurement or management
   mechanisms could benefit from this.  Unfortunately, the current ECN
   feedback mechanism does not reflect multiple congestion markings if
   they occur within the same Round-Trip Time (RTT).  A more accurate




<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                      [Page 3]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-4" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   feedback extension to ECN (AccECN) is proposed in a separate document
   [<a href="#ref-ACCURATE" title="&quot;More Accurate ECN Feedback in TCP&quot;">ACCURATE</a>], as this is also useful for other mechanisms.

   Congestion accounting for both classic ECN feedback and AccECN
   feedback is explained in detail in <a href="#section-3.2">Section 3.2</a>.  Setting the E flag
   in response to ECN-based congestion feedback is again detailed in
   <a href="#section-4.1">Section 4.1</a>.

<span class="h3"><a class="selflink" id="section-1.1" href="#section-1.1">1.1</a>.  Requirements Language</span>

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [<a href="./rfc2119" title="&quot;Key words for use in RFCs to Indicate Requirement Levels&quot;">RFC2119</a>].

<span class="h2"><a class="selflink" id="section-2" href="#section-2">2</a>.  Sender-Side Modifications</span>

   This section gives an overview of actions that need to be taken by a
   TCP sender modified to use ConEx signaling.

   In the TCP handshake, a ConEx sender MUST negotiate for SACK and ECN
   preferably with AccECN feedback.  Therefore, a ConEx sender MUST also
   implement SACK and ECN.  Depending on the capability of the receiver,
   the following operation modes exist:

   o  SACK-accECN-ConEx (SACK and accurate ECN feedback)

   o  SACK-ECN-ConEx (SACK and classic instead of accurate ECN)

   o  accECN-ConEx (no SACK but accurate ECN feedback)

   o  ECN-ConEx (no SACK and no accurate ECN feedback, but classic ECN)

   o  SACK-ConEx (SACK but no ECN at all)

   o  Basic-ConEx (neither SACK nor ECN)

   A ConEx sender MUST expose all congestion information to the network
   according to the congestion information received by ECN or based on
   loss information provided by the TCP feedback loop.  A TCP sender
   SHOULD count congestion byte-wise (rather than packet-wise; see next
   paragraph).  After any congestion notification, a sender MUST mark
   subsequent packets with the appropriate ConEx flag in the IP header.
   Furthermore, a ConEx sender must send enough credit to cover all
   experienced congestion for the connection so far, as well as the risk
   of congestion for the current transmission (see <a href="#section-4.2">Section 4.2</a>).






<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                      [Page 4]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-5" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   With SACK the number of lost payload bytes is known, but not the
   number of packets carrying these bytes.  With classic ECN only an
   indication is given that a marking occurred, but not the exact number
   of payload bytes nor packets.  As network congestion is usually byte-
   congestion [<a href="./rfc7141" title="&quot;Byte and Packet Congestion Notification&quot;">RFC7141</a>], the byte-size of a packet marked with a CDO
   flag is defined to represent that number of bytes of congestion
   signaling [<a href="./rfc7837" title="&quot;IPv6 Destination Option for Congestion Exposure (ConEx)&quot;">RFC7837</a>].  Therefore, the exact number of bytes should be
   taken into account, if available, to make the ConEx Signal as exact
   as possible.

   Detailed mechanisms for congestion counting in each operation mode
   are described in the next section.

<span class="h2"><a class="selflink" id="section-3" href="#section-3">3</a>.  Counting Congestion</span>

   A ConEx TCP sender maintains two counters: one that counts congestion
   based on the information retrieved by loss detection, and a second
   that accounts for ECN-based congestion feedback.  These counters hold
   the number of outstanding bytes that should be ConEx-Marked with,
   respectively, the E flag or the L flag in subsequent packets.

   The outstanding bytes for congestion indications based on loss are
   maintained in the Loss Exposure Gauge (LEG), as explained in
   <a href="#section-3.1">Section 3.1</a>.

   The outstanding bytes counted based on ECN feedback information are
   maintained in the Congestion Exposure Gauge (CEG), as explained in
   <a href="#section-3.2">Section 3.2</a>.

   When the sender sends a ConEx-capable packet with the E or L flag
   set, it reduces the respective counter by the byte-size of the
   packet.  This is explained for both counters in <a href="#section-4.1">Section 4.1</a>.

   Note that all bytes of an IP packet must be counted in the LEG or CEG
   to capture the right number of bytes that should be marked.
   Therefore, the sender SHOULD take the payload and headers into
   account, up to and including the IP header.  However, in TCP the
   information regarding how large the headers of a lost or marked
   packet were is usually not available, as only payload data will be
   acknowledged.

   If equal-sized packets, or at least equally distributed packet sizes,
   can be assumed, the sender MAY only add and subtract TCP payload
   bytes.  In this case, there should be about the same number of ConEx-
   Marked packets as the original packets that were causing the
   congestion.  Thus, both contain about the same number of header bytes
   so they will cancel out.  This case is assumed for simplicity in the
   following sections.



<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                      [Page 5]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-6" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   Otherwise, if a sender sends different sized packets (with unequally
   distributed packet sizes), the sender needs to memorize or estimate
   the number of lost or ECN-marked packets.  If the sender has
   sufficient memory available, the most accurate way to reconstruct the
   number of lost or marked packets is to remember the sequence number
   of all sent but not acknowledged packets.  In this case, a sender is
   able to reconstruct the number of packets, and thus the header bytes
   that were sent during the last RTT.  Otherwise (e.g., if not enough
   memory is available), the sender would need to estimate the packet
   size.  The average packet size can be estimated if the distribution
   pattern of packet sizes in the last RTT is known; alternatively, the
   minimum packet size seen in the last RTT can be used as the most
   conservative estimate.

   If the number of newly sent-out packets with the ConEx L or E flag
   set is smaller (or larger) than this estimated number of lost/ECN-
   marked packets, the additional header bytes should be added to (or
   can be subtracted from) the respective gauge.

<span class="h3"><a class="selflink" id="section-3.1" href="#section-3.1">3.1</a>.  Loss Detection</span>

   This section applies whether or not SACK support is available.  The
   following subsection (<a href="#section-3.1.1">Section 3.1.1</a>) handles the case when SACK is
   not available.

   A TCP sender detects losses and subsequently retransmits the lost
   data.  Therefore, the ConEx sender can simply set the ConEx L flag on
   all retransmissions in order to at least cover the amount of bytes
   lost.  If this approach is taken, no LEG is needed.

   However, any retransmission may be spurious.  In this case, more
   bytes have been marked than necessary.  To compensate for this
   effect, a ConEx sender can maintain a local signed counter (the LEG)
   that indicates the number of outstanding bytes to be sent with the
   ConEx L flag and also can become negative.

   Using the LEG, when a TCP sender decides that a data segment needs to
   be retransmitted, it will increase the LEG by the size of the TCP
   payload bytes in the retransmission (assuming equal sized segments
   such that the retransmitted packet will have the same number of
   header bytes as the original ones):

   For each retransmission:

   LEG += payload

   Note how the LEG is reduced when the ConEx L marking is set as
   described in <a href="#section-4">Section 4</a>.



<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                      [Page 6]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-7" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   Further, to accommodate spurious retransmissions, a ConEx sender
   SHOULD make use of heuristics to detect such spurious retransmissions
   (e.g., F-RTO [<a href="./rfc5682" title="&quot;Forward RTO-Recovery (F-RTO): An Algorithm for Detecting Spurious Retransmission Timeouts with TCP&quot;">RFC5682</a>], DSACK [<a href="./rfc3708" title="&quot;Using TCP Duplicate Selective Acknowledgement (DSACKs) and Stream Control Transmission Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs) to Detect Spurious Retransmissions&quot;">RFC3708</a>], and Eifel [<a href="./rfc3522" title="&quot;The Eifel Detection Algorithm for TCP&quot;">RFC3522</a>],
   [<a href="./rfc4015" title="&quot;The Eifel Response Algorithm for TCP&quot;">RFC4015</a>]), if already available in a given implementation.  If no
   mechanism for detecting spurious retransmissions is available, the
   ConEx sender MAY chose to implement one of the mechanisms stated
   above.  However, given the inaccuracy that ConEx may have anyway and
   the timeliness of ConEx information, a ConEx MAY also chose not to
   compensate for spurious retransmission.  In this case, if spurious
   retransmissions occur, the ConEx sender has simply sent too many
   ConEx Signals which, e.g., would decrease the congestion allowance in
   a ConEx policer unnecessarily.

   If a heuristic method is used to detect spurious retransmission and
   has determined that a certain number of packets were retransmitted
   erroneously, the ConEx sender subtracts the payload size of these TCP
   packets from LEG.

   If a spurious retransmission is detected:

   LEG -= payload

   Note that LEG can become negative if too many L markings have already
   been sent.  This case is further discussed in <a href="#section-6">Section 6</a>.

<span class="h4"><a class="selflink" id="section-3.1.1" href="#section-3.1.1">3.1.1</a>.  Without SACK Support</span>

   If multiple losses occur within one RTT and SACK is not used, it may
   take several RTTs until all lost data is retransmitted.  With the
   scheme described above, the ConEx information will be delayed
   considerably, but timeliness is important for ConEx.  For ConEx, it
   is important to know how much data was lost; it is not important to
   know what data is lost.  During the first RTT after the initial loss
   detection, the amount of received data, and thus also the amount of
   lost data, can be estimated based on the number of received ACKs.

   Therefore, a ConEx sender can use the following algorithm to
   estimated the number of lost bytes with an additional delay of one
   RTT using an additional Loss Estimation Counter (LEC):

      flight_bytes:      current flight size in bytes
      retransmit_bytes:  payload size of the retransmission









<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                      [Page 7]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-8" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


      At the first retransmission in a congestion event, LEC is set:

         LEC = flight_bytes - 3*SMSS

         (At this point in the transmission, in the worst case,
         all packets in flight minus three that triggered the dupACks
         could have been lost.)

      Then, during the first RTT of the congestion event:

         For each retransmission:
            LEG += retransmit_bytes
            LEC -= retransmit_bytes

         For each ACK:
            LEC -= SMSS

      After one RTT:

         LEG += LEC

         (The LEC now estimates the number of outstanding bytes
         that should be ConEx L-marked.)

      After the first RTT for each following retransmissions:

         if (LEC &gt; 0): LEC -= retransmit_bytes
         else if (LEC==0): LEG += retransmit_bytes

         if (LEC &lt; 0): LEG += -LEC

         (The LEG is not increased for those bytes that were
         already counted.)

<span class="h3"><a class="selflink" id="section-3.2" href="#section-3.2">3.2</a>.  Explicit Congestion Notification (ECN)</span>

   ECN [<a href="./rfc3168" title="&quot;The Addition of Explicit Congestion Notification (ECN) to IP&quot;">RFC3168</a>] is an IP/TCP mechanism that allows network nodes to
   mark packets with the Congestion Experienced (CE) mark instead of
   dropping them when congestion occurs.

   A receiver might support classic ECN, the more accurate ECN feedback
   scheme (AccECN), or neither.  In the case that ECN is not supported
   for a connection, of course no ECN marks will occur; thus, the sender
   will never set the E flag.  Otherwise, a ConEx sender needs to
   maintain a signed counter, the Congestion Exposure Gauge (CEG), for
   the number of outstanding bytes that have to be ConEx-Marked with the
   E flag.




<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                      [Page 8]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-9" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   The CEG is increased when ECN information is received from an ECN-
   capable receiver supporting the classic ECN scheme or the accurate
   ECN feedback scheme.  When the ConEx sender receives an ACK
   indicating one or more segments were received with a CE mark, CEG is
   increased by the appropriate number of bytes as described further
   below.

   Unfortunately, in case of duplicate acknowledgements, the number of
   newly acknowledged bytes will be zero even though (CE-marked) data
   has been received.  Therefore, we increase the CEG by DeliveredData,
   as defined below:

   DeliveredData = acked_bytes + SACK_diff + (is_dup)*1SMSS -
   (is_after_dup)*num_dup*1SMSS

   DeliveredData covers the number of bytes that has been newly
   delivered to the receiver.  Therefore, on each arrival of an ACK,
   DeliveredData will be increased by the newly acknowledged bytes
   (acked_bytes) as indicated by the current ACK, relative to all past
   ACKs.  The formula depends on whether SACK is available: if SACK is
   not available, SACK_diff is always zero, whereas if ACK information
   is available, is_dup and is_after_dup are always zero.

   With SACK, DeliveredData is increased by the number of bytes provided
   by (new) SACK information (SACK_diff).  Note that if less
   unacknowledged bytes are announced in the new SACK information than
   in the previous ACK, SACK_diff can be negative.  In this case, data
   is newly acknowledged (in acked_bytes) that was previously
   accumulated into DeliveredData, based on SACK information.

   Otherwise without SACK, DeliveredData is increased by 1 Sender
   Maximum Segment Size (SMSS) on duplicate acknowledgements because
   duplicate acknowledgements do not acknowledge any new data (and
   acked_bytes will be zero).  For the subsequent partial or full ACK,
   acked_bytes cover all newly acknowledged bytes including those
   already accounted for with the receipt of any duplicate
   acknowledgement.  Therefore, DeliveredData is reduced by one SMSS for
   each preceding duplicate ACK.  Consequently, is_dup is one if the
   current ACK is a duplicated ACK without SACK, and zero otherwise.
   is_after_dup is only one for the next full or partial ACK after a
   number of duplicated ACKs without SACK and num_dup counts the number
   of duplicated ACKs in a row (which usually is 3 or more).

   With classic ECN, one congestion-marked packet causes continuous
   congestion feedback for a whole round trip, thus hiding the arrival
   of any further congestion-marked packets during that round trip.  A
   more accurate ECN feedback scheme (AccECN) is needed to ensure that
   feedback properly reflects the extent of congestion marking.  The two



<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                      [Page 9]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-10" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   cases, with and without a receiver capable of AccECN, are discussed
   in the following sections.

<span class="h4"><a class="selflink" id="section-3.2.1" href="#section-3.2.1">3.2.1</a>.  Accurate ECN Feedback</span>

   With a more accurate ECN feedback scheme (AccECN) that is supported
   by the receiver, either the number of marked packets or the number of
   marked bytes will be fed back from the receiver to the sender and,
   therefore is known at the sender side.  In the latter case, the CEG
   can be increased directly by the number of marked bytes.  Otherwise
   if D is assumed to be the number of marks, the gauge (CEG) will be
   conservatively increased by one SMSS for each marking or, at the
   maximum, the number of newly acknowledged bytes:

   CEG += min(SMSS*D, DeliveredData)

<span class="h4"><a class="selflink" id="section-3.2.2" href="#section-3.2.2">3.2.2</a>.  Classic ECN Support</span>

   With classic ECN, as soon as a CE mark is seen at the receiver side,
   it will feed this information back to the sender by setting the Echo
   Congestion Experienced (ECE) flag in the TCP header of subsequent
   ACKs.  Once the sender receives the first ECE of a congestion
   notification, it sets the Congestion Window Reduced (CWR) flag in the
   TCP header once.  When this packet with the CWR flag in the TCP
   header arrives at the receiver side acknowledging its first ECE
   feedback, the receiver stops setting the ECE flag.

   If the ConEx sender fully conforms to the semantics of ECN signaling
   as defined by [<a href="./rfc3168" title="&quot;The Addition of Explicit Congestion Notification (ECN) to IP&quot;">RFC3168</a>], it will receive one full RTT of ACKs with
   the ECE flag set whenever at least one CE mark was received by the
   receiver.  As the sender cannot estimate how many packets have
   actually been CE-marked during this RTT, the most conservative
   assumption MAY be taken, namely assuming that all packets were
   marked.  This can be achieved by increasing the CEG by DeliveredData
   for each ACK with the ECE flag:

   CEG += DeliveredData

   Optionally, a ConEx sender could implement the following technique
   (that does not conform to [<a href="./rfc3168" title="&quot;The Addition of Explicit Congestion Notification (ECN) to IP&quot;">RFC3168</a>]), called "advanced compatibility
   mode", to considerably improve its estimate of the number of ECN-
   marked packets:

   To extract more than one ECE indication per RTT, a ConEx sender could
   set the CWR flag continuously to force the receiver to signal only
   one ECE per CE mark.  Unfortunately, the use of delayed ACKs
   [<a href="./rfc5681" title="&quot;TCP Congestion Control&quot;">RFC5681</a>] (which is common) will prevent feedback of every CE mark;
   if a CWR confirmation is received before the ECE can be sent out on



<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 10]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-11" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   the next ACK, ECN feedback information could get lost (depending on
   the actual receiver implementation).  Thus, a sender SHOULD set CWR
   only on those data segments that will presumably trigger a (delayed)
   ACK.  The sender would need an additional control loop to estimate
   which data segments will trigger an ACK in order to extract more
   timely congestion notifications.  Still, the CEG SHOULD be increased
   by DeliveredData, as one or more CE-marked packets could be
   acknowledged by one delayed ACK.

<span class="h2"><a class="selflink" id="section-4" href="#section-4">4</a>.  Setting the ConEx Flags</span>

   By setting the X flag, a packet is marked as ConEx-capable.  All
   packets carrying payload MUST be marked with the X flag set,
   including retransmissions.  Only if no congestion feedback
   information is (currently) available, SHOULD the X flag be zero
   (e.g., for control packets on a connection that has not sent any user
   data for some time and, therefore is sending only pure ACKs that are
   not carrying any payload).

<span class="h3"><a class="selflink" id="section-4.1" href="#section-4.1">4.1</a>.  Setting the E or the L Flag</span>

   As described in <a href="#section-3.1">Section 3.1</a>, the sender needs to maintain a CEG
   counter and might also maintain a LEG counter.  If no LEG is used,
   all retransmission will be marked with the L flag.

   Further, as long as the LEG or CEG counter is positive, the sender
   marks each ConEx-capable packet with L or E respectively, and
   decreases the LEG or CEG counter by the TCP payload bytes carried in
   the marked packet (assuming headers are not being counted because
   packet sizes are regular).  No matter how small the value of LEG or
   CEG, if the value is positive the sender MUST NOT defer packet
   marking; this ensures that ConEx Signals are timely.  Therefore, the
   value of LEG and CEG will commonly be negative.

   If both the LEG and CEG are positive, the sender MUST mark each
   ConEx-capable packet with both L and E.  If a credit signal is also
   pending (see the next section), the C flag can be set as well.

<span class="h3"><a class="selflink" id="section-4.2" href="#section-4.2">4.2</a>.  Setting the Credit Flag</span>

   The ConEx abstract mechanism [<a href="./rfc7713" title="&quot;Congestion Exposure (ConEx) Concepts, Abstract Mechanism, and Requirements&quot;">RFC7713</a>] requires that sufficient
   credit MUST be signaled in advance to cover the expected congestion
   during the feedback delay of one RTT.

   To monitor the credit state at the audit, a ConEx sender needs to
   maintain a Credit State Counter (CSC) in bytes.  If congestion
   occurs, credits will be consumed and the CSC is reduced by the number
   of bytes that were lost or estimated to be ECN-marked.  If the risk



<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 11]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-12" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   of congestion was estimated wrongly, and thus too few credits were
   sent, the CSC becomes zero but cannot go negative.

   To be sure that the credit state in the audit never reaches zero, the
   number of credits should always equal the number of bytes in flight
   as all packets could potentially get lost or congestion-marked.  In
   this case, a ConEx sender also monitors the number of bytes in flight
   F.  If F ever becomes larger than the CSC, the ConEx sender sets the
   C flag on each ConEx-capable packet and increases the CSC by the
   payload size of each marked packet until the CSC is no less than F
   again.  However, a ConEx sender might also be less conservative and
   send fewer credits if it, e.g., assumes that the congestion will be
   low on a certain path based on previous experience.

   Recall that the CSC will be decreased whenever congestion occurs;
   therefore the CSC will need to be replenished as soon as the CSC
   drops below F.  Also recall that the sender can set the C flag on a
   ConEx-capable packet whether or not the E or L flags are also set.

   In TCP Slow Start, the congestion window might grow much larger than
   during the rest of the transmission.  Likely, a sender could consider
   sending fewer than F credits but risking being penalized by an audit
   function.  However, the credits should at least cover the increase in
   sending rate.  Given the exponential increase as implemented in the
   TCP Slow Start algorithm, which means that the sending rate doubles
   every RTT, a ConEx sender should at least cover half the number of
   packets in flight by credits.

   Note that the number of losses or markings within one RTT does not
   depend solely on the sender's actions.  In general, the behavior of
   the cross traffic, whether Active Queue Management (AQM) is used and
   how it is parameterized influence how many packets might be dropped
   or marked.  As long as any AQM encountered is not overly aggressive
   with ECN marking, sending half the flight size as credits should be
   sufficient whether congestion is signaled by loss or ECN.

   To maintain half of the packets in flight as credits, half of the
   packet of the initial window must also be C-marked.  In Slow Start
   marking, every fourth packet introduces the correct amount of credit
   as can be seen in Figure 1.











<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 12]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-13" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


                                        in_flight  credits
                RTT1  |------XC------&gt;|     1         1
                      |------X-------&gt;|     2         1
                      |------XC------&gt;|     3         2
                      |               |
                RTT2  |------X-------&gt;|     3         2
                      |------X-------&gt;|     4         2
                      |------X-------&gt;|     4         2
                      |------XC------&gt;|     5         3
                      |------X-------&gt;|     5         3
                      |------X-------&gt;|     6         3
                      |               |
                RTT3  |------X-------&gt;|     6         3
                      |------XC------&gt;|     7         4
                      |------X-------&gt;|     7         4
                      |------X-------&gt;|     8         4
                      |------X-------&gt;|     8         4
                      |------XC------&gt;|     9         5
                      |------X-------&gt;|     9         5
                      |------X-------&gt;|    10         5
                      |------X-------&gt;|    10         5
                      |------XC------&gt;|    11         6
                      |------X-------&gt;|    11         6
                      |------X-------&gt;|    12         6
                      |      .        |
                      |      :        |

       Figure 1: Credits in Slow Start (with an initial window of 3)

   It is possible that a TCP flow will encounter an audit function
   without relevant flow state due to, e.g., rerouting or memory
   limitations.  Therefore, the sender needs to detect this case and
   resend credits.  A ConEx sender might reset the credit counter CSC to
   zero if losses occur in subsequent RTTs (assuming that the sending
   rate was correctly reduced based on the received congestion signal
   and using a conservatively large RTT estimation).

   This section proposes a concrete algorithm for determining how much
   credit to signal (with a separate approach used for Slow Start).
   However, experimentation in credit setting algorithms is expected and
   encouraged.  The wider goal of ConEx is to reflect the "cost" of the
   risk of causing congestion on those that contribute most to it.
   Thus, experimentation is encouraged to improve or maintain
   performance while reducing the risk of causing congestion and,
   therefore potentially reducing the need to signal so much credit.






<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 13]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-14" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


<span class="h2"><a class="selflink" id="section-5" href="#section-5">5</a>.  Loss of ConEx Information</span>

   Packets carrying ConEx Signals could be discarded themselves.  This
   will be a second order problem (e.g., if the loss probability is
   0.1%, the probability of losing a ConEx L signal will be 0.1% of 0.1%
   = 0.01%).  Further, the penalty an audit induces should be
   proportional to the mismatch of expected ConEx marks and observed
   congestion, therefore the audit might only slightly increase the loss
   level of this flow.  Therefore, an implementer MAY choose to ignore
   this problem, accepting instead the risk that an audit function might
   wrongly penalize a flow.

   Nonetheless, a ConEx sender is responsible for always signaling
   sufficient congestion feedback, and therefore SHOULD remember which
   packet was marked with either the L, the E, or the C flag.  If one of
   these packets is detected as lost, the sender SHOULD increase the
   respective gauge(s), LEG or CEG, by the number of lost payload bytes
   in addition to increasing LEG for the loss.

<span class="h2"><a class="selflink" id="section-6" href="#section-6">6</a>.  Timeliness of the ConEx Signals</span>

   ConEx Signals will only be useful to a network node within a time
   delay of about one RTT after the congestion occurred.  To avoid
   further delays, a ConEx sender SHOULD send the ConEx signaling on the
   next available packet.

   Any or all of the ConEx flags can be used in the same packet, which
   allows delays to be minimized when multiple signals are pending.  The
   need to set multiple ConEx flags at the same time can occur if, e.g,
   an ACK is received by the sender that simultaneously indicates that
   at least one ECN mark was received, and that one or more segments
   were lost.  This may happen during excessive congestion, if the
   queues overflow even though ECN was used and currently all forwarded
   packets are marked, while others have to be dropped.  Another case
   when this might happen is when ACKs are lost, so that a subsequent
   ACK carries summary information not previously available to the
   sender.

   If a flow becomes application-limited, there could be insufficient
   bytes to send to reduce the gauges to zero or below.  In such cases,
   the sender cannot help but delay ConEx Signals.  Nonetheless, as long
   as the sender is marking all outgoing packets, an audit function is
   unlikely to penalize ConEx-Marked packets.  Therefore, no matter how
   long a gauge has been positive, a sender MUST NOT reduce the gauge by
   more than the ConEx-Marked bytes it has sent.






<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 14]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-15" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   If the CEG or LEG counter is negative, the respective counter MAY be
   reset to zero within one RTT after it was decreased the last time, or
   one RTT after recovery if no further congestion occurred.

<span class="h2"><a class="selflink" id="section-7" href="#section-7">7</a>.  Open Areas for Experimentation</span>

   All proposed mechanisms in this document are experimental, and
   therefore further large-scale experimentation on the Internet is
   required to evaluate if the signaling provided by these mechanisms is
   accurate and timely enough to produce value for ConEx-based (traffic
   management or other) mechanisms.

   The current ConEx specifications assume that congestion is counted in
   the number of bytes (including the IP header that directly
   encapsulates the CDO and everything that the IP header encapsulates)
   [<a href="./rfc7837" title="&quot;IPv6 Destination Option for Congestion Exposure (ConEx)&quot;">RFC7837</a>].  This decision was taken because most network devices
   today experience byte-congestion where the memory is filled exactly
   with the number of bytes a packet carries [<a href="./rfc7141" title="&quot;Byte and Packet Congestion Notification&quot;">RFC7141</a>].  However, there
   are also devices that may allocate a certain amount of memory per
   packet, no matter how large a packet is.  These devices get congested
   based on the number of packets in their memory and therefore, in this
   case, congestion is determined by the number of packets that have
   been lost or marked.  Furthermore, a transport-layer endpoint such as
   a TCP sender or receiver, might not know the exact number of bytes
   that a lower layer was carrying.  Therefore, a TCP endpoint may only
   be able to estimate the exact number of congested bytes (assuming
   that all lower-layer headers have the same length).  If this
   estimation is sufficient to work with, the ConEx Signal needs to be
   further evaluated in tests on the Internet together with different
   auditor implementations.

   Further, the proposed marking schemes in this document are designed
   under the assumption that all TCP packets of a ConEx-capable flow are
   of equal size or that flows have a constant mean packet size over a
   rather small time frame, like one RTT or less.  In most
   implementations, this assumption might be taken as well and is
   probably true for most of the traffic flows.  If this proposed scheme
   is used, it is necessary to evaluate how much accuracy degrades if
   this precondition is not met.  Evaluating with real traffic from
   different applications is especially important in making the decision
   regarding whether the proposed schemes are sufficient or whether a
   more complex scheme is needed.

   In this context, the proposed scheme to set credit markings in Slow
   Start runs the risk of providing an insufficient number of markings,
   which can cause an audit function to penalize this flow.  Both the
   proposed credit scheme for Slow Start as well as the scheme in
   Congestion Avoidance must be evaluated together with one or more



<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 15]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-16" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   specific implementations of a ConEx auditor to ensure that both
   algorithms, in the sender and in the auditor, work properly together
   with a low risk of false positives (which would lead to penalization
   of an honest sender).  However, if a sender is wrongly assumed to
   cheat, the penalization of the audit should be adequate and should
   allow an honest sender using a congestion control scheme that is
   commonly used today to recover quickly.

   Another open issue is the accuracy of the ECN feedback signal.  At
   the time of this document's publication, there is no AccECN mechanism
   specified yet, and further AccECN will also take some time to be
   widely deployed.  This document proposes an advanced compatibility
   mode for classic ECN.  The proposed mechanism can provide more
   accurate feedback by utilizing the way classic ECN is specified but
   has a higher risk of losing information.  To figure out how high this
   risk is in a real deployment scenario, further experimental
   evaluation is needed.  The following argument is intended to prove
   that suppressing repetitions of ECE, however, is still safe against
   possible congestion collapse due to lost congestion feedback and
   should be further proven in experimentation:

   Repetition of ECE in classic ECN is intended to ensure reliable
   delivery of congestion feedback.  However, with advanced
   compatibility mode, it is possible to miss congestion notifications.
   This can happen in some implementations if delayed acknowledgements
   are used.  Further, an ACK containing ECE can simply get lost.  If
   only a few CE marks are received within one congestion event (e.g.,
   only one), the loss of one acknowledgement due to (heavy) congestion
   on the reverse path can prevent that any congestion notification is
   received by the sender.

   However, if loss of feedback exacerbates congestion on the forward
   path, more forward packets will be CE-marked, increasing the
   likelihood that feedback from at least one CE will get through per
   RTT.  As long as one ECE reaches the sender per RTT, the sender's
   congestion response will be the same as if CWR were not continuous.
   The only way that heavy congestion on the forward path could be
   completely hidden would be if all ACKs on the reverse path were lost.
   If total ACK loss persisted, the sender would time out and do a
   congestion response anyway.  Therefore, the problem seems confined to
   potential suppression of a congestion response during light
   congestion.

   Furthermore, even if loss of all ECN feedback leads to no congestion
   response, the worst that could happen would be loss instead of ECN-
   signaled congestion on the forward path.  Given that compatibility
   mode does not affect loss feedback, there would be no risk of
   congestion collapse.



<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 16]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-17" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


<span class="h2"><a class="selflink" id="section-8" href="#section-8">8</a>.  Security Considerations</span>

   General ConEx security considerations are covered extensively in the
   ConEx abstract mechanism [<a href="./rfc7713" title="&quot;Congestion Exposure (ConEx) Concepts, Abstract Mechanism, and Requirements&quot;">RFC7713</a>].  This section covers TCP-specific
   concerns that may occur with the addition of ConEx to TCP (while not
   discussing generally well-known attacks against TCP).  It is assumed
   that any altering of ConEx information can be detected by protection
   mechanisms in the IP layer and is, therefore, not discussed here but
   in [<a href="./rfc7837" title="&quot;IPv6 Destination Option for Congestion Exposure (ConEx)&quot;">RFC7837</a>].  Further, [<a href="./rfc7837" title="&quot;IPv6 Destination Option for Congestion Exposure (ConEx)&quot;">RFC7837</a>] describes how to use ConEx to
   mitigate flooding attacks by using preferential drop where the use of
   ConEx can even increase security.

   The ConEx modifications to TCP provide no mechanism for a receiver to
   force a sender not to use ConEx.  A receiver can degrade the accuracy
   of ConEx by claiming that it does not support SACK, AccECN, or ECN,
   but the sender will never have to turn ConEx off.  Further, the
   receiver cannot force the sender to have to mark ConEx more
   conservatively, in order to cover the risk of any inaccuracy.
   Instead, it is always the sender's choice to either mark very
   conservatively, which ensures that the audit always sees enough
   markings to not penalize the flow, or estimate the needed number of
   markings more tightly.  This second case can lead to inaccurate
   marking, and therefore increases the likelihood of loss at an audit
   function that will only harm the receiver itself.

   Assuming the sender is limited in some way by a congestion allowance
   or quota, a receiver could spoof more loss or ECN congestion feedback
   than it actually experiences, in an attempt to make the sender draw
   down its allowance faster than necessary.  However, over-declaring
   congestion simply makes the sender slow down.  If the receiver is
   interested in the content, it will not want to harm its own
   performance.

   However, if the receiver is solely interested in making the sender
   draw down its allowance, the net effect will depend on the sender's
   congestion control algorithm as permanently adding more and more
   additional congestion would cause the sender to more and more reduce
   its sending rate.  Therefore, a receiver can only maintain a certain
   congestion level that is corresponding to a certain sending rate.
   With NewReno [<a href="./rfc6582" title="&quot;The NewReno Modification to TCP's Fast Recovery Algorithm&quot;">RFC6582</a>], doubling congestion feedback causes the
   sender to reduce its sending rate such that it would only consume
   sqrt(2) = 1.4 times more congestion allowance.  However, to improve
   scaling, congestion control algorithms are tending towards less
   responsive algorithms like Cubic or Compound TCP, and ultimately to
   linear algorithms like Data Center TCP (DCTCP) [<a href="#ref-DCTCP" title="&quot;Data Center TCP (DCTCP)&quot;">DCTCP</a>] that aim to
   maintain the same congestion level independent of the current sending
   rate and always reduce its sending window if the signaled congestion
   feedback is higher.  In each case, if the receiver doubles congestion



<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 17]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-18" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   feedback, it causes the sender to respectively consume more allowance
   by a factor of 1.2, 1.15, or 1, where 1 implies the attack has become
   completely ineffective as no further congestion allowance is consumed
   but the flow will decrease its sending rate to a minimum instead.

<span class="h2"><a class="selflink" id="section-9" href="#section-9">9</a>.  References</span>

<span class="h3"><a class="selflink" id="section-9.1" href="#section-9.1">9.1</a>.  Normative References</span>

   [<a id="ref-RFC2018">RFC2018</a>]  Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
              Selective Acknowledgment Options", <a href="./rfc2018">RFC 2018</a>,
              DOI 10.17487/RFC2018, October 1996,
              &lt;<a href="http://www.rfc-editor.org/info/rfc2018">http://www.rfc-editor.org/info/rfc2018</a>&gt;.

   [<a id="ref-RFC2119">RFC2119</a>]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", <a href="https://www.rfc-editor.org/bcp/bcp14">BCP 14</a>, <a href="./rfc2119">RFC 2119</a>,
              DOI 10.17487/RFC2119, March 1997,
              &lt;<a href="http://www.rfc-editor.org/info/rfc2119">http://www.rfc-editor.org/info/rfc2119</a>&gt;.

   [<a id="ref-RFC3168">RFC3168</a>]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              <a href="./rfc3168">RFC 3168</a>, DOI 10.17487/RFC3168, September 2001,
              &lt;<a href="http://www.rfc-editor.org/info/rfc3168">http://www.rfc-editor.org/info/rfc3168</a>&gt;.

   [<a id="ref-RFC5681">RFC5681</a>]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
              Control", <a href="./rfc5681">RFC 5681</a>, DOI 10.17487/RFC5681, September 2009,
              &lt;<a href="http://www.rfc-editor.org/info/rfc5681">http://www.rfc-editor.org/info/rfc5681</a>&gt;.

   [<a id="ref-RFC7713">RFC7713</a>]  Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx)
              Concepts, Abstract Mechanism, and Requirements", <a href="./rfc7713">RFC 7713</a>,
              DOI 10.17487/RFC7713, December 2015,
              &lt;<a href="http://www.rfc-editor.org/info/rfc7713">http://www.rfc-editor.org/info/rfc7713</a>&gt;.

   [<a id="ref-RFC7837">RFC7837</a>]  Krishnan, S., Kuehlewind, M., Briscoe, B., and C. Ralli,
              "IPv6 Destination Option for Congestion Exposure (ConEx)",
              <a href="./rfc7837">RFC 7837</a>, DOI 10.17487/RFC7837, May 2016,
              &lt;<a href="http://www.rfc-editor.org/info/rfc7837">http://www.rfc-editor.org/info/rfc7837</a>&gt;.














<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 18]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-19" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


<span class="h3"><a class="selflink" id="section-9.2" href="#section-9.2">9.2</a>.  Informative References</span>

   [<a id="ref-ACCURATE">ACCURATE</a>] Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More
              Accurate ECN Feedback in TCP", Work in Progress,
              <a href="./draft-ietf-tcpm-accurate-ecn-00">draft-ietf-tcpm-accurate-ecn-00</a>, December 2015.

   [<a id="ref-DCTCP">DCTCP</a>]    Alizadeh, M., Greenberg, A., Maltz, D., Padhye, J., Patel,
              P., Prabhakar, B., Sengupta, S., and M. Sridharan, "Data
              Center TCP (DCTCP)", ACM SIGCOMM Computer Communication
              Review, Volume 40, Issue 4, pages 63-74,
              DOI 10.1145/1851182.1851192, October 2010,
              &lt;<a href="http://portal.acm.org/citation.cfm?id=1851192">http://portal.acm.org/citation.cfm?id=1851192</a>&gt;.

   [<a id="ref-ECNTCP">ECNTCP</a>]   Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith,
              "Re-ECN: Adding Accountability for Causing Congestion to
              TCP/IP", Work in Progress, <a href="./draft-briscoe-conex-re-ecn-tcp-04">draft-briscoe-conex-re-ecn-</a>
              <a href="./draft-briscoe-conex-re-ecn-tcp-04">tcp-04</a>, July 2014.

   [<a id="ref-RFC3522">RFC3522</a>]  Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm
              for TCP", <a href="./rfc3522">RFC 3522</a>, DOI 10.17487/RFC3522, April 2003,
              &lt;<a href="http://www.rfc-editor.org/info/rfc3522">http://www.rfc-editor.org/info/rfc3522</a>&gt;.

   [<a id="ref-RFC3708">RFC3708</a>]  Blanton, E. and M. Allman, "Using TCP Duplicate Selective
              Acknowledgement (DSACKs) and Stream Control Transmission
              Protocol (SCTP) Duplicate Transmission Sequence Numbers
              (TSNs) to Detect Spurious Retransmissions", <a href="./rfc3708">RFC 3708</a>,
              DOI 10.17487/RFC3708, February 2004,
              &lt;<a href="http://www.rfc-editor.org/info/rfc3708">http://www.rfc-editor.org/info/rfc3708</a>&gt;.

   [<a id="ref-RFC4015">RFC4015</a>]  Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm
              for TCP", <a href="./rfc4015">RFC 4015</a>, DOI 10.17487/RFC4015, February 2005,
              &lt;<a href="http://www.rfc-editor.org/info/rfc4015">http://www.rfc-editor.org/info/rfc4015</a>&gt;.

   [<a id="ref-RFC5682">RFC5682</a>]  Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata,
              "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting
              Spurious Retransmission Timeouts with TCP", <a href="./rfc5682">RFC 5682</a>,
              DOI 10.17487/RFC5682, September 2009,
              &lt;<a href="http://www.rfc-editor.org/info/rfc5682">http://www.rfc-editor.org/info/rfc5682</a>&gt;.

   [<a id="ref-RFC6582">RFC6582</a>]  Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The
              NewReno Modification to TCP's Fast Recovery Algorithm",
              <a href="./rfc6582">RFC 6582</a>, DOI 10.17487/RFC6582, April 2012,
              &lt;<a href="http://www.rfc-editor.org/info/rfc6582">http://www.rfc-editor.org/info/rfc6582</a>&gt;.

   [<a id="ref-RFC6789">RFC6789</a>]  Briscoe, B., Ed., Woundy, R., Ed., and A. Cooper, Ed.,
              "Congestion Exposure (ConEx) Concepts and Use Cases",
              <a href="./rfc6789">RFC 6789</a>, DOI 10.17487/RFC6789, December 2012,
              &lt;<a href="http://www.rfc-editor.org/info/rfc6789">http://www.rfc-editor.org/info/rfc6789</a>&gt;.



<span class="grey">Kuehlewind &amp; Scheffenegger    Experimental                     [Page 19]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-20" ></span>
<span class="grey"><a href="./rfc7786">RFC 7786</a>               TCP Modifications for ConEx              May 2016</span>


   [<a id="ref-RFC7141">RFC7141</a>]  Briscoe, B. and J. Manner, "Byte and Packet Congestion
              Notification", <a href="https://www.rfc-editor.org/bcp/bcp41">BCP 41</a>, <a href="./rfc7141">RFC 7141</a>, DOI 10.17487/RFC7141,
              February 2014, &lt;<a href="http://www.rfc-editor.org/info/rfc7141">http://www.rfc-editor.org/info/rfc7141</a>&gt;.

Acknowledgements

   The authors would like to thank Bob Briscoe who contributed with
   these initial ideas [<a href="#ref-ECNTCP" title="&quot;Re-ECN: Adding Accountability for Causing Congestion to TCP/IP&quot;">ECNTCP</a>] and valuable feedback.  Moreover, thanks
   to Jana Iyengar who also provided valuable feedback.

Authors' Addresses

   Mirja Kuehlewind (editor)
   ETH Zurich
   Switzerland

   Email: mirja.kuehlewind@tik.ee.ethz.ch


   Richard Scheffenegger
   NetApp, Inc.
   Am Euro Platz 2
   Vienna  1120
   Austria

   Email: rs.ietf@gmx.at

























Kuehlewind &amp; Scheffenegger    Experimental                     [Page 20]
</pre>