解密协议层的攻击——HTTP请求走私-京东云开发者社区

最近一直在研究一些比较有意思的攻击方法与思路，在查阅本地文档的时候(没错，本地，我经常会将一些有意思的文章但是没时间看就会被我保存pdf到本地），一篇2019年Black hat的议题——HTTP请求走私，进入我的视野，同时我也查阅到在2020 Blackhat中该攻击手法再次被分析。我对此产生浓厚学习兴趣，于是便有了这篇文章。

HTTP请求走私是一种HTTP协议的攻击利用方法，该攻击产生的原因在于HTTP代理链中HTTP Server的实现中存在不一致的问题。

时间线

2004年，@Amit Klein提出HTTP Response Splitting技术，为HTTP Smuggling攻击雏形；
2005年，第一次被@Watchfire所提出，并对其进行了详细介绍；
2016年，DEFCON 24上，@regilero在他的议题——Hiding Wookiees in HTTP中在对前面报告进行丰富与扩充；
2019年，Blackhat USA上，PortSwigger的@James Kettle在其议题——HTTP DESYNC ATTACKS SMASHING INTO THE CELL NEXT DOOR中对当前网络环境进行了分析，同时在其利用上加入chunked技术，对现有攻击面进行了拓展；
2020年，Blackhat USA上，@Amit Klein在其议题——HTTP Request Smuggling in 2020中最新变种手法进行分析，同时对各类环境场景下进行了分析。

漏洞利用场景分析

HTTP协议请求走私并不像其他web攻击手法那么直观，而是在更加复杂的网络环境中，因不同服务器基于不同的RFC标准实现的针对HTTP协议包的不同处理方式而产生的一种安全风险。

在对其漏洞进行分析前，首先需要了解目前被广泛使用的HTTP 1.1协议特性——Keep-Alive、Pipeline技术。

简单来说，在HTTP 1.0及其以前版本的协议中，在每次进行交互的时候，C/S两端都需要进行TCP的三次握手链接。而如今的web页面大部分主要还是由大量静态资源所组成。如果依然按照HTTP 1.0及其以前版本的协议设计，会导致服务器大量的负载被浪费。于是在HTTP 1.1中，增加了Keep-Alive、Pipeline技术。

KEEP-ALIVE

根据RFC7230规范中section-6.3可以得知，HTTP 1.1中默认使用persistent connections方式。其实现手法是在HTTP通信包中加入Connection: Keep-Alive标识：在一次HTTP通信后不会关闭TCP连接，而在后续相同目标服务器请求中复用该空闲的TCP通道，避免了由于新建TCP连接产生的时延和服务器资源消耗，提升用户资源访问速度。

PIPELINE

而在Keep-Alive中后续又有了Pipeline机制,这样客户端就可以像流水线一样不用等待某个包的响应而持续的向服务器发包。而服务器也会遵循先进先出原则对客户端请求进行响应。

如图，我们可以看到相比于no pipelining模式，pipelining模式下服务器在响应时间上有了很大的提升。

现如今，为了提高用户浏览速度、加强服务稳定性、提升使用体验以及减轻网络负担。大部分厂商都会使用CDN加速服务或负载均衡LB等部署业务。当用户访问服务器静态资源时，将直接从CDN上获取详情，当存在真正服务器交互时，才会与后端服务器产生交互。如图所示：

但是，该模式中reverse proxy部分将长期与back-end部分通信，一般情况下这部分连接会重用TCP通道。通俗来说，用户流量来自四面八方，user端到reverse proxy端通信会建立多条TCP通道，而rever proxy与back-end端通信ip固定，这两者重用TCP连接通道来通信便顺理成章了。

在这种场景下，当不同服务器实现时参考的RFC标准不同时，我们向reverse proxy发送一个比较模糊的HTTP请求时，因为reverse proxy与back-end基于不同标准进行解析，可能产生reverse proxy认为该HTTP请求合法，并转发到back-end，而back-end只认为部分HTTP请求合法，剩下的多余请求，便就算是夹带走私的HTTP请求了。当该部分对正常用户的请求造成了影响之后，就实现了HTTP走私攻击。如图所示：深色为正常请求，橙色为走私请求，绿色为正常用户请求。一起发包情况下，走私的请求内容被拼接到正常请求中。

CHUNKED数据包格式

分块传输编码（Chunked transfer encoding）是超文本传输协议（HTTP）中的一种数据传输机制，允许HTTP的数据可以分成多个部分。

如下图所示，为jdcloud.com未进行数据包进行chunked。

当对jdcloud.com进行分块时，如下图所示。

常见攻击

注：后续文章中所提到CL=Content-Length，TE=Transfer-Encoding，如需使用burpsuite进行数据包调试时，需去除Repeater中Update Content-Length选项。

场景1：GET请求中CL不为0的情况

主要指在GET中设置Content-Length长度，使用body发送数据。当然这里也不仅仅限制与GET请求中，只是GET的理解比较典型，所以我们用在做例子。

在RFC7230 Content-Length部分提到：

For example, a Content-Length header field is normally sent in a POST request even when the value is 0 (indicating an empty payload body). A user agent SHOULD NOT send a Content-Length header field when the request message does not contain a payload body and the method semantics do not anticipate such a body.

在最新的RFC7231 4.3.1 GET中也仅仅提了一句：

A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.

从官方规范文档可以了解到：RFC规范并未严格的规范Server端处理方式，对该类请求的规范也适当进行了放松，但是也是部分情况。由于这些中间件没有一个严格的标准依据，所以也会产生解析差异导致HTTP Smuggling攻击。

构造数据包

1GET / HTTP/1.1\r\n
2Host: example.com\r\n
3Content-Length: 44\r\n
4
5GET /secret HTTP/1.1\r\n
6Host: example.com\r\n
7\r\n

<左右滑动以查看完整代码>

由于GET请求，服务器将不对Content-Length进行处理，同时因为Pipeline的存在，后端服务器会将该数据包视为两个GET请求。分别为：

请求——1

1GET / HTTP/1.1\r\n
2Host: example.com\r\n

<左右滑动以查看完整代码>

请求——2

1GET /secret HTTP/1.1\r\n
2Host: example.com\r\n

<左右滑动以查看完整代码>

这就导致了请求走私。

场景2：CL-CL

在RFC7230的第3.3.3节中的第四条中，规定当服务器收到的请求中包含两个Content-Length，而且两者的值不同时，需要返回400错误。

If a message is received without Transfer-Encoding and with either multiple Content-Length header fields having differing field-values or a single Content-Length header field having an invalid value, then the message framing is invalid and the recipient MUST treat it as an unrecoverable error. If this is a request message, the server MUST respond with a 400 (Bad Request) status code and then close the connection. If this is a response message received by a proxy, the proxy MUST close the connection to the server, discard the received response, and send a 502 (Bad Fielding & Reschke Standards Track [Page 32] RFC 7230 HTTP/1.1 Message Syntax and Routing June 2014 Gateway) response to the client. If this is a response message received by a user agent, the user agent MUST close the connection to the server and discard the received response.

但是某些服务器并没遵循规范进行实现，当服务器未遵循该规范时，前后服务器都不会响应400。可能造成代理服务器使用第一个Content-Length获取长度，而后端按照第二个Content-Length获取长度。

构造数据包

1POST / HTTP/1.1\r\n
2Host: example.com\r\n
3Content-Length: 8\r\n
4Content-Length: 7\r\n
5
612345\r\n
7a

<左右滑动以查看完整代码>

这个时候，当后端服务器接受到数据包时，Content-Length长度为7。实际上接受到的body为12345\r\n，而我们前面所提到的，代理服务器会与后端服务器重用TCP通道，这个时候a就会拼接到下一个请求。这个时候如果存在一个用户发起GET请求。则该用户GET请求实际为：

1aGET / HTTP/1.1\r\n
2Host: example.com\r\n

<左右滑动以查看完整代码>

同时该用户也会收到一个类似aGET request method not found的报错响应，其实这样就已经实现了一次HTTP协议走私攻击，对正常用户造成了影响，而且后续可以扩展成类似于CSRF的攻击方式。

但是两个Content-Length这种请求包还是太过于理想化了，一般的服务器都不会接受这种存在两个请求头的请求包，但是在RFC2616的第4.4节中，规定：

The transfer-length of a message is the length of the message-body as it appears in the message; that is, after any transfer-codings have been applied. When a message-body is included with a message, the transfer-length of that body is determined by one of the following (in order of precedence):

If a Transfer-Encoding header field (section 14.41) is present and has any value other than "identity", then the transfer-length is defined by use of the "chunked" transfer-coding (section 3.6), unless the message is terminated by closing the connection.

也就是说，当Content-Length与Transfer-Encoding同时被定义使用时，可忽略Content-Length。也就是说当Transfer-Encoding的加入，两个Content-Length并不影响代理服务器与后端服务器的响应。

场景3：CL-TE

这里的情况是指代理服务器处理Content-Length，后端服务器会遵守RFC2616的规定，处理Transfer-Encoding的情况(这里也就是场景2后边所提到的情况)。

构造数据包

 1POST / HTTP/1.1\r\n
 2Host: example.com\r\n
 3User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:56.0) Gecko/20100101 Firefox/56.0\r\n
 4Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
 5Accept-Language: en-US,en;q=0.5\r\n
 6Connection: keep-alive\r\n
 7Content-Length: 6\r\n
 8Transfer-Encoding: chunked\r\n
 9\r\n
100\r\n
11\r\n
12G

<左右滑动以查看完整代码>

因前后服务器规范不同，解析如下：

请求——1 (代理服务器的解析）

 1POST / HTTP/1.1\r\n
 2Host: example.com\r\n
 3User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:56.0) Gecko/20100101 Firefox/56.0\r\n
 4Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
 5Accept-Language: en-US,en;q=0.5\r\n
 6Connection: keep-alive\r\n
 7Content-Length: 6\r\n
 8Transfer-Encoding: chunked\r\n
 9\r\n
100\r\n
11\r\n
12G

<左右滑动以查看完整代码>

请求——2 (代理服务器的解析）

其中G被留在缓存区中，当无其他用户请求时候，该数据包会不会产生解析问题，但TCP重用情况，当一个正常请求过来时候。将出现如下情况：

1GPOST / HTTP/1.1\r\n
2Host: example.com\r\n
3....

<左右滑动以查看完整代码>

这个时候HTTP包，再一次通过TCP通道进行走私。

场景4：TE-CL

即代理服务器处理Transfer-Encoding请求，后端服务器处理Content-Length请求。

构造数据包

 1POST / HTTP/1.1\r\n
 2Host: example.com\r\n
 3User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:56.0) Gecko/20100101 Firefox/56.0\r\n
 4Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
 5Accept-Language: en-US,en;q=0.5\r\n
 6Content-Length: 4\r\n
 7Transfer-Encoding: chunked\r\n
 8\r\n
 912\r\n
10GPOST / HTTP/1.1\r\n
11\r\n
120\r\n
13\r\n

<左右滑动以查看完整代码>

由于Transfer-Encoding遇到0\r\n\r\n才结束解析。此时后端将解析Content-Length,真正到达后端数据将为：

1POST / HTTP/1.1\r\n
2Host: example.com\r\n
3User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:56.0) Gecko/20100101 Firefox/56.0\r\n
4Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
5Accept-Language: en-US,en;q=0.5\r\n
6Content-Length: 4\r\n
7\r\n
812\r\n

<左右滑动以查看完整代码>

同时将出现第二个数据包：

1GPOST / HTTP/1.1\r\n
2\r\n
30\r\n
4\r\n

<左右滑动以查看完整代码>

场景5：TE-TE

当收到存在两个请求头的请求包时，前后端服务器都处理Transfer-Encoding请求头，这确实是实现了RFC的标准。不过前后端服务器毕竟不是同一种，这就有了一种方法，我们可以对发送的请求包中的Transfer-Encoding进行某种混淆操作(这里主要指Content-Length)，从而使其中一个服务器不处理Transfer-Encoding请求头。从某种意义上还是CL-TE或者TE-CL。

构造数据包

 1POST / HTTP/1.1\r\n
 2Host: example.com\r\n
 3User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:56.0) Gecko/20100101 Firefox/56.0\r\n
 4Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
 5Content-length: 4\r\n
 6Transfer-Encoding: chunked\r\n
 7Transfer-encoding: cow\r\n
 8\r\n
 95c\r\n
10GPOST / HTTP/1.1\r\n
11Content-Type: application/x-www-form-urlencoded\r\n
12Content-Length: 15\r\n
13\r\n
14x=1\r\n
150\r\n
16\r\n

<左右滑动以查看完整代码>

攻击场景分析

使用PortSwigger的实验环境环境进行实际攻击演示。

复用TCP进行管理员操作

*靶场链接：

https://portswigger.net/web-security/request-smuggling/exploiting/lab-bypass-front-end-controls-cl-te

This lab involves a front-end and back-end server, and the front-end server doesn't support chunked encoding. There's an admin panel at /admin, but the front-end server blocks access to it.

To solve the lab, smuggle a request to the back-end server that accesses the admin panel and deletes the user carlos.

实验目的：访问admin页，并利用认证对carlos用户进行删除。

SETP 1、因直接访问/admin目录被提示拦截，同时题目提示CL.TE。这里通过构造CL.TE格式数据包，尝试访问。/admin路由。

SETP 2、访问提示管理员界面只允许为本地用户访问，尝试直接访问localhost,并获取到删除用户路由地址。

SETP 3、通过构造请求访问即可，最终再次访问/admin显示页面已经没有删除carlos用户选项。

结合业务获取用户请求包 -1

*链接：

https://portswigger.net/web-security/request-smuggling/exploiting/lab-reveal-front-end-request-rewriting

This lab involves a front-end and back-end server, and the front-end server doesn't support chunked encoding.

There's an admin panel at /admin, but it's only accessible to people with the IP address 127.0.0.1. The front-end server adds an HTTP header to incoming requests containing their IP address. It's similar to the X-Forwarded-For header but has a different name.

To solve the lab, smuggle a request to the back-end server that reveals the header that is added by the front-end server. Then smuggle a request to the back-end server that includes the added header, accesses the admin panel, and deletes the user carlos.

实验目的同上，不过这里在前端服务器做了限制。不支持chunked。同时前端到后端做了检查,在headers中自定义了一个类似于X-Forwarded-For的头。

SETP1、通过页面search处直接构造走私数据包，在页面返回中间服务器到后端服务器数据包内容(走私数据包长度当前为200，若实际场景中显示不全则可通过增加CL长度解决)，获取到X-uNiqsg-Ip头。同时,这里之所以选择search处，主要是因为该处在页面存在输出。

SETP2、通过伪造同样请求发包即可。

结合业务获取用户请求包 -2

*链接：

https://portswigger.net/web-security/request-smuggling/exploiting/lab-capture-other-users-requests

This lab involves a front-end and back-end server, and the front-end server doesn't support chunked encoding.

To solve the lab, smuggle a request to the back-end server that causes the next user's request to be stored in the application. Then retrieve the next user's request and use the victim user's cookies to access their account.

实验目的：通过写页面的方式，获取下一个请求数据包中cookie数据。

SETP1、发现post?postId=路由下存在写页面操作，通过修改数据包。

SETP2、访问当前页面查看website处即可获取到下一个请求包的数据。（这里同样也可以控制下一个请求包数据在评论区，只需将最后一个评论参数comment放至最后即可）

反射XSS

*链接：

https://portswigger.net/web-security/request-smuggling/exploiting/lab-deliver-reflected-xss

This lab involves a front-end and back-end server, and the front-end server doesn't support chunked encoding.

The application is also vulnerable to reflected XSS via the User-Agent header.

To solve the lab, smuggle a request to the back-end server that causes the next user's request to receive a response containing an XSS exploit that executes alert(1).

应用场景：当业务存在反射型XSS时，可通过缓存投毒的方式在其他用户页面写入脏数据。

SETP1、进入任意评论区发现页面存在userAgent回显，通过走私协议修改userAgent即可。

进行缓存投毒

*链接：

https://portswigger.net/web-security/request-smuggling/exploiting/lab-perform-web-cache-poisoning

This lab involves a front-end and back-end server, and the front-end server doesn't support chunked encoding. The front-end server is configured to cache certain responses.

To solve the lab, perform a request smuggling attack that causes the cache to be poisoned, such that a subsequent request for a JavaScript file receives a redirection to the exploit server. The poisoned cache should alert document.cookie.

应用场景：劫持下一用户请求页面。（实际场景中可劫持跳转至钓鱼等页面）

SETP1、缓存注入修改Host为恶意请求。

关于防御

从前面的案例我们可以看到HTTP请求走私的危害性，那么如何防御呢？

禁用代理服务器与后端服务器之间的TCP连接重用。
使用HTTP/2协议。
前后端使用相同的服务器。

但是这些修复方法又存在一些现实困难：

HTTP/2推行过于困难，尽管HTTP/2兼容HTTP/1.1。
取消TCP重用将增大服务器负载，服务器资源吃不消。
使用相同的服务器，在一些厂商其实也很难实现。其主要原因还是前后端实现标准不一致的问题。

那么没有解决方案了嘛？

其实不然，上云就是个很好的方案。云主机、CDN、WAF都统一实现编码规范，可以很好地避免该类问题的产生。

参考链接

*https://media.defcon.org/DEF%20CON%2024/DEF%20CON%2024%20presentations/DEF%20CON%2024%20-%20Regilero-Hiding-Wookiees-In-Http.pdf
*https://portswigger.net/research/http-desync-attacks-request-smuggling-reborn
*https://regilero.github.io/english/security/2019/10/17/securityapachetrafficserverhttp_smuggling/
*https://paper.seebug.org/1048
*https://tools.ietf.org/html/rfc2616
*http://blog.zeddyu.info/2019/12/05/HTTP-Smuggling/
*https://tools.ietf.org/html/rfc7230
*https://tools.ietf.org/html/rfc7231