mirror of
https://github.com/beyondx/Notes.git
synced 2026-02-04 02:43:32 +08:00
Add New Notes
This commit is contained in:
411
Zim/Programme/APUE/socket/How_do_you_get_ECONNRESET_on_recv.txt
Normal file
411
Zim/Programme/APUE/socket/How_do_you_get_ECONNRESET_on_recv.txt
Normal file
@@ -0,0 +1,411 @@
|
||||
Content-Type: text/x-zim-wiki
|
||||
Wiki-Format: zim 0.4
|
||||
Creation-Date: 2012-02-27T16:36:05+08:00
|
||||
|
||||
====== How do you get ECONNRESET on recv ======
|
||||
Created Monday 27 February 2012
|
||||
|
||||
http://fixunix.com/unix/84635-how-do-you-get-econnreset-recv.html
|
||||
|
||||
The man page for __recv__ and __read__ list the error __ECONNRESET__ as an error
|
||||
condition that happens when "A connection was forcibly closed __by a__
|
||||
__ peer__."~~ I take this to mean that, assuming a TCP connection, if a~~
|
||||
~~ client is recv'ing from a server, and the server suddenly crashes,~~
|
||||
~~ then on the client side recv will return -1 and set errno to~~
|
||||
~~ ECONNRESET.~~
|
||||
|
||||
For the purpose of **robust error handling**, I'm trying to integrate
|
||||
routines to take care of this sort of thing in my program. But I
|
||||
simply** can't actually get recv to return ECONNRESET** in any of my
|
||||
tests. For testing purposes, I set up a simple server and a simple
|
||||
client. The server sends data, and a background thread raises a
|
||||
__SIGSEGV__ while the server is sending, causing the whole program to
|
||||
crash. Meanwhile, the client is recv'ing. But when the server
|
||||
crashes, the client does not issue an ECONNRESET error. Rather, recv
|
||||
__ returns 0 and errno is set to 0__. No error condition is generated at
|
||||
all. But the man page says that recv should only return 0 if "the
|
||||
peer has performed an orderly shutdown". But a SIGSEGV is certainly
|
||||
not my idea of an "orderly shutdown"!
|
||||
|
||||
So, is the behavior of recv in this aspect something that is
|
||||
implementation defined, i.e. not identical across platforms? Maybe
|
||||
some UNIX environments return ECONNRESET on recv, but others don't?
|
||||
Or does recv never return ECONNRESET with TCP?
|
||||
--------------------------------------------------------
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:09 AM #2
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
On 2007-08-09, chsalvia@gmail.com wrote:
|
||||
> The man page for recv and read list the error ECONNRESET as an error
|
||||
> condition that happens when "A connection was forcibly closed by a
|
||||
> peer." I take this to mean that, assuming a TCP connection, if a
|
||||
> client is recv'ing from a server, and the server suddenly crashes,
|
||||
> then on the client side recv will return -1 and set errno to
|
||||
> ECONNRESET.
|
||||
|
||||
Well, your understanding is__ probably wrong.__
|
||||
|
||||
The TCP answers with RESET when you try to __send__ some data to peer that
|
||||
does not want to __read__ that data. In other words the peer has __closed__
|
||||
connection or has done __shutdown of reading__.
|
||||
|
||||
Normally, if the peer__ closes__ connection, recv returns 0 without any
|
||||
error. The same applies to the cases when the peer application crashes.
|
||||
|
||||
Now, if you try to send the data to peer after you got 0 from recv,
|
||||
you should get RESET.
|
||||
|
||||
__ recv返回0,说明对方已经关闭了(close)连接,或关闭了写端(shutdown(SHUT_WR))。__
|
||||
在本地也调用close后再send,就会收到RESET报文。
|
||||
以后再发send
|
||||
|
||||
如果是前者,客户端send数据时服务器端TCP会返回RESET报文,send出错,errno为
|
||||
**ECONNRESET。**
|
||||
|
||||
If you try to send the data after you got RESET, you'll __get EPIPE or SIGPIPE__.
|
||||
|
||||
So, theoretically, you can see ECONNRESET in recv only if the peer does
|
||||
__ shutdown(SHUT_RD) __and you try to send some data after this. Which
|
||||
usually never happens More often the peer__ closes __socket unexpectedly
|
||||
while you are sending many chunks of data and as result you get
|
||||
SIGPIPE, because your first send triggers RESET, and your second send
|
||||
triggers SIGPIPE, because you didn't see the RESET.
|
||||
|
||||
--
|
||||
Minds, like parachutes, function best when open
|
||||
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:09 AM #3
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
chsalvia@gmail.com wrote:
|
||||
> The man page for recv and read list the error ECONNRESET as an error
|
||||
> condition that happens when "A connection was forcibly closed by a
|
||||
> peer." I take this to mean that, assuming a TCP connection, if a
|
||||
> client is recv'ing from a server, and the server suddenly crashes,
|
||||
> then on the client side recv will return -1 and set errno to
|
||||
> ECONNRESET.
|
||||
|
||||
Actually no When the server _system_ suddenly crashes, your
|
||||
application receives nothing.
|
||||
|
||||
ECONNRESET means the connection has received a ReSeT (RST) segment
|
||||
(ostesibly) from the remote TCP. There are a multitude of reasons
|
||||
such a segment could be received, including, but not limited to:
|
||||
|
||||
*) the remote abused SO_LINGER and did an abortive close of the
|
||||
connection
|
||||
|
||||
*) your application sent data which arrived after the remote called
|
||||
shutdown(SHUT_RD) or close()
|
||||
|
||||
*) the remote TCP hit a retransmission limit and aborted (yes, if the
|
||||
data segments weren't getting through the chances of the RST making
|
||||
it are slim, but still non-zero)
|
||||
|
||||
*) there was some actual TCP protocol error between the two systems
|
||||
|
||||
99 times out of ten if the server _application_ terminates
|
||||
(prematurely) the normal close() which happens on almost all platorms
|
||||
will cause TCP to emit a FINished (FIN) segment. That would then be a
|
||||
recv/read return of zero at your end. Of course if your application
|
||||
ignored that and then tried to send something, that brings us to the
|
||||
second bullet item above.
|
||||
|
||||
> But the man page says that recv should only return 0 if "the peer
|
||||
> has performed an orderly shutdown". But a SIGSEGV is certainly not
|
||||
> my idea of an "orderly shutdown"!
|
||||
|
||||
Ah, but as per above, 99 times out of ten, when the OS is cleaning-up
|
||||
after the SIGSEGV'd application, it goes ahead and calls (the moral
|
||||
equivalent to) close(), which unless perhaps the application has set
|
||||
the abortive close SO_LINGER options will result in a FIN being sent.
|
||||
The TCP code doesn't know the difference between a close() from the
|
||||
app making a direct call, the system making a close() call on normal
|
||||
program termination, or one from abnormal termination.
|
||||
|
||||
I suppse you could try setting the SO_LINGER options on the server
|
||||
code to cause an RST when close() is called and then see what killing
|
||||
the process does. Just be sure that you only do that in a debug
|
||||
version and/or have code to put SO_LINGER back the way it should be
|
||||
when doing a "normal" close() in your server app. Hmm, that might be
|
||||
one of the few valid (IMO) reasons to use that otherwise heinous
|
||||
direct-to_RST SO_LINGER option... Perhaps one day I will try that
|
||||
with netperf.
|
||||
|
||||
rick jones
|
||||
--
|
||||
Process shall set you free from the need for rational thought.
|
||||
these opinions are mine, all mine; HP might not want them anyway...
|
||||
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
|
||||
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:09 AM #4
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
Andrei Voropaev writes:
|
||||
> So, theoretically, you can see ECONNRESET in recv only if the peer does
|
||||
> shutdown(SHUT_RD) and you try to send some data after this. Which
|
||||
> usually never happens More often the peer closes socket unexpectedly
|
||||
> while you are sending many chunks of data and as result you get
|
||||
> SIGPIPE, because your first send triggers RESET, and your second send
|
||||
> triggers SIGPIPE, because you didn't see the RESET.
|
||||
|
||||
Actually, a more common cause is that the peer uses the SO_LINGER
|
||||
option, sets l_onoff to 1 (true) and l_linger to 0 (zero time), then
|
||||
closes the socket. On systems that implement BSD sockets properly,
|
||||
that causes the system to emit TCP RST and blow away the connection.
|
||||
Your application will then see ECONNRESET or SIGPIPE or EPIPE,
|
||||
depending on where it was when the message was received.
|
||||
|
||||
--
|
||||
James Carlson, Solaris Networking
|
||||
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
|
||||
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
|
||||
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:09 AM #5
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
In article ,
|
||||
Rick Jones wrote:
|
||||
|
||||
> chsalvia@gmail.com wrote:
|
||||
> > The man page for recv and read list the error ECONNRESET as an error
|
||||
> > condition that happens when "A connection was forcibly closed by a
|
||||
> > peer." I take this to mean that, assuming a TCP connection, if a
|
||||
> > client is recv'ing from a server, and the server suddenly crashes,
|
||||
> > then on the client side recv will return -1 and set errno to
|
||||
> > ECONNRESET.
|
||||
>
|
||||
> Actually no When the server _system_ suddenly crashes, your
|
||||
> application receives nothing.
|
||||
|
||||
But if you were sending something at the time that it crashed, you
|
||||
system will keep retransmitting. When the system reboots, it will
|
||||
respond to the retransmission with a RST, and this will cause you to get
|
||||
ECONNRESET.
|
||||
|
||||
--
|
||||
Barry Margolin, barmar@alum.mit.edu
|
||||
Arlington, MA
|
||||
*** PLEASE post questions in newsgroups, not directly to me ***
|
||||
*** PLEASE don't copy me on replies, I'll read them in the group ***
|
||||
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:09 AM #6
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
Barry Margolin wrote:
|
||||
> But if you were sending something at the time that it crashed, you
|
||||
> system will keep retransmitting. When the system reboots, it will
|
||||
> respond to the retransmission with a RST, and this will cause you to
|
||||
> get ECONNRESET.
|
||||
|
||||
I thought one got some sort of timed-out or unreachable errno or
|
||||
somesuch?
|
||||
|
||||
rick jones
|
||||
--
|
||||
No need to believe in either side, or any side. There is no cause.
|
||||
There's only yourself. The belief is in your own precision. - Jobert
|
||||
these opinions are mine, all mine; HP might not want them anyway...
|
||||
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
|
||||
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:09 AM #7
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
In article ,
|
||||
Rick Jones wrote:
|
||||
|
||||
> Barry Margolin wrote:
|
||||
> > But if you were sending something at the time that it crashed, you
|
||||
> > system will keep retransmitting. When the system reboots, it will
|
||||
> > respond to the retransmission with a RST, and this will cause you to
|
||||
> > get ECONNRESET.
|
||||
>
|
||||
> I thought one got some sort of timed-out or unreachable errno or
|
||||
> somesuch?
|
||||
|
||||
Only if the reboot takes longer than the retransmission limit. In the
|
||||
days when a reboot took several minutes that would be likely, but these
|
||||
days many systems can reboot in under a minute (unless they have to do
|
||||
lengthy fsck's), so the ECONNRESET is a possibility.
|
||||
|
||||
--
|
||||
Barry Margolin, barmar@alum.mit.edu
|
||||
Arlington, MA
|
||||
*** PLEASE post questions in newsgroups, not directly to me ***
|
||||
*** PLEASE don't copy me on replies, I'll read them in the group ***
|
||||
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:09 AM #8
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
Barry Margolin wrote:
|
||||
> Only if the reboot takes longer than the retransmission limit. In
|
||||
> the days when a reboot took several minutes that would be likely,
|
||||
> but these days many systems can reboot in under a minute (unless
|
||||
> they have to do lengthy fsck's), so the ECONNRESET is a possibility.
|
||||
|
||||
But what of the RFC suggested (or is it mandated?) quiet time on stack
|
||||
start?-)
|
||||
|
||||
rick jones
|
||||
--
|
||||
a wide gulf separates "what if" from "if only"
|
||||
these opinions are mine, all mine; HP might not want them anyway...
|
||||
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
|
||||
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:09 AM #9
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
In article ,
|
||||
Rick Jones wrote:
|
||||
|
||||
> Barry Margolin wrote:
|
||||
> > Only if the reboot takes longer than the retransmission limit. In
|
||||
> > the days when a reboot took several minutes that would be likely,
|
||||
> > but these days many systems can reboot in under a minute (unless
|
||||
> > they have to do lengthy fsck's), so the ECONNRESET is a possibility.
|
||||
>
|
||||
> But what of the RFC suggested (or is it mandated?) quiet time on stack
|
||||
> start?-)
|
||||
|
||||
If I understand it correctly, this just prohibits the rebooted system
|
||||
from initiating connections during the quiet time. It doesn't affect
|
||||
responding to segments received. In fact, the point of the quiet time
|
||||
is to ensure that new connections don't inadvertently reuse the port and
|
||||
sequence numbers of connections from before the reboot, which would
|
||||
prevent responding to those packets with RST.
|
||||
|
||||
--
|
||||
Barry Margolin, barmar@alum.mit.edu
|
||||
Arlington, MA
|
||||
*** PLEASE post questions in newsgroups, not directly to me ***
|
||||
*** PLEASE don't copy me on replies, I'll read them in the group ***
|
||||
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:10 AM #10
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
|
||||
I have written a multi-threaded C client application running on HP UX
|
||||
11 Pa RISC that sends a SOAP request via send( ) to a webservice
|
||||
residing on a Windows PC. The send( ) is always successful, and I
|
||||
perform no other socket calls until I issue a recv( ) to get the
|
||||
webservice response. Very occasionally the application either gets no
|
||||
reponse or an ECONNRESET response. We suspect some sort of network
|
||||
issue between the client and server. I have built and deployed my
|
||||
client application for several other platforms including Solaris and
|
||||
Windows and have never experienced this issue.
|
||||
|
||||
I feel that my application should handle this situation more
|
||||
gracefully, and to this end my questions are:
|
||||
|
||||
Is it safe to assume anything regarding the state of the send( )
|
||||
request on the webservice server? More specifically, what would be the
|
||||
proper way for my client application to recover?
|
||||
|
||||
On Aug 9, 3:27 pm, Rick Jones wrote:
|
||||
> chsal...@gmail.com wrote:
|
||||
> > The man page for recv and read list the error ECONNRESET as an error
|
||||
> > condition that happens when "A connection was forcibly closed by a
|
||||
> > peer." I take this to mean that, assuming a TCP connection, if a
|
||||
> > client is recv'ing from a server, and the server suddenly crashes,
|
||||
> > then on the client side recv will return -1 and set errno to
|
||||
> > ECONNRESET.
|
||||
>
|
||||
> Actually no When the server _system_ suddenly crashes, your
|
||||
> application receives nothing.
|
||||
>
|
||||
> ECONNRESET means the connection has received a ReSeT (RST) segment
|
||||
> (ostesibly) from the remote TCP. There are a multitude of reasons
|
||||
> such a segment could be received, including, but not limited to:
|
||||
>
|
||||
> *) the remote abused SO_LINGER and did an abortive close of the
|
||||
> connection
|
||||
>
|
||||
> *) your application sent data which arrived after the remote called
|
||||
> shutdown(SHUT_RD) or close()
|
||||
>
|
||||
> *) the remote TCP hit a retransmission limit and aborted (yes, if the
|
||||
> data segments weren't getting through the chances of the RST making
|
||||
> it are slim, but still non-zero)
|
||||
>
|
||||
> *) there was some actual TCP protocol error between the two systems
|
||||
>
|
||||
> 99 times out of ten if the server _application_ terminates
|
||||
> (prematurely) the normal close() which happens on almost all platorms
|
||||
> will cause TCP to emit a FINished (FIN) segment. That would then be a
|
||||
> recv/read return of zero at your end. Of course if your application
|
||||
> ignored that and then tried to send something, that brings us to the
|
||||
> second bullet item above.
|
||||
>
|
||||
> > But the man page says that recv should only return 0 if "the peer
|
||||
> > has performed an orderly shutdown". But a SIGSEGV is certainly not
|
||||
> > my idea of an "orderly shutdown"!
|
||||
>
|
||||
> Ah, but as per above, 99 times out of ten, when the OS is cleaning-up
|
||||
> after the SIGSEGV'd application, it goes ahead and calls (the moral
|
||||
> equivalent to) close(), which unless perhaps the application has set
|
||||
> the abortive close SO_LINGER options will result in a FIN being sent.
|
||||
> The TCP code doesn't know the difference between a close() from the
|
||||
> app making a direct call, the system making a close() call on normal
|
||||
> program termination, or one from abnormal termination.
|
||||
>
|
||||
> I suppse you could try setting the SO_LINGER options on the server
|
||||
> code to cause an RST when close() is called and then see what killing
|
||||
> the process does. Just be sure that you only do that in a debug
|
||||
> version and/or have code to put SO_LINGER back the way it should be
|
||||
> when doing a "normal" close() in your server app. Hmm, that might be
|
||||
> one of the few valid (IMO) reasons to use that otherwise heinous
|
||||
> direct-to_RST SO_LINGER option... Perhaps one day I will try that
|
||||
> with netperf.
|
||||
>
|
||||
> rick jones
|
||||
> --
|
||||
> Process shall set you free from the need for rational thought.
|
||||
> these opinions are mine, all mine; HP might not want them anyway...
|
||||
> feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
|
||||
|
||||
|
||||
Reply With Quote Reply With Quote
|
||||
10-04-2007 12:10 AM #11
|
||||
Re: How do you get ECONNRESET on recv?
|
||||
|
||||
Fish Maker wrote:
|
||||
> I have written a multi-threaded C client application running on HP
|
||||
> UX 11 Pa RISC that sends a SOAP request via send( ) to a webservice
|
||||
> residing on a Windows PC. The send( ) is always successful, and I
|
||||
> perform no other socket calls until I issue a recv( ) to get the
|
||||
> webservice response. Very occasionally the application either gets
|
||||
> no reponse or an ECONNRESET response. We suspect some sort of
|
||||
> network issue between the client and server. I have built and
|
||||
> deployed my client application for several other platforms including
|
||||
> Solaris and Windows and have never experienced this issue.
|
||||
|
||||
> I feel that my application should handle this situation more
|
||||
> gracefully, and to this end my questions are:
|
||||
|
||||
> Is it safe to assume anything regarding the state of the send( )
|
||||
> request on the webservice server? More specifically, what would be
|
||||
> the proper way for my client application to recover?
|
||||
|
||||
If you have received nothing but the ECONRESET on the recv() you can
|
||||
assume nothing about the state of the request on the server. You do
|
||||
not know if the server application received the data, nor if it acted
|
||||
upon the data if it did receive it. To know that you need to receive
|
||||
some sort of message from the server application.
|
||||
|
||||
I'm not fully up on all the terminology, but you may want to web
|
||||
search on "two phase commit."
|
||||
|
||||
rick jones
|
||||
--
|
||||
web2.0 n, the dot.com reunion tour...
|
||||
these opinions are mine, all mine; HP might not want them anyway...
|
||||
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
|
||||
81
Zim/Programme/APUE/socket/TCP连接关闭总结.txt
Normal file
81
Zim/Programme/APUE/socket/TCP连接关闭总结.txt
Normal file
@@ -0,0 +1,81 @@
|
||||
Content-Type: text/x-zim-wiki
|
||||
Wiki-Format: zim 0.4
|
||||
Creation-Date: 2012-02-27T20:06:34+08:00
|
||||
|
||||
====== TCP连接关闭总结 ======
|
||||
Created Monday 27 February 2012
|
||||
|
||||
http://blog.csdn.net/shallwake/article/details/5250467
|
||||
|
||||
由于涉及面太广,只作简单整理,有兴趣的可参考《UNIX Networking Programming》volum 1, Section 5.7, 5.12, 5.14, 5.15, 6.6 以及7.5 SO_LINGER选项。
|
||||
|
||||
以一个简单的echo服务器为例,客户端从标准输入读入字符,发送给服务器,服务器收到后再原样返回,客户端收到后打印到标准输出。
|
||||
|
||||
那么,关于套接字的关闭有以下几种情形:
|
||||
|
||||
1,客户端主动关闭连接:
|
||||
|
||||
1.1,客户端调用close()
|
||||
1.2,客户端进程关闭
|
||||
1.3,客户端调用shutdown()
|
||||
1.4,客户端调用close()+SO_LINGER选项
|
||||
1.5,客户端崩溃(突然断电,网线拔出,非正常关机,导致内核没有发送FIN,没有重启)
|
||||
|
||||
|
||||
2,服务器关闭连接:
|
||||
|
||||
2.1,服务器调用close()
|
||||
2.2,服务器进程关闭
|
||||
2.3,服务器崩溃
|
||||
2.4,服务器崩溃+SO_KEEPALIVE选项
|
||||
|
||||
========================================分割线=========================================
|
||||
|
||||
1.1与1.2等价,就算客户端进程关闭,系统内核也会自动close(socket),且注意,当socket__引用为0时__才会真正调用close(),__close()总是立即返回的,然后由系统尝试发送完内核缓冲区内的所有数据,接着才发送FIN__。所以,__进程退出后其发送的数据有可能还没发到对方__。
|
||||
|
||||
说道这里,不得不谈谈TCP连接关闭的四次握手。可以看成是2组FIN, ACK。主动关闭的一方先发送FIN,收到ACK后,进入FIN_WAIT2状态,此时也叫做“__半关闭”状态__,特别须要注意的是,此时__主动关闭一方的套接字依然可以接收数据包,但是不能发送数据包__。
|
||||
注意:
|
||||
1. 这里的“发”是指本地TCP发送FIN并收到ACK后( __可能由close()或shutdown(SHUT_WR)引起)再执行的__send或write系统调用,**不包括已经在发端的内核TCP缓冲区中**未发送的数据 (发送这些数据的send在close前调用,而且成功返回)。
|
||||
2. 如果在close或shutdown后继续发数据,则send,write有__可能收到SIGPIPE,然后出错返回,errno为EPIPE__,
|
||||
|
||||
被动关闭的一方,此时收到FIN了,一般情况下都是__由read(socket)返回0,然后得知对方关闭(但是本地还可以继续发数据。)__,close(socket)后,另外一组FIN,ACK随之产生,此时主动方进入TIME_WAIT状态。即四次握手完成。
|
||||
|
||||
以上即是正常情况下连接关闭的情形。
|
||||
|
||||
再看看1.3,shutdown()与close()主要有3点区别:
|
||||
|
||||
* __shutdown()不理会引用计数与内核缓冲区内剩余待发数据包,直接发送FIN(对于关闭发送而言)__;
|
||||
* shutdown()可以只关闭套接字__某个方向__的连接,例如关闭发送,关闭接收,或者2者都关闭;
|
||||
|
||||
__实际上shutdown(write)后,就是上面说的半关闭情形,依然可以完成四次握手。__
|
||||
|
||||
===== 再看看1.4,为什么要设置SO_LINGER呢 =====
|
||||
|
||||
SO_LINGER的目的就是__改变close()的默认行为__,可以决定close()在哪个状态返回,或者让套接字__立即发送RST(而且没有TIME_WAIT状态)__,从而没有FIN的发送,接收方返回ECONNRESET错误,连接**直接关闭**。
|
||||
|
||||
再来总结下1.1-1.4,这么多关闭连接的方式,那么什么方式才是最好的呢?
|
||||
|
||||
择优选择的方式当然是考虑最恶劣的情况,对方主机崩溃或网络故障导致数据包传输停滞。
|
||||
|
||||
* RST不用考虑了,直接TIME_WAIT状态都没,如果有网络故障,可能**下次创建的套接字还会接收到已经被销毁的套接字的数据报**。
|
||||
* close()不能保证对方一定收到FIN(因为close总是**立即返回**的,有内核尝试发完TCP缓冲区中的所有数据,然后发送FIN。但这时__发送进程可能已经结束__了。)。
|
||||
* close()+SO_LINGER虽然能控制close()在__收到ACK后返回__,依然不能保证四次握手完成。
|
||||
* shutdown()先进入半关闭状态,再调用read(),返回0(收到对方FIN)则说明四次握手正常进行,__此为最优方式__。
|
||||
|
||||
其实仔细想想,一般情况也不用这么麻烦,拿网游服务器来说,客户端close()后,就算服务器不知道,那么这种情况归为1.5讨论;如果是服务端close()而客户端不知道,那么归为2.3讨论。总之都有解决办法。。
|
||||
|
||||
现在再讨论1.5,很简单,服务端加入链路异常检测机制即可,这也是所有大型TCP服务器必备的机制,__定时发送小数据包检测客户端是否有异常退出__。
|
||||
|
||||
========================================分割线=========================================
|
||||
|
||||
服务器关闭连接方面:
|
||||
|
||||
2.1,2.2等价,一般情况下也与1.1,1.2等价,只是主动关闭方是服务器了。
|
||||
2.3,服务器崩溃,客户端由于一直收不到ACK,会一直尝试发送数据,标准socket大概是__9分钟__后才会返回错误。
|
||||
2.3,服务器崩溃,客户端又长时间与服务器没有数据交互,此时设置__SO_KEEPALIVE__选项可得知。
|
||||
|
||||
========================================分割线=========================================
|
||||
|
||||
后记:网络是门复杂的学问,由此TCP连接的关闭可见一斑。普通程序员通常不会考虑这么细致,但是我相信这些问题一直困扰着他们。
|
||||
|
||||
补充说明:经试验,在Windows平台,__1.2 2.2情况等同于close()+SO_LINGER选项直接发送RST__,可能由于系统必须及时清理资源吧,这点**与linux是不同**的,有兴趣的可以试试。
|
||||
52
Zim/Programme/APUE/socket/理解套接字recv(),send().txt
Normal file
52
Zim/Programme/APUE/socket/理解套接字recv(),send().txt
Normal file
@@ -0,0 +1,52 @@
|
||||
Content-Type: text/x-zim-wiki
|
||||
Wiki-Format: zim 0.4
|
||||
Creation-Date: 2012-02-27T20:43:52+08:00
|
||||
|
||||
====== 理解套接字recv(),send() ======
|
||||
Created Monday 27 February 2012
|
||||
|
||||
http://blog.csdn.net/shallwake/article/details/5273727
|
||||
|
||||
今天看UNP时,找到了个很不错的图示,觉得理解清楚后就基本没什么问题了,在这里做个简单整理,注意此图示是假设从stdin接受输入,然后send给套接字发送;从套接字recv后,传给stdout输出。
|
||||
|
||||
===== send:内核发送缓冲区(注意发送和接收缓冲区是环形的。) =====
|
||||
{{./1.jpg}}
|
||||
|
||||
* tooptr :指向下一个__将传送给socket__的字节
|
||||
* toiptr :指向下一个可以__接收应用层数据__的位置
|
||||
|
||||
所以:
|
||||
* 要传送给套接字的数据长度就是toiptr - tooptr。
|
||||
* 内核缓冲区可以接受stdin传来的数据长度是&to[MAXLINE] - toiptr。
|
||||
* 阻塞模式下:应用层copy数据至内核缓冲区即返回,若没有足够缓冲区容纳传来的__整个数据__(如网络太慢),则阻塞至有足够空间。
|
||||
* 非阻塞模式下:若缓冲区__已满__,立即返回EWOULDBLOCK,有缓冲区,立即返回的是__已经copy了的数据长度__。
|
||||
|
||||
=============================分割线===================================
|
||||
|
||||
===== recv:内核接收缓冲区 =====
|
||||
{{./2.jpg}}
|
||||
* froptr :指向下一个将__传送给应用层__的字节
|
||||
* friptr :指向下一个可以__接收socket数据__的位置
|
||||
|
||||
所以:
|
||||
* 要__传送给应用层__的数据长度就是friptr - froptr 。
|
||||
* 内核缓冲区可以接受__socket传来__的数据长度是&fr[MAXLINE] - friptr。
|
||||
* 阻塞模式下:若缓冲区内无数据可读,则__阻塞等待至有数据才返回,数据长度不定__,可以是1个字节,也可以是一个完整数据包
|
||||
* 非阻塞模式下:若缓冲区内无数据,立即返回EWOULDBLOCK,有缓冲区,与上面相同。
|
||||
|
||||
=============================分割线===================================
|
||||
|
||||
===== 总结: =====
|
||||
|
||||
* 无论阻塞还是非阻塞,不要指望send(n) or recv(n)就一定能发送或接收n字节的数据。
|
||||
* 把内核缓冲区理解清楚对网络编程理解很有帮助。
|
||||
|
||||
===== 思考: =====
|
||||
|
||||
众所周知一个服务器设计原则是“__不要使用任何阻塞操作__”。
|
||||
很容易理解,一是充分利用CPU;二则是安全性,比如恶意客户很容易让服务器阻塞在它上面。
|
||||
|
||||
关于__非阻塞的安全性__,我看过很多代码都是把非阻塞send()放进一个循环里,没有发送完指定n个数据则不退出,这在正常情况下可以,但是若网络比较慢,根据上面图示推测,显然while()退出也缓慢,这势必会影响服务器对其他套接字数据的发送。更不用考虑若对方是恶意用户,比如只接收一个字节则sleep()。。
|
||||
|
||||
所以,我觉得,高性能服务器不能用阻塞,也不能把任何I/O操作放进循环直到操作完期望数据,这点以后再整理。。。
|
||||
(可以用poll,epoll,select等。)
|
||||
BIN
Zim/Programme/APUE/socket/理解套接字recv(),send()/1.jpg
Normal file
BIN
Zim/Programme/APUE/socket/理解套接字recv(),send()/1.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 12 KiB |
BIN
Zim/Programme/APUE/socket/理解套接字recv(),send()/2.jpg
Normal file
BIN
Zim/Programme/APUE/socket/理解套接字recv(),send()/2.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 12 KiB |
Reference in New Issue
Block a user