Quantcast

Re: Fwd: backing up to a Drobo CIFS VFS: Error -11 sending data on socket to server, Bufferbloat?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: backing up to a Drobo CIFS VFS: Error -11 sending data on socket to server, Bufferbloat?

Samba - samba-technical mailing list
On Fri, May 12, 2017 at 6:45 AM, Robert Kudyba <[hidden email]> wrote:

>> Given your kernel log message I would expect seeing the reconnecting happen.
>
> I scheduled a tshark -c over night and was able to get it to run
> within minutes of the reconnect attempts.It seems pretty consistent,
> about 2 hours after the start of the cron the Error -11's start to
> happen. Last night it also happened about 30 minutes into the job:
>
> May 11 23:47:51 localworkstation kernel: CIFS VFS: Server drobo-rh has
> not responded in 120 seconds. Reconnecting...
> May 11 23:49:59 localworkstation kernel: CIFS VFS: Server drobo-rh has
> not responded in 120 seconds. Reconnecting...
> May 11 23:52:57 localworkstation kernel: CIFS VFS: Server drobo-rh has
> not responded in 120 seconds. Reconnecting...
>
> May 12 01:19:44 localworkstation cupsd: REQUEST localhost - - "POST /
> HTTP/1.1" 200 181 Renew-Subscription successful-ok
> May 12 01:21:20 localworkstation kernel: CIFS VFS: sends on sock
> ffff9f2817bede00 stuck for 15 seconds
> May 12 01:21:20 localworkstation kernel: CIFS VFS: Error -11 sending
> data on socket to server
>
> Here's the tshark -r tshark.pcap -Y 'smb||smb2' results from 2 files:
>
>   593 0.273687063 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] Write AndX Response, 65536 bytes
>   666 0.282264835 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>     8 0.136979937 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] [TCP Previous segment not captured] Write AndX Response,
> 65536 bytes
>    36 0.142752037 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>    63 0.149017707 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>
>> I would wrap your backup script like so to get the right capture:
>>
>>     capfile="$(date -Ihours)"
>>     tcpdump -w "$capfile" port 445 &
>>     cappid=$!
>>     sleep 2
>>     .....<your script>....
>>     sleep 2
>>     kill -6 $cappid
>
> OK just did that and started running the backup manually so far I've
> seen variations/multiples of these (cut for easier reading):
> 38735  18.521926 localserver → drobo SMB 10202 Write AndX Request,
> FID: 0x1fca, 65536 bytes at offset 87949312[TCP segment of a
> reassembled PDU]
> 38760  18.527189 localserver → drobo SMB 7306 Write AndX Request, FID:
> 0x1fca, 65536 bytes at offset 88014848[TCP segment of a reassembled
> PDU]
> 38771  18.529817 drobo → localserver SMB 117 Write AndX Response, FID:
> 0x1fca, 65536 bytes
> 44079  19.615349 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] Write AndX Response, 65536 bytes
> 44102  19.626642 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] Write AndX Response, 65536 bytes
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to [hidden email]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

You are seeing some packet drops which might make it hard to figure
out the issue, but those are simply the writes.

What I would look for is places where the writes stop.

You will likely see it better in wireshark by using Statistics->I/O
Graph as it will show you where the throughput rate drops to zero.

Then you can click on the graph at the point where it drops to zero to
get the main window to go to that packet and then you can relate it to
the tshark output.

The earliest message you posted suggests that the client (Linux) was
complaining of no activity for 120 seconds, which suggests that the
Drobo stopped responding at some point. Thus I would expect to see one
or more writes that did not get responses, followed by a connection
drop and reconnect.

--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: backing up to a Drobo CIFS VFS: Error -11 sending data on socket to server, Bufferbloat?

Samba - samba-technical mailing list
On Fri, May 12, 2017 at 7:11 AM, Richard Sharpe
<[hidden email]> wrote:

> On Fri, May 12, 2017 at 6:45 AM, Robert Kudyba <[hidden email]> wrote:
>>> Given your kernel log message I would expect seeing the reconnecting happen.
>>
>> I scheduled a tshark -c over night and was able to get it to run
>> within minutes of the reconnect attempts.It seems pretty consistent,
>> about 2 hours after the start of the cron the Error -11's start to
>> happen. Last night it also happened about 30 minutes into the job:
>>
>> May 11 23:47:51 localworkstation kernel: CIFS VFS: Server drobo-rh has
>> not responded in 120 seconds. Reconnecting...
>> May 11 23:49:59 localworkstation kernel: CIFS VFS: Server drobo-rh has
>> not responded in 120 seconds. Reconnecting...
>> May 11 23:52:57 localworkstation kernel: CIFS VFS: Server drobo-rh has
>> not responded in 120 seconds. Reconnecting...
>>
>> May 12 01:19:44 localworkstation cupsd: REQUEST localhost - - "POST /
>> HTTP/1.1" 200 181 Renew-Subscription successful-ok
>> May 12 01:21:20 localworkstation kernel: CIFS VFS: sends on sock
>> ffff9f2817bede00 stuck for 15 seconds
>> May 12 01:21:20 localworkstation kernel: CIFS VFS: Error -11 sending
>> data on socket to server
>>
>> Here's the tshark -r tshark.pcap -Y 'smb||smb2' results from 2 files:
>>
>>   593 0.273687063 drobo → localserver SMB 117 [TCP ACKed unseen
>> segment] Write AndX Response, 65536 bytes
>>   666 0.282264835 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>>     8 0.136979937 drobo → localserver SMB 117 [TCP ACKed unseen
>> segment] [TCP Previous segment not captured] Write AndX Response,
>> 65536 bytes
>>    36 0.142752037 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>>    63 0.149017707 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>>
>>> I would wrap your backup script like so to get the right capture:
>>>
>>>     capfile="$(date -Ihours)"
>>>     tcpdump -w "$capfile" port 445 &
>>>     cappid=$!
>>>     sleep 2
>>>     .....<your script>....
>>>     sleep 2
>>>     kill -6 $cappid
>>
>> OK just did that and started running the backup manually so far I've
>> seen variations/multiples of these (cut for easier reading):
>> 38735  18.521926 localserver → drobo SMB 10202 Write AndX Request,
>> FID: 0x1fca, 65536 bytes at offset 87949312[TCP segment of a
>> reassembled PDU]
>> 38760  18.527189 localserver → drobo SMB 7306 Write AndX Request, FID:
>> 0x1fca, 65536 bytes at offset 88014848[TCP segment of a reassembled
>> PDU]
>> 38771  18.529817 drobo → localserver SMB 117 Write AndX Response, FID:
>> 0x1fca, 65536 bytes
>> 44079  19.615349 drobo → localserver SMB 117 [TCP ACKed unseen
>> segment] Write AndX Response, 65536 bytes
>> 44102  19.626642 drobo → localserver SMB 117 [TCP ACKed unseen
>> segment] Write AndX Response, 65536 bytes
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
>> the body of a message to [hidden email]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> You are seeing some packet drops which might make it hard to figure
> out the issue, but those are simply the writes.
>
> What I would look for is places where the writes stop.
>
> You will likely see it better in wireshark by using Statistics->I/O
> Graph as it will show you where the throughput rate drops to zero.
>
> Then you can click on the graph at the point where it drops to zero to
> get the main window to go to that packet and then you can relate it to
> the tshark output.
>
> The earliest message you posted suggests that the client (Linux) was
> complaining of no activity for 120 seconds, which suggests that the
> Drobo stopped responding at some point. Thus I would expect to see one
> or more writes that did not get responses, followed by a connection
> drop and reconnect.

BTW, I should mention that if the capture is large you will need lots
of memory to deal with it. I have a Lenovo P50 with 64GB of ram and a
1TB SSD and I can deal with 5GB captures reasonably comfortably, but
with only 8 or 16GB, forget it.

--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: backing up to a Drobo CIFS VFS: Error -11 sending data on socket to server, Bufferbloat?

Samba - samba-technical mailing list
>> You will likely see it better in wireshark by using Statistics->I/O
>> Graph as it will show you where the throughput rate drops to zero.
>>
>> Then you can click on the graph at the point where it drops to zero to
>> get the main window to go to that packet and then you can relate it to
>> the tshark output.

I posted the graph at storm.cis.fordham.edu/~rkudyba/wireshark.pdf

>> The earliest message you posted suggests that the client (Linux) was
>> complaining of no activity for 120 seconds, which suggests that the
>> Drobo stopped responding at some point. Thus I would expect to see one
>> or more writes that did not get responses, followed by a connection
>> drop and reconnect.

Some random messages from tshark not exactly sure what I'm looking for
as the time stamps aren't easy to interpret:
41674 217.172644 drobo → localserver SMB 117 [TCP ACKed unseen
segment] Write AndX Response, FID: 0x4ca4, 65536 bytes
41702 217.177652 localserver → drobo SMB 14546 Write AndX Request,
FID: 0x4ca4, 65536 bytes at offset 54329344[TCP segment of a
reassembled PDU]
41726 217.183095 drobo → localserver SMB 117 Write AndX Response, FID:
0x4ca4, 65536 bytes
 4156 136.143502 drobo → localserver SMB 252 Trans2 Response<unknown>
 4254 161.145741 drobo → localserver SMB 117 [TCP ZeroWindow] Write
AndX Response, 65536 bytes
 4260 161.147530 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
 4566 161.219832 drobo → localserver SMB 117 [TCP ACKed unseen
segment] Write AndX Response, 65536 bytes
 4594 161.225586 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
 4690 162.010514 drobo → localserver SMB 117 [TCP ZeroWindow] Write
AndX Response, 65536 bytes
 4696 162.011564 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
 4697 162.011830 drobo → localserver SMB 204 Trans2 Response<unknown>
 1056   4.654442 drobo → localserver SMB 117 [TCP ACKed unseen
segment] [TCP Previous segment not captured] Write AndX Response,
65536 bytes
 1057   4.660219 drobo → localserver SMB 117 [TCP ACKed unseen
segment] Write AndX Response, 65536 bytes
 1082   4.808276 drobo → localserver SMB 117 [TCP ACKed unseen
segment] [TCP Previous segment not captured] Write AndX Response,
65536 bytes
  2718  33.792075 localserver → drobo SMB 14546 Write AndX Request,
FID: 0x38ff, 65536 bytes at offset 167563264[TCP segment of a
reassembled PDU]
 2723  33.795265 localserver → drobo SMB 20338 Write AndX Request,
FID: 0x38ff, 65536 bytes at offset 167628800[TCP segment of a
reassembled PDU]
 2726  33.806707 drobo → localserver SMB 117 [TCP ACKed unseen
segment] Write AndX Response, 65536 bytes
 2727  33.806736 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
 2730  33.807123 localserver → drobo SMB 17442 Write AndX Request,
FID: 0x38ff, 65536 bytes at offset 167694336[TCP segment of a
reassembled PDU]
 2736  33.811345 localserver → drobo SMB 8754 Write AndX Request, FID:
0x38ff, 65536 bytes at offset 167759872[TCP segment of a reassembled
PDU]
 2739  33.814173 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
 2744  33.818007 localserver → drobo SMB 13098 Write AndX Request,
FID: 0x38ff, 65536 bytes at offset 167825408[TCP segment of a
reassembled PDU]
 41544 300.578408 localserver → drobo SMB 5858 Write AndX Request,
FID: 0x38ff, 65536 bytes at offset 2089480192[TCP segment of a
reassembled PDU]
41548 300.582822 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
41551 300.584432 localserver → drobo SMB 13098 Write AndX Request,
FID: 0x38ff, 65536 bytes at offset 2089545728Write AndX Request, FID:
0x38ff, 4096 bytes at offset 2089828352
41555 300.589855 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
41558 300.590108 localserver → drobo SMB 13098 Write AndX Request,
FID: 0x38ff, 65536 bytes at offset 2089611264[TCP segment of a
reassembled PDU]
41559 300.594149 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
41560 300.594438 drobo → localserver SMB 129 Write AndX Response, 4096 bytes
41584 300.602694 drobo → localserver SMB 129 [TCP ACKed unseen
segment] Write AndX Response, 65536 bytes

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: backing up to a Drobo CIFS VFS: Error -11 sending data on socket to server, Bufferbloat?

Samba - samba-technical mailing list
On Fri, May 12, 2017 at 10:17 AM, Robert Kudyba <[hidden email]> wrote:
> I posted the graph at storm.cis.fordham.edu/~rkudyba/wireshark.pdf

OK. There looks like some gaps where there is little or no write traffic.

>>> The earliest message you posted suggests that the client (Linux) was
>>> complaining of no activity for 120 seconds, which suggests that the
>>> Drobo stopped responding at some point. Thus I would expect to see one
>>> or more writes that did not get responses, followed by a connection
>>> drop and reconnect.
>
> Some random messages from tshark not exactly sure what I'm looking for
> as the time stamps aren't easy to interpret:
> 41674 217.172644 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] Write AndX Response, FID: 0x4ca4, 65536 bytes
> 41702 217.177652 localserver → drobo SMB 14546 Write AndX Request,
> FID: 0x4ca4, 65536 bytes at offset 54329344[TCP segment of a
> reassembled PDU]
> 41726 217.183095 drobo → localserver SMB 117 Write AndX Response, FID:
> 0x4ca4, 65536 bytes
>  4156 136.143502 drobo → localserver SMB 252 Trans2 Response<unknown>
>  4254 161.145741 drobo → localserver SMB 117 [TCP ZeroWindow] Write
> AndX Response, 65536 bytes
>  4260 161.147530 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>  4566 161.219832 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] Write AndX Response, 65536 bytes
>  4594 161.225586 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>  4690 162.010514 drobo → localserver SMB 117 [TCP ZeroWindow] Write
> AndX Response, 65536 bytes
>  4696 162.011564 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>  4697 162.011830 drobo → localserver SMB 204 Trans2 Response<unknown>
>  1056   4.654442 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] [TCP Previous segment not captured] Write AndX Response,
> 65536 bytes
>  1057   4.660219 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] Write AndX Response, 65536 bytes
>  1082   4.808276 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] [TCP Previous segment not captured] Write AndX Response,
> 65536 bytes
>   2718  33.792075 localserver → drobo SMB 14546 Write AndX Request,
> FID: 0x38ff, 65536 bytes at offset 167563264[TCP segment of a
> reassembled PDU]
>  2723  33.795265 localserver → drobo SMB 20338 Write AndX Request,
> FID: 0x38ff, 65536 bytes at offset 167628800[TCP segment of a
> reassembled PDU]
>  2726  33.806707 drobo → localserver SMB 117 [TCP ACKed unseen
> segment] Write AndX Response, 65536 bytes
>  2727  33.806736 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>  2730  33.807123 localserver → drobo SMB 17442 Write AndX Request,
> FID: 0x38ff, 65536 bytes at offset 167694336[TCP segment of a
> reassembled PDU]
>  2736  33.811345 localserver → drobo SMB 8754 Write AndX Request, FID:
> 0x38ff, 65536 bytes at offset 167759872[TCP segment of a reassembled
> PDU]
>  2739  33.814173 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
>  2744  33.818007 localserver → drobo SMB 13098 Write AndX Request,
> FID: 0x38ff, 65536 bytes at offset 167825408[TCP segment of a
> reassembled PDU]
>  41544 300.578408 localserver → drobo SMB 5858 Write AndX Request,
> FID: 0x38ff, 65536 bytes at offset 2089480192[TCP segment of a
> reassembled PDU]
> 41548 300.582822 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
> 41551 300.584432 localserver → drobo SMB 13098 Write AndX Request,
> FID: 0x38ff, 65536 bytes at offset 2089545728Write AndX Request, FID:
> 0x38ff, 4096 bytes at offset 2089828352
> 41555 300.589855 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
> 41558 300.590108 localserver → drobo SMB 13098 Write AndX Request,
> FID: 0x38ff, 65536 bytes at offset 2089611264[TCP segment of a
> reassembled PDU]
> 41559 300.594149 drobo → localserver SMB 117 Write AndX Response, 65536 bytes
> 41560 300.594438 drobo → localserver SMB 129 Write AndX Response, 4096 bytes
> 41584 300.602694 drobo → localserver SMB 129 [TCP ACKed unseen
> segment] Write AndX Response, 65536 bytes

Hmmm, there seem like gaps but that could be an artifact of you
selecting stuff at random. You are probably not going to find the
problem that way.

One thing you can do is to search for all those SMB requests where
there is no response. Of course since there are packet drops you might
find a few of those, but you can search for (!smb2.request_in) &&
(smb2.flags.response == 0) and that will show you all the requests for
which there is no response.

Another thing you can do is look for where the connection drops.
Search on (tcp.flags.reset == 1) || (tcp.flags.fin == 1) select one,
and then filter for all packets on the connection (tcp,port ==
portnum) where portnum is not 445 so you get just one TCP connection
and then you can look at what happened at the end of that tcp
connection.

This last approach might be the best if the problem is that the Linux
CIFS module is forgetting to send data at some point and we are not
looking for a Write request that the Drobo did not respond to.

--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: backing up to a Drobo CIFS VFS: Error -11 sending data on socket to server, Bufferbloat?

Samba - samba-technical mailing list
> One thing you can do is to search for all those SMB requests where
> there is no response. Of course since there are packet drops you might
> find a few of those, but you can search for (!smb2.request_in) &&
> (smb2.flags.response == 0) and that will show you all the requests for
> which there is no response.

Sorry still getting used to WS but I couldn't find (!smb2.request_in)
== but  (smb2.flags.response == 0) was there. Am I supposed to run
tshark on the exported results?

> Another thing you can do is look for where the connection drops.
> Search on (tcp.flags.reset == 1) || (tcp.flags.fin == 1) select one,
> and then filter for all packets on the connection (tcp,port ==
> portnum) where portnum is not 445 so you get just one TCP connection
> and then you can look at what happened at the end of that tcp
> connection.
>
> This last approach might be the best if the problem is that the Linux
> CIFS module is forgetting to send data at some point and we are not
> looking for a Write request that the Drobo did not respond to.

How do I ' filter for all packets on the connection (tcp,port ==
portnum) where portnum is not 445'?

Using the 2 searches here I get these kind of logs:

140622  25.417347 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
Ack=293608973 Win=26508 Len=0 TSval=632244604 TSecr=2339550377
140623  25.417645 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
Ack=293611869 Win=26508 Len=0 TSval=632244604 TSecr=2339550377
140624  25.417912 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
Ack=293614765 Win=26508 Len=0 TSval=632244604 TSecr=2339550378
140625  25.417925 fedora → drobo TCP 11650 [TCP segment of a reassembled PDU]
140626  25.417942 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
Ack=293617661 Win=26508 Len=0 TSval=632244604 TSecr=2339550378
140627  25.418193 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
Ack=293620557 Win=26508 Len=0 TSval=632244604 TSecr=2339550378
140628  25.418662 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
Ack=293623453 Win=26508 Len=0 TSval=632244604 TSecr=2339550378
140629  25.418672 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
Ack=293626349 Win=26508 Len=0 TSval=632244604 TSecr=2339550379
140630  25.419048 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
Ack=293629245 Win=26508 Len=0 TSval=632244604 TSecr=2339550379
140631  25.419076 fedora → drobo TCP 14546 [TCP segment of a reassembled PDU]
140632  25.419328 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
Ack=293632141 Win=26508 Len=0 TSval=632244604 TSecr=2339550379

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: backing up to a Drobo CIFS VFS: Error -11 sending data on socket to server, Bufferbloat?

Samba - samba-technical mailing list
On Fri, May 12, 2017 at 7:40 PM, Robert Kudyba <[hidden email]> wrote:
>> One thing you can do is to search for all those SMB requests where
>> there is no response. Of course since there are packet drops you might
>> find a few of those, but you can search for (!smb2.request_in) &&
>> (smb2.flags.response == 0) and that will show you all the requests for
>> which there is no response.
>
> Sorry still getting used to WS but I couldn't find (!smb2.request_in)
> == but  (smb2.flags.response == 0) was there. Am I supposed to run
> tshark on the exported results?

Hmmm, in Wireshark it should have an area where you can enter a filter
expression. I use the Windows version so it shows up under the icon
bar. On Linux with the gtk version it shows up under the icon bar as
well with the word Filter: to the left.

In that text box enter (!smb2.request_in) && (smb2.flags.response == 0)

It is difficult to do this with tshark because of the single pass
nature of tshark.

>> Another thing you can do is look for where the connection drops.
>> Search on (tcp.flags.reset == 1) || (tcp.flags.fin == 1) select one,
>> and then filter for all packets on the connection (tcp,port ==
>> portnum) where portnum is not 445 so you get just one TCP connection
>> and then you can look at what happened at the end of that tcp
>> connection.
>>
>> This last approach might be the best if the problem is that the Linux
>> CIFS module is forgetting to send data at some point and we are not
>> looking for a Write request that the Drobo did not respond to.
>
> How do I ' filter for all packets on the connection (tcp,port ==
> portnum) where portnum is not 445'?

OK, you need to find the port numbers used in the early packets in the
capture for SMB traffic. They should be 445 and another port. Use that
other port for the expression above. If the two ports are 445 and
12743 then you want (tcp.port == 12743). That will isolate all packets
on the first connection, and then you can look towards the end to see
what is going wrong. You could even save just those packets and maybe
send them to us (compressed).

> Using the 2 searches here I get these kind of logs:
>
> 140622  25.417347 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
> Ack=293608973 Win=26508 Len=0 TSval=632244604 TSecr=2339550377
> 140623  25.417645 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
> Ack=293611869 Win=26508 Len=0 TSval=632244604 TSecr=2339550377
> 140624  25.417912 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
> Ack=293614765 Win=26508 Len=0 TSval=632244604 TSecr=2339550378
> 140625  25.417925 fedora → drobo TCP 11650 [TCP segment of a reassembled PDU]
> 140626  25.417942 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
> Ack=293617661 Win=26508 Len=0 TSval=632244604 TSecr=2339550378
> 140627  25.418193 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
> Ack=293620557 Win=26508 Len=0 TSval=632244604 TSecr=2339550378
> 140628  25.418662 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
> Ack=293623453 Win=26508 Len=0 TSval=632244604 TSecr=2339550378
> 140629  25.418672 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
> Ack=293626349 Win=26508 Len=0 TSval=632244604 TSecr=2339550379
> 140630  25.419048 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
> Ack=293629245 Win=26508 Len=0 TSval=632244604 TSecr=2339550379
> 140631  25.419076 fedora → drobo TCP 14546 [TCP segment of a reassembled PDU]
> 140632  25.419328 drobo → fedora TCP 66 445 → 34050 [ACK] Seq=228277
> Ack=293632141 Win=26508 Len=0 TSval=632244604 TSecr=2339550379



--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: backing up to a Drobo CIFS VFS: Error -11 sending data on socket to server, Bufferbloat?

Samba - samba-technical mailing list
>> Sorry still getting used to WS but I couldn't find (!smb2.request_in)
>> == but  (smb2.flags.response == 0) was there. Am I supposed to run
>> tshark on the exported results?
>
> Hmmm, in Wireshark it should have an area where you can enter a filter
> expression. I use the Windows version so it shows up under the icon
> bar. On Linux with the gtk version it shows up under the icon bar as
> well with the word Filter: to the left.

> In that text box enter (!smb2.request_in) && (smb2.flags.response == 0)

Yep that's what I thought here's the error:
(!smb2.request_in) && (smb2.flags.response == 0) isn't a valid display
filter: "smb2.request_in" is neither a field nor a protocol name.

> OK, you need to find the port numbers used in the early packets in the
> capture for SMB traffic. They should be 445 and another port. Use that
> other port for the expression above. If the two ports are 445 and
> 12743 then you want (tcp.port == 12743). That will isolate all packets
> on the first connection, and then you can look towards the end to see
> what is going wrong. You could even save just those packets and maybe
> send them to us (compressed).

So the filter should be this as the port # is 53854 when I just use the
"or" clause?

( (tcp.flags.reset == 1) || (tcp.flags.fin == 1) )&& (tcp.port == 53854)

Just exprting the one packet I see this:
No.     Time           Source                Destination           Protocol
Length Info
   7940 120.507576397  fedora-ws         drobo         TCP      66
53854 → 445 [FIN, ACK] Seq=277 Ack=1 Win=312 Len=0 TSval=831046843
TSecr=655916326

Frame 7940: 66 bytes on wire (528 bits), 66 bytes captured (528 bits)
Ethernet II, Src: HewlettP_25:8e:ee (2c:27:d7:25:8e:ee), Dst:
CheckPoi_3f:32:29 (00:1c:7f:3f:32:29)
    Destination: CheckPoi_3f:32:29 (00:1c:7f:3f:32:29)
    Source: HewlettP_25:8e:ee (2c:27:d7:25:8e:ee)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: fedora-ws, Dst: drobo
Transmission Control Protocol, Src Port: 53854, Dst Port: 445, Seq: 277,
Ack: 1, Len: 0
    Source Port: 53854
    Destination Port: 445
    [Stream index: 3]
    [TCP Segment Len: 0]
    Sequence number: 277    (relative sequence number)
    Acknowledgment number: 1    (relative ack number)
    Header Length: 32 bytes
    Flags: 0x011 (FIN, ACK)
    Window size value: 312
    [Calculated window size: 312]
    [Window size scaling factor: -1 (unknown)]
    Checksum: 0xb14e [unverified]
    [Checksum Status: Unverified]
    Urgent pointer: 0
    Options: (12 bytes), No-Operation (NOP), No-Operation (NOP), Timestamps
        No-Operation (NOP)
        No-Operation (NOP)
        Timestamps: TSval 831046843, TSecr 655916326

0000  00 1c 7f 3f 32 29 2c 27 d7 25 8e ee 08 00 45 00   ...?2),'.%....E.
0010  00 34 ef 97 40 00 40 06 9a 04 96 6c 40 38 96 6c   .4..@.@....[hidden email]
0020  44 17 d2 5e 01 bd d3 ec 30 f3 9e 4b c6 91 80 11   D..^....0..K....
0030  01 38 b1 4e 00 00 01 01 08 0a 31 88 c4 bb 27 18   .8.N......1...'.
0040  7d 26                                             }&

63 MB gzipped pcap file uploaded here:
http://storm.cis.fordham.edu/rkudyba/wireshark-capture.pcap.gz
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: backing up to a Drobo CIFS VFS: Error -11 sending data on socket to server, Bufferbloat?

Samba - samba-technical mailing list
On Mon, May 15, 2017 at 1:39 PM, Robert Kudyba <[hidden email]> wrote:

>>> Sorry still getting used to WS but I couldn't find (!smb2.request_in)
>>> == but  (smb2.flags.response == 0) was there. Am I supposed to run
>>> tshark on the exported results?
>>
>> Hmmm, in Wireshark it should have an area where you can enter a filter
>> expression. I use the Windows version so it shows up under the icon
>> bar. On Linux with the gtk version it shows up under the icon bar as
>> well with the word Filter: to the left.
>
>> In that text box enter (!smb2.request_in) && (smb2.flags.response == 0)
>
> Yep that's what I thought here's the error:
> (!smb2.request_in) && (smb2.flags.response == 0) isn't a valid display
> filter: "smb2.request_in" is neither a field nor a protocol name.

OK, that wouldn't have helped because it is SMB1 traffic ... different
syntax. Sorry.

>> OK, you need to find the port numbers used in the early packets in the
>> capture for SMB traffic. They should be 445 and another port. Use that
>> other port for the expression above. If the two ports are 445 and
>> 12743 then you want (tcp.port == 12743). That will isolate all packets
>> on the first connection, and then you can look towards the end to see
>> what is going wrong. You could even save just those packets and maybe
>> send them to us (compressed).
>
> So the filter should be this as the port # is 53854 when I just use the "or"
> clause?
>
> ( (tcp.flags.reset == 1) || (tcp.flags.fin == 1) )&& (tcp.port == 53854)
>
> Just exprting the one packet I see this:
> No.     Time           Source                Destination           Protocol
> Length Info
>    7940 120.507576397  fedora-ws         drobo         TCP      66     53854
> → 445 [FIN, ACK] Seq=277 Ack=1 Win=312 Len=0 TSval=831046843 TSecr=655916326
>
> Frame 7940: 66 bytes on wire (528 bits), 66 bytes captured (528 bits)
> Ethernet II, Src: HewlettP_25:8e:ee (2c:27:d7:25:8e:ee), Dst:
> CheckPoi_3f:32:29 (00:1c:7f:3f:32:29)
>     Destination: CheckPoi_3f:32:29 (00:1c:7f:3f:32:29)
>     Source: HewlettP_25:8e:ee (2c:27:d7:25:8e:ee)
>     Type: IPv4 (0x0800)
> Internet Protocol Version 4, Src: fedora-ws, Dst: drobo
> Transmission Control Protocol, Src Port: 53854, Dst Port: 445, Seq: 277,
> Ack: 1, Len: 0
>     Source Port: 53854
>     Destination Port: 445
>     [Stream index: 3]
>     [TCP Segment Len: 0]
>     Sequence number: 277    (relative sequence number)
>     Acknowledgment number: 1    (relative ack number)
>     Header Length: 32 bytes
>     Flags: 0x011 (FIN, ACK)
>     Window size value: 312
>     [Calculated window size: 312]
>     [Window size scaling factor: -1 (unknown)]
>     Checksum: 0xb14e [unverified]
>     [Checksum Status: Unverified]
>     Urgent pointer: 0
>     Options: (12 bytes), No-Operation (NOP), No-Operation (NOP), Timestamps
>         No-Operation (NOP)
>         No-Operation (NOP)
>         Timestamps: TSval 831046843, TSecr 655916326
>
> 0000  00 1c 7f 3f 32 29 2c 27 d7 25 8e ee 08 00 45 00   ...?2),'.%....E.
> 0010  00 34 ef 97 40 00 40 06 9a 04 96 6c 40 38 96 6c   .4..@.@....[hidden email]
> 0020  44 17 d2 5e 01 bd d3 ec 30 f3 9e 4b c6 91 80 11   D..^....0..K....
> 0030  01 38 b1 4e 00 00 01 01 08 0a 31 88 c4 bb 27 18   .8.N......1...'.
> 0040  7d 26                                             }&
>
> 63 MB gzipped pcap file uploaded here:
> http://storm.cis.fordham.edu/rkudyba/wireshark-capture.pcap.gz

OK, that helped, although I had to insert ~ before your username to download it.

The problem appears to be firewall related because I am seeing ICMP
Dest Unreachable responses to an SMB Write&X Response (frame 5816 and
5817).

Do a search on tcp.port == 42166 and you will see the response and
then a DEST Unreachable, then a retransmit in 120 seconds and another
DEST Unreachable etc.

However, there is also another SMB session on tcp.port == 53858 where
the Drobo is returning INVALID handle on the WRITES ... which seems to
be the fundamental problem.

However, stuff is missing from the capture.

This session seems to indicate the problem: tcp.port == 53860

There are a bunch of writes, but eventually the Windows fills up and
the Drobo stops responding at the TCP layer and eventually one party
or other RSTs the connection. Looks like a Drobo problem to me.

--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

Loading...