Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

classic Classic list List threaded Threaded
37 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
Hello Samba experts,

I have just successfully replaced the old default Thecus Samba version
3.5.16 on my Thecus NAS (32-bit Intel Atom, 3 GB RAM, i686 Thecus kernel
2.6.33) by a current Samba 4.6.5 build that I have cross-compiled myself
from scratch.

Note that so far, I am using the unchanged (i.e. exactly identical) old
3.5.16 smb.conf file for 4.6.5.

Everything seems to work fine in terms of functionality. From Windows
PowerShell, I have been also able to verify with "Get-SmbConnection"
that I am now using SMB3 ("dialect 3.1.1"), and I have noticed decent
performance gains for copying large files as expected:

  * single large file copy, Win10 NTFS client to Thecus NAS ("write"):
    before: Samba 3.5.16/SMB 1.5: ~18 MB per second
    after: Samba 4.6.5/SMB 3.1.1: ~23 MB per second

  * single large file copy, Thecus NAS to Win10 NTFS client ("read"):
    before: Samba 3.5.16/SMB 1.5: ~60 MB per second
    after: Samba 4.6.5/SMB 3.1.1: ~85 MB per second


But unfortunately, I have also run into a severe performance degradation
for copying a huge number of small files between a Win10 client and the
NAS in both directions. My test scenario here is copying a single
directory containing 5400 files of between less than 1kB and max 8kB in
size in both directions (Win10 -> Thecus and the other way round).

To the best of my knowledge, performance-related parameters in smb.conf
had already been tuned for 3.5.16 with good results - IIRC, it was
especially crucial for write performance to switch from

case sensitive = auto

to

case sensitive = true
default case = lower
preserve case = yes
short preserve case = yes

in order to stop smbd from repeatedly looping through those ~5000 trying
to check whether any new file name was unique in a case insensitive way...

But with the exact same smb.conf as attached (global section), I now
observe

  * many small files copy, Win10 NTFS client to Thecus NAS ("write"):
    before: Samba 3.5.16/SMB 1.5: ~120 kB per second
    after: Samba 4.6.5/SMB 3.1.1: ~4 (!!!) kB per second (this is
    completely inacceptable: on average, only 1-2 files per second!!!)

  * many small files copy, Thecus NAS to Win10 NTFS client ("read"):
    before: Samba 3.5.16/SMB 1.5: ~400 kB per second
    after: Samba 4.6.5/SMB 3.1.1: ~130 kB per second (even though still
    acceptable, slower by a factor of 3!!!)

Note that also, in both the "write" and the "read" scenario, CPU load
for the smbd process on the NAS is much higher for the 4.6.5 than the
3.5.16 scenario - and in the "write" scenario, CPU load even increases
over time (i.e. with the number of files that are already copied) and
continuously uses up to about 80% of one of my four (hyper-threaded)
Atom cores (1.8 GHz)...

Unfortunately, this performance issue makes Samba 4.6.5 pretty much a
pain to use for any "development" scenarios with huge numbers of small
files - so for now, I had to revert back my NAS to 3.5.16 to be usable -
even though I definitely want to upgrade asap in order to remove the
risk to be affected by the "SambaCry" issue (CVE-2017-7494)... :-(


My questions now are:

  * Am I still doing something wrong in terms of configuration, i.e.
    using inappropriate settings in smb.conf?
  * Am I perhaps hitting an already known performance issue, and if so,
    do you have any plans/timeline for fixing it?
  * And if this issue is indeed new to you, then how can I help the
    Samba team in tracking down the root cause of this and hopefully
    fixing the issue, i.e. can I enable my build to support code
    profiling to see where all that CPU power and time is lost?

Many thanks in advance for looking into this and advising how to
proceed! :-)

Best regards
Andreas



--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba

smb-global.conf (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
On Tue, Jun 13, 2017 at 12:09:41PM +0200, awl1 via samba wrote:

> Hello Samba experts,
>
> I have just successfully replaced the old default Thecus Samba
> version 3.5.16 on my Thecus NAS (32-bit Intel Atom, 3 GB RAM, i686
> Thecus kernel 2.6.33) by a current Samba 4.6.5 build that I have
> cross-compiled myself from scratch.
>
> Note that so far, I am using the unchanged (i.e. exactly identical)
> old 3.5.16 smb.conf file for 4.6.5.
>
> Everything seems to work fine in terms of functionality. From
> Windows PowerShell, I have been also able to verify with
> "Get-SmbConnection" that I am now using SMB3 ("dialect 3.1.1"), and
> I have noticed decent performance gains for copying large files as
> expected:
>
>  * single large file copy, Win10 NTFS client to Thecus NAS ("write"):
>    before: Samba 3.5.16/SMB 1.5: ~18 MB per second
>    after: Samba 4.6.5/SMB 3.1.1: ~23 MB per second
>
>  * single large file copy, Thecus NAS to Win10 NTFS client ("read"):
>    before: Samba 3.5.16/SMB 1.5: ~60 MB per second
>    after: Samba 4.6.5/SMB 3.1.1: ~85 MB per second
>
>
> But unfortunately, I have also run into a severe performance
> degradation for copying a huge number of small files between a Win10
> client and the NAS in both directions. My test scenario here is
> copying a single directory containing 5400 files of between less
> than 1kB and max 8kB in size in both directions (Win10 -> Thecus and
> the other way round).
>
> To the best of my knowledge, performance-related parameters in
> smb.conf had already been tuned for 3.5.16 with good results - IIRC,
> it was especially crucial for write performance to switch from

Can you get comparitive wireshark traces for the two cases ?

That would help discover what the bottleneck is.

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
Hello Jeremy,

thanks a million for your help and interest in tracking this down! :-)

Am 13.06.2017 um 18:36 schrieb Jeremy Allison:
> Can you get comparitive wireshark traces for the two cases ?
>
> That would help discover what the bottleneck is.
I am not at all a network guy, but I hope that - maybe with a little
more help from your part once I have tried to do so in practice - I
should be able to do so...

Follow-up questions:

Which machine do you want me to run wireshark on? Can this be the
Windows machine, or will I need to cross-compile a wireshark version to
run on my NAS first (which might take some days)?

Based on the description here:

     https://wiki.samba.org/index.php/Capture_Packets

I assume that you want me to record ports 139 and 445?

In the first step, I will try and make four records, right:

  * the "write" scenario, copying from Win10 to the NAS with 3.5.16 and
    4.6.5
  * the "read" scenario, copying from the NAS to Win10 with 3.5.16 and 4.6.5

How many files should I copy during each recording?

I'm a little bit worried whether a network trace will indeed pont to the
root cause, as I fear from looking at smbd CPU usage on the NAS that
there will be some CPU-bound busy activity (i.e. not just I/O wait) at
the heart of the issue, but we will hopefully see this during the process.

And finally: Where do you want me to upload the recorded traces to?

Many thanks so far & best regards
Andreas

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
Hello again, Jeremy,

first of all, I am terribly sorry for my late reply. I tried to send my
posting many times, but my mail has always been silently discarded by
the Samba mail servers due to my main mail provider (GMX - a very large
German mail provider with millions of customers) having been blacklisted
by SORBS.

For the time being, SORBS is still unwilling to delist them for unknown
reasons (which I consider a clear malpractice by SORBS, as GMX has
sophisticated spam/abuse management in place), so I had to switch to
another mail provider just in order to be able to post again on the
Samba list... :-(


Am 13.06.2017 um 19:00 schrieb awl1:
> Am 13.06.2017 um 18:36 schrieb Jeremy Allison:
>> Can you get comparitive wireshark traces for the two cases ?
>>
>> That would help discover what the bottleneck is.
> I am not at all a network guy, but I hope that - maybe with a little
> more help from your part once I have tried to do so in practice - I
> should be able to do so...

OK, so it looks like I have been able to successfully produce Wireshark
capture files for the four scenarios... :-)

As I am almost certain that these packet captures will contain at least
some sensitive information from my environment - such as e.g. user,
share and machine names, IP addresses (possibly in the old SMB dialect
1.5 even the clear-text password for the share?) - I will only send the
link to the captures ZIP file stored in my cloud space to you via
private mail. So please keep the packet dumps confidential, and only
share them with other Samba developers after getting my explicit consent!

The ZIP file contains four Wireshark captures for the two scenarios
(write to and read from share) and the two Samba/SMB versions
(4.6.5/SMB2/dialect 3.1.1 and 3.5.16/SMB/dialect 1.5) in "pcapng" format:

  * smb311_write - Win10 client writing to Samba 4.6.5 using SMB2
    protocol (dialect 3.1.1), copying ~ 1000 files from local hard disk
    onto the share, documenting the issue with very slow throughput of
    below 10 kB/sec (especially in the range of file 300-600, most
    interestingly throughput improved again after some time)
  * smb15_write - Win10 client writing to Samba 3.5.16 using SMB
    protocol (dialect 1.5), copying ~ 1000 files from local hard disk
    onto the share, with much better throughput than in smb311_write

  * smb311_read - Win10 client reading from Samba 4.6.5 using SMB2
    protocol (dialect 3.1.1), copying ~ 2000 files from the share to
    local hard disk, with acceptable throughput, but consistently slower
    than in smb15_read
  * smb15_read - Win10 client reading from Samba 3.5.16 using SMB
    protocol (dialect 1.5), copying ~ 2000 files from the share to local
    hard disk, with consistently better throughput than in smb311_read

Fingers crossed that you will be able to determine why 4.6.5 is slower
in both scenarios, and especially so much slower when writing to the
share (smb311_write) and one more time, thanks a million for digging
into these packet dumps...

Best regards
Andreas


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Friendly Reminder: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
Hello again, Jeremy and other Samba experts,

I'm sorry to be such a pain in your neck(s), but I still need your help
in looking for help trying to find out why SMB2/3.1.1 in Samba 4.6.5
performs so much worse than SMB/1.5 in Samba 3.6.15 in scenarios with a
huge number of small files.

As requested by Jeremy, I have done wireshark "pcapng" captures of the
four scenarios as described in my original post below:

  * smb311_write - Win 10 client storing ~ 1000 small files onto Samba
    4.6.5/Thecus NAS
  * smb15_write - Win 10 client storing ~ 1000 small files on Samba
    3.6.15/Thecus NAS
  * smb311_read - Win 10 client reading ~ 2000 small files from Samba
    4.6.5/Thecus NAS
  * smb15_read - Win 10 client reading ~ 2000 small files from Samba
    3.6.15/Thecus NAS

These recordings do indeed contain confidential data from my machine,
which is why I have so far only sent a download link to Jeremy via
private mail.

In case others from the Samba team would also like to look into the
wireshark capture traces, please get back to me directly and request
access: I will then also send you a download link/password to the
capture files ZIP via private mail.

I would really appreciate to be able to switch this old Thecus N4200PRO
NAS away from Thecus' outdated 3.6.15 version (prone to "SambaCry") to a
self-compiled, but secure 4.6.x version asap.

Many thanks one more time for your kind help with this & best regards
Andreas


-------- Weitergeleitete Nachricht --------
Betreff: Re: [Samba] Huge number of small files performance regression
from 3.5.16 to 4.6.5 with identical smb.conf
Datum: Tue, 20 Jun 2017 14:30:22 +0200
Von: awl1 <[hidden email]>
An: Jeremy Allison <[hidden email]>, [hidden email]



Hello again, Jeremy,

first of all, I am terribly sorry for my late reply. I tried to send my
posting many times, but my mail has always been silently discarded by
the Samba mail servers due to my main mail provider (GMX - a very large
German mail provider with millions of customers) having been blacklisted
by SORBS.

For the time being, SORBS is still unwilling to delist them for unknown
reasons (which I consider a clear malpractice by SORBS, as GMX has
sophisticated spam/abuse management in place), so I had to switch to
another mail provider just in order to be able to post again on the
Samba list... :-(


Am 13.06.2017 um 19:00 schrieb awl1:
> Am 13.06.2017 um 18:36 schrieb Jeremy Allison:
>> Can you get comparitive wireshark traces for the two cases ?
>>
>> That would help discover what the bottleneck is.
> I am not at all a network guy, but I hope that - maybe with a little
> more help from your part once I have tried to do so in practice - I
> should be able to do so...

OK, so it looks like I have been able to successfully produce Wireshark
capture files for the four scenarios... :-)

As I am almost certain that these packet captures will contain at least
some sensitive information from my environment - such as e.g. user,
share and machine names, IP addresses (possibly in the old SMB dialect
1.5 even the clear-text password for the share?) - I will only send the
link to the captures ZIP file stored in my cloud space to you via
private mail. So please keep the packet dumps confidential, and only
share them with other Samba developers after getting my explicit consent!

The ZIP file contains four Wireshark captures for the two scenarios
(write to and read from share) and the two Samba/SMB versions
(4.6.5/SMB2/dialect 3.1.1 and 3.5.16/SMB/dialect 1.5) in "pcapng" format:

  * smb311_write - Win10 client writing to Samba 4.6.5 using SMB2
    protocol (dialect 3.1.1), copying ~ 1000 files from local hard disk
    onto the share, documenting the issue with very slow throughput of
    below 10 kB/sec (especially in the range of file 300-600, most
    interestingly throughput improved again after some time)
  * smb15_write - Win10 client writing to Samba 3.5.16 using SMB
    protocol (dialect 1.5), copying ~ 1000 files from local hard disk
    onto the share, with much better throughput than in smb311_write

  * smb311_read - Win10 client reading from Samba 4.6.5 using SMB2
    protocol (dialect 3.1.1), copying ~ 2000 files from the share to
    local hard disk, with acceptable throughput, but consistently slower
    than in smb15_read
  * smb15_read - Win10 client reading from Samba 3.5.16 using SMB
    protocol (dialect 1.5), copying ~ 2000 files from the share to local
    hard disk, with consistently better throughput than in smb311_read

Fingers crossed that you will be able to determine why 4.6.5 is slower
in both scenarios, and especially so much slower when writing to the
share (smb311_write) and one more time, thanks a million for digging
into these packet dumps...

Best regards
Andreas


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Friendly Reminder: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
Andreas,

A few thoughts regarding your system
1) If it's a home system and you're specifically concerned about mitigating
CVE 2017-7494, (a) verify that your share isn't mounted 'noexec' - if it's
mounted this way then you're safe (b) if not (a), then add the [global]
parameter "nt pipe support = no". This will break functionality that relies
on support for named pipes, but downloading / uploading files should still
work normally.

2) If you need to use Samba 4.6.5, try starting with a minimal smb.conf
with logging turned up. Then review your samba logs. Note that setting log
level to "10' will probably be more verbose than you want. Choose an
appropriate level. Here's one from on of my testing machines. :

[global]
   guest account = awalker
   map to guest = Bad User
   log level = 10

[Donkey Vol]
   path = "/mnt/Donkey/Vol1"
   writeable = yes
   vfs objects = zfs_space
   guest ok = yes
   guest only = yes

3) The last firmware update looks like it was from 2014. You're probably
vulnerable to a lot more than just that single Samba CVE. If this is in a
business environment, perhaps look into migrating to a new appliance /
server that's not EOL.

If it's a home environment, and you like to tinker with things look for
guides on installing stock Debian on the Thecus (it looks the Thecus has
x86 hardware and an IDE DOM), and then adding Louis's Samba repo /
installing the package. I did this with an old WD MyCloud about a year or
so ago, and was much happier with the system afterwards. It's hackish, but
can be a fun side-project.

Andrew
--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Friendly Reminder: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
Hello Andrew,

many thanks for your thoughts and clarifications - here's my take on them:

Regarding 1), I had indeed started to look into simply adding "nt pipe
support = no" to the original 3.5.6 Samba smb.conf first. But in order
to do so, I would also need to create my own Samba module, replacing the
Thecus built-in: Without deactivating the built-in Samba, it is not
possible to add custom smb.conf entries (anything not settable through
the admin GUI), and the admin GUI backs out/overwrites any custom
changes to the default smb.conf from an SSH command line every time I
execute any non-readonly operation from the admin GUI... :-(

So I ended up with not simply providing my own smb.conf, but also
compiling my own Samba version, and I have indeed successfully
cross-compiled Samba 4.6.5 including all dependencies without any
issues: Besides the performance regression for huge numbers of small
files, my self-compiled 4.6.5 works completely fine and as expected.

Would you indeed think that the performance issue (SMB2 3.1.1 notably
slower than SMB 1.5 when copying a huge number of small files) will
document itself in debug level log entries? As these log entries will
also need to be written to the exact same NAS HDDs, they would also
inevitably come with a throughput/performance overhead/bottleneck for
this debug logging...!? (With Thecus' default log level settings, there
are no error messages at all in any of the Samba logs when I execute the
file copy tests...)

And regarding 3): Is there any particular reason why you seem to expect
Louis' Debian modules for 4.6.5 to perform better than my self-compiled
version? Could this indeed be due to the fact that my Samba 4.6.5 on the
NAS is still running on Thecus kernel 2.6.33 (I'd rather question this,
as I have compiled Samba itself as well as all its required dependencies
in up-to-date versions with gcc-5.2 from the latest crosstools-ng that
works on the NAS's kernel)?

This is my private NAS located in my Home Office LAN, and it also runs a
custom (and more or less up-to-date) OpenSSH and Dovecot service
compiled by myself. The NAS is mainly used as a backup target and for
development purposes, so the huge number of small files scenario
unfortunately is not that rare...


Base line:
 From my guts feeling, most probably comparing the Wireshark packet
captures as proposed by Jeremy would be a good way to move forward and
find out why the number of packets in the Samba 4.6.5/SMB2 3.1.1
scenario seems to be higher than using Samba 3.6.15/SMB 1.5, but I will
clearly need some help by you, SMB protocol/Wireshark experts, to know
what to look for in those packet captures... ;-)

So I'd still prefer to stay with my self-compiled 4.6.x version and am
looking forward for some help from the experts in how to read/analyze
the Wireshark captures, trying to find out why SMB2/3.1.1 in Samba 4.6.5
seems to be hitting a performance regression...

Thanks anyway & best regards
Andreas


Am 29.06.2017 um 22:14 schrieb Andrew Walker via samba:

> Andreas,
>
> A few thoughts regarding your system
> 1) If it's a home system and you're specifically concerned about mitigating
> CVE 2017-7494, (a) verify that your share isn't mounted 'noexec' - if it's
> mounted this way then you're safe (b) if not (a), then add the [global]
> parameter "nt pipe support = no". This will break functionality that relies
> on support for named pipes, but downloading / uploading files should still
> work normally.
>
> 2) If you need to use Samba 4.6.5, try starting with a minimal smb.conf
> with logging turned up. Then review your samba logs. Note that setting log
> level to "10' will probably be more verbose than you want. Choose an
> appropriate level. Here's one from on of my testing machines. :
>
> [global]
>     guest account = awalker
>     map to guest = Bad User
>     log level = 10
>
> [Donkey Vol]
>     path = "/mnt/Donkey/Vol1"
>     writeable = yes
>     vfs objects = zfs_space
>     guest ok = yes
>     guest only = yes
>
> 3) The last firmware update looks like it was from 2014. You're probably
> vulnerable to a lot more than just that single Samba CVE. If this is in a
> business environment, perhaps look into migrating to a new appliance /
> server that's not EOL.
>
> If it's a home environment, and you like to tinker with things look for
> guides on installing stock Debian on the Thecus (it looks the Thecus has
> x86 hardware and an IDE DOM), and then adding Louis's Samba repo /
> installing the package. I did this with an old WD MyCloud about a year or
> so ago, and was much happier with the system afterwards. It's hackish, but
> can be a fun side-project.
>
> Andrew



--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Friendly Reminder: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
In reply to this post by Samba - General mailing list
Hello one more time, Jeremy & fellow Samba experts/developers,

over the weekend, I have done some more reading about wireshark
tooling/statistics - the answer to

https://ask.wireshark.org/questions/58970/analysing-performance-issues-with-storage-smb2

was very helpful - and am now able to provide very clear and simple
proof of the performance regression that I am seeing between SMB/1.5 in
Samba 3.5.16 and SMB2/3.1.1 in Samba 4.6.5, using Wireshark's
"Statistics -> Service Response Times -> SMB(2)" tool.

I really hope that someone from the development team is now interested
in taking over and starts looking into this with me. Please rest assured
that I will be happy to do everything I can to support the analysis and
testing process. Please get back to me, and I will be sending you the
access information and password for the respective wireshark PCAPNG
traces ZIP file.

My test scenario: Win 10 client, copying files from/to share with Total
Commander, Thecus N4200pro NAS (Linux kernel 2.6.33) with 3 GB RAM and
either Thecus original Samba 3.5.16 or self-compiled (using gcc-5.2)
Samba 4.6.5, exact same smb.conf for both versions, definitely no other
load on /access to the NAS during my testing.


A) "Write" Scenario: Win 10 client copying/storing ~ 1000 small files
onto a Samba share on Thecus NAS

A1) smb15_write: Samba share running Thecus Samba 3.5.16 (writing 1024
small files to share)

=================================================================================================
SMB Service Response Time Statistics - smb15_write:
Index                        Procedure  Calls  Min SRT (s)  Max SRT (s)  
Avg SRT (s)  Sum SRT (s)
-------------------------------------------------------------------------------------------------
SMB Commands
Close                                4   2791 0.000210     0.020178    
0.000831     2.320021
Negotiate Protocol                 114      1 0.001986     0.001986    
0.001986     0.001986
NT Create AndX                     162   3066 0.000650     0.019631    
0.003122     9.570728
Session Setup AndX                 115      2 0.001048     0.005596    
0.003322     0.006644
Tree Connect AndX                  117      1 0.012328     0.012328    
0.012328     0.012328
Write AndX                          47   1024 0.000473     0.350664    
0.001528     1.564348
Transaction2 Sub-Commands
FIND_FIRST2                          1   2042 0.001599     0.017518    
0.003191     6.515678
QUERY_FILE_INFO                      7   3071 0.000260     0.007440    
0.000421     1.292491
QUERY_FS_INFO                        3   1024 0.000259     0.017036    
0.000363     0.371787
QUERY_PATH_INFO                      5      2 0.000437     0.001040    
0.000739     0.001477
SET_FILE_INFO                        8   2037 0.001218     0.008207    
0.001812     3.691841
NT Transaction Sub-Commands
-------------------------------------------------------------------------------------------------
Grand Total Sum of SRT (s)    25.349329

A2) smb31_write: Samba share running self-compiled Samba 4.6.5 (writing
1015 small files to share)

========================================================================================
SMB2 Service Response Time Statistics - smb31_write:
Index               Procedure  Calls  Min SRT (s)  Max SRT (s)  Avg SRT
(s)  Sum SRT (s)
----------------------------------------------------------------------------------------
SMB2
Close                       6   3059     0.000496 0.590329    
0.004469    13.670456
Create                      5   3061     0.001683 0.023491    
0.005686    17.405265
Find                       14   1607     0.001383 0.746684    
0.193413   310.814294
GetInfo                    16     46     0.000424 0.009404    
0.001441     0.066298
Ioctl                      11      1     0.000510 0.000510    
0.000510     0.000510
Negotiate Protocol          0      1     0.043381 0.043381    
0.043381     0.043381
Session Setup               1      2     0.001535 0.037150    
0.019343     0.038685
SetInfo                    17   2033     0.000757 0.007015    
0.001111     2.259172
Tree Connect                3      2     0.001591 0.031307    
0.016449     0.032898
Tree Disconnect             4      1     0.004142 0.004142    
0.004142     0.004142
Write                       9   1015     0.000590 0.358855    
0.001798     1.825072
----------------------------------------------------------------------------------------
                                                  Grand Total Sum of SRT
(s)   346.160173

As you can easily see, the big difference here is in the "Find" Operation:
A2a) While in Samba 4.6.5, 1607 calls to SMB2 "Find" take 310.81
seconds, in Samba 3.5.16 2042 calls to SMB "FIND_FIRST2" take only 6.51
seconds.
The obvious consequence is that the overall grand total sum of service
response time for copying ~ 1000 small files goes up from 25 to a really
annoying 346 seconds.


B) "Read" Scenario: Win 10 client copying/reading ~ 2000 small files
from a Samba share on Thecus NAS

B1) smb15_read: Samba share running Thecus Samba 3.5.16 (reading 2040
small files from share)

=================================================================================================
SMB Service Response Time Statistics - smb15_read:
Index                        Procedure  Calls  Min SRT (s)  Max SRT (s)  
Avg SRT (s)  Sum SRT (s)
-------------------------------------------------------------------------------------------------
SMB Commands
Close                                4   1630 0.000234     0.051417    
0.001525     2.486313
Negotiate Protocol                 114      1 0.001941     0.001941    
0.001941     0.001941
NT Create AndX                     162   2040 0.009347     0.170227    
0.012715    25.938098
Read AndX                           46   2185 0.000267     0.008728    
0.000842     1.838953
Session Setup AndX                 115      2 0.001074     0.005605    
0.003340     0.006679
Tree Connect AndX                  117      1 0.012529     0.012529    
0.012529     0.012529
Transaction2 Sub-Commands
FIND_FIRST2                          1     10 0.014211     0.817830    
0.218505     2.185050
FIND_NEXT2                           2    117 0.167130     0.868053    
0.547014    64.000643
QUERY_FILE_INFO                      7  10200 0.000267     0.003532    
0.000363     3.706283
QUERY_FS_INFO                        3   2048 0.000254     0.017356    
0.000334     0.683642
QUERY_PATH_INFO                      5     54 0.000424     0.055262    
0.007493     0.404625
NT Transaction Sub-Commands
-------------------------------------------------------------------------------------------------
Grand Total Sum of SRT (s)   101.264756

B2) smb31_read: Samba share running self-compiled Samba 4.6.5 (reading
2033 small files from share)

========================================================================================
SMB2 Service Response Time Statistics - smb31_read:
Index               Procedure  Calls  Min SRT (s)  Max SRT (s)  Avg SRT
(s)  Sum SRT (s)
----------------------------------------------------------------------------------------
SMB2
Close                       6   1754     0.000443 0.188892    
0.006119    10.732004
Create                      5   2038     0.001203 2.296105    
0.026880    54.782150
Find                       14     52     0.000435 2.540043    
1.351859    70.296650
GetInfo                    16   6091     0.000656 0.066154    
0.001229     7.485929
Negotiate Protocol          0      1     0.032793 0.032793    
0.032793     0.032793
Read                        8   2033     0.000300 0.071186    
0.000742     1.507737
Session Setup               1      2     0.001522 0.036670    
0.019096     0.038192
Tree Connect                3      1     0.030543 0.030543    
0.030543     0.030543
----------------------------------------------------------------------------------------
                                                  Grand Total Sum of SRT
(s)   144.905998

In this case, the difference in the overall grand total sum of response
times is "only" 101 seconds to 145 seconds, which is much less drastic,
but still pretty much noticeable.
Unfortunately, it seems less clear here what the main culprit is - the
difference rather seems to come from three different SMB(2) calls:
B2a) While we have 10+117 = 127 calls to "FIND_FIRST2"/"FIND_NEXT2" in
SMB that take ~ 66 seconds, only 52 calls to SMB2 "Find" take ~ 70 seconds.
B2b) SMB2 "Close" takes ~ 10 seconds for 1754 calls, while SMB "Close"
takes only ~ 2.5 seconds for 1630 calls.
B2c) SMB2 "Create" takes ~ 55 seconds for 2038 calls, while SMB "NT
Create AndX" takes only ~ 26 seconds for 2040 calls.


I truly hope we will be able to improve general Samba 4.6 / SMB2
performance for the "huge number of small files" scenario as a result of
this exercise...

Many thanks in advance for your kind help & best regards
Andreas


Am 29.06.2017 um 11:03 schrieb awl1:

> Hello again, Jeremy and other Samba experts,
>
> I'm sorry to be such a pain in your neck(s), but I still need your
> help in looking for help trying to find out why SMB2/3.1.1 in Samba
> 4.6.5 performs so much worse than SMB/1.5 in Samba 3.5.16 in scenarios
> with a huge number of small files.
>
> As requested by Jeremy, I have done wireshark "pcapng" captures of the
> four scenarios as described in my original post below:
>
>  * smb311_write - Win 10 client storing ~ 1000 small files onto Samba
>    4.6.5/Thecus NAS
>  * smb15_write - Win 10 client storing ~ 1000 small files on Samba
>    3.5.16/Thecus NAS
>  * smb311_read - Win 10 client reading ~ 2000 small files from Samba
>    4.6.5/Thecus NAS
>  * smb15_read - Win 10 client reading ~ 2000 small files from Samba
>    3.5.16/Thecus NAS
>
> These recordings do indeed contain confidential data from my machine,
> which is why I have so far only sent a download link to Jeremy via
> private mail.
>
> In case others from the Samba team would also like to look into the
> wireshark capture traces, please get back to me directly and request
> access: I will then also send you a download link/password to the
> capture files ZIP via private mail.
>
> I would really appreciate to be able to switch this old Thecus
> N4200PRO NAS away from Thecus' outdated 3.5.16 version (prone to
> "SambaCry") to a self-compiled, but secure 4.6.x version asap.
>
> Many thanks one more time for your kind help with this & best regards
> Andreas
>
> Am 13.06.2017 um 18:36 schrieb Jeremy Allison:
>>> Can you get comparitive wireshark traces for the two cases ?
>>>
>>> That would help discover what the bottleneck is.


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
Hello again, Jeremy, hello again, Samba experts/developers,

as "all good things come in threes" and "third time is a charm",
following kind advice from Björn Jacke, I do indeed try again to arouse
your interest on this list one more time, giving an even shorter summary
of the issue and having tested with a number of older Samba versions
between 3.5.x and 4.6.x to exactly pinpoint when the issue started.

As I am 99.99% confident that this is not a configuration issue on my
side, I would really appreciate if somebody the Samba team would be
interested in tracking down why - for the specific scenario with a huge
number of small files - performance is (so) much worse with Samba
4.x/SMB2 than it used to be with Samba 3.x/SMB1.

(Please note that, for a small number of larger or even huge files, as
expected, I can also confirm from my observations that Samba 4.x/SMB2 is
typically faster than Samba 3.x/SMB1, sometimes even considerably, so
the issue is NOT with Samba 4.x/SMB2 in general, but seems to be caused
to the specific scenario of a huge number of small files.)

_*Summary:*_

  * Win10 client using TotalCommander 9.0a to copy files
  * Copying files from/to a Samba share running on my Home Office Thecus NAS
  * Thecus N4200pro NAS (Intel(R) Atom(TM) CPU D525, 2 cores/4 HT
    threads @ 1.80GHz, Linux kernel 2.6.33, 3 GB RAM) and either Thecus
    original Samba 3.5.16 or several self-compiled (using gcc-5.2) Samba
    versions:
      o Samba 4.6.5, SMB2 dialect 3.1.1
      o Samba 4.2.14, SMB2 dialect 3.0
      o Samba 4.0.26, SMB2 dialect 3.0
      o Samba 3.6.25, SMB2 dialect 2.0.2 (single line "min protocol =
        SMB2" added to smb.conf)
      o Samba 3.6.25, SMB1 dialect 1.5
      o Thecus original Samba 3.5.16, SMB1 dialect 1.5
  * Exact same hardware, network, complete software stack for all cases
    (except varying Samba version on Thecus NAS)
  * Exact same smb.conf for both versions (see attached)
  * Definitely no other load on/access to the NAS during my testing
  * Recorded Wireshark captures in pcapng format for both Write/Read
    scenarios in all above Samba versions
  * Looking at Grand Total Sum of Wireshark "Service Response Time
    Statistics" (SRT) in seconds for all captures to compare performance
    below


*A) "Write" Scenario:*
Write ~ 1000 Small Files (between <1kB and ~ 20kB) to Samba share on
Thecus NAS, copying from a directory of ~ 5000 files stored on Win10
local NTFS

    *Samba version**
    * *SMB/SMB2 dialect**
    * *Total SRT (sec)**
    *
    3.5.16
    1.5
    *25**
    *
    3.6.25
    1.5
    *21**
    *
    3.6.25
    2.0.2
    *341**
    *
    4.0.26
    3.0
    *387**
    *
    4.2.14
    3.0
    *355**
    *
    4.6.5
    3.1.1
    *346**
    *


*B) "Read" Scenario:*
Read ~ 2000 Small Files (between <1kB and ~ 20kB) from a directory of ~
5000 files from Samba share on Thecus NAS, copy to local NTFS on Win10

    *Samba version**
    * *SMB/SMB2 dialect**
    * *Total SRT (sec)**
    *
    3.5.16
    1.5
    *101
    *
    3.6.25
    1.5
    *100
    *
    3.6.25
    2.0.2
    *139
    *
    4.0.26
    3.0
    *152
    *
    4.2.14
    3.0
    *140**
    *
    4.6.5
    3.1.1
    *144**
    *

(Note that the read scenario spends most of the time in all cases, i.e.
eben in 3.x/SMB 1.5, determining the whole number of ~5000 files in this
directory, before Total Commander even starts copying the ~ 2000 files.)

Summary of findings:

  * For both Write and Read scenario and a huge number of small files,
    performance with SMB2/dialect 2.0/3.0/3.1.1 in all Samba versions >=
    3.6 up to most recent 4.6 is (much) worse than SMB performance with
    SMB/dialect 1.5 in Samba 3.6 and before.
  * While in the Read scenario, performance is "only" worse by a factor
    of 40% (which might possibly at least partly be explained by
    additional complexity in SMB2), for the Write scenario, performance
    is *about fourteen times (1400%) worse*, a finding which definitely
    cannot be explained to be "working as designed".
  * While SMB/1.5 performance is still fine in the latest 3.6.25, *all
    SMB2-capable releases of Samba from the very first SMB2/2.x
    implementation in Samba 3.6 onwards seem to be affected* by the
    performance regression.

I have attached the (anonymized) smb.conf (global section and particular
share definition) as well as an Excel sheet and PDF with the detailed
Wireshark Service Response Time Statistics for Write and Read scenario.

Am 13.06.2017 um 18:36 schrieb Jeremy Allison:
> Can you get comparitive wireshark traces for the two cases ?
> That would help discover what the bottleneck is.

As requested by Jeremy, the Wireshark "pcapng" packet traces/recordings
are available for all Samba versions mentioned above in both Read and
Write scenario. Unfortunately, these recordings do indeed contain
confidential data both from my machine and the share, so please get back
to me directly and request access: I will then send you a download
link/password to the capture files ZIP via private mail.

I hereby promise that I will do everything I can in order to support
your analysis, including running follow-up tests on my
platform/scenario, digging deeper into packet traces or even do source
code investigations based on your instructions.

*I truly hope we will be able to improve general Samba 4.x / SMB2
performance for the "huge number of small files" scenario as a result of
this exercise... *

Many thanks one more time for your kind help with this!

Best regards,
Andreas


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba

smb.conf (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
In reply to this post by Samba - General mailing list
Hello again, Jeremy, hello again, Samba experts/developers,

as "all good things come in threes" and "third time is a charm",
following kind advice from Björn Jacke, I do indeed try again to arouse
your interest on this list one more time, giving an even shorter summary
of the issue and having tested with a number of older Samba versions
between 3.5.x and 4.6.x to exactly pinpoint when the issue started.

As I am 99.99% confident that this is not a configuration issue on my
side, I would really appreciate if somebody the Samba team would be
interested in tracking down why - for the specific scenario with a huge
number of small files - performance is (so) much worse with Samba
4.x/SMB2 than it used to be with Samba 3.x/SMB1.

(Please note that, for a small number of larger or even huge files, as
expected, I can also confirm from my observations that Samba 4.x/SMB2 is
typically faster than Samba 3.x/SMB1, sometimes even considerably, so
the issue is NOT with Samba 4.x/SMB2 in general, but seems to be caused
to the specific scenario of a huge number of small files.)

Summary:

   * Win10 client using TotalCommander 9.0a to copy files
   * Copying files from/to a Samba share running on my Home Office
     Thecus NAS
   * Thecus N4200pro NAS (Intel(R) Atom(TM) CPU D525, 2 cores/4 HT
     threads @ 1.80GHz, Linux kernel 2.6.33, 3 GB RAM) and either
     Thecus original Samba 3.5.16 or several self-compiled
     (using gcc-5.2) Samba versions:
       - Samba 4.6.5, SMB2 dialect 3.1.1
       - Samba 4.2.14, SMB2 dialect 3.0
       - Samba 4.0.26, SMB2 dialect 3.0
       - Samba 3.6.25, SMB2 dialect 2.0.2 (single line
         "min protocol = SMB2" added to smb.conf)
       - Samba 3.6.25, SMB1 dialect 1.5
       - Thecus original Samba 3.5.16, SMB1 dialect 1.5
   * Exact same hardware, network, complete software stack for all
     cases (except varying Samba version on Thecus NAS)
   * Exact same smb.conf for both versions (see attached)
   * Definitely no other load on/access to the NAS during my testing
   * Recorded Wireshark captures in pcapng format for both Write/Read
     scenarios in all above Samba versions
   * Looking at Grand Total Sum of Wireshark "Service Response Time
     Statistics" (SRT) in seconds for all captures to compare performance
     below


A) "Write" Scenario:
Write ~ 1000 Small Files (between <1kB and ~ 20kB) to Samba share on
Thecus NAS, copying from a directory of ~ 5000 files stored on Win10
local NTFS

     Samba version   SMB/SMB2 dialect   Total SRT (sec)
     3.5.16          1.5                25
     3.6.25          1.5                21
     3.6.25          2.0.2              341 (!!!)
     4.0.26          3.0                387 (!!!)
     4.2.14          3.0                355 (!!!)
     4.6.5           3.1.1              346 (!!!)

B) "Read" Scenario:
Read ~ 2000 Small Files (between <1kB and ~ 20kB) from a directory of ~
5000 files from Samba share on Thecus NAS, copy to local NTFS on Win10

     Samba version   SMB/SMB2 dialect   Total SRT (sec)
     3.5.16          1.5                101
     3.6.25          1.5                100
     3.6.25          2.0.2              139 (!)
     4.0.26          3.0                152 (!)
     4.2.14          3.0                140 (!)
     4.6.5           3.1.1              144 (!)

(Note that the read scenario spends most of the time in all cases, i.e.
eben in 3.x/SMB 1.5, determining the whole number of ~5000 files in this
directory, before Total Commander even starts copying the ~ 2000 files.)

Summary of findings:

   * For both Write and Read scenario and a huge number of small files,
     performance with SMB2/dialect 2.0/3.0/3.1.1 in all Samba versions
     >= 3.6 up to most recent 4.6 is (much) worse than SMB performance
   with SMB/dialect 1.5 in Samba 3.6 and before.
   * While in the Read scenario, performance is "only" worse by a factor
     of 40% (which might possibly at least partly be explained by
     additional complexity in SMB2), for the Write scenario, performance
     is about fourteen times (1400%) worse, a finding which definitely
   cannot be explained to be "working as designed".
   * While SMB/1.5 performance is still fine in the latest 3.6.25, all
     SMB2-capable releases of Samba from the very first SMB2/2.x
     implementation in Samba 3.6 onwards seem to be affected by the
     performance regression.

I have attached the (anonymized) smb.conf (global section and particular
share definition) as well as an Excel sheet and PDF with the detailed
Wireshark Service Response Time Statistics for Write and Read scenario.

Am 13.06.2017 um 18:36 schrieb Jeremy Allison:
 > Can you get comparitive wireshark traces for the two cases ?
 > That would help discover what the bottleneck is.

As requested by Jeremy, the Wireshark "pcapng" packet traces/recordings
are available for all Samba versions mentioned above in both Read and
Write scenario. Unfortunately, these recordings do indeed contain
confidential data both from my machine and the share, so please get back
to me directly and request access: I will then send you a download
link/password to the capture files ZIP via private mail.

I hereby promise that I will do everything I can in order to support
your analysis, including running follow-up tests on my
platform/scenario, digging deeper into packet traces or even do source
code investigations based on your instructions.

I truly hope we will be able to improve general Samba 4.x / SMB2
performance for the "huge number of small files" scenario as a result of
this exercise...

Many thanks one more time for your kind help with this!

Best regards,
Andreas


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba

smb.conf (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
In reply to this post by Samba - General mailing list
Hello again, Jeremy, hello again, Samba experts/developers,

as "all good things come in threes" and "third time is a charm",
following kind advice from Björn Jacke, I do indeed try again on this
list to arouse your interest one more time, giving an even shorter
summary of the issue - and having tested with a number of older Samba
versions between 3.5.x and 4.6.x to exactly pinpoint when the issue
started...

As I am 99.99% confident that this is not a configuration issue on my
side, I would really appreciate if somebody from the Samba team would be
interested in tracking down why - for the specific scenario with a huge
number of small files - performance is (so) much worse with Samba
4.x/SMB2 than it used to be with Samba 3.x/SMB1.

(Please note that, for a small number of larger or even huge files, as
expected, I can also confirm from my observations that Samba 4.x/SMB2 is
typically faster than Samba 3.x/SMB1, sometimes even considerably, so
the issue is NOT with Samba 4.x/SMB2 in general, but seems to be caused
to the specific scenario of a huge number of small files.)

Summary:

    * Win10 client using TotalCommander 9.0a to copy files
    * Copying files from/to a Samba share running on my Home Office
      Thecus NAS
    * Thecus N4200pro NAS (Intel(R) Atom(TM) CPU D525, 2 cores/4 HT
      threads @ 1.80GHz, Linux kernel 2.6.33, 3 GB RAM) and either
      Thecus original Samba 3.5.16 or several self-compiled
      (using gcc-5.2) Samba versions:
        - Samba 4.6.5, SMB2 dialect 3.1.1
        - Samba 4.2.14, SMB2 dialect 3.0
        - Samba 4.0.26, SMB2 dialect 3.0
        - Samba 3.6.25, SMB2 dialect 2.0.2 (single line
          "min protocol = SMB2" added to smb.conf)
        - Samba 3.6.25, SMB1 dialect 1.5
        - Thecus original Samba 3.5.16, SMB1 dialect 1.5
    * Exact same hardware, network, complete software stack for all
      cases (except varying Samba version on Thecus NAS)
    * Exact same smb.conf for both versions (see attached)
    * Definitely no other load on/access to the NAS during my testing
    * Recorded Wireshark captures in pcapng format for both Write/Read
      scenarios in all above Samba versions
    * Looking at Grand Total Sum of Wireshark "Service Response Time
      Statistics" (SRT) in seconds for all captures to compare
      performance below


A) "Write" Scenario:
Write ~ 1000 Small Files (between <1kB and ~ 20kB) to Samba share on
Thecus NAS, copying from a directory of ~ 5000 files stored on Win10
local NTFS

      Samba version   SMB/SMB2 dialect   Total SRT (sec)
      3.5.16          1.5                25
      3.6.25          1.5                21
      3.6.25          2.0.2              341 (!!!)
      4.0.26          3.0                387 (!!!)
      4.2.14          3.0                355 (!!!)
      4.6.5           3.1.1              346 (!!!)

B) "Read" Scenario:
Read ~ 2000 Small Files (between <1kB and ~ 20kB) from a directory of ~
5000 files from Samba share on Thecus NAS, copy to local NTFS on Win10

      Samba version   SMB/SMB2 dialect   Total SRT (sec)
      3.5.16          1.5                101
      3.6.25          1.5                100
      3.6.25          2.0.2              139 (!)
      4.0.26          3.0                152 (!)
      4.2.14          3.0                140 (!)
      4.6.5           3.1.1              144 (!)

(Note that the read scenario spends most of the time - even in 3.x/SMB
1.5 - determining the whole number of ~ 5000 files in this directory,
before Total Commander even starts copying the ~ 2000 files.)

Summary of findings:

    * For both Write and Read scenario and a huge number of small files,
      performance with SMB2/dialect 2.0/3.0/3.1.1 in all Samba versions
      >= 3.6 up to most recent 4.6 is (much) worse than SMB performance
      with SMB/dialect 1.5 in Samba 3.6 and before.

    * While in the Read scenario, performance is "only" worse by a factor
      of 40% (which might possibly at least partly be explained by
      additional complexity in SMB2), for the Write scenario, performance
      is about *fourteen times* (1400%) worse, a finding which definitely
      cannot be explained to be "working as designed".

    * While SMB/1.5 performance is still fine in the latest 3.6.25, *all
      SMB2-capable releases of Samba from the very first SMB2/2.x
      implementation in Samba 3.6 onwards* seem to be affected by the
      performance regression.

As it seems prohibited to attach Excel or PDF documents when posting to
this list, I am providing my (anonymized) smb.conf (global section and
particular share definition) as well as an Excel sheet and a PDF with
the detailed Wireshark Service Response Time Statistics for Write and
Read scenario over here:

http://home.mnet-online.de/awl1/smb.conf
http://home.mnet-online.de/awl1/Performance%20Regression.xls
http://home.mnet-online.de/awl1/Performance%20Regression.pdf

Am 13.06.2017 um 18:36 schrieb Jeremy Allison:
> Can you get comparitive wireshark traces for the two cases ?
> That would help discover what the bottleneck is.

As requested by Jeremy, the Wireshark "pcapng" packet traces/recordings
are available for all Samba versions mentioned above in both Read and
Write scenario. Unfortunately, these recordings do indeed contain
confidential data both from my machine and the share, so please get back
to me directly and request access: I will then send you a download link
and password to the capture files ZIP via private mail.

I also hereby promise that I will do everything I can in order to
support your analysis, including running follow-up tests on my
platform/scenario, digging deeper into packet traces or even do source
code investigations based on your instructions.

I truly hope we will be able to improve general Samba 4.x / SMB2
performance for the "huge number of small files" scenario as a result of
this exercise...

Many thanks one more time for your kind help with this!

Best regards,
Andreas

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
On Fri, 14 Jul 2017 17:37:11 +0200
awl1 via samba <[hidden email]> wrote:

> Hello again, Jeremy, hello again, Samba experts/developers,
>
> as "all good things come in threes" and "third time is a charm",
> following kind advice from Björn Jacke, I do indeed try again on this
> list to arouse your interest one more time, giving an even shorter
> summary of the issue - and having tested with a number of older Samba
> versions between 3.5.x and 4.6.x to exactly pinpoint when the issue
> started...
>
> As I am 99.99% confident that this is not a configuration issue on my
> side, I would really appreciate if somebody from the Samba team would
> be interested in tracking down why - for the specific scenario with a
> huge number of small files - performance is (so) much worse with
> Samba 4.x/SMB2 than it used to be with Samba 3.x/SMB1.
>
> (Please note that, for a small number of larger or even huge files,
> as expected, I can also confirm from my observations that Samba
> 4.x/SMB2 is typically faster than Samba 3.x/SMB1, sometimes even
> considerably, so the issue is NOT with Samba 4.x/SMB2 in general, but
> seems to be caused to the specific scenario of a huge number of small
> files.)
>
> Summary:
>
>     * Win10 client using TotalCommander 9.0a to copy files
>     * Copying files from/to a Samba share running on my Home Office
>       Thecus NAS
>     * Thecus N4200pro NAS (Intel(R) Atom(TM) CPU D525, 2 cores/4 HT
>       threads @ 1.80GHz, Linux kernel 2.6.33, 3 GB RAM) and either
>       Thecus original Samba 3.5.16 or several self-compiled
>       (using gcc-5.2) Samba versions:
>         - Samba 4.6.5, SMB2 dialect 3.1.1
>         - Samba 4.2.14, SMB2 dialect 3.0
>         - Samba 4.0.26, SMB2 dialect 3.0
>         - Samba 3.6.25, SMB2 dialect 2.0.2 (single line
>           "min protocol = SMB2" added to smb.conf)
>         - Samba 3.6.25, SMB1 dialect 1.5
>         - Thecus original Samba 3.5.16, SMB1 dialect 1.5
>     * Exact same hardware, network, complete software stack for all
>       cases (except varying Samba version on Thecus NAS)
>     * Exact same smb.conf for both versions (see attached)
>     * Definitely no other load on/access to the NAS during my testing
>     * Recorded Wireshark captures in pcapng format for both Write/Read
>       scenarios in all above Samba versions
>     * Looking at Grand Total Sum of Wireshark "Service Response Time
>       Statistics" (SRT) in seconds for all captures to compare
>       performance below
>
>
> A) "Write" Scenario:
> Write ~ 1000 Small Files (between <1kB and ~ 20kB) to Samba share on
> Thecus NAS, copying from a directory of ~ 5000 files stored on Win10
> local NTFS
>
>       Samba version   SMB/SMB2 dialect   Total SRT (sec)
>       3.5.16          1.5                25
>       3.6.25          1.5                21
>       3.6.25          2.0.2              341 (!!!)
>       4.0.26          3.0                387 (!!!)
>       4.2.14          3.0                355 (!!!)
>       4.6.5           3.1.1              346 (!!!)
>
> B) "Read" Scenario:
> Read ~ 2000 Small Files (between <1kB and ~ 20kB) from a directory of
> ~ 5000 files from Samba share on Thecus NAS, copy to local NTFS on
> Win10
>
>       Samba version   SMB/SMB2 dialect   Total SRT (sec)
>       3.5.16          1.5                101
>       3.6.25          1.5                100
>       3.6.25          2.0.2              139 (!)
>       4.0.26          3.0                152 (!)
>       4.2.14          3.0                140 (!)
>       4.6.5           3.1.1              144 (!)
>
> (Note that the read scenario spends most of the time - even in
> 3.x/SMB 1.5 - determining the whole number of ~ 5000 files in this
> directory, before Total Commander even starts copying the ~ 2000
> files.)
>
> Summary of findings:
>
>     * For both Write and Read scenario and a huge number of small
> files, performance with SMB2/dialect 2.0/3.0/3.1.1 in all Samba
> versions
>       >= 3.6 up to most recent 4.6 is (much) worse than SMB
>       >performance
>       with SMB/dialect 1.5 in Samba 3.6 and before.
>
>     * While in the Read scenario, performance is "only" worse by a
> factor of 40% (which might possibly at least partly be explained by
>       additional complexity in SMB2), for the Write scenario,
> performance is about *fourteen times* (1400%) worse, a finding which
> definitely cannot be explained to be "working as designed".
>
>     * While SMB/1.5 performance is still fine in the latest 3.6.25,
> *all SMB2-capable releases of Samba from the very first SMB2/2.x
>       implementation in Samba 3.6 onwards* seem to be affected by the
>       performance regression.
>
> As it seems prohibited to attach Excel or PDF documents when posting
> to this list, I am providing my (anonymized) smb.conf (global section
> and particular share definition) as well as an Excel sheet and a PDF
> with the detailed Wireshark Service Response Time Statistics for
> Write and Read scenario over here:
>
> http://home.mnet-online.de/awl1/smb.conf
> http://home.mnet-online.de/awl1/Performance%20Regression.xls
> http://home.mnet-online.de/awl1/Performance%20Regression.pdf
>
> Am 13.06.2017 um 18:36 schrieb Jeremy Allison:
> > Can you get comparitive wireshark traces for the two cases ?
> > That would help discover what the bottleneck is.
>
> As requested by Jeremy, the Wireshark "pcapng" packet
> traces/recordings are available for all Samba versions mentioned
> above in both Read and Write scenario. Unfortunately, these
> recordings do indeed contain confidential data both from my machine
> and the share, so please get back to me directly and request access:
> I will then send you a download link and password to the capture
> files ZIP via private mail.
>
> I also hereby promise that I will do everything I can in order to
> support your analysis, including running follow-up tests on my
> platform/scenario, digging deeper into packet traces or even do
> source code investigations based on your instructions.
>
> I truly hope we will be able to improve general Samba 4.x / SMB2
> performance for the "huge number of small files" scenario as a result
> of this exercise...
>
> Many thanks one more time for your kind help with this!
>
> Best regards,
> Andrea

Can I ask a question here?
Are the windows machines part of an AD domain ?

Rowland

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
Hello Rowland,

Am 14.07.2017 um 18:54 schrieb Rowland Penny via samba:
> Can I ask a question here? Are the windows machines part of an AD
> domain ?
no, this is (nearly) the most simple scenario that you can imagine (Home
Office NAS).

As you can see from smb.conf:

passdb backend = tdbsam

http://home.mnet-online.de/awl1/smb.conf

All users are strictly local on the NAS (and match those in the Windows
workgroup).

Thanks & best regards,
Andreas


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Friendly Reminder: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
In reply to this post by Samba - General mailing list
On Mon, Jul 03, 2017 at 12:08:53PM +0200, awl1 wrote:

> Hello one more time, Jeremy & fellow Samba experts/developers,
>
> over the weekend, I have done some more reading about wireshark
> tooling/statistics - the answer to
>
> https://ask.wireshark.org/questions/58970/analysing-performance-issues-with-storage-smb2
>
> was very helpful - and am now able to provide very clear and simple
> proof of the performance regression that I am seeing between SMB/1.5
> in Samba 3.5.16 and SMB2/3.1.1 in Samba 4.6.5, using Wireshark's
> "Statistics -> Service Response Times -> SMB(2)" tool.
>
> I really hope that someone from the development team is now
> interested in taking over and starts looking into this with me.
> Please rest assured that I will be happy to do everything I can to
> support the analysis and testing process. Please get back to me, and
> I will be sending you the access information and password for the
> respective wireshark PCAPNG traces ZIP file.

It would be quicker for you to help I'm afraid. As
you have nicely identified the SMB2_QUERY_DIRECTORY
as cause of the regression, can you look into the
wireshark traces and tell me what info level the
SMB1 client is asking for and what info level the
SMB2 client is asking for ?

If the SMB2 client is also asking for security
descriptors, this may be part of it.

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
In reply to this post by Samba - General mailing list
On Fri, Jul 14, 2017 at 04:19:09PM +0200, awl1 wrote:
> [global]
> server string = %h
> max open files = 100000
> deadtime = 15
> dead time = 15
> hide unreadable = yes

Remove the above. Causes security descriptor lookup for every directory entry.

> load printers = no
> log file = /var/log/samba.%m
> max log size = 50
> strict locking = no
> lock directory = /var/samba
> encrypt passwords = yes
> case sensitive = true
> default case = lower
> preserve case = yes
> short preserve case = yes
> passdb backend = tdbsam
> socket options = TCP_NODELAY IPTOS_LOWDELAY SO_KEEPALIVE

Remove the above - voodoo bullsh*t not needed on modern kernels.

> aio read size = 1
> aio write size = 1
> write cache size = 2097152

REMOVE THE ABOVE WITH PREJUDICE !!!!!

Where did this come from ? That code can have significant
effects on allowing async/oplocks etc.

> read raw = yes
> write raw = yes

Remove the above - only needed for SMB1.

> min receivefile size = 0
> use sendfile = yes
> large readwrite = yes

Remove the above. SMB1 only.

> max xmit = 32768

Remove the above. May have devastating effects.

> getwd cache = true
> map untrusted to domain = yes
> os level = 1
> local master = yes
> unix extensions = yes
> domain master = no
> preferred master = no
> dns proxy = no
> dos charset = CP850
> unix charset = utf8
> client ldap sasl wrapping = seal
> allow trusted domains = yes
> idmap uid = 20000-60000000
> idmap gid = 20000-60000000
> winbind separator = +
> winbind nested groups = yes
> winbind enum users = yes
> winbind enum groups = yes
> create mask = 0644
> winbind use default domain = yes
> map acl inherit = yes
> nt acl support = yes
> #map system = yes
> bind interfaces only = yes
> interfaces = lo,bond*
> guest account = nobody
> map to guest = Bad User
> guest only = yes
> follow symlinks = no
> block size = 262144
> dfree cache time = 5
> large readwrite = yes

Remove the above. SMB1 only.

> getwd cache = yes
> oplocks = yes
> kernel oplocks = yes

Remove the above. If you're not sharing via
multiple protocols then this will make things
worse.

> veto files = /folder.db/.AppleDouble/.AppleDB/.bin/.AppleDesktop/Network Trash Folder/:2eDS_Store/.DS_Store/TheFindByContentFolder/TheVolumeSettingsFolder/Temporary Items/.AppleDBcnid.lock/.VolumeIcon.icns/.Temporary Items/.Parent/.HSicon/._*/:*/
> veto oplock files = /J0*.WMF/*_.GIF/J0*.JPG/*_.WMF/

Remove the two lines above. Why are they needed ?

> workgroup = WORKGROUP
> password server = *
> security = user
> auth methods = guest sam_ignoredomain
> realm =
> idmap backend = rid:WORKGROUP=20000-60000000
> wins server = 192.168.1.1
> client ntlmv2 auth = no
> server signing = disabled
> smb encrypt = disabled
> delete veto files = yes
>
> [Work]
> comment = Work
> browseable = yes
> guest only = no
> path = /raid0/data/Work
> map acl inherit = yes
> inherit acls = yes
> read only = no
> create mask = 0777
> force create mode = 0000
> inherit permissions = Yes
> map archive = yes
> map hidden = no
> store dos attributes = no
> valid users = @smbadmin,@smbwork,user1
> invalid users = user2,upload
> read list = backup
> write list = @smbadmin,@smbwork,user1

Start with a small, clean smb.conf file. This one is a horrible
mess looks like accumulated over many years.

Jeremy.

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Friendly Reminder: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
In reply to this post by Samba - General mailing list
Hello Jeremy,

many thanks for getting back to me! :-)

Am 14.07.2017 um 19:33 schrieb Jeremy Allison:
> It would be quicker for you to help I'm afraid. As you have nicely
> identified the SMB2_QUERY_DIRECTORY as cause of the regression, can
> you look into the wireshark traces and tell me what info level the
> SMB1 client is asking for and what info level the SMB2 client is
> asking for ? If the SMB2 client is also asking for security
> descriptors, this may be part of it.
I will try to do my best in helping you, but I will need more
information, as I am not yet clear what exactly you want me to look for
in Wireshark:

How did you derive SMB2_QUERY_DIRECTORY from the information I listed?
For me, the main "Write" bottleneck pointed to SMB2 "Find", how do I get
to SMB2_QUERY_DIRECTORY from there?

Index               Procedure  Calls  Min SRT (s)  Max SRT (s)  Avg SRT
(s)  Sum SRT (s)
Find                       14   1607     0.001383     0.746684
0.193413   310.814294

So in order to check I should open both traces and compare exactly what
information? It would be great if you can describe what I have to do in
the Wireshark application in as much detail as possible...

Sorry, I am a complete newbie in terms of packet analysis/using
Wireshark... :-(

Thanks heaps & best regards
Andreas


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Friendly Reminder: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
On Fri, Jul 14, 2017 at 07:44:38PM +0200, awl1 wrote:

> Hello Jeremy,
>
> many thanks for getting back to me! :-)
>
> Am 14.07.2017 um 19:33 schrieb Jeremy Allison:
> >It would be quicker for you to help I'm afraid. As you have nicely
> >identified the SMB2_QUERY_DIRECTORY as cause of the regression,
> >can you look into the wireshark traces and tell me what info level
> >the SMB1 client is asking for and what info level the SMB2 client
> >is asking for ? If the SMB2 client is also asking for security
> >descriptors, this may be part of it.
> I will try to do my best in helping you, but I will need more
> information, as I am not yet clear what exactly you want me to look
> for in Wireshark:
>
> How did you derive SMB2_QUERY_DIRECTORY from the information I
> listed? For me, the main "Write" bottleneck pointed to SMB2 "Find",
> how do I get to SMB2_QUERY_DIRECTORY from there?
>
> Index               Procedure  Calls  Min SRT (s)  Max SRT (s)  Avg
> SRT (s)  Sum SRT (s)
> Find                       14   1607     0.001383     0.746684
> 0.193413   310.814294
>
> So in order to check I should open both traces and compare exactly
> what information? It would be great if you can describe what I have
> to do in the Wireshark application in as much detail as possible...

First try fixing the smb.conf in the way I reviewed. Then
let's look.

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
In reply to this post by Samba - General mailing list
Hi again,

Am 14.07.2017 um 19:41 schrieb Jeremy Allison:
> Start with a small, clean smb.conf file. This one is a horrible mess
> looks like accumulated over many years.
the complete content of the global section was provided by Thecus (who
produced the NAS). Without "breaking into" the NAS through ssh as a
"module developer", an end user is not even able to see the smb.conf file...

So everything in the [global] section is nothing but Thecus' seemingly
poor defaults (dating back from Samba 3.5.16) except one small change I
did when I switched from "case sensitive = auto" to "case sensitive =
true" in order to speed up the "write huge number of files into a single
directory scenario" for the original Samba 3.5 some years ago.

When you stated "the above" in your previous mail, did you only refer to
the one line immediately above, or to the complete block of lines
between two of your remarks?
Just one example:
> aio read size = 1
> aio write size = 1
> write cache size = 2097152
You stated: "REMOVE THE ABOVE WITH PREJUDICE !!!!!"

So do you want me to remove just one line: "write cache size", or all
three lines up to your previous comment?

Will try to apply the changes asap and report back...

Thanks & best regards
Andreas


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
On Fri, Jul 14, 2017 at 07:56:11PM +0200, awl1 wrote:
> Hi again,
>
> Am 14.07.2017 um 19:41 schrieb Jeremy Allison:
> >Start with a small, clean smb.conf file. This one is a horrible
> >mess looks like accumulated over many years.
> the complete content of the global section was provided by Thecus
> (who produced the NAS). Without "breaking into" the NAS through ssh
> as a "module developer", an end user is not even able to see the
> smb.conf file...

Yes, but I'm not Thecus support, I'm only helping you :-).

> So everything in the [global] section is nothing but Thecus'
> seemingly poor defaults (dating back from Samba 3.5.16) except one
> small change I did when I switched from "case sensitive = auto" to
> "case sensitive = true" in order to speed up the "write huge number
> of files into a single directory scenario" for the original Samba
> 3.5 some years ago.

Yes, but I'm guessing that some of the interactions
here with these smb.conf params have never been tested
with the move to SMB2.

> When you stated "the above" in your previous mail, did you only
> refer to the one line immediately above, or to the complete block of
> lines between two of your remarks?

One line.

> Just one example:
> >aio read size = 1
> >aio write size = 1
> >write cache size = 2097152
> You stated: "REMOVE THE ABOVE WITH PREJUDICE !!!!!"
>
> So do you want me to remove just one line: "write cache size", or
> all three lines up to your previous comment?

Just the "write cache size".

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

Samba - General mailing list
In reply to this post by Samba - General mailing list
On Fri, 14 Jul 2017 19:20:05 +0200
awl1 <[hidden email]> wrote:

> Hello Rowland,
>
> Am 14.07.2017 um 18:54 schrieb Rowland Penny via samba:
> > Can I ask a question here? Are the windows machines part of an AD
> > domain ?
> no, this is (nearly) the most simple scenario that you can imagine
> (Home Office NAS).
>
> As you can see from smb.conf:
>
> passdb backend = tdbsam

You cannot tell from that, that is also used on a Unix domain member

>
> http://home.mnet-online.de/awl1/smb.conf
>
> All users are strictly local on the NAS (and match those in the
> Windows workgroup).

As you are running a workgroup, then the NAS should be set up as a
standalone server. Samba has changed greatly since 3.5.x and quite a
lot of your smb.conf should be changed to take account of this, you
don't need winbind for one thing.

Rowland



--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
12
Loading...