Performance optimizations for small files and case sensitive-option

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance optimizations for small files and case sensitive-option

Samba - General mailing list
Hello,

I am running a samba server on a SLES12SP3 machine smb version
4.6.9+git.59.c2cff9cea4c. The samba server offers only the smb service,
the file data are stored on separate NFS servers because files are
accesses by our users from either linux or windows systems. So a write
to this samba server always will read or write files from/to NFS servers.

The problem is speed when many very small files are copied from windows.
I have a test directory of 70MB in size consisting of 5528 mostly very
small (<=70Kb) files. The directory is stored on a local disk on the
windows machine. "Copy via smb" below means, I copy the data from a
Windows 10 VM to a share connected on the smb server (which results in
writing the data on the NFS file servers). The time it takes to copy
this directory is as follows:

Copy testdir 70MB with 5500 single files via smb:   300 sec (5min)
Copy zip archive of testdir (1 file, 46MB) via smb:  ~2 sec
Copy this directory on smb server via NFS, cp -ar:   90 sec
Copy test dir via smb to local fs on smb server:     90 sec

I tried a lot to make the copy process of many small files faster and
finally came across the case sensitive option of samba:

Copy testdir 70MB with 5500 single files via smb:    180 sec (3min)
using "case sensitive= yes"

This is about 1/3 faster compared to the 300sec without the case
sensitive option. Unfortunately case sensitive files and windows leads
to strange effects that many users will not understand and thus is not a
real option for productive use (think of creating a file hellO.txt and
another hello.txt and a user deleting the only one he sees in explorer).

So I thought if for copying one file samba has to check if this file
already exists in all sorts of upper/lower spellings which takes a lot
of time with NFS it might help to use bigger caches. So I set
smb max stat cache size and directory name cache size to a higher value
(see config below). But unfortunately it did not help (~300sec).

Does anyone have any hint why caching esp. max stat cache size does not
seem to have any effect whereas "case sensitive=yes" has a big positive
effect?

Does anone have any other idea what I could try to get a better smb
performance for many small files?

Thanks a lot
Rainer

My smb.conf:
[global]
        workgroup = MYWORKGROUP
        server string = Samba (version %v)
        encrypt passwords = Yes
        log file = /var/log/samba/log.%m
        log level = 1 winbind:1
        max log size = 0

        unix extensions = no
        wide links = yes
        kernel oplocks = no
        oplocks = yes
        posix locking = no
        blocking locks = no
        acl allow execute always = yes
        max open files = 32808
        max xmit = 262144
        dead time = 15
        browseable = no
        server signing = No
        getwd cache = yes
        directory name cache size = 65536
        stat cache = yes
        max stat cache size = 65536

        use sendfile = true
        aio read size = 32768
        aio write size = 32768
        disable netbios = yes
        smb ports = 445

        name resolve order = host wins bcast
        netbios name = smbtesthost
        netbios aliases = smbtestalias1
        passdb backend = tdbsam
        vfs objects = fileid
        realm = MYREALM
        security = ADS
        winbind use default domain = yes
        winbind max domain connections = 10
        winbind max clients = 1000
        winbind reconnect delay = 20
        map untrusted to domain = yes
        map to guest = never
        idmap config MYREALM : backend = nss
        idmap config MYREALM : range = 0-2000000
        idmap config MYREALM : read only = yes
        idmap config * : backend = tdb
        idmap config * : range = 3000000-4000000
        idmap config * : read only = no

        map acl inherit = yes
--
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse 1
56070 Koblenz, Tel: +49261287 1312 Fax +49261287 100 1312
Web: http://userpages.uni-koblenz.de/~krienke
PGP: http://userpages.uni-koblenz.de/~krienke/mypgp.html


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|

Re: Performance optimizations for small files and case sensitive-option

Samba - General mailing list
On Tue, Jan 09, 2018 at 01:44:22PM +0100, Rainer Krienke via samba wrote:

> Hello,
>
> I am running a samba server on a SLES12SP3 machine smb version
> 4.6.9+git.59.c2cff9cea4c. The samba server offers only the smb service,
> the file data are stored on separate NFS servers because files are
> accesses by our users from either linux or windows systems. So a write
> to this samba server always will read or write files from/to NFS servers.
>
> The problem is speed when many very small files are copied from windows.
> I have a test directory of 70MB in size consisting of 5528 mostly very
> small (<=70Kb) files. The directory is stored on a local disk on the
> windows machine. "Copy via smb" below means, I copy the data from a
> Windows 10 VM to a share connected on the smb server (which results in
> writing the data on the NFS file servers). The time it takes to copy
> this directory is as follows:
>
> Copy testdir 70MB with 5500 single files via smb:   300 sec (5min)
> Copy zip archive of testdir (1 file, 46MB) via smb:  ~2 sec
> Copy this directory on smb server via NFS, cp -ar:   90 sec
> Copy test dir via smb to local fs on smb server:     90 sec
>
> I tried a lot to make the copy process of many small files faster and
> finally came across the case sensitive option of samba:
>
> Copy testdir 70MB with 5500 single files via smb:    180 sec (3min)
> using "case sensitive= yes"
>
> This is about 1/3 faster compared to the 300sec without the case
> sensitive option. Unfortunately case sensitive files and windows leads
> to strange effects that many users will not understand and thus is not a
> real option for productive use (think of creating a file hellO.txt and
> another hello.txt and a user deleting the only one he sees in explorer).
>
> So I thought if for copying one file samba has to check if this file
> already exists in all sorts of upper/lower spellings which takes a lot
> of time with NFS it might help to use bigger caches. So I set
> smb max stat cache size and directory name cache size to a higher value
> (see config below). But unfortunately it did not help (~300sec).

You must canonicalize the filenames on the NFS server. Make them
all upper case, then:

See this (old) page for details:

https://www.samba.org/samba/docs/man/Samba-HOWTO-Collection/largefile.html

case sensitive = True
default case = upper
preserve case = no
short preserve case = no

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|

Re: Performance optimizations for small files and case sensitive-option

Samba - General mailing list
Hallo,

thank you very much for your link. Unfortunately I cannot make all files
of all our users upper case. This is because all files that are stored
on our fileservers for all our about 12000 accounts can be accessed via
Samba but also via NFS from linux. Moreover even Users that usually work
with Windows, sometimes log in on a Linux machine and want to access
their files as well.

What I am unable to understand is why the efficient behaviour of "case
sensitive = True" cannot be achieved by a cache that learns all existing
files in the current directory so samba would have simply to look into
this cache and things should be as efficient like in the "case sensitive
= True" case without the disadvantages like the need to convert all
files to eg upper case?

Thanks
Rainer


>> Copy testdir 70MB with 5500 single files via smb:    180 sec (3min)
>> using "case sensitive= yes"
>>
>> This is about 1/3 faster compared to the 300sec without the case
>> sensitive option. Unfortunately case sensitive files and windows leads
>> to strange effects that many users will not understand and thus is not a
>> real option for productive use (think of creating a file hellO.txt and
>> another hello.txt and a user deleting the only one he sees in explorer).
>>
>> So I thought if for copying one file samba has to check if this file
>> already exists in all sorts of upper/lower spellings which takes a lot
>> of time with NFS it might help to use bigger caches. So I set
>> smb max stat cache size and directory name cache size to a higher value
>> (see config below). But unfortunately it did not help (~300sec).
>
> You must canonicalize the filenames on the NFS server. Make them
> all upper case, then:
>
> See this (old) page for details:
>
> https://www.samba.org/samba/docs/man/Samba-HOWTO-Collection/largefile.html
>
> case sensitive = True
> default case = upper
> preserve case = no
> short preserve case = no
>

--
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse 1
56070 Koblenz, Tel: +49261287 1312 Fax +49261287 100 1312
Web: http://userpages.uni-koblenz.de/~krienke
PGP: http://userpages.uni-koblenz.de/~krienke/mypgp.html


--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
Reply | Threaded
Open this post in threaded view
|

Re: Performance optimizations for small files and case sensitive-option

Samba - General mailing list
On Wed, Jan 10, 2018 at 09:04:06AM +0100, Rainer Krienke via samba wrote:

> Hallo,
>
> thank you very much for your link. Unfortunately I cannot make all files
> of all our users upper case. This is because all files that are stored
> on our fileservers for all our about 12000 accounts can be accessed via
> Samba but also via NFS from linux. Moreover even Users that usually work
> with Windows, sometimes log in on a Linux machine and want to access
> their files as well.
>
> What I am unable to understand is why the efficient behaviour of "case
> sensitive = True" cannot be achieved by a cache that learns all existing
> files in the current directory so samba would have simply to look into
> this cache and things should be as efficient like in the "case sensitive
> = True" case without the disadvantages like the need to convert all
> files to eg upper case?

Because the contents of directories change arbitrarily.

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba