Quantcast

The speed of copy is very very slow when much files in the folder

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
Hi,

100,000 files in my share.
When I copy some little files to the folder, the speed is very very slow.

I add some logs in the smbd, the “get_real_filename_full_scan” function cast about 1s. It’s too slow.
I think the function is used for comparing the dest name to the exist files to check whether the dest file is exist.

When I add “case sensitive = yes” to the configuration file, the speed is much quickly.

So, I think whether we has some method to improve the performance?
Could you give us some suggestion about the case.

Environment:
       Ubuntu 14.04
       Samba 4.3.11

Client:
       Windows 7

Configuration:
       clustering = yes
       ctdbd socket = /var/run/ctdb/ctdbd.socket
   max protocol = SMB3
   large readwrite = yes
   idmap config *:range = 1000000-1999999
   log level = 2
   use sendfile = yes
   store dos attributes = yes
   acl_xattr:ignore system acls = yes
   aio read size = 1024
   deadtime = 10
   aio write behind = true

Could you add my bugzilla account to the samba domain.
Email: [hidden email]<mailto:[hidden email]>

Br,
Zhang Xiaoxu.

-------------------------------------------------------------------------------------------------------------------------------------
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
On Fri, Apr 28, 2017 at 12:36:27PM +0000, Zhangxiaoxu via samba-technical wrote:

> 100,000 files in my share.
> When I copy some little files to the folder, the speed is very very slow.
>
> I add some logs in the smbd, the “get_real_filename_full_scan” function cast about 1s. It’s too slow.
> I think the function is used for comparing the dest name to the exist files to check whether the dest file is exist.
>
> When I add “case sensitive = yes” to the configuration file, the speed is much quickly.
>
> So, I think whether we has some method to improve the performance?
> Could you give us some suggestion about the case.
>
> Environment:
>        Ubuntu 14.04
>        Samba 4.3.11
>
> Client:
>        Windows 7
>
> Configuration:
>        clustering = yes

What's your file system?

Volker

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
On Fri, Apr 28, 2017 at 5:50 AM, vl--- via samba-technical
<[hidden email]> wrote:

> On Fri, Apr 28, 2017 at 12:36:27PM +0000, Zhangxiaoxu via samba-technical wrote:
>> 100,000 files in my share.
>> When I copy some little files to the folder, the speed is very very slow.
>>
>> I add some logs in the smbd, the “get_real_filename_full_scan” function cast about 1s. It’s too slow.
>> I think the function is used for comparing the dest name to the exist files to check whether the dest file is exist.
>>
>> When I add “case sensitive = yes” to the configuration file, the speed is much quickly.
>>
>> So, I think whether we has some method to improve the performance?
>> Could you give us some suggestion about the case.
>>
>> Environment:
>>        Ubuntu 14.04
>>        Samba 4.3.11
>>
>> Client:
>>        Windows 7
>>
>> Configuration:
>>        clustering = yes
>
> What's your file system?
>
> Volker
>
In addition, this is the same problem that someone from H3C asked
about a couple of days ago.

It is a consequence of implementing NTFS semantics (Case preserving,
case insensitive) on a case sensitive file system.

Volker is asking that question because some Linux file systems can
provide case insensitive semantics for lookup.

--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Fri, Apr 28, 2017 at 8:36 AM, Zhangxiaoxu via samba-technical
<[hidden email]> wrote:
> Hi,
>
> 100,000 files in my share.
> When I copy some little files to the folder, the speed is very very slow.

In one directory, with no subdirectories? Maybe you should stop doing
this? No matter how clever the network file system, getting the
directory information sorted out and even sorting out permissions is
complicated by so many files in one directory. It can be workable,
especially if you're not trying to sort out a list of the files, but
it's not free.

> I add some logs in the smbd, the “get_real_filename_full_scan” function cast about 1s. It’s too slow.
> I think the function is used for comparing the dest name to the exist files to check whether the dest file is exist.
>
> When I add “case sensitive = yes” to the configuration file, the speed is much quickly.

Well, yes. Finding all the fiile names and sorting them is
computationally expensive, especially to verify whether a mixed case
file in that long list of files matches yours and there is a conflict.

> So, I think whether we has some method to improve the performance?
> Could you give us some suggestion about the case.

You just said one. "case sensitive - yes". The other is "don't put
that many files in one directory without splitting it out into
subdirectories".

> Environment:
>        Ubuntu 14.04
>        Samba 4.3.11
>
> Client:
>        Windows 7
>
> Configuration:
>        clustering = yes
>        ctdbd socket = /var/run/ctdb/ctdbd.socket
>    max protocol = SMB3
>    large readwrite = yes
>    idmap config *:range = 1000000-1999999
>    log level = 2
>    use sendfile = yes
>    store dos attributes = yes
>    acl_xattr:ignore system acls = yes
>    aio read size = 1024
>    deadtime = 10
>    aio write behind = true
>
> Could you add my bugzilla account to the samba domain.
> Email: [hidden email]<mailto:[hidden email]>
>
> Br,
> Zhang Xiaoxu.
>
> -------------------------------------------------------------------------------------------------------------------------------------
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from H3C, which is
> intended only for the person or entity whose address is listed above. Any use of the
> information contained herein in any way (including, but not limited to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
On Fri, Apr 28, 2017 at 7:03 PM, Nico Kadel-Garcia via samba-technical
<[hidden email]> wrote:

> On Fri, Apr 28, 2017 at 8:36 AM, Zhangxiaoxu via samba-technical
> <[hidden email]> wrote:
>> Hi,
>>
>> 100,000 files in my share.
>> When I copy some little files to the folder, the speed is very very slow.
>
> In one directory, with no subdirectories? Maybe you should stop doing
> this? No matter how clever the network file system, getting the
> directory information sorted out and even sorting out permissions is
> complicated by so many files in one directory. It can be workable,
> especially if you're not trying to sort out a list of the files, but
> it's not free.
>
>> I add some logs in the smbd, the “get_real_filename_full_scan” function cast about 1s. It’s too slow.
>> I think the function is used for comparing the dest name to the exist files to check whether the dest file is exist.
>>
>> When I add “case sensitive = yes” to the configuration file, the speed is much quickly.
>
> Well, yes. Finding all the fiile names and sorting them is
> computationally expensive, especially to verify whether a mixed case
> file in that long list of files matches yours and there is a conflict.
>
>> So, I think whether we has some method to improve the performance?
>> Could you give us some suggestion about the case.
>
> You just said one. "case sensitive - yes". The other is "don't put
> that many files in one directory without splitting it out into
> subdirectories".

The problem with "case sensitive = yes" is that it is only for file
systems that are case insensitive. Yes, that sounds backwards, but
that is what it is for.

While it will solve the copying problem if you don't actually have a
case-insensitive files system, it introduces other big issues. One is
that it will actually let you create two files with the same name that
differ only by case, eg, File1.txt and file1.txt. Then you have a
problem, because which one you get will depend on the exact case used
when asking for the file. These (and worse) are the dragons I
mentioned.

To use "case sensitive = yes" you need a file system that supports
case-insensitive lookups, like XFS, ZFS (on Linux) and GPFS, it seems.

>> Environment:
>>        Ubuntu 14.04
>>        Samba 4.3.11
>>
>> Client:
>>        Windows 7
>>
>> Configuration:
>>        clustering = yes
>>        ctdbd socket = /var/run/ctdb/ctdbd.socket
>>    max protocol = SMB3
>>    large readwrite = yes
>>    idmap config *:range = 1000000-1999999
>>    log level = 2
>>    use sendfile = yes
>>    store dos attributes = yes
>>    acl_xattr:ignore system acls = yes
>>    aio read size = 1024
>>    deadtime = 10
>>    aio write behind = true
>>
>> Could you add my bugzilla account to the samba domain.
>> Email: [hidden email]<mailto:[hidden email]>
>>
>> Br,
>> Zhang Xiaoxu.
>>
>> -------------------------------------------------------------------------------------------------------------------------------------
>> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
>> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
>> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
>> 邮件!
>> This e-mail and its attachments contain confidential information from H3C, which is
>> intended only for the person or entity whose address is listed above. Any use of the
>> information contained herein in any way (including, but not limited to, total or partial
>> disclosure, reproduction, or dissemination) by persons other than the intended
>> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
>> by phone or email immediately and delete it!
>



--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
On Fri, Apr 28, 2017 at 11:25 PM, Richard Sharpe via samba-technical
<[hidden email]> wrote:

> On Fri, Apr 28, 2017 at 7:03 PM, Nico Kadel-Garcia via samba-technical
> <[hidden email]> wrote:
>> On Fri, Apr 28, 2017 at 8:36 AM, Zhangxiaoxu via samba-technical
>> <[hidden email]> wrote:
>>> Hi,
>>>
>>> 100,000 files in my share.
>>> When I copy some little files to the folder, the speed is very very slow.
>>
>> In one directory, with no subdirectories? Maybe you should stop doing
>> this? No matter how clever the network file system, getting the
>> directory information sorted out and even sorting out permissions is
>> complicated by so many files in one directory. It can be workable,
>> especially if you're not trying to sort out a list of the files, but
>> it's not free.
>>
>>> I add some logs in the smbd, the “get_real_filename_full_scan” function cast about 1s. It’s too slow.
>>> I think the function is used for comparing the dest name to the exist files to check whether the dest file is exist.
>>>
>>> When I add “case sensitive = yes” to the configuration file, the speed is much quickly.
>>
>> Well, yes. Finding all the fiile names and sorting them is
>> computationally expensive, especially to verify whether a mixed case
>> file in that long list of files matches yours and there is a conflict.
>>
>>> So, I think whether we has some method to improve the performance?
>>> Could you give us some suggestion about the case.
>>
>> You just said one. "case sensitive - yes". The other is "don't put
>> that many files in one directory without splitting it out into
>> subdirectories".
>
> The problem with "case sensitive = yes" is that it is only for file
> systems that are case insensitive. Yes, that sounds backwards, but
> that is what it is for.
>
> While it will solve the copying problem if you don't actually have a
> case-insensitive files system, it introduces other big issues. One is
> that it will actually let you create two files with the same name that
> differ only by case, eg, File1.txt and file1.txt. Then you have a
> problem, because which one you get will depend on the exact case used
> when asking for the file. These (and worse) are the dragons I
> mentioned.

Getting the file you asked for, with the spelling you asked for,
rather than trying to mangle case to match a file with an alternative
name is what I'd *want* and expect. The ancient use of case
insensitive file systems is both confusing and dangerous. This is not
what I would consider "a problem", I'd consider it desired behavior.

> To use "case sensitive = yes" you need a file system that supports
> case-insensitive lookups, like XFS, ZFS (on Linux) and GPFS, it seems.

Respectfully, you could and should give up on case insensitive file
names. They are an unnecessary computational and file system
organizational burden.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

答复: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
Hi,

File System: EXT4

The share folder is created in the system disk.


-----邮件原件-----
发件人: [hidden email] [mailto:[hidden email]] 代表 [hidden email]
发送时间: 2017年4月28日 20:51
收件人: zhangxiaoxu 13123 (RD)
抄送: [hidden email]
主题: Re: The speed of copy is very very slow when much files in the folder

On Fri, Apr 28, 2017 at 12:36:27PM +0000, Zhangxiaoxu via samba-technical wrote:

> 100,000 files in my share.
> When I copy some little files to the folder, the speed is very very slow.
>
> I add some logs in the smbd, the “get_real_filename_full_scan” function cast about 1s. It’s too slow.
> I think the function is used for comparing the dest name to the exist files to check whether the dest file is exist.
>
> When I add “case sensitive = yes” to the configuration file, the speed is much quickly.
>
> So, I think whether we has some method to improve the performance?
> Could you give us some suggestion about the case.
>
> Environment:
>        Ubuntu 14.04
>        Samba 4.3.11
>
> Client:
>        Windows 7
>
> Configuration:
>        clustering = yes

What's your file system?

Volker
-------------------------------------------------------------------------------------------------------------------------------------
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 答复: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
On Sat, Apr 29, 2017 at 1:50 AM, Zhangxiaoxu via samba-technical
<[hidden email]> wrote:
> Hi,
>
> File System: EXT4
>
> The share folder is created in the system disk.

As the Doctor says: When it hurts, don't do that.

EXT4 is not the file system you are looking for.

> -----邮件原件-----
> 发件人: [hidden email] [mailto:[hidden email]] 代表 [hidden email]
> 发送时间: 2017年4月28日 20:51
> 收件人: zhangxiaoxu 13123 (RD)
> 抄送: [hidden email]
> 主题: Re: The speed of copy is very very slow when much files in the folder
>
> On Fri, Apr 28, 2017 at 12:36:27PM +0000, Zhangxiaoxu via samba-technical wrote:
>> 100,000 files in my share.
>> When I copy some little files to the folder, the speed is very very slow.
>>
>> I add some logs in the smbd, the “get_real_filename_full_scan” function cast about 1s. It’s too slow.
>> I think the function is used for comparing the dest name to the exist files to check whether the dest file is exist.
>>
>> When I add “case sensitive = yes” to the configuration file, the speed is much quickly.
>>
>> So, I think whether we has some method to improve the performance?
>> Could you give us some suggestion about the case.
>>
>> Environment:
>>        Ubuntu 14.04
>>        Samba 4.3.11
>>
>> Client:
>>        Windows 7
>>
>> Configuration:
>>        clustering = yes
>
> What's your file system?
>
> Volker
> -------------------------------------------------------------------------------------------------------------------------------------
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from H3C, which is
> intended only for the person or entity whose address is listed above. Any use of the
> information contained herein in any way (including, but not limited to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!



--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 答复: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Sat, Apr 29, 2017 at 08:50:57AM +0000, Zhangxiaoxu wrote:
> File System: EXT4
>
> The share folder is created in the system disk.

Ok, I had hoped that you are an OEM with a special file system that
you have control over. With EXT4, the behaviour you describe is "works
as designed".

You can of course always create a special share with the option to set
case sensivity for the directories that have a problem.

Samba just has to implement case insensitive behavior on a case
sensitive file system, and this only works with traversals unless the
file system has a special case-insensitive index behind the scenes.

Volker

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 答复: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
On Mon, May 1, 2017 at 2:42 AM, vl--- via samba-technical
<[hidden email]> wrote:

> On Sat, Apr 29, 2017 at 08:50:57AM +0000, Zhangxiaoxu wrote:
>> File System: EXT4
>>
>> The share folder is created in the system disk.
>
> Ok, I had hoped that you are an OEM with a special file system that
> you have control over. With EXT4, the behaviour you describe is "works
> as designed".
>
> You can of course always create a special share with the option to set
> case sensivity for the directories that have a problem.
>
> Samba just has to implement case insensitive behavior on a case
> sensitive file system, and this only works with traversals unless the
> file system has a special case-insensitive index behind the scenes.

Or: abandon case insensitivity. A directory that has 100,000 files
with no sub directories is typically machine generated, as part of
some process. An audit for case insesitive name collisions, and a
review of the process that generates or uploads such files, might be
easy to activate for Zhanxiaoxu's environment.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The speed of copy is very very slow when much files in the folder

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Sun, Apr 30, 2017 at 6:31 AM, Nico Kadel-Garcia <[hidden email]> wrote:

> On Sat, Apr 29, 2017 at 9:58 AM, Richard Sharpe
> <[hidden email]> wrote:
>> On Fri, Apr 28, 2017 at 9:18 PM, Nico Kadel-Garcia <[hidden email]> wrote:
>>> On Fri, Apr 28, 2017 at 11:25 PM, Richard Sharpe via samba-technical
>>> <[hidden email]> wrote:
>
>>>> The problem with "case sensitive = yes" is that it is only for file
>>>> systems that are case insensitive. Yes, that sounds backwards, but
>>>> that is what it is for.
>>>>
>>>> While it will solve the copying problem if you don't actually have a
>>>> case-insensitive files system, it introduces other big issues. One is
>>>> that it will actually let you create two files with the same name that
>>>> differ only by case, eg, File1.txt and file1.txt. Then you have a
>>>> problem, because which one you get will depend on the exact case used
>>>> when asking for the file. These (and worse) are the dragons I
>>>> mentioned.
>>>
>>> Getting the file you asked for, with the spelling you asked for,
>>> rather than trying to mangle case to match a file with an alternative
>>> name is what I'd *want* and expect. The ancient use of case
>>> insensitive file systems is both confusing and dangerous. This is not
>>> what I would consider "a problem", I'd consider it desired behavior.
>>>
>>>> To use "case sensitive = yes" you need a file system that supports
>>>> case-insensitive lookups, like XFS, ZFS (on Linux) and GPFS, it seems.
>>>
>>> Respectfully, you could and should give up on case insensitive file
>>> names. They are an unnecessary computational and file system
>>> organizational burden.
>>
>> Excuse me while I regard you as a sciolist.
>
> I actually had to look that one up! A simple "you don't know what
> you're talking about" would have sufficed. And sorry, but I've
> considerable experience been dealing with filesystems as part f my
> professional responsibilities since..... dear lord, 1988? And with
> Samba since roughly 1993, when I ported it to SunOS and published
> notes.
>
> Large directories with many files in the same folder have *always*
> been a burden. Filesystems have improved in their ability to handle
> them: I'll note that ext, in particular, has improved in this regard
> since its earliest releases. But attempts to list, to analyze, and
> especially to sort these large directories are a real computational
> burden for the kernel itself. Layering complexity into the kernel to
> optimize performance for large directories more efficiently is not
> free, and has often degraded performance for more modest operations
> and even destabilized the file system itself. ext3, for example, is
> *still* a faster filesystem than ext4 if you don't require the
> journaling (for failure recovery) or improved handling of large
> directories (such as our original poster is seeking). Large
> directories are an old, old problem, with many references.

Wow, all that experience but you still do not understand that Windows
does not have that problem. Hell, even ZFS added case-insensitive
lookups for that reason and I made it work for ZFS on Linux a couple
of years ago (with a lot of help from others, of course).

The point is that Windows does not show this sort of O(N^2) and can
handle case-insensitivity with large numbers of files in directories
and if you want to deploy an enterprise-grade solution with Samba you
cannot tell customers that they need to change their environment to
suit your product.

--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

Loading...