F16 unusable while writing to pendrive

OT: bash script - unexpected exit

Printer problems

Patrick O'Callaghan

Saturday, 17 December 2011 Sat, 17 Dec '11

6:11 p.m.

This looks like a regression. Under F15 when I wrote large files to a pendrive, the system would become a little sluggish. Now it essentially freezes until the write terminates. What I mean is that the UI is almost completely unresponsive; even clicking between two terminal windows is so slow that you can see the window contents refresh, followed several seconds later by the frame. Fully updated F16 with KDE, Intel Core 2 Dual, 4GB RAM, Intel onboard graphics. The pendrive is an 8GB Patriot Xporter (a year or two old) under USB-2 with no intervening hub. poc

Show replies by date

Patrick O'Callaghan

Saturday, 21 January Sat, 21 Jan

12:35 p.m.

New subject: F16 unusable while writing to pendrive [SOLVED]

On Sat, 2011-12-17 at 21:17 -0430, Patrick O'Callaghan wrote:

...

On Sun, 2011-12-18 at 01:37 +0000, Marko Vojinovic wrote: > On Saturday 17 December 2011 19:41:38 Patrick O'Callaghan wrote: > > This looks like a regression. Under F15 when I wrote large files to a > > pendrive, the system would become a little sluggish. Now it essentially > > freezes until the write terminates. What I mean is that the UI is almost > > completely unresponsive; even clicking between two terminal windows is > > so slow that you can see the window contents refresh, followed several > > seconds later by the frame. > > > > Fully updated F16 with KDE, Intel Core 2 Dual, 4GB RAM, Intel onboard > > graphics. The pendrive is an 8GB Patriot Xporter (a year or two old) > > under USB-2 with no intervening hub. > > Oh yes, this is one of my favourities... :-D > > There's a very good article at LWN discussing precisely this issue: > > http://lwn.net/Articles/467328/ > > You wouldn't believe how complicated these things can get... ;-) One would > naively think "yes, the USB drive is slow, but that shouldn't stop the rest of > the system from running smoothly". However, when you put into the mix the > hugemem page allocation, memory fragmentation, contradictory points of view on > how the kernel should be optimized, etc., it seems that it is quite natural > that your typical fast 3GHz system with 4GB of RAM grinds to a complete halt > during a simple write-to-USB-drive operation... :-D That might be it I guess, but I'm not totally convinced. I updated F15->F16 only 5 days ago, and previous to the switchover had no problems of this sort, using the latest kernel for F15. It's hard to believe the kernel switch from F15 to F16 can have made such a huge difference. For the record, I went from: kernel-2.6.41.1-1.fc15.x86_64 to: kernel-3.1.5-1.fc16.x86_64 But as we know, the version bump from 2 to 3 is meaningless in itself.

This was really bothering me (I make frequent use of pendrives) so took another look at the above article, then at /usr/share/doc/kernel-doc-3.1.9/Documentation/vm/transhuge.txt, where it explains how to turn off the Transparent Huge Pages feature. I added transparent_hugepage=never to my boot params, rebooted, and the problem appears to have been fixed. Note: YMMV. This works for me with my workload (no long jobs or intensive compute load). Others may find it impacts their performance, but it shouldn't break anything. poc

Patrick O'Callaghan

Monday, 23 January Mon, 23 Jan

10:12 a.m.

New subject: F16 unusable while writing to pendrive [SOLVED -- not]

On 23/01/12 4:01 AM, Bruno Wolff III wrote:

...

On Mon, Jan 23, 2012 at 00:31:38 -0600, Bruno Wolff III <bruno(a)wolff.to> wrote: > On Sun, Jan 22, 2012 at 20:52:43 -0430, > Patrick O'Callaghan <pocallaghan(a)gmail.com> wrote: >> >> Apparently kernel 3.3 will have a fix for this problem, but that's >> expected to be around the end of March unless someone backports it. > > 3.3 rc1 has been built for rawhide and should work on f16 if you want to > try it now. I have had a chance to test this build (kernel-PAE-3.3.0-0.rc1.git0.3.fc17) now and it seems to be working fine.

Good to know. I'll bear it in mind if the boot param workaround doesn't help. poc

Bruno Wolff III

12:31 a.m.

New subject: F16 unusable while writing to pendrive [SOLVED -- not]

On Sun, Jan 22, 2012 at 20:52:43 -0430, Patrick O'Callaghan <pocallaghan(a)gmail.com> wrote:

...

Apparently kernel 3.3 will have a fix for this problem, but that's expected to be around the end of March unless someone backports it.

3.3 rc1 has been built for rawhide and should work on f16 if you want to try it now.

Patrick O'Callaghan

Sunday, 22 January Sun, 22 Jan

10:28 p.m.

New subject: F16 unusable while writing to pendrive [another attempt]

On Sun, 2012-01-22 at 23:00 -0430, Patrick O'Callaghan wrote:

...

On Mon, 2012-01-23 at 02:41 +0000, Marko Vojinovic wrote: > I have chosen to set up the "madvise" state rather than the "never" > state, so > I added "transparent_hugepage=madvise" to the GRUB_CMDLINE_LINUX line > in > /etc/default/grub, ran "grub2-mkconfig /boot/grub2/grub.cfg" to > activate the > change, and rebooted. After the reboot, the /sys/kernel/.../enabled > file reads: > > always [madvise] never > > instead of the previous > > [always] madvise never > > I guess that means it works. :-) I've only tried the "never" option on the boot line, and it definitely didn't work, i.e. the setting stayed at [always]. I might try with madvise to see if it makes a difference.

Well, wouldn't you know. I had been editing /boot/grub/grub.conf and not the new-fangled /boot/grub2/grub.cfg, so of course it wasn't working. Ahem ... I've fixed that, so fingers crossed. poc

Patrick O'Callaghan

9:30 p.m.

New subject: F16 unusable while writing to pendrive [SOLVED -- not]

On Mon, 2012-01-23 at 02:41 +0000, Marko Vojinovic wrote:

...

I have chosen to set up the "madvise" state rather than the "never" state, so I added "transparent_hugepage=madvise" to the GRUB_CMDLINE_LINUX line in /etc/default/grub, ran "grub2-mkconfig /boot/grub2/grub.cfg" to activate the change, and rebooted. After the reboot, the /sys/kernel/.../enabled file reads: always [madvise] never instead of the previous [always] madvise never I guess that means it works. :-)

I've only tried the "never" option on the boot line, and it definitely didn't work, i.e. the setting stayed at [always]. I might try with madvise to see if it makes a difference. poc

Marko Vojinovic

9:25 p.m.

New subject: F16 unusable while writing to pendrive [SOLVED -- not]

On Monday 23 January 2012 02:41:07 you wrote:

...

Now I'll see how USB is going to work --- a 2.2 GB file is about to be written... ;-)

As a side note --- is it normal to have the write-to-USB-flash speed of 2.5 MiB/s (on average)? I am supposedly using USB2.0 port, and have a 8 GB flash drive. Writing a file of 2.2 GB in size takes around 15 minutes to complete. Also that KDE progress- bar thing is reporting the 2.5 MiB/s value (on average) which fits my calculation, but I was wondering if this is the expected performance or not. I started using USB for GB-sized files only recently, after the DVDRW drive in my laptop died, so don't have much experience. P.S. My machine was quite usable during the write, nothing hung or stalled. But then again I may have not actually had enough uptime to hit the "hugepage" bug. ;-) [vmarko@Yoda ~]$ uptime 03:24:49 up 1 day, 3:23, 4 users, load average: 0.91, 2.63, 2.08 Best, :-) Marko

Marko Vojinovic

8:41 p.m.

New subject: F16 unusable while writing to pendrive [SOLVED -- not]

On Sunday 22 January 2012 20:52:43 Patrick O'Callaghan wrote:

...

On Sat, 2012-01-21 at 14:05 -0430, Patrick O'Callaghan wrote: > This was really bothering me (I make frequent use of pendrives) so took > another look at the above article, then > at /usr/share/doc/kernel-doc-3.1.9/Documentation/vm/transhuge.txt, where > it explains how to turn off the Transparent Huge Pages feature. I added > transparent_hugepage=never to my boot params, rebooted, and the problem > appears to have been fixed. I spoke too soon. The performance improvement must have been due to me having just rebooted. In fact neither the boot param nor attempting to write to /sys/kernel/mm/transparent_hugepage/enabled have any effect. The latter *always* shows THP to be enabled and won't allow it to be overwritten, even as root.

After reading your previous post, I looked up the transhuge.txt doc and read it thoroughly. Echoing into the /sys/kernel/.../enabled did not work for me, but adding an option to the kernel command line did work. I have chosen to set up the "madvise" state rather than the "never" state, so I added "transparent_hugepage=madvise" to the GRUB_CMDLINE_LINUX line in /etc/default/grub, ran "grub2-mkconfig /boot/grub2/grub.cfg" to activate the change, and rebooted. After the reboot, the /sys/kernel/.../enabled file reads: always [madvise] never instead of the previous [always] madvise never I guess that means it works. :-) Before that I tried to put the echo command into /etc/rc.d/rc.local, but that didn't work. Initially I thought that there is a race condition in systemd between executing the rc.local and mounting /sys for rw, but I didn't properly investigate what actually went wrong --- it was easier to add a boot parameter. Also, boot parameter is guaranteed to work from the very start, while the echo version would influence only apps that have been started after the echo itself, which is kind of klunky for my taste. I haven't tried to write to USB since yesterday (I am just about to try now), but so far so good, and the kernel apparently acknowledges the "madvise" boot option. Now I'll see how USB is going to work --- a 2.2 GB file is about to be written... ;-) HTH, :-) Marko

Patrick O'Callaghan

Thursday, 5 January Thu, 5 Jan

6:29 a.m.

On Thu, 2012-01-05 at 03:39 +0100, Reindl Harald wrote:

...

Am 05.01.2012 03:02, schrieb Fernando Cassia: > On Sun, Dec 18, 2011 at 10:01, Patrick O'Callaghan > <pocallaghan(a)gmail.com> wrote: >> Understanding how the file abstraction works is to my mind a question of >> basic culture around here. As a piece of engineering design balancing >> functional elegance with practicality it's a wonderful thing to >> contemplate and a key element of the success of the Unix model, > > I agree. Thanks for the explanation, links and pointers, Patrick. but having in the year 2012 a Corei7 Quad with 16 GB RAM which is freezing the KDE-Desktop while you rsync large files to a USB drive is a little bit unacepptable - and that is what happens since a longer time if you read data froma real fast RAID10 and write it to a USB-drive

I agree of course, but that has nothing to do with Fernando's question. The problem appears to be related to the USB driver, or possibly to the large page problem someone mentioned recently, it's not clear. All I know is that in F15 my system would get a little slower when writing to a pendrive, and now in F16 it practically stops. poc

Reindl Harald

Wednesday, 4 January Wed, 4 Jan

8:39 p.m.

Am 05.01.2012 03:02, schrieb Fernando Cassia:

...

On Sun, Dec 18, 2011 at 10:01, Patrick O'Callaghan <pocallaghan(a)gmail.com> wrote: > Understanding how the file abstraction works is to my mind a question of > basic culture around here. As a piece of engineering design balancing > functional elegance with practicality it's a wonderful thing to > contemplate and a key element of the success of the Unix model, I agree. Thanks for the explanation, links and pointers, Patrick.

but having in the year 2012 a Corei7 Quad with 16 GB RAM which is freezing the KDE-Desktop while you rsync large files to a USB drive is a little bit unacepptable - and that is what happens since a longer time if you read data froma real fast RAID10 and write it to a USB-drive

Fernando Cassia

8:16 p.m.

On Sun, Dec 18, 2011 at 09:01, Tim <ignored_mailbox(a)yahoo.com.au> wrote:

...

You probably want to research into "sparse files."

Thanks Tim, One learns something every day. I had read about sparse files but never used them, since I come from an IBM 32-bit OS/2 (HPFS) background which I used for almost a full decade (1992-2002) and it didn´t have sparse file support, not even in the os2 implementation of JFS I believe -which arrived late, by around 1999. When I used Linux and FreeBSD, I was no longer doing any low-level programming, only bash and rexx, so I didn´t bother to research this. So thanks for the heads up. I´ve found the following page which details an interesting application of sparse files for virtual machines http://administratosphere.wordpress.com/tag/sparse-files/ FC -- "The purpose of computing is insight, not numbers." Richard Hamming - http://en.wikipedia.org/wiki/Hamming_code

Joe Zeff

Sunday, 18 December Sun, 18 Dec

2:19 p.m.

On 12/18/2011 11:56 AM, Patrick O'Callaghan wrote:

...

Nevertheless it happens and appears to affect other people as well. Which makes it a subtle bug to find, unfortunately.

The OP should open a BZ report, with as much hardware detail as possible and post a link here. Then, you can add a comment, with your hardware details so that whoever works on it can compare the two and possibly find out what makes such a difference.

Joe Zeff

2:16 p.m.

On 12/18/2011 11:54 AM, Patrick O'Callaghan wrote:

...

I taught Unix filesystem fundamentals for years and I have no idea what you're getting at. Just saying. If you want to be more specific perhaps we can talk.

The link I gave has a rather simplistic explanation. I used to have a link to a more detailed one but couldn't find it. If I do, I'll post it. The basic idea is that AIUI, instead of jamming files one right next to the other, Linux spreads them out all over the partition with room between them whenever possible. That allows files to grow as needed with the least chance of fragmenting.

Patrick O'Callaghan

1:56 p.m.

On Sun, 2011-12-18 at 19:55 +0100, Heinz Diehl wrote:

...

On 18.12.2011, Patrick O'Callaghan wrote: > This looks like a regression. Under F15 when I wrote large files to a > pendrive, the system would become a little sluggish. Now it essentially > freezes until the write terminates. What I mean is that the UI is almost > completely unresponsive; even clicking between two terminal windows is > so slow that you can see the window contents refresh, followed several > seconds later by the frame. I can't confirm the behaviour you are describing here in any way. I can write as much as I like to all of my pendrives, and my system is responsive as always. I do not even notice that something gets written to a pendrive in the background while doing other things. F16, updatet via "distro-sync" to F16. Kernel 3.1.5 (vanilla from kernel.org, with 2 totally unrelated patches).

Nevertheless it happens and appears to affect other people as well. Which makes it a subtle bug to find, unfortunately. poc

Patrick O'Callaghan

1:54 p.m.

On Sun, 2011-12-18 at 10:45 -0800, Joe Zeff wrote:

...

On 12/18/2011 01:17 AM, Fernando Cassia wrote: > Is that what you are saying?. Did you follow the link I gave? It's what the explanation there implies. However, I'm guessing that the system only knows how big your first write is, not how big the file's going to get, partially because the program itself may not know this.

The system knows absolutely nothing except what you tell it. The file is created with only the most basic attributes, including ownership and permissions. It occupies no space other than a directory entry, and everything from then on has to be added explicitly.

...

And, instead of going to the beginning of the area allocated, it goes roughly to the middle, to leave room for the preceding file to grow as well as room in case you want to add something to the head of the file.

I taught Unix filesystem fundamentals for years and I have no idea what you're getting at. Just saying. If you want to be more specific perhaps we can talk. poc

Heinz Diehl

12:55 p.m.

On 18.12.2011, Patrick O'Callaghan wrote:

...

I can't confirm the behaviour you are describing here in any way. I can write as much as I like to all of my pendrives, and my system is responsive as always. I do not even notice that something gets written to a pendrive in the background while doing other things. F16, updatet via "distro-sync" to F16. Kernel 3.1.5 (vanilla from kernel.org, with 2 totally unrelated patches).

Joe Zeff

12:45 p.m.

On 12/18/2011 01:17 AM, Fernando Cassia wrote:

...

Is that what you are saying?.

Did you follow the link I gave? It's what the explanation there implies. However, I'm guessing that the system only knows how big your first write is, not how big the file's going to get, partially because the program itself may not know this. And, instead of going to the beginning of the area allocated, it goes roughly to the middle, to leave room for the preceding file to grow as well as room in case you want to add something to the head of the file. It's not perfect, and files can still fragment, but the probability is that it won't happen anywhere near as often as it does under Windows.

Alan Cox

9:24 a.m.

...

> Actually, the fact that Linux drives don't need regular defragging has > nothing to do with the file system. Actually the fact that linux drives don't get defragged is more because there are no defragging tools than because there wouldn't be a performance benefit to having the sectors in fragmented files moved to adjacent chunks of disk.

Manure... There are tools for it but ext2/3/4 and some of the other file systems have allocation algorithms and I/O scheduling policies that make them fragmentation resistant. If you plot fragmentation and performance over time they don't drop off very much even after years of I/O (at least for almost all workloads...). btrfs on the other hand currently seems to have chronic fragmentation problems. So it's nothing to do with tools, its a design property, the core of which comes from the BSD FFS/UFS work in the 1980s. Various other improvements have been added on top of that including block reservations. Alan

Alan Cox

9:18 a.m.

...

the low-level APIs involved with file creation. Is there a way to tell the file system that you´re creating a file with total size "x" before any such data is written to it, I mean, as part of the file creation call?.

Yes.

...

I mean, it is one thing to create a file with size 0, then start appending data to it in chunk, than say "hey, I´m creating a 8-gigabytes long file, with name xyz". If the latter exists, I´m curious if there´s logic at the filesystem level to try to find a chunk of free space big enough to allocate it (to reduce fragmentation).

It would almost certainly be the wrong thing to do to try and create an 8GB chunk in one place, at least when considering the entire system behaviour in the longer term.

...

downloaded, but I never knew if the file creation call was a single one or it actually consisted of the file creation call first, and then a write of the x gigabytes of zeroes...

Depends on the filesystem. Unwritten zero blocks in most file systems are not stored but are implied by the file length.

...

I can´t believe that in this day and age (i briefly looked at the win32 api and it seems there´s no api to create a fixed-size empty file) there´s no api for this, and that one has to rely on a per-app implementation (ie filing zeroes).

There is one/

...

Why am I asking this? because of this lament about the lack of a "mkfile" command in Linux as there is in Solaris http://madbodger.livejournal.com/114433.html

So find someone more clueful to read ;). mkfile is a solaris command hack. You can do the same on any Unix like os with dd if=/dev/zero of=filename bs=1M count=number-of-megs-you-want I guess mkfile is someones quick wrapper to hide it, or to use fallocate

...

Just curious... (I know, you will tell me "it isn´t the job of a filesystem to populate the contents of an empty file!). And maybe you´d be right. Still, I wonder if perhaps fixed-size, empty-file creation wouldn´t be much faster if it was implemented at the filesystem level).

Thats why it is. Alan

Patrick O'Callaghan

7:01 a.m.

On Sun, 2011-12-18 at 06:17 -0300, Fernando Cassia wrote:

...

On Sun, Dec 18, 2011 at 04:14, Joe Zeff <joe(a)zeff.us> wrote: > Basically, the system tries to find a place big enough to hold the entire > file instead of putting the first chunk into the first place it finds. Please confirm if I understood this right, as I¨m not familiar with the low-level APIs involved with file creation.

No offense, but my advice to anyone playing around with Linux or Unix systems at a level beyond the end-user is to read up on the basics of the core file APIs, even if you're never going to write a program. Understanding how the file abstraction works is to my mind a question of basic culture around here. As a piece of engineering design balancing functional elegance with practicality it's a wonderful thing to contemplate and a key element of the success of the Unix model, especially when you compare it to the competition (of course by now the competition has essentially lifted the best parts but it wasn't always that way). You should at least look at creat(2), open(2), lseek(2), read(2), write(2) and unlink(2). Anyway ...

...

Is there a way to tell the file system that you´re creating a file with total size "x" before any such data is written to it, I mean, as part of the file creation call?.

No.

...

Some filesystem implementations may allow this and some not, but the basic APIs are a lowest common denominator and don't include it directly. Whether or not you preallocate space has no effect on the semantics of accessing the file, so it's an optimization issue and different implementations may do it in different ways, e.g. some may use extents -- so they always preallocate a certain minimum amount -- but not all do.

...

Is that what you are saying?. I do know that for instance some bittorrent clients (Vuze -formerly azureus comes to mind) allocate the full size of the file being retrieved, then starts populating (writing) segments as those are downloaded, but I never knew if the file creation call was a single one or it actually consisted of the file creation call first, and then a write of the x gigabytes of zeroes...

The BT clients do this not for speed optimization (irrelevant for this use case) but as a way of reserving space. That way there's no danger of running out of room in the middle of a large BT transfer.

...

Believe it. However don't think that allocation is done by writing zeroes; in some implementations writing a block- or extent-aligned buffer of zeroes won't actually send any data to the disk. Also, take a look at fallocate(2), but note that it's Linux-specific.

...

Why am I asking this? because of this lament about the lack of a "mkfile" command in Linux as there is in Solaris http://madbodger.livejournal.com/114433.html

Mkfile is a *command*, i.e. a program written using the API. You could just as easily write mkfile in Linux (maybe someone has done it, I don't know).

...

Consider the following program: main() { int fd; fd=creat("myfile", 0666); lseek (fd, 100000, 2); write (fd, "end", 4); } Save as (say) hole.c and do: $ make hole $ ./hole $ ls -l myfile $ du myfile $ cat myfile Now see if you understand what's happening. poc

Patrick O'Callaghan

6:28 a.m.

On Sat, 2011-12-17 at 21:15 -0800, sourcerer_sea(a)riseup.net wrote:

...

> Actually, the fact that Linux drives don't need regular defragging has > nothing to do with the file system.

...

It should still be possible to fragment a large file if, for example, you opened a file that covered more than 3 segments and appended data to the middle of it.

You can't "append to the middle of it". Appending by definition means adding on the end. You can overwrite the middle, but it's not going to push everything up to make room (i.e. filesystems don't work like text editors where you can 'insert' stuff just anywhere). The only way this would be relevant to extent allocation is in the case of sparse files, where the file has a certain logical size but a lower physical size. If you don't know what I mean by that, kindly read up on sparse files before proceeding. poc

Tim

6:01 a.m.

On Sun, 2011-12-18 at 06:17 -0300, Fernando Cassia wrote:

...

I do know that for instance some bittorrent clients (Vuze -formerly azureus comes to mind) allocate the full size of the file being retrieved, then starts populating (writing) segments as those are downloaded, but I never knew if the file creation call was a single one or it actually consisted of the file creation call first, and then a write of the x gigabytes of zeroes...

You probably want to research into "sparse files." -- [tim@localhost ~]$ uname -r 2.6.27.25-78.2.56.fc9.i686 Don't send private replies to my address, the mailbox is ignored. I read messages from the public lists.

Fernando Cassia

3:17 a.m.

On Sun, Dec 18, 2011 at 04:14, Joe Zeff <joe(a)zeff.us> wrote:

...

Basically, the system tries to find a place big enough to hold the entire file instead of putting the first chunk into the first place it finds.

Please confirm if I understood this right, as I¨m not familiar with the low-level APIs involved with file creation. Is there a way to tell the file system that you´re creating a file with total size "x" before any such data is written to it, I mean, as part of the file creation call?. I mean, it is one thing to create a file with size 0, then start appending data to it in chunk, than say "hey, I´m creating a 8-gigabytes long file, with name xyz". If the latter exists, I´m curious if there´s logic at the filesystem level to try to find a chunk of free space big enough to allocate it (to reduce fragmentation). Is that what you are saying?. I do know that for instance some bittorrent clients (Vuze -formerly azureus comes to mind) allocate the full size of the file being retrieved, then starts populating (writing) segments as those are downloaded, but I never knew if the file creation call was a single one or it actually consisted of the file creation call first, and then a write of the x gigabytes of zeroes... I can´t believe that in this day and age (i briefly looked at the win32 api and it seems there´s no api to create a fixed-size empty file) there´s no api for this, and that one has to rely on a per-app implementation (ie filing zeroes). Why am I asking this? because of this lament about the lack of a "mkfile" command in Linux as there is in Solaris http://madbodger.livejournal.com/114433.html Just curious... (I know, you will tell me "it isn´t the job of a filesystem to populate the contents of an empty file!). And maybe you´d be right. Still, I wonder if perhaps fixed-size, empty-file creation wouldn´t be much faster if it was implemented at the filesystem level). FC

Joe Zeff

1:14 a.m.

On 12/17/2011 09:15 PM, sourcerer_sea(a)riseup.net wrote:

...

Well, what is that reason?

The link I gave has an explanation. Basically, the system tries to find a place big enough to hold the entire file instead of putting the first chunk into the first place it finds. And, as it spreads files out all over the disk, there's generally room left over for growth. But that's a function of the system, not the file structure.

sourcerer_sea＠riseup.net

Saturday, 17 December Sat, 17 Dec

11:15 p.m.

...

Actually, the fact that Linux drives don't need regular defragging has nothing to do with the file system.

Well, what is that reason? I know that ext4 offers 'extents' and that reduces fragmentation, but I don't see any other reason why the ext* family of filesystems are more immune to fragmentation than other filesystems. It should still be possible to fragment a large file if, for example, you opened a file that covered more than 3 segments and appended data to the middle of it. The effect should be that another block is allocated and added to the inode-list for that file to hold the new data that you've added, with that block chaining to block 3. N.B. Before change was made: [file-> [BLOCK1]->[BLOCK2]->[BLOCK3]] The file is contiguous on disk. After change was made: [file-> [BLOCK1]->[BLOCK2]->[BLOCK242]->[BLOCK3]] The file is fragmented. Depending on the size of the file, the size of the blocks, and the location of the edit, almost all files can become fragmented. Another example: Creating a file of initial size 0. [file-> [BLOCKN]] Before any changes are written to the file, some other files are modified. This is a typical scenario on a multiprocessing system. These other files allocate blocks N+1 up to block N+10 for their own use according to scenario 1. Make a change to the file: [file-> [BLOCKN]->[BLOCKN+11]] The nearest free block was block N+11, which is still 10 blocks away. Now these differences are small. However, suppose that a block was 256KiB in size. This means that each block pair can be read in one cycle of the hard-drive reader (they typically read 512 KiB at a time) It would take more time to read the file if the hard-drive must read the first block pair [512 KiB], seek (256 * 10) bytes forward, and read the next block, but only get 1 block out of that read. (Unoptimal read for hard-drive block size) Here is how extents work to reduce the risk of fragmentation: Allocation of a file: [file-> [BLOCKN]->...->[BLOCKN+M]] The file allocates all blocks contiguously on the disk from N to M. This is great for large files, because you can then use those blocks freely to grow your file (i.e. append) as needed. As long as you reserved the correct size, appending does not lead to fragmentation. However, the 'fragmentation due to appending in the middle of a file' problem still exists. I am by no means an expert in filesystems, but I know a little bit about their inner workings. I hope that you found my explanation informative. sea

Patrick O'Callaghan

10:59 p.m.

On Sat, 2011-12-17 at 22:01 -0500, Tom Horsley wrote:

...

On Sat, 17 Dec 2011 18:49:48 -0800 Joe Zeff wrote: > Actually, the fact that Linux drives don't need regular defragging has > nothing to do with the file system. Actually the fact that linux drives don't get defragged is more because there are no defragging tools than because there wouldn't be a performance benefit to having the sectors in fragmented files moved to adjacent chunks of disk.

man e2freefrag man e4defrag poc

Tom Horsley

9:01 p.m.

On Sat, 17 Dec 2011 18:49:48 -0800 Joe Zeff wrote:

...

Actually, the fact that Linux drives don't need regular defragging has nothing to do with the file system.

Actually the fact that linux drives don't get defragged is more because there are no defragging tools than because there wouldn't be a performance benefit to having the sectors in fragmented files moved to adjacent chunks of disk.

Joe Zeff

8:49 p.m.

On 12/17/2011 06:35 PM, Marko Vojinovic wrote:

...

We Linux users usually make fun of Windows users for having to defragment their hard drives every now and then, while our ext2/3/4 solution is sooo far superior.

Actually, the fact that Linux drives don't need regular defragging has nothing to do with the file system. Unlike Windows, Linux doesn't cram the files in one after the other, leaving room between them for growth. See http://www.whylinuxisbetter.net/items/defragment/index.php?lang= for a simple explanation. BTW, if you want to make your flash drive completely unreadable by anybody using Windows (but not Mac) reformat it to ext4 because Windows can't read ext4 drives yet.

Marko Vojinovic

8:35 p.m.

On Saturday 17 December 2011 21:17:19 Patrick O'Callaghan wrote:

...

On Sun, 2011-12-18 at 01:37 +0000, Marko Vojinovic wrote: > On Saturday 17 December 2011 19:41:38 Patrick O'Callaghan wrote: > > This looks like a regression. Under F15 when I wrote large files to > > a > > pendrive, the system would become a little sluggish. Now it > > essentially > > freezes until the write terminates. What I mean is that the UI is > > almost completely unresponsive; even clicking between two terminal > > windows is so slow that you can see the window contents refresh, > > followed several seconds later by the frame. > > > > Fully updated F16 with KDE, Intel Core 2 Dual, 4GB RAM, Intel > > onboard > > graphics. The pendrive is an 8GB Patriot Xporter (a year or two old) > > under USB-2 with no intervening hub. > > Oh yes, this is one of my favourities... :-D > > There's a very good article at LWN discussing precisely this issue: > http://lwn.net/Articles/467328/ > > You wouldn't believe how complicated these things can get... ;-) One > would naively think "yes, the USB drive is slow, but that shouldn't > stop the rest of the system from running smoothly". However, when you > put into the mix the hugemem page allocation, memory fragmentation, > contradictory points of view on how the kernel should be optimized, > etc., it seems that it is quite natural that your typical fast 3GHz > system with 4GB of RAM grinds to a complete halt during a simple > write-to-USB-drive operation... :-D That might be it I guess, but I'm not totally convinced. I updated F15->F16 only 5 days ago, and previous to the switchover had no problems of this sort, using the latest kernel for F15. It's hard to believe the kernel switch from F15 to F16 can have made such a huge difference. For the record, I went from: kernel-2.6.41.1-1.fc15.x86_64 to: kernel-3.1.5-1.fc16.x86_64 But as we know, the version bump from 2 to 3 is meaningless in itself.

I wouldn't know when exactly was this kind of vmm behavior first introduced (in kernel-version-time), but it appears that the problem was discovered only about a month ago, and that there is still no proper fix for this. IMHO, it's still too fresh, but eventually a fix will come in one of the future kernels. :-) What I would like best is to have a kernel "knob" somewhere under /proc, which would enable/disable Mel Gorman's patch, so that both desktop users and cluster farms could tune the vmm behavior to their needs. Also, I haven't been following the discussions on the LKML, but AFAIK there is no consensus yet on how to disentangle this mess. We Linux users usually make fun of Windows users for having to defragment their hard drives every now and then, while our ext2/3/4 solution is sooo far superior. But when RAM gets fragmented due to the big uptimes (which is also something we like to bragg about among Windows users), then all sorts of sh*t starts to happen... The moral of the story: optimization of virtual memory management systems is a bloody hell, especially if you insist on a one-size-fits-all solution (which is the part that always seems to not make any sense IMHO). Best, :-) Marko

Patrick O'Callaghan

7:47 p.m.

On Sun, 2011-12-18 at 01:37 +0000, Marko Vojinovic wrote:

...

On Saturday 17 December 2011 19:41:38 Patrick O'Callaghan wrote: > This looks like a regression. Under F15 when I wrote large files to a > pendrive, the system would become a little sluggish. Now it essentially > freezes until the write terminates. What I mean is that the UI is almost > completely unresponsive; even clicking between two terminal windows is > so slow that you can see the window contents refresh, followed several > seconds later by the frame. > > Fully updated F16 with KDE, Intel Core 2 Dual, 4GB RAM, Intel onboard > graphics. The pendrive is an 8GB Patriot Xporter (a year or two old) > under USB-2 with no intervening hub. Oh yes, this is one of my favourities... :-D There's a very good article at LWN discussing precisely this issue: http://lwn.net/Articles/467328/ You wouldn't believe how complicated these things can get... ;-) One would naively think "yes, the USB drive is slow, but that shouldn't stop the rest of the system from running smoothly". However, when you put into the mix the hugemem page allocation, memory fragmentation, contradictory points of view on how the kernel should be optimized, etc., it seems that it is quite natural that your typical fast 3GHz system with 4GB of RAM grinds to a complete halt during a simple write-to-USB-drive operation... :-D

That might be it I guess, but I'm not totally convinced. I updated F15->F16 only 5 days ago, and previous to the switchover had no problems of this sort, using the latest kernel for F15. It's hard to believe the kernel switch from F15 to F16 can have made such a huge difference. For the record, I went from: kernel-2.6.41.1-1.fc15.x86_64 to: kernel-3.1.5-1.fc16.x86_64 But as we know, the version bump from 2 to 3 is meaningless in itself. poc

Marko Vojinovic

7:37 p.m.

On Saturday 17 December 2011 19:41:38 Patrick O'Callaghan wrote:

...

Oh yes, this is one of my favourities... :-D There's a very good article at LWN discussing precisely this issue: http://lwn.net/Articles/467328/ You wouldn't believe how complicated these things can get... ;-) One would naively think "yes, the USB drive is slow, but that shouldn't stop the rest of the system from running smoothly". However, when you put into the mix the hugemem page allocation, memory fragmentation, contradictory points of view on how the kernel should be optimized, etc., it seems that it is quite natural that your typical fast 3GHz system with 4GB of RAM grinds to a complete halt during a simple write-to-USB-drive operation... :-D OTOH, it is actually completely outrageous in this day and age, but... ;-) Best, :-) Marko

Bruno Wolff III

Monday, 23 January Mon, 23 Jan

2:31 a.m.

New subject: F16 unusable while writing to pendrive [SOLVED -- not]

On Mon, Jan 23, 2012 at 00:31:38 -0600, Bruno Wolff III <bruno(a)wolff.to> wrote:

...

On Sun, Jan 22, 2012 at 20:52:43 -0430, Patrick O'Callaghan <pocallaghan(a)gmail.com> wrote: > > Apparently kernel 3.3 will have a fix for this problem, but that's > expected to be around the end of March unless someone backports it. 3.3 rc1 has been built for rawhide and should work on f16 if you want to try it now.

I have had a chance to test this build (kernel-PAE-3.3.0-0.rc1.git0.3.fc17) now and it seems to be working fine.

Patrick O'Callaghan

Sunday, 22 January Sun, 22 Jan

7:22 p.m.

New subject: F16 unusable while writing to pendrive [SOLVED -- not]

On Sat, 2012-01-21 at 14:05 -0430, Patrick O'Callaghan wrote:

...

On Sat, 2011-12-17 at 21:17 -0430, Patrick O'Callaghan wrote: > On Sun, 2011-12-18 at 01:37 +0000, Marko Vojinovic wrote: > > On Saturday 17 December 2011 19:41:38 Patrick O'Callaghan wrote: > > > This looks like a regression. Under F15 when I wrote large files to a > > > pendrive, the system would become a little sluggish. Now it essentially > > > freezes until the write terminates. What I mean is that the UI is almost > > > completely unresponsive; even clicking between two terminal windows is > > > so slow that you can see the window contents refresh, followed several > > > seconds later by the frame. > > > > > > Fully updated F16 with KDE, Intel Core 2 Dual, 4GB RAM, Intel onboard > > > graphics. The pendrive is an 8GB Patriot Xporter (a year or two old) > > > under USB-2 with no intervening hub. > > > > Oh yes, this is one of my favourities... :-D > > > > There's a very good article at LWN discussing precisely this issue: > > > > http://lwn.net/Articles/467328/ > > > > You wouldn't believe how complicated these things can get... ;-) One would > > naively think "yes, the USB drive is slow, but that shouldn't stop the rest of > > the system from running smoothly". However, when you put into the mix the > > hugemem page allocation, memory fragmentation, contradictory points of view on > > how the kernel should be optimized, etc., it seems that it is quite natural > > that your typical fast 3GHz system with 4GB of RAM grinds to a complete halt > > during a simple write-to-USB-drive operation... :-D > > That might be it I guess, but I'm not totally convinced. I updated > F15->F16 only 5 days ago, and previous to the switchover had no problems > of this sort, using the latest kernel for F15. It's hard to believe the > kernel switch from F15 to F16 can have made such a huge difference. For > the record, I went from: > > kernel-2.6.41.1-1.fc15.x86_64 > > to: > > kernel-3.1.5-1.fc16.x86_64 > > But as we know, the version bump from 2 to 3 is meaningless in itself. This was really bothering me (I make frequent use of pendrives) so took another look at the above article, then at /usr/share/doc/kernel-doc-3.1.9/Documentation/vm/transhuge.txt, where it explains how to turn off the Transparent Huge Pages feature. I added transparent_hugepage=never to my boot params, rebooted, and the problem appears to have been fixed. Note: YMMV. This works for me with my workload (no long jobs or intensive compute load). Others may find it impacts their performance, but it shouldn't break anything.

I spoke too soon. The performance improvement must have been due to me having just rebooted. In fact neither the boot param nor attempting to write to /sys/kernel/mm/transparent_hugepage/enabled have any effect. The latter *always* shows THP to be enabled and won't allow it to be overwritten, even as root. Apparently kernel 3.3 will have a fix for this problem, but that's expected to be around the end of March unless someone backports it. poc

Fernando Cassia

Wednesday, 4 January Wed, 4 Jan

8:02 p.m.

On Sun, Dec 18, 2011 at 10:01, Patrick O'Callaghan <pocallaghan(a)gmail.com> wrote:

...

Understanding how the file abstraction works is to my mind a question of basic culture around here. As a piece of engineering design balancing functional elegance with practicality it's a wonderful thing to contemplate and a key element of the success of the Unix model,

I agree. Thanks for the explanation, links and pointers, Patrick. FC -- "The purpose of computing is insight, not numbers." Richard Hamming - http://en.wikipedia.org/wiki/Hamming_code

Patrick O'Callaghan

Sunday, 22 January Sun, 22 Jan

9:49 p.m.

New subject: F16 unusable while writing to pendrive [SOLVED -- not]

On Mon, 2012-01-23 at 03:25 +0000, Marko Vojinovic wrote:

...

On Monday 23 January 2012 02:41:07 you wrote: > Now I'll see how USB is going to work --- a 2.2 GB file is about to be > written... ;-) As a side note --- is it normal to have the write-to-USB-flash speed of 2.5 MiB/s (on average)? I am supposedly using USB2.0 port, and have a 8 GB flash drive. Writing a file of 2.2 GB in size takes around 15 minutes to complete. Also that KDE progress- bar thing is reporting the 2.5 MiB/s value (on average) which fits my calculation, but I was wondering if this is the expected performance or not.

I'd say that seems on the slow side, but there are several factors that can affect it. For one thing, not all flash drives are created equal. Even if they're supposed to work at USB2 speed, that's just the bus speed of getting the data across the wire and doesn't really tell you much. Cheaper drives are definitely slower than more expensive ones which have larger internal buffers. Most commercial drives don't have any published info on these speeds, but occasionally one sees comparitive benchmarks being published (e.g. http://usbflashspeed.com/). The other important factor is fragmentation. Even though flash drives are theoretically random-access, the way flash memory works means that a sparse distribution of blocks to be modified can make a large difference to the effective write speed (see http://en.wikipedia.org/wiki/Flash_memory#Limitations). This becomes noticeable if you repeatedly create and delete randomly-sized large files on the drive. In time the drive will become slower and slower because of fragmentation, and it can sometimes make sense to copy the entire contents, reformat the drive and copy the files back again in order to "refresh" it. AFAIK there are filesystem designs which try to minimize this effect, but I assume we're talking about VFAT here, as it's the lowest common denominator for practical purposes. poc

4485

days inactive

4521

days old

users@lists.fedoraproject.org

Manage subscription

34 comments

11 participants

tags (0)

participants (11)

Alan Cox
Bruno Wolff III
Fernando Cassia
Heinz Diehl
Joe Zeff
Marko Vojinovic
Patrick O'Callaghan
Reindl Harald
sourcerer_sea＠riseup.net
Tim
Tom Horsley