systemd btrfs timeout

Brian May

24 Jul 2015 24 Jul '15

10:21 a.m.

Hello, I seem to get intermittent systemd time out errors when trying to mount btrfs, which throws the system into systemd rescue mode. e.g. "Job dev-sdb1.device/start timed out." I can't actually see any reason for the problems in the systemd journal. Furthermore, "mount -a" from rescue mode works perfectly. Maybe the hard disks just aren't "waking up" fast enough in the boot sequence. I have reported slowness here before when system is cold, which I suspect is hard disk related (spinning WD disks). Hmm. It is now booting fine again, maybe I need to leave the hard disks on for a while before they start working properly... smartctl reports nothing abnormal. Any ideas? Maybe time to get new hard disks? Thanks

Attachments:

attachment.html (text/html — 1002 bytes)

Show replies by date

James Harper

24 Jul 24 Jul

11:07 a.m.

...

Hello,

I seem to get intermittent systemd time out errors when trying to mount btrfs, which throws the system into systemd rescue mode.

e.g. "Job dev-sdb1.device/start timed out."

I can't actually see any reason for the problems in the systemd journal.

Furthermore, "mount -a" from rescue mode works perfectly.

Maybe the hard disks just aren't "waking up" fast enough in the boot sequence. I have reported slowness here before when system is cold, which I suspect is hard disk related (spinning WD disks). Hmm. It is now booting fine again, maybe I need to leave the hard disks on for a while before they start working properly...

smartctl reports nothing abnormal.

Any ideas?

Maybe time to get new hard disks?

I had that happen occasionally, but in my case when btrfs failed to mount it was because the disk never started up at all, and it was always the same disk. Are the disks in question spun up by the BIOS, or by Linux? Is it always the same disk, or a different disk each time? James

Brian May

26 Jul 26 Jul

12:47 a.m.

On Fri, 24 Jul 2015 at 21:07 James Harper <james@ejbdigital.com.au> wrote:

...

I had that happen occasionally, but in my case when btrfs failed to mount it was because the disk never started up at all, and it was always the same disk.

Yes, that would make more sense.

...

Are the disks in question spun up by the BIOS, or by Linux?

Not sure. I would have assumed BIOS spins all the disks up it detects before starting grub. I am not exactly keeping up with modern BIOS trends though (it is UEFI boot), so I could be wrong here. The disks containing the btrfs partitions are not used for booting, and are not used for / either.

...

Is it always the same disk, or a different disk each time?

Same disk, which isn't terribly surprising - I only have one btrfs. It is RAID1, however I refer to /dev/sdb1 in /etc/fstab - which should automatically bring in /dev/sdc1 if my understanding is correct. I am very doubtful it is btrfs' fault, it seems very odd regardless that I only encountered hard failures after converting to btrfs.

James Harper

12:52 a.m.

...

On Fri, 24 Jul 2015 at 21:07 James Harper <james@ejbdigital.com.au <mailto:james@ejbdigital.com.au> > wrote:

...
Are the disks in question spun up by the BIOS, or by Linux?

Not sure.

I would have assumed BIOS spins all the disks up it detects before starting grub.

I am not exactly keeping up with modern BIOS trends though (it is UEFI boot), so I could be wrong here. The disks containing the btrfs partitions are not used for booting, and are not used for / either.

I think it would spin up all the disks it knew about. If you had a SAS card or something with its own BIOS then things might be different. Also spinning up a disk is a pretty quick thing, and would only add a second or two to the mount time. James

Craig Sanders

12:59 a.m.

On Sun, Jul 26, 2015 at 12:47:58AM +0000, Brian May wrote:

...

...
Is it always the same disk, or a different disk each time?

Same disk, which isn't terribly surprising - I only have one btrfs. It is RAID1, however I refer to /dev/sdb1 in /etc/fstab -

do you get the same error if you edit /etc/fstab to refer to /dev/sdc1 ? if that works, it may be that sdb is slower at spinning up than sdc.

...

which should automatically bring in /dev/sdc1 if my understanding is correct.

yep. btrfs doesn't care which drive/partition of an array you tell it to mount, it'll find the other members and assemble the array anyway. craig ps: if you haven't already done so, check the power and both ends of the data cables for the drives - loose connections can cause all sorts of annoying intermittent problems. -- craig sanders <cas@taz.net.au>

Brian May

1:07 a.m.

On Sun, 26 Jul 2015 at 10:59 Craig Sanders <cas@taz.net.au> wrote:

...

do you get the same error if you edit /etc/fstab to refer to /dev/sdc1 ?

Yes. Except now for sdc1, not sdb1

...

ps: if you haven't already done so, check the power and both ends of the data cables for the drives - loose connections can cause all sorts of annoying intermittent problems.

Good suggestion. I have never opened up this computer, probably should check. Also disks are standard SATA, I think the BIOS should detect them and power them up.

Marcus Furlong

27 Jul 27 Jul

12:48 a.m.

On 26 July 2015 at 10:47, Brian May <brian@microcomaustralia.com.au> wrote:

...

On Fri, 24 Jul 2015 at 21:07 James Harper <james@ejbdigital.com.au> wrote:

...

...
Is it always the same disk, or a different disk each time?

Same disk, which isn't terribly surprising - I only have one btrfs. It is RAID1, however I refer to /dev/sdb1 in /etc/fstab - which should automatically bring in /dev/sdc1 if my understanding is correct.

I am very doubtful it is btrfs' fault, it seems very odd regardless that I only encountered hard failures after converting to btrfs.

I have the same issue, with btrfs on /dev/sdb1 and /dev/sdc1. Around half the time, it doesn't mount on boot. When debugging, I discovered that it does actually mount for less than a second and then it automatically unmounts.Changing from /dev/sdb1 to /dev/sdc1 made no difference. I asked on the #btrfs irc channel recently and there was a suggestion that the issue was related to udev rules: http://logs.tvrrug.org.uk/logs/%23btrfs/2015-06-11.html#2015-06-11T12:07:48 but I haven't had time to do any further investigation. -- Marcus Furlong

Brian May

2:19 a.m.

On Mon, 27 Jul 2015 at 10:49 Marcus Furlong <furlongm@gmail.com> wrote:

...

I have the same issue, with btrfs on /dev/sdb1 and /dev/sdc1. Around half the time, it doesn't mount on boot. When debugging, I discovered that it does actually mount for less than a second and then it automatically unmounts.Changing from /dev/sdb1 to /dev/sdc1 made no difference.

Sounds like my problem. How did you work out that systemd unmounts it? I asked on the #btrfs irc channel recently and there was a suggestion

...

that the issue was related to udev rules:

So no solution or work around then? Thanks

Marcus Furlong

2:40 a.m.

On 27 July 2015 at 12:19, Brian May <brian@microcomaustralia.com.au> wrote:

...

On Mon, 27 Jul 2015 at 10:49 Marcus Furlong <furlongm@gmail.com> wrote:

...
I have the same issue, with btrfs on /dev/sdb1 and /dev/sdc1. Around half the time, it doesn't mount on boot. When debugging, I discovered that it does actually mount for less than a second and then it automatically unmounts.Changing from /dev/sdb1 to /dev/sdc1 made no difference.

Sounds like my problem. How did you work out that systemd unmounts it?

I didn't work yet out what umounts it exactly. But to see if it mounted I did something like the following: for i in `seq 1 100` ; do mount /dev/sdb1 ; mount | grep sdb1 ; done and on some iterations it would be mounted, but umounted on subsequent iterations.

...

...
I asked on the #btrfs irc channel recently and there was a suggestion that the issue was related to udev rules:

So no solution or work around then?

My current workaround is to keep mounting in rc.local until it stays mounted for longer than 10 seconds. Marcus. -- Marcus Furlong

Brian May

11:12 p.m.

On Mon, 27 Jul 2015 at 12:40 Marcus Furlong <furlongm@gmail.com> wrote:

...

I didn't work yet out what umounts it exactly. But to see if it mounted I did something like the following:

for i in `seq 1 100` ; do mount /dev/sdb1 ; mount | grep sdb1 ; done

and on some iterations it would be mounted, but umounted on subsequent iterations.

I was going to try debugging this further this cold morning, but it seems it is working perfectly. I believe that the default x-systemd.timeout is 90 seconds. Should be more then enough I think. mount -a works much faster.

Chris Samuel

29 Jul 29 Jul

11:03 a.m.

On Mon, 27 Jul 2015 11:12:52 PM Brian May wrote:

...

I believe that the default x-systemd.timeout is 90 seconds. Should be more then enough I think. mount -a works much faster.

Mount times of 30 minutes have been reported for a filesystem that's ~82TB large with ~62TB of data. One of the Fujitsu developers responded with: # Quite common, especial when it grows large. # But it would be much better to use ftrace to show which btrfs # operation takes the most time. # # We have some guess on this, from reading space cache to reading chunk # info. But didn't know which takes the most of time. cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

Peter Ross

30 Jul 30 Jul

1:29 a.m.

Chris Samuel wrote:

...

Mount times of 30 minutes have been reported for a filesystem that's ~82TB large with ~62TB of data.

I wonder how many subvolumes were involved. It took a bit (2 or 3 seconds) longer when I had a lot of ZFS filesystems to mount (maybe between 50 to 100, I guess now) It was on 2 TeraByte max so smaller as the reported overall size. Regards Peter On Wed, Jul 29, 2015 at 9:03 PM, Chris Samuel <chris@csamuel.org> wrote:

...

On Mon, 27 Jul 2015 11:12:52 PM Brian May wrote:

...
I believe that the default x-systemd.timeout is 90 seconds. Should be more then enough I think. mount -a works much faster.

Mount times of 30 minutes have been reported for a filesystem that's ~82TB large with ~62TB of data.

One of the Fujitsu developers responded with:

# Quite common, especial when it grows large. # But it would be much better to use ftrace to show which btrfs # operation takes the most time. # # We have some guess on this, from reading space cache to reading chunk # info. But didn't know which takes the most of time.

cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

_______________________________________________ luv-main mailing list luv-main@luv.asn.au http://lists.luv.asn.au/listinfo/luv-main

trentbuck＠gmail.com

27 Jul 27 Jul

2:09 a.m.

Brian May writes:

...

...
Are the disks in question spun up by the BIOS, or by Linux? I would have assumed BIOS spins all the disks up it detects before starting grub.

Apropos? ==> https://wiki.archlinux.org/index.php/Improve_Boot_Performance#Staggered_spin...

Chris Samuel

26 Jul 26 Jul

4:38 a.m.

On Fri, 24 Jul 2015 10:21:49 AM Brian May wrote:

...

I seem to get intermittent systemd time out errors when trying to mount btrfs, which throws the system into systemd rescue mode.

Yeah, btrfs can do that as it can take a while to mount a filesystem, it's been mentioned on the btrfs list these last few days. # 50% of the time when booting, the system go in safe mode because # my 12x 4TB RAID10 btrfs is taking too long to mount from fstab. to which there was the reply: # man systemd.mount, search for ”x-systemd.device-timeout=”. # # But maybe that's a hint for developers, long mount time for # btrfs are quite common – it would be cool if it could be reduced. Best of luck, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

Brian May

8:10 a.m.

On Sun, 26 Jul 2015 at 14:38 Chris Samuel <chris@csamuel.org> wrote:

...

Yeah, btrfs can do that as it can take a while to mount a filesystem, it's been mentioned on the btrfs list these last few days.

The systemd timeout appears to be 90 seconds. Would like to think 90 seconds is enough. Of course, maybe this is another systemd timeout that fails, and it doesn't realize until the 90 seconds elapses. Am a bit doubtful. "time mount -a" reports 1.014 seconds real time.

...

# 50% of the time when booting, the system go in safe mode because # my 12x 4TB RAID10 btrfs is taking too long to mount from fstab.

to which there was the reply:

# man systemd.mount, search for ”x-systemd.device-timeout=”. # # But maybe that's a hint for developers, long mount time for # btrfs are quite common – it would be cool if it could be reduced.

Wonder what the default is?

3649

Age (days ago)

3655

Last active (days ago)

List overview

Download

14 comments

7 participants

participants (7)

Brian May
Chris Samuel
Craig Sanders
James Harper
Marcus Furlong
Peter Ross
trentbuck＠gmail.com