
Linux has used demand-loading of executables and shared objects for ages. This means that if an executable or shared object has some pages that happen to never get called (IE for corner conditions that don't happen in the usual case) then they never get read from disk. The same applies for debugging data when the object isn't being debugged. Even if you have unused data on the same pages as executable data thus increasing the number of executable pages loaded into RAM it shouldn't be a big deal as RAM keeps getting bigger so that only a small portion of the multi-gigabytes of RAM in a low-end system is used for executable pages. Disks are getting bigger all the time. Nowadays it would be silly to consider purchasing a disk for a desktop system that's less than 2TB in size and laptops have had hundreds of GB for ages. Currently the biggest root filesystem on a system I run is 12G, that is 0.6% of the space on a desktop disk you might purchase (*) and less than 10% of the low end laptop disks that were on sale a couple of years ago. If I had to make that 24G for the root filesystem it wouldn't be a big deal. In terms of reducing binary size there has been some discussion about a port of Linux that uses 32bit instructions with 64bit data operations and registers on an AMD64 architecture. The idea is to save RAM and TLBs by not using 64/64 while still getting some performance benefits of a 64bit CPU. But there is little interest in this and it seems that Debian won't support it due to no- one caring. So the question is, why strip binaries? Back in the days when we ran servers with 100MB hard drives there was a real need to save space. When a 128Kb/64Kb ADSL link was considered fast there was a real need to reduce download time. But now that most of us have ADSL links that allow 100KB/s (800Kb/s) UPLOAD speeds and significantly faster download speeds. The thing about the debugging symbol table is that you never know when you will need it. Having it always there seems to have no cost that matters but it can provide significant benefits. So why ship a program or shared object that's stripped? (*) The system in question has a 160G disk I got from a junk pile. But as half the disk space is on my /junk filesystem and there is still unallocated space I think this supports my general point. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On 29/05/12 22:50, Russell Coker wrote:
Disks are getting bigger all the time. Nowadays it would be silly to consider purchasing a disk for a desktop system that's less than 2TB in size and
Except that for anything I'm building now I'll want an SSD, even desktops. I have three servers which still use spinning boot disks (two with clients use RAID1, one personal one doesn't), and one with an SSD (Debian k/FreeBSD ZFS NAS, SSD also used for ZIL/L2ARC). My two T410's use SSD's, and even my MacBook uses a Seagate Hybrid drive. My desktop is also SSD boot. Two of the servers date from 2009, one from 2007, I didn't become comfortable with SSD's until the 2010 election (which is when I installed the SSD currently in my laptop, which according to SMART still has 38% of life left) I still don't use huge root filesystems, although on new ones I'd use 20GB, my laptop has a 15GB root and that's starting to get tight, although I keep all sorts of random packages installed.
The thing about the debugging symbol table is that you never know when you will need it. Having it always there seems to have no cost that matters but it can provide significant benefits. So why ship a program or shared object that's stripped?
Ubuntu at least does this by storing all symbols for all relevent package versions "in the cloud" so you can use when needed. I think Debian were planning to deploy this as well, but I've no idea if that happened.

On Tue, 29 May 2012 23:09:32 +1000, Julien Goodwin <luv-lists@studio442.com.au> wrote:
Ubuntu at least does this by storing all symbols for all relevent package versions "in the cloud" so you can use when needed. I think Debian were planning to deploy this as well, but I've no idea if that happened.
FWIW the "in the cloud" is how Microsoft has stored symbols for most of the Windows stack for at least 10 years. Basically, debuggers can download them on demand - kind of useful if you find yourself in WinDBG (which I swear is where I've spent the majority of my Windows using time in the past decade). -- Stewart Smith

On Tue, 29 May 2012 23:09:32 +1000, Julien Goodwin <luv- lists@studio442.com.au> wrote:
Ubuntu at least does this by storing all symbols for all relevent package versions "in the cloud" so you can use when needed. I think Debian were planning to deploy this as well, but I've no idea if that happened.
FWIW the "in the cloud" is how Microsoft has stored symbols for most of the Windows stack for at least 10 years.
Basically, debuggers can download them on demand - kind of useful if you find yourself in WinDBG (which I swear is where I've spent the majority of my Windows using time in the past decade).
Me too. For all it's faults, WinDBG is a beautiful thing. Kernel debugging would be so much harder without it! James

Stewart Smith <stewart@flamingspork.com> wrote:
FWIW the "in the cloud" is how Microsoft has stored symbols for most of the Windows stack for at least 10 years.
Basically, debuggers can download them on demand - kind of useful if you find yourself in WinDBG (which I swear is where I've spent the majority of my Windows using time in the past decade).
As I remember, tools such as Apport (developed by Ubuntu, heading toward Debian - see the python-apport package) store the symbols on the machine that decodes the generated report, not on the machine where the segfault happens.
From memory, the data retrieved after the crash isn't a full core file, bit I suppose it's enough to obtain a backtrace.

Stewart Smith wrote:
On Tue, 29 May 2012 23:09:32 +1000, Julien Goodwin <luv-lists@studio442.com.au> wrote:
Ubuntu at least does this by storing all symbols for all relevent package versions "in the cloud" so you can use when needed. I think Debian were planning to deploy this as well, but I've no idea if that happened.
FWIW the "in the cloud" is how Microsoft has stored symbols for most of the Windows stack for at least 10 years.
Basically, debuggers can download them on demand - kind of useful if you find yourself in WinDBG (which I swear is where I've spent the majority of my Windows using time in the past decade).
You can download them on-demand in Debian, too. They're the -dbg packages. (Well, OK, you need to have enabled core dumps in advance, or to rerun the app. So probably not as awesome as what you describe in windbg.) The reason why they're separate packages, and not installed by default, is because they're a non-negligible increase in size. To take an example at random: twb@cigar:/srv/apt/debian/pool/main$ du -sch o/otcl/*deb | tail -1 660K total twb@cigar:/srv/apt/debian/pool/main$ du -sch o/otcl/*dbg*deb | tail -1 148K total That's 22% -- nearly a quarter -- of the package's space, consumed by debugging symbols alone. But OK, maybe that's because it's such a small package. How about a bigger one? twb@cigar:/srv/apt/debian/pool/main$ du -sch a/apache2/*dbg*deb | tail -1 11M total twb@cigar:/srv/apt/debian/pool/main$ du -sch a/apache2/*deb | tail -1 25M total Hmm, now we're up to 44% -- nearly half! -- of the space being used by debugging symbols. OK, what about the whole of Debian? twb@cigar:/srv/apt/debian/pool/main$ find -name \*.deb -print0 | du -sch --files0-from - | tail -1 130G total twb@cigar:/srv/apt/debian/pool/main$ find -name \*dbg*.deb -print0 | du -sch --files0-from - | tail -1 46G total 35% of Debian's entire binary mirror[*] is debugging symbols! [*] OK technically I am only mirroring main for i386 and amd64, and exclude sources. But you get the idea.

On Wed, 30 May 2012, "Trent W. Buck" <trentbuck@gmail.com> wrote:
The reason why they're separate packages, and not installed by default, is because they're a non-negligible increase in size. To take an example at random: [...] Hmm, now we're up to 44% -- nearly half! -- of the space being used by debugging symbols. OK, what about the whole of Debian?
10 years ago a 40G hard drive was considered big. Now SSD is the storage form which has size limits, and you can cheaply and easily get 200G or more of SSD. 44% increase in disk space is something that most people should be able to afford. But the ability to download symbol tables at debug time from the cloud sounds like a better option for permanently connected systems. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

Stewart Smith wrote:
On Tue, 29 May 2012 23:09:32 +1000, Julien Goodwin <luv- lists@studio442.com.au> wrote:
Ubuntu at least does this by storing all symbols for all relevent package versions "in the cloud" so you can use when needed. I think Debian were planning to deploy this as well, but I've no idea if that happened.
FWIW the "in the cloud" is how Microsoft has stored symbols for most of the Windows stack for at least 10 years.
Basically, debuggers can download them on demand - kind of useful if you find yourself in WinDBG (which I swear is where I've spent the majority of my Windows using time in the past decade).
You can download them on-demand in Debian, too. They're the -dbg packages. (Well, OK, you need to have enabled core dumps in advance, or to rerun the app. So probably not as awesome as what you describe in windbg.)
The reason why they're separate packages, and not installed by default, is because they're a non-negligible increase in size. To take an example at random:
If done properly, the dbg builds would have various compiler optimisations turned off too. Under Windows, a debug ('checked') build includes all the ASSERTS and DebugPrint messages too which are stripped out in the 'free' build. James

On Wed, 30 May 2012, James Harper <james.harper@bendigoit.com.au> wrote:
If done properly, the dbg builds would have various compiler optimisations turned off too. Under Windows, a debug ('checked') build includes all the ASSERTS and DebugPrint messages too which are stripped out in the 'free' build.
Yes compiler optimisation often makes debugging difficult. Compilers are free to optimise out arithmetic and logical steps. But even knowing approximately which line of the function had the error can really help. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On 30/05/12 14:14, Russell Coker wrote:
On Wed, 30 May 2012, James Harper<james.harper@bendigoit.com.au> wrote:
If done properly, the dbg builds would have various compiler optimisations turned off too. Under Windows, a debug ('checked') build includes all the ASSERTS and DebugPrint messages too which are stripped out in the 'free' build.
Yes compiler optimisation often makes debugging difficult. Compilers are free to optimise out arithmetic and logical steps.
But even knowing approximately which line of the function had the error can really help.
Only if you're a developer for that distribution. Everyone else will either be: An end-user - in which case they'll hit "restart" and continue, and possibly complain a bit in a forum somewhere. An upstream developer - in which case they'll already be compiling it from source and have debug symbols if desired. According to one page I found, there are 2916 Debian developers, total. So, by adding debug symbols to everything, you'd be helping out at most 3000 people, and inconveniencing many, many more. And those developers could just install the -dbg packages if they really want.. -Toby PS. Am not sure if the figure is accurate; is debian.org/devel/people a reasonable indicator?

Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote:
Only if you're a developer for that distribution.
Everyone else will either be: An end-user - in which case they'll hit "restart" and continue, and possibly complain a bit in a forum somewhere.
An upstream developer - in which case they'll already be compiling it from source and have debug symbols if desired.
I've reported bugs, with backtraces from gdb, to developers more than once, so I'm one of the exceptions to the above.

On Wed, 30 May 2012, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote:
Only if you're a developer for that distribution.
It's also good for bug reports. Yes bug reports can be said to only help developers, but they help developers make the distribution better. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On 30/05/12 15:00, Russell Coker wrote:
On Wed, 30 May 2012, Toby Corkindale<toby.corkindale@strategicdata.com.au> wrote:
Only if you're a developer for that distribution.
It's also good for bug reports. Yes bug reports can be said to only help developers, but they help developers make the distribution better.
But as Ubuntu has demonstrated, you can still have users make bug reports without needing the users to have installed the dbg packages.

On 30/05/12 14:14, Russell Coker wrote:
On Wed, 30 May 2012, James Harper<james.harper@bendigoit.com.au> wrote:
If done properly, the dbg builds would have various compiler optimisations turned off too. Under Windows, a debug ('checked') build includes all the ASSERTS and DebugPrint messages too which are stripped out in the 'free' build.
Yes compiler optimisation often makes debugging difficult. Compilers are free to optimise out arithmetic and logical steps.
But even knowing approximately which line of the function had the error can really help.
Only if you're a developer for that distribution.
Everyone else will either be: An end-user - in which case they'll hit "restart" and continue, and possibly complain a bit in a forum somewhere.
An upstream developer - in which case they'll already be compiling it from source and have debug symbols if desired.
(here I go again about MS) When an app crashes under Windows a minidump is collected and the user is given the option to send the dump info to Microsoft for collection and analysis (despite popular belief, this analysis does actually happen!). Such a thing would be really cool under Linux as it would allow the aggregation of a huge amount of information and allow the developers to spot the bugs that most require attention, eg "hmmm... 1000 reports in the last 24 hours of a crash at line 42... that's probably worth looking at!". Like Windows, it would also allow a response to be returned when the crash is submitted like "This bug is resolved in the latest version. Please upgrade.". You still don't need the symbols on your PC though. James

James Harper <james.harper@bendigoit.com.au> wrote:
(here I go again about MS) When an app crashes under Windows a minidump is collected and the user is given the option to send the dump info to Microsoft for collection and analysis (despite popular belief, this analysis does actually happen!). Such a thing would be really cool under Linux as it would allow the aggregation of a huge amount of information and allow the developers to spot the bugs that most require attention, eg "hmmm... 1000 reports in the last 24 hours of a crash at line 42... that's probably worth looking at!". Like Windows, it would also allow a response to be returned when the crash is submitted like "This bug is resolved in the latest version. Please upgrade.". You still don't need the symbols on your PC though.
I think apport is meant to do most of the above. Fedora also have a tool but I've forgotten the name and I don't know exactly what it does.

On 05/30/2012 03:01 PM, James Harper wrote:
(here I go again about MS) When an app crashes under Windows a minidump is collected and the user is given the option to send the dump info to Microsoft for collection and analysis (despite popular belief, this analysis does actually happen!). Such a thing would be really cool under Linux as it would allow the aggregation of a huge amount of information and allow the developers to spot the bugs that most require attention, eg "hmmm... 1000 reports in the last 24 hours of a crash at line 42... that's probably worth looking at!". Like Windows, it would also allow a response to be returned when the crash is submitted like "This bug is resolved in the latest version. Please upgrade.". You still don't need the symbols on your PC though.
James
hi so something like this in Fedora https://fedoraproject.org/wiki/Features/ABRT Steve

Toby Corkindale wrote:
According to one page I found, there are 2916 Debian developers, total.
Those are the class of "Debian Developers", those who have voting responsibilities in referenda (e.g. electing the DPL, deciding whether GFDL invariant sections are DFSG compatible). The class called "Debian Mentors" is (I assume) much larger; they are people who can upload new versions of packages they maintain, but cannot upload new packages, and are not responsible for voting. They must be part of the web of trust, but do not need to pass a written test. Of course there are also people who write documentation, do bug triage, translations messages, &c and these individuals are not necessarily part of either of the above groups. It my impression that in recent years there has been increased recognition that their contributions are just as important as people doing uploads.

James Harper wrote:
The reason why they're separate packages, and not installed by default, is because they're a non-negligible increase in size. To take an example at random:
If done properly, the dbg builds would have various compiler optimisations turned off too
AIUI they're not separate builds. They just take the debugging symbols from foo.so and move them into a separate file. I could be completely wrong about that, I don't really understand if/how that would work, I just got that impression...

James Harper wrote:
The reason why they're separate packages, and not installed by default, is because they're a non-negligible increase in size. To take an example at random:
If done properly, the dbg builds would have various compiler optimisations turned off too
AIUI they're not separate builds. They just take the debugging symbols from foo.so and move them into a separate file. I could be completely wrong about that, I don't really understand if/how that would work, I just got that impression...
It would appear that you are correct http://wiki.debian.org/DebugPackage http://wiki.debian.org/HowToGetABacktrace also has some useful info about how to use them. If I was king they would be called symbol packages :) James

On Tue, May 29, 2012 at 10:50:38PM +1000, Russell Coker wrote:
The thing about the debugging symbol table is that you never know when you will need it. Having it always there seems to have no cost that matters but it can provide significant benefits. So why ship a program or shared object that's stripped?
because libreoffice(*) is already way too big and slow to upgrade when i run apt-get, even on an ~16Mbit ADSL2 link? from my point of view, it's just bloat. potentially useful in some circumstances, yes. actually useful in (my) real life, no. if someone really needs it, they can install the -dbg version or recompile without stripping. most people, however, don't need it and won't make any use of it if it's there. also, my rootfs is an 80G partition (approx 20GB free) on a 120GB SSD. if every binary i installed was unstripped, i'd probably have to double the size of / re: "you never know when you'll need [the symbol table]" and since I rarely use gdb, it would all be for something that's AFAICR never been of any use to me in about 18 years of running linux at home and work. (*) or choose your own bloatware app. i picked LO because when i see it In the apt-get upgrade list, i know it's going to be a long and slow upgrade, even from a local mirror (disk i/o still takes time) craig -- craig sanders <cas@taz.net.au> BOFH excuse #126: it has Intel Inside

On Tue, 29 May 2012, Craig Sanders <cas@taz.net.au> wrote:
because libreoffice(*) is already way too big and slow to upgrade when i run apt-get, even on an ~16Mbit ADSL2 link?
If a never-strip policy was adopted by a distribution then LO would probably be an exception. With the way that it does Java etc there isn't going to be a simple gdb option that gives useful results. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

Linux has used demand-loading of executables and shared objects for ages. This means that if an executable or shared object has some pages that happen to never get called (IE for corner conditions that don't happen in the usual case) then they never get read from disk. The same applies for debugging data when the object isn't being debugged. Even if you have unused data on the same pages as executable data thus increasing the number of executable pages loaded into RAM it shouldn't be a big deal as RAM keeps getting bigger so that only a small portion of the multi-gigabytes of RAM in a low-end system is used for executable pages.
Disks are getting bigger all the time. Nowadays it would be silly to consider purchasing a disk for a desktop system that's less than 2TB in size and laptops have had hundreds of GB for ages. Currently the biggest root filesystem on a system I run is 12G, that is 0.6% of the space on a desktop disk you might purchase (*) and less than 10% of the low end laptop disks that were on sale a couple of years ago. If I had to make that 24G for the root filesystem it wouldn't be a big deal.
In terms of reducing binary size there has been some discussion about a port of Linux that uses 32bit instructions with 64bit data operations and registers on an AMD64 architecture. The idea is to save RAM and TLBs by not using 64/64 while still getting some performance benefits of a 64bit CPU. But there is little interest in this and it seems that Debian won't support it due to no- one caring.
So the question is, why strip binaries? Back in the days when we ran servers with 100MB hard drives there was a real need to save space. When a 128Kb/64Kb ADSL link was considered fast there was a real need to reduce download time. But now that most of us have ADSL links that allow 100KB/s (800Kb/s) UPLOAD speeds and significantly faster download speeds.
The thing about the debugging symbol table is that you never know when you will need it. Having it always there seems to have no cost that matters but it can provide significant benefits. So why ship a program or shared object that's stripped?
(*) The system in question has a 160G disk I got from a junk pile. But as half the disk space is on my /junk filesystem and there is still unallocated space I think this supports my general point.
Disk space can be expensive for hosted servers. Why not reach a compromise and ship the symbols separately for anyone who wants them? (I'm extrapolating from my knowledge of Windows systems here... I assume Linux doesn't force you to keep the symbols in the same file as the binary, but that's only an assumption). James

On 29/05/12 22:50, Russell Coker wrote:
So the question is, why strip binaries? Back in the days when we ran servers with 100MB hard drives there was a real need to save space. When a 128Kb/64Kb ADSL link was considered fast there was a real need to reduce download time. But now that most of us have ADSL links that allow 100KB/s (800Kb/s) UPLOAD speeds and significantly faster download speeds.
Speak for yourself re ADSL speeds.. I'm in the inner suburbs, and still have to put up with sub-ADSL1 speeds that fluctuate with the seasons.. and then have to share said bandwidth with other members of the house. It's not pretty. I tried to find some stats on average speeds for Australia.. Closest I could find was this: http://www.lifehacker.com.au/2011/09/australias-average-broadband-speed-384k... Still, if the current federal govt manage to hang on for another term (which is looking unlikely), then we might get an NBN in a few years which'll help things out.

Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote:
Speak for yourself re ADSL speeds.. I'm in the inner suburbs, and still have to put up with sub-ADSL1 speeds that fluctuate with the seasons.. and then have to share said bandwidth with other members of the house.
It's not pretty.
Indeed. I'm comparatively fortunate: I regularly get well over 8 MBps on my ADSL2+ connection. Faster would be better, of course, as would greater reliability. This phone line has been patched several times by Telstra in the last five years to repair faults, and the ADSL connection still drops out occasionally. (The only way to fix this is to move to a high reliability profile that drops the speed very considerably, or to have the line replaced with a new copper pair, which Telstra aren't about to do). For dist-upgrades it's good enough, though, including large packages.
participants (9)
-
Craig Sanders
-
James Harper
-
Jason White
-
Julien Goodwin
-
Russell Coker
-
Steve Roylance
-
Stewart Smith
-
Toby Corkindale
-
Trent W. Buck