
At the bottom of this message is an extract from the monitoring system for the LUV server. Load average spikes to over 20, but at the time of monitoring there was only 1 D state process and nothing was using much CPU time or much RAM. At the same time other VMs didn't report high load so it wasn't an issue of the disk capacity of the hardware being saturated (which would be difficult for a RAID array of SSDs on a mostly quiet server). 09:55:01 CPU %user %nice %system %iowait %steal %idle 12:35:02 all 4.28 0.00 1.15 17.87 1.40 75.29 12:45:01 all 4.82 0.00 1.54 17.73 1.57 74.35 12:55:01 all 5.00 0.00 1.20 18.15 0.91 74.75 13:05:01 all 5.60 0.00 1.33 17.79 1.00 74.28 13:15:01 all 14.11 0.00 2.70 16.95 1.56 64.69 13:25:01 all 4.21 0.00 1.52 20.83 0.89 72.56 Average: all 5.15 0.00 1.31 19.99 1.09 72.45 Above is part of the sar output. Note that at around the time of the high load average there was a higher than usual amount of user CPU time. The iowait while not correlated with this issue was higher than I expected, I ran "iotop -o -d5 -b -P" which indicated that writes from mysqld was the main disk access. I ran "fatrace -f W" which indicated that mysqld was writing to deleted files in /tmp. | 68991 | luv_drupal | localhost | luv_drupal | Query | 0 | Creating sort index | SELECT v.vid, v.*, n.type FROM vocabulary v LEFT JOIN vocabulary_node_types n ON v.vid = n.vid WHERE | The only time I caught an access with the "show processlist;" SQL command was the above, might "Creating sort index" mean writing to deleted files in /tmp? ALERT itmustbe/loadavg: 21.27 6.84 3.01 >= 7 5 4 (Wed May 13 13:06:39) Summary output : 21.27 6.84 3.01 >= 7 5 4 Detailed text (if any) follows: ------------------------------- Here are D state processes: USER PID VSZ RSS TTY COMMAND root 157 0B 0B [jbd2/vda-8] Here are processes with the top CPU percentages: USER PID CPU TTY COMMAND mon 1630 7.0 /usr/bin/perl /usr/lib/mon/mon-local.d/ loadavg.monitor 7 5 4 mon 1629 6.0 /usr/bin/perl /usr/lib/mon/mon.d/msql- mysql.monitor --mode m www-data 1445 5.1 /usr/sbin/apache2 -k start Here are processes with the top RAM use: USER PID VIRT RES TTY COMMAND clamav 335 1.12GB 874MB /usr/sbin/clamd --foreground=true mysql 1392 1.7GB 613MB /usr/sbin/mysqld --basedir=/usr -- datadir=/var/lib/mysql --p spamassassin 20347 101MB 89.3MB spamd child spamassassin 29351 97MB 84.9MB spamd child root 1047 94.5MB 84.1MB /usr/bin/perl -T -w /usr/sbin/spamd -d --pidfile=/var/run/sp Swap Used: 7.8MB / 256MB -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

Hi Russell I happened to spot that you tagged me - mind that I don't do much DB stuff these days, but happy to help if I can.
The iowait while not correlated with this issue was higher than I expected, I ran "iotop -o -d5 -b -P" which indicated that writes from mysqld was the main disk access. I ran "fatrace -f W" which indicated that mysqld was writing to deleted files in /tmp.
| 68991 | luv_drupal | localhost | luv_drupal | Query | 0 | Creating sort index | SELECT v.vid, v.*, n.type FROM vocabulary v LEFT JOIN vocabulary_node_types n ON v.vid = n.vid WHERE |
The only time I caught an access with the "show processlist;" SQL command was the above, might "Creating sort index" mean writing to deleted files in /tmp?
If sort_buffer_size is too small (generally 2M or 4M is good), or the SELECT columns contain TEXT or BLOB fields (using * in the SELECT list is not a good habit), the sort operation will have to use disk-files to do its thing. Something else might also be "wrong" with this query, perhaps the LEFT JOIN wasn't supposed to be a LEFT JOIN and is returning way too many rows, borky WHERE clause, or other factors. Drupal is pretty decent these days (say Drupal 8), but it depends on the version, and there are borky plugins of course. Btw if you use SHOW FULL PROCESSLIST you'll get the full rather than a possibly truncated query string. If you are using MariaDB, you can set up the slow query log and enable extra options so that sorts that go to disk get logged. But if you just set up slow query log with 1 second, you can already see what shows. If it takes that long it'll show up. Or go further, long_query_time=0.1 I think the resultset of this query is way bigger than it should be, otherwise it possibly wouldn't go disk, and wouldn't produce enough disk I/O to blip. Tuning the server a bit might help. Did you change any settings from the defaults? If you have enough RAM you could take a sneaky shortcut and define tmp=/dev/shm, but if the tmp files that MySQL does need during normal operations are big enough, you'll get memory troubles instead. And as you'll know, swapping is never a good thing on a DB server. But tuning the server just a bit is always a good idea, the defaults on most distros are to minimise RAM usage and minimal logging. Also set stuff like innodb_flush_method=O_DIRECT to optimise the I/O further. Regards, Arjen.

On Wednesday, 13 May 2020 8:42:16 PM AEST Arjen Lentz wrote:
I happened to spot that you tagged me - mind that I don't do much DB stuff these days, but happy to help if I can.
Thanks.
The iowait while not correlated with this issue was higher than I expected, I ran "iotop -o -d5 -b -P" which indicated that writes from mysqld was the main disk access. I ran "fatrace -f W" which indicated that mysqld was writing to deleted files in /tmp.
| 68991 | luv_drupal | localhost | luv_drupal | Query | 0 | Creating | sort> index | SELECT v.vid, v.*, n.type FROM vocabulary v LEFT JOIN vocabulary_node_types n ON v.vid = n.vid WHERE |
According to "show table status;" the vocabulary and vocabulary_node_types tables each have 16K of data.
The only time I caught an access with the "show processlist;" SQL command was the above, might "Creating sort index" mean writing to deleted files in /tmp? If sort_buffer_size is too small (generally 2M or 4M is good), or the SELECT
It's running MySQL 5.6 which apparently has 256K for the default sort_buffer_size and the configuration doesn't seem to specify anything different. So I guess we shouldn't have problems in this regard. Will I gain anything from changing to MariaDB? For reasons that I never worked out the LUV server didn't get changed to MariaDB on the Debian upgrade process while other Debian servers I run did.
Btw if you use SHOW FULL PROCESSLIST you'll get the full rather than a
Thanks, I'll try that.
possibly truncated query string. If you are using MariaDB, you can set up the slow query log and enable extra options so that sorts that go to disk get logged. But if you just set up slow query log with 1 second, you can already see what shows. If it takes that long it'll show up. Or go further, long_query_time=0.1
OK, I guess that's one reason for changing to MariaDB.
I think the resultset of this query is way bigger than it should be, otherwise it possibly wouldn't go disk, and wouldn't produce enough disk I/O to blip. Tuning the server a bit might help. Did you change any settings from the defaults?
If you have enough RAM you could take a sneaky shortcut and define tmp=/dev/shm, but if the tmp files that MySQL does need during normal operations are big enough, you'll get memory troubles instead. And as you'll know, swapping is never a good thing on a DB server.
The VM has 4G of RAM which is a reasonable amount for what is being done. There is 16G of RAM allocated to VMs and 48G in the system. I can easily allocate more RAM to that VM. The total of all databases on the LUV server is 3G of storage. I could allocate it another 3G of RAM to make sure it all stays in cache. Using /dev/shm for tmp seems like a good idea.
But tuning the server just a bit is always a good idea, the defaults on most distros are to minimise RAM usage and minimal logging. Also set stuff like innodb_flush_method=O_DIRECT to optimise the I/O further.
Thanks, I'll check that out too. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

Hi Russell So through either the slow query log, or catching it via SHOW FULL PROCESSLIST, or for a more extreme solution enabling the general query log (you can set it up with the path, but only enable it for a bit and then turn it off again - it would create a lot of stuffs), catch the actual query. Then run it with EXPLAIN <query> Post that the output from that here. Also it might be good to paste your SHOW TABLE STATUS for those tables here, I might be able to see more from that. And perhaps SHOW CREATE TABLE <tblname> \G from the mysql cmdline client for the table structure of each of these tables. At this point I wouldn't fuss with switching from MySQL to MariaDB, MySQL does a fine job generally. MySQL 5.6 is EOL now though, so something a bit newer might be good. Regards, Arjen.
participants (2)
-
Arjen Lentz
-
Russell Coker