Forums | Mahara Community

Support /
Performance issues when viewing profiles

Chris Myers's profile picture
Posts: 5

11 August 2017, 7:32 AM

Please forgive me for my Mahara-naiivety; we've got one group on campus that uses the software; I just manage the VM that it's running on.


Long story short, we're running Mahara 17.04.2 on SLES 12 SP2 x64, with PHP 5.5.14, Apache 2.4.23-29.3.2, and Mariadb er 15.1 Distrib 10.0.31. The server is a very low-usage VM with two vCPUs and 1.5GB of memory.


The issue I'm about to type out started when we upgraded from Mahara 1.x (I don't remember the exact version at this point) to 16.10.2, and at the same time had to migrate from a VM running from SLES11 SP4 to a new one running SLES12 SP2, since the newer versions of Mahara require a version of PHP not available on SLES 11. When we were on the old version on the old software, performance was about as good as we could expect. The problems I'm going to write about below started following the upgrade and migration. Since we were experiencing this issue on 16.10.2, and there was a new version available, I tried upgrading to 17.04.2 today, and things are a little tiny bit bit better, but not much.


I apologize in advance for the very lengthy post, but I'm going to try to sum up my several-days-of-troubleshooting-on-my-own-before-asking-for-help.


Here's basically what we're running into -- all of these actions are tested on the same VM ::


If I try to pull up a random static image through Apache and mash F5 on my web browser, it redisplays the image as quickly as it can, no timeouts and no lags. (eg., So, it doesn't look like there's an issue with Apache itself.

If I pull up a PHPINFO page and do the same, it also runs as quickly as it can, no timeouts and no lags. So, it doesn't look like there's an issue with PHP running in Apache. Also, other PHP-based applications, such as phpbb, that are running on the same server don't have any issues.

If I pull up an image that has to involve the database, such as , it's fine. If I hit F5 as quickly as I can, it's fine. If I mash F5, it spins for several minutes before returning. However, during that time I'm able to do really intensive queries from the mySql instance and they return very quickly. Also, other PHP-based applications that use the same mySql, PHP, and Apache instances on that VM are able to function without any delays, even when mashing F5, even while Mahara is still drowning. So it's not like mySql or Apache itself is freezing. It seems to be a Mahara-specific issue.


Now, I know that normally someone wouldn't be so obnoxious, and usually you'd expect adverse reactions for doing something like that. But what I'm doing with that is trying to simulate the number of database hits that occur from simply viewing someone's profile, such as going here: . Why would I want to do that you might ask? Well...


When we actually use Mahara, even simple stuff like viewing someone's profile, uploading a document, etc., it takes f o r e v e r (as in, 5+ minutes to do a simple task.) So far all of my troubleshooting hasn't turned up anything at all useful, but what I have noticed is this ::

If I pull up a server console and run this:

watch -n .5 'netstat -alnp | grep "0" | grep ESTABLISHED | wc -l'

it checks every 500ms and lets me keep tabs on how many mySql connections are currently in the ESTABLISHED state. What I've observed is that something basic like hitting F5 as fast as I can on the url might generate one or two concurrent ones, and all is happy. But as soon as you get #3, everything stops. Any further requests just pile up, until you get 50 or 60 or 150 in the ESTABLISHED state. When that happens, you can still execute queries through the mySql console without issue, and other PHP-Apache applications talking to the same mySql instance perform fine (even if you mash F5 on those applications.) It's just Mahara that pretty much throws its hands up in the air and sits there doing nothing for several minutes.


When this is occurring, doing a "show processlist;" in the mySQL console shows all of those threads, but they're all in a "Command" of "sleep". So it's not like the db is even trying to process them.

MariaDB [(none)]> show processlist;
| Id  | User      | Host            | db     | Command | Time | State | Info             | Progress |
|  87 | maharausr | localhost:34252 | mahara | Sleep   |  176 |       | NULL             |    0.000 |
|  88 | maharausr | localhost:34254 | mahara | Sleep   |  176 |       | NULL             |    0.000 |
|  89 | maharausr | localhost:34256 | mahara | Sleep   |  176 |       | NULL             |    0.000 |

Doing a "show open tables;" returns none that are "in_use" or "name_locked".



mysqladmin -u root status

doesn't seem to show anything out of the ordinary:

Uptime: 1992  Threads: 60  Questions: 7360  Slow queries: 0  Opens: 208  Flush tables: 1  Open tables: 271  Queries per second avg: 3.694


And when the issue is happening, not only is the CPU usage very low, but Apache and mySql don't really even show up in the top 10 processes in top:


webserver:~ # netstat -alnp | grep "0" | grep ESTABLISHED | wc -l

webserver:~ # netstat -alnp | grep "0" | grep ESTABLISHED | tail -3
tcp        0      0         ESTABLISHED 2133/mysqld         
tcp        0      0         ESTABLISHED 2133/mysqld         
tcp        0      0         ESTABLISHED 2133/mysqld

webserver:~ # top
top - 13:55:17 up  6:38,  2 users,  load average: 1.62, 0.38, 0.12
Tasks: 272 total,   1 running, 270 sleeping,   0 stopped,   1 zombie
%Cpu(s):  0.9 us,  1.3 sy,  0.0 ni, 97.7 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   1419476 total,  1340816 used,    78660 free,    86744 buffers
KiB Swap:  2095100 total,     5752 used,  2089348 free.   401180 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                 
10107 root      20   0   15500   2584   2100 R 6.250 0.182   0:00.01 top                                                     
    1 root      20   0   37484   5652   3992 S 0.000 0.398   0:07.68 systemd                                                 
    2 root      20   0       0      0      0 S 0.000 0.000   0:00.01 kthreadd                                                
    3 root      20   0       0      0      0 S 0.000 0.000   0:01.50 ksoftirqd/0                                             
    5 root       0 -20       0      0      0 S 0.000 0.000   0:00.00 kworker/0:0H                                            
    7 root      20   0       0      0      0 S 0.000 0.000   0:08.07 rcu_sched                                               
    8 root      20   0       0      0      0 S 0.000 0.000   0:00.00 rcu_bh                                                  
    9 root      rt   0       0      0      0 S 0.000 0.000   0:00.30 migration/0                                             
   10 root      rt   0       0      0      0 S 0.000 0.000   0:00.10 watchdog/0                                              



And, looking at the mySql threads list shows nothing exciting either:

webserver:~ # mysql -u root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 764
Server version: 10.0.31-MariaDB SLE 12 SP1 package

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show global status like '%thread%';
| Variable_name                            | Value |
| Delayed_insert_threads                   | 0     |
| Innodb_master_thread_active_loops        | 473   |
| Innodb_master_thread_idle_loops          | 23545 |
| Performance_schema_thread_classes_lost   | 0     |
| Performance_schema_thread_instances_lost | 0     |
| Slow_launch_threads                      | 0     |
| Threadpool_idle_threads                  | 13    |
| Threadpool_threads                       | 14    |
| Threads_cached                           | 0     |
| Threads_connected                        | 69    |
| Threads_created                          | 541   |
| Threads_running                          | 1     |
12 rows in set (0.00 sec)


Some extra info in case it helps at all:

webserver:~ # cat /etc/SuSE-release
SUSE Linux Enterprise Server 12 (x86_64)
# This file is deprecated and will be removed in a future service pack or release.
# Please check /etc/os-release for details about this release.

webserver:~ # ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 5475
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 5475
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

webserver:~ # uname -a
Linux webserver 4.4.74-92.32-default #1 SMP Thu Jul 27 15:07:08 UTC 2017 (eba0211) x86_64 x86_64 x86_64 GNU/Linux

webserver:~ # free -h
             total       used       free     shared    buffers     cached
Mem:          1.4G       881M       504M       5.2M        78M       417M
-/+ buffers/cache:       386M       999M
Swap:         2.0G        16M       2.0G


Is there anything at all that I'm missing, or that I should look at? As things sit right now, the software isn't usable.

Chris Myers's profile picture
Posts: 5

15 August 2017, 7:55 AM

I've been racking my brain trying to figure out what's going on, and so far haven't resolved the issue. However, I have discovered some things --

The issue seems to be related to our load balancer being in the mix (a Barracuda 440.) But what I can't figure out is why the problem affects only this piece of software, and only newer versions of the software, since nothing's changed on the server or load balancer, other than the version of Mahara.

Specifically, if I connect to the web server directly, I don't experience these issues. It's only when the traffic is routed through the load balancer. I also noticed that when doing a wireshark, there are a lot of HTTP keepalive requests sent between the browser and load balancer.

But the thing that I still don't understand is that this issue didn't occur on the previous version of Mahara, nor do they occur with other PHP applications on the same VM (which also share the same rule and settings on the load balancer.)

It looks like the issues with mySql are only coincidental, because if I kill the mySql threads off, the Apache sessions stay live.

Is there anything peculiar about how Mahara is built with regards to stuff like HTTP keepalives, etc.?

Chris Myers's profile picture
Posts: 5

16 August 2017, 1:02 AM

Just a quick update --


I was thinking about this issue on my drive into the office this morning, and decided to try a different route. I left the database on the previous application server, but moved the Mahara web application to a different application server, and performance is spectacular now. So it doesn't look like the issue is with the load balancer or mySql, but somewhere with how Mahara interacts with PHP, Apache, and the load balancer on that particular machine.

I should note that the second application server is pretty much the same configuration as the first -- same versions of SLES, Apache, and PHP, with the same plugins.

I'd  rather not split things up like this, but if that's what it takes for performance, so be it. It does bother me what the issue could be however, so if anyone has any thoughts, I'd appreciate it.

Mark Kirkwood's profile picture
Posts: 3

03 October 2017, 11:09 AM

Is it possible that you were running out of memory with everything on one host?

Chris Myers's profile picture
Posts: 5

04 October 2017, 2:39 AM

It didn't look like it; when I was running my load tests, the cpu and memory usage stayed pretty steady (and low.)

Chris Myers's profile picture
Posts: 5

04 October 2017, 2:40 AM

At one point I also bumped up the number of CPUs and memory, but that didn't help, so I took it back down again.

It also didn't appear to be disk latency, etc. -- this is backed by a Nimble SAN, and the iowait stayed very low.

6 results