Forums | Mahara Community
Server + Sizing help
11 June 2009, 10:45
I am looking at architecting a solution implement Mahara in a large scale environment and I cannot find much on the server requirements - i.e. processor/memory/disk/io.
Is there any documentation which can help around this - basically I have no hardware and I need to know what to purchase and data to put together sizing volumetrics to include in a support model.
Can anyone help?
11 June 2009, 21:24
Mahara will run on any modern server, and even most older ones. We recommend Debian etch or later as an OS, so anything that runs on should be fine.
As to how much resources you'll actually need, this depends on the number of users - both in total, and the maximum number of concurrent users - you're expecting. Do you have any ideas on how many users you'll have?
12 June 2009, 3:32
Thanks for the response. With regards the environment I'd like your opinion on the following please - we are looking at using Mahara in a fairly large enterprise environment for education. The architecture for the overall solution has a full Shibboleth Identity Management system for ~1,000,000 users, Moodel for ~100,000, Zimbra email for ~100,000 users and Hive. I'm looking at plugging Mahara into this infrastructure using Shibboleth, Moodle and Hive. Our systems are built on latest IBM blades using VMWare technology - so I would be looking at using the same for Maraha based on RedHat which is our chosen Linux OS.
In addition the system has to provide 99.9% availability, so I would need to implement it such that it can be scaled in a cluster environment.
So a couple of key questions out of this:
- Can I use VMWare
- Can I use Blade based technology
- Can I use RedHat Enterprise
- How can I implement Mahara so it can scale in a cluster as the system grows
From a user basis we are looking at a year on year uptake starting with ~60,000 users, with a similar increase year on year, with a target user base in 3 years of ~180,000-200,000 users acvtively using our systems. We have based most of our current system calculations on a concurrency model of Yr 1=5%, Yr2=10% and Yr3=33%.
Any advice on the above would be much appreciated.
13 June 2009, 1:22
Well it sounds like you have the hardware required at least
Mahara requires a pretty simple LAMP-ish based stack, and therefore can scale just like any other product on such a stack. We've deployed Mahara on clusters before, and it works just fine.
I would suggest that any calculations you've done for Moodle will reflect reasonably well for Mahara, as they're basically the same stack. If anything, Mahara will perform better as it does less per page (we've made sure that, for example, no writes are done on the average page load).
It's likely people will use more file space in Mahara than in Moodle - but even if you gave all your users a 1G quota, it's highly unlikely most of them will use anywhere near that amount, which means you can get away with greatly overselling it in the short term, and tracking the usage over time.
Can you use all of those technologies? Yes, as they'll all handle LAMP stuff. As for the stack itself, I would strongly recommend you use PostgreSQL as your database if at all possible.
I guess the only other question is whether Mahara can handle the load you're talking about. As I said before, Mahara does less per page than Moodle so I believe it will scale better, and we've certainly designed it with scalability in mind (not having a complicated roles system really helps here), but we haven't deeply investigated the performance yet, so you may come across one or two foulups/slow queries on such a large deployment. We'll be happy to help you if you do!
14 June 2009, 20:47Dude, thats some insane specifications. Just reading that was amazing
15 June 2009, 2:25
Thanks for the help - I'll base my calculations around our Moodle which we already have in place. Disk space isn't too much of an issue - our Zimbra/Hive currently has 20Tb allocated.
Could you give me a bit more info around what you did in the deployments - was this done using Hardware network balancing, or did you use Tomcat/Apache load balancing?
Really appreciate the offer of help if we need it.
This is a bit of a critical project as it is for the entire school estate of one of the largest metropolitan councils in Europe. Our Shibboleth alone is being built to support ~1,000,000 users covering students, parents, teachers and government bodies. We are using Mahara as our official recommendation as the ePortfolio system for this deployment.
I'll keep everyone upto date with how we get on. At the moment we have Moodle deployed, Zimbra being deployed - just sorting out a issue with the proxy, Hive already deployed and linked to Moodle, and our Identity Management solution will have a first drop in July. We are working on shibboleth components for all of these. Might also be worth mentioning that we are using SIF (linked into a multi-node ZIS) for all our identity and person details. This is all linked into schools MIS systems so we have direct provisioning of identity from the schools - and the IdM + SIF will be driving the desktop provisioning of AD based services.
Keep ya posted.
15 June 2009, 3:29
We typically do L4 load balancing using LVS. That handles the web server load balancing, we've never had to do any kind of database replication. The dataroot is NFS mounted on each webserver from a file server. Because sessions are in the dataroot, there's no need for any webserver affinity per client, which keeps things simple. Cron is another issue - you want it only running on one server every minute. We have a script on all the webservers that grabs a lock in dataroot to do that.
Other than that, it's all pretty simple. The code is the same on all the servers of course, as is the apache configuration.
19 June 2009, 3:42
Another question - with regards disk storage and file based objects. Is everything stored in the database, or can you upload and store file objects? if you have file objects can these be served via Hive or is it better to have it as a simple shared file system on the SAN?
19 June 2009, 4:29
Sorry another question - about network requirements. I will be looking at this being accessed through a sub-domain - mahara.myvle.org - are there any other sub-domains which I should be defining for Mahara? what ports do I need to have opened?
A lot of our other systems all run SSL - Would you recommend SSL? if so what ports would I need and also what type of certificates would Mahara need?
Really appreciate the help so far.
21 June 2009, 19:25
Hi - Mahara just needs one domain to function. Please pay close attention to the instructions on the installation page about setting up aliases, such as www, for your installation.
Would I recommend SSL? Well, with SSL you balance security with performance, so it's really up to you. There is a patch floating around for doing just SSL on the login page (though it doesn't work for the password reset page yet).
As for open ports - Mahara needs to be able to make outbound connections to port 80 on other machines, and also to 443 if you want https RSS feeds to work. You'll also need to configure the web servers to be able to send e-mail so the cronjobs can mail out forum post updates. This may or may not require opening ports, it depends on how you set up mail. Mahara supports SMTP if you need that.