Forums | Mahara Community

Support /
Cleaning up disk space


dan attwood's profile picture
Posts: 37

30 May 2017, 20:05

Hello

 

We've been running mahara for a good few years now and the maharadata is now at ~120 gig. We really need to reclaim some disk space.

I went through and deleted all users that hadn't logged in since 2014. I was expected this to remove the user and any files attributed to them and thus reclaim some space.

However it's barely dented the disk space and when going into artefact/file/originals  and then look into the numbered folders I can see a lot of content modified in 2014. I would have expected this to be deleted.

Is there a script or something I need to run in order to reclaim this space?

 

Dan

 

 

 

Robert Lyon's profile picture
Posts: 749

31 May 2017, 9:02

Hi Dan

Normally when a user is deleted so are the pages / artefact that belong to them.
As part of this deletion of artefact there is a check to make sure no one else is using the artefact,
so even though the owner of the file may be deleted if it's being used by other (via shallow copy of an artefact maybe)
then the file will stay around.

If no one else is using the artefact then the artefact is deleted via the PHP unlink() function.

To check if files in the dataroot artefact/file/originals are still being used you can check by looking
at the artefact_file_files table in the database and comparing the fileid column to what is in the dataroot.

To do this you need to find the modulo 256 of the number, eg:
- fileid = 186 then 186%256 = 186 so file lives at artefact/file/originals/186/186
- fileid = 292 then 292%256 = 36  so file lives at artefact/file/originals/36/292
- fileid = 12345 then 12345%256 = 57  so file lives at artefact/file/originals/57/12345

So a script could easily be created to find all the 'valid' files. Then a way could be worked out to delete the ones not in the 'valid' set.

Note if you are going to do this remember to either backup the site first and/or test on a copy of the site.

Also if you have archiving on the archives of a user will not be deleted when the user is deleted so that may be taking up large amounts of space. It will live under dataroot/submissions.


Also if users are allowed to export their portfolios then data under dataroot/export could also be taking up room.


Deleting data from either of those two places should be fine if you first put your site into maintenance mode first so you know no user is currently submitting/exporting at the time.

Cheers

Robert

dan attwood's profile picture
Posts: 37

31 May 2017, 21:14

Hello

 

Thank you for your reply.

I understand now how to link files in the artefact_file_files table to the actual file in the file system which is great.

Where I'm lost now is how to tell if they are 'valid' 

I took the largest file in the table and found it happily on the file system. It has a last modified time of 2013. This means that it probably isn't valid as the student is likely to have moved on. How do I check if it's still linked to any user?

could I possibly be seeing a situation where:

the user is deleted

it tries to delete a file attached to the user

it fails due to a permissions error

it therefore doesn't delete the file or the corresponding row in artefact_file_files

it continues to delete the user

 

 

 

 

Robert Lyon's profile picture
Posts: 749

01 June 2017, 9:22

Hi Dan,

To help to find out user/file status you can run this command on the database

SELECT u.username, u.deleted, a.artefacttype, a.title FROM usr u JOIN artefact a ON a.owner = u.id JOIN artefact_file_files aff ON aff.artefact = a.id WHERE aff.fileid = [id of file]

Where [id of file] is the fileid you want to query and it should show you the username and if they are deleted or not and the name/type of the file

Cheers

Robert

4 results