Forums | Mahara Community
Support
/
Serious corruption in Artefacts Table
05 July 2018, 23:58
Hi,
I'm posting here in the hope that someone can help with an issue that has arisen in our Mahara system.
A few weeks ago, users started reporting that Group pages containing attachments, which have been set up as templates to allow students to copy them were not working correctly. Specifically, when they attempted to download the attached template documents they were getting errors like
Not found
The page you are looking for could not be found.Artefact with id 23596273627 not found
$length = strlen($oldparent->path) + 1;
$params = array($newparent->path, $length, $this->path, db_like_escape("{$this->path}/") . '%');
$sql = "UPDATE {artefact} SET path = ? || SUBSTR(path, ?) WHERE (path = ? OR path LIKE ? )";
execute_sql($sql, $params);
// Make sure that the value of a new path is set.
$this->path = $newparent->path . substr($this->path, $length - 1);
09 July 2018, 13:40
Hi Mike,
This is not a problem I've heard of before but I don't think special chars would have caused the issue as the path only involves the integers from the artefact id fields plus the '/' char.
It might have been caused by a blank 'path' field for an artefact.
In htdocs/lib/db/upgrade.php the lines 3300 -> 3368 have the code that made the 'path' data in the first place.
You could try re-running the population of path information by doing the following:
1) Make a copy of the database first to run the test against.
2) Edit the htdocs/local/upgrade.php file and add those lines wrapped in:
if ($oldversion < 2018070900) {
require_once(get_config('docroot') . 'artefact/lib.php');
// Fill the new field with path data.
... the lines 3300 -> 3368 from htdocs/lib/db/upgrade.php ...
}
3) Update htdocs/local/version.php to set the $config->version = 2018070900;
to match the number in the htdocs/local/upgrade.php
4) You then should be able to run those lines via a link on your site's Admin -> Dashboard page.
If all works fine on database copy then run again against production site.
If it recreates bad paths then could you let me know
Cheers
Robert
10 July 2018, 1:42
Hi Robert,
Thanks very much for your post. I've been continuing to investigate this myself over the last few days and had also come to the conclusion that the most likely cause was a null path parameter.
I took the update query UPDATE {artefact} SETpath=?||SUBSTR(path,?) WHERE (path=? OR path LIKE ?) from the code and ran similar select statements with a null value for path and as anticipated, they return all the records in the table except for two records in the table that actually have NULL values for path (not sure how or why this happened).
In addition, I've noticed that ALL the records that were created AFTER the 24th of May do seem to have valid paths etc, and all the records before this have an invalid path, all pointing to the same parent. This certainly lends weight to the theory that a bug in Mahara Core somehow led to every record on the system being updated by the UPDATE statement when a particular user did *something* unusual, probably on or around May 24th.
Thanks for your suggested fix. I was planning to write a bespoke SQL script but will have a look at doing what you've suggested instead as potentially it will save a lot of time.
I'll let you know how we get on.
Regards,
Mike
10 July 2018, 9:03
Hi Mike
It might be worthwhile to upgrade or apply this patch
https://reviews.mahara.org/#/c/8492/3
which restricts the updating of path
Cheers
Robert
10 July 2018, 21:34
Hello again Robert,
Thanks for pointing me in the direction of the patch. I went through the (rather involved) process of creating an OpenID account and a public key etc but even having done that I get the mesage
Code Review - Error
The page you requested was not found, or you do not have permission to view this page
If click on Go Anonymous instead, I just get a blank page with no code on it.
Could you perhaps please point to the the Bug number so I can hopefully find it in Source Tree?
Anyway, the immediate task is to proceed with the Artefacts Path fix as discussed but I did want to just check whether or not we should be concerned that thousands of Artefact IDs in the artefacts table do not appear to have corresponding files in the filestore (where artefacttype is one of "file", "image", "video", "audio", "archive")?
I ask because I want to avoid the risk that we do all the work to fix the Artefacts table path but then find that our Mahara system is still broken because of the discrepancies between the IDs in the table and the files in the filestore.
Regards,
Mike
10 July 2018, 23:12
I've looked through the commits. Looks like Bug 1724603 - Adding 'owner' column to update for update_hierarchy_path() as per
https://bugs.launchpad.net/mahara/+bug/1724603
Is that correct?
Thanks,
Mike
11 July 2018, 7:37
Hi Mike,
Ah sorry about that - just after I replied our gerrit server was to be upgraded and so is temporarily not working
but yes that is the correct bug report
the patch change can also be seen merged into the mahara repo at
https://git.mahara.org/mahara/mahara/commit/4b1f57b90374b3cc772b1560e1d3d05184716128
Cheers
Robert
11 July 2018, 22:06
Thanks again for getting back to me. I can see the fix on the link you sent. Looks like it was mainly a performance improvement enhancement but it should certainly reduce the risk of the problem reoccurring once we've fixed the data.
Do you have any comment whether or not we should be concerned that thousands of Artefact IDs in the artefacts table do not appear to have corresponding files in the filestore (where artefacttype is one of "file", "image", "video", "audio", "archive")?
I'm just concerned about the risk that we fix the data issues in the artefacts table but still find that our users are having similar problems due to the discrepancies with the filestore highlighted above.
12 July 2018, 10:37
Hi Mike
The files saved into the dataroot (datastore) are done by the fileid value and not the artefact id value - though in most cases they are the same value.
To double check if files are missing from dataroot you can get a list of the file name with path relative to dataroot path by doing the following query:
SELECT a.id AS artefact_id, aff.fileid AS file_id,
CASE WHEN a.artefacttype = 'profileicon'
THEN CONCAT('artefact/file/profileicons/originals/', (aff.fileid % 256), '/', aff.fileid)
ELSE CONCAT('artefact/file/originals/', (aff.fileid % 256), '/', aff.fileid)
END AS dataroot_path
FROM artefact a
JOIN artefact_file_files aff ON aff.artefact = a.id
ORDER BY a.id ASC;
If the files are indeed missing - this could have been due to:
1) a server admin deleting files from dataroot directly (not likely)
2) on deleting a file in mahara, via artefact/file.lib.php delete() function the unlinking of file from dataroot worked but the subsequent delete records from database failed, possibly due to bad path info (more likely)
Cheers
Robert