Posted

May 28, 2014

Tags

MODX, SQL, PHP, MySQL, Snippet

MODX Resources, TV's and MySQL GROUP_CONCAT in Action

Quick and dirty way of fetching resources with template variables in single query using MySQL CONCAT and GROUP_CONCAT string functions. Resulting in JSON string of TV's for fetched resource.

Every MODX developer most likely knows getResources snippet, the ultimate swiss knife for fetching resources to be shown in lift-ups, paginated lists and so on. Not to forget new rival from pdoTools extra which comes with drop-in replacement for getResources with different query processing logic. These both are excellent extras and every MODX site builder should know at least first of two. But are these really necessary? As fetching data from database is generally quite easy, just to fetch resources without TV's is basics of SQL. Below example of simple query that fetches all resources that have parent with id 5.

PHP / xPDO

<?php

$c = $modx->newQuery('modResource');
$c->where(array(
    'parent' => 5
));

$results = $modx->getCollection('modResource', $c);

SQL

SELECT id, pagetitle, parent ...  # Avoid * queries
    FROM modx_site_content
    WHERE parent = 5;

To add TV's to these queries is tricky as the amount of how many TV's there is unknown before hand. Typically people fetch the modResource objects and then use getTemplateVars() method to fetch the TV's. That has slight issue with it, for each resource there will be one SQL query issued, so fetching 20 resource will lead to extra 20 queries. If the TV's would be joined directly to query, then user needs to read many lines with same resource information just to get the TV and TV value.

Using CONCAT and GROUP_CONCAT

Overview of the functions

CONCAT is used to concatenate fields and strings to single string. For example using CONCAT(pagetitle, '_desc') in SELECT would give results of 'example_desc' if the pagetitle happens to be 'example'. More on CONCAT in MySQL Documentation.

GROUP_CONCAT is aggregate group by function which in simplest form returns field values separated with comma. For example GROUP_CONCAT(pagetitle) could return 'example,example2,example3' if we had pages with those names. More on GROUP_CONCAT in MySQL Documentation as GROUP_CONCAT has lot to offer which will be seen with the query to come.

The SQL Query

The query does not have all fields from modx_site_content to keep the example clean. The full PHP snippet later on has the functions to easily add all or specific fields to query without having to memorize whole modx_site_content table.

SELECT c.id, c.pagetitle, c.content, 
	CONCAT("[",GROUP_CONCAT("{\"",vars.name, "\":\"", 
		REPLACE(REPLACE(v.value, "\"", "\\\""), "\n", ""),"\"}"),"]") 
FROM modx_site_content c
	LEFT JOIN modx_site_tmplvar_contentvalues v ON v.contentid = c.id
	LEFT JOIN modx_site_tmplvars vars ON vars.id = v.tmplvarid
GROUP BY c.id;

 From top to bottom:

  1. CONCAT Is used to start the JSON array
  2. GROUP_CONCAT is used to create JSON objects with TV name and value. Value has quotes escaped and line changes replaced with empty characters as JSON does not allow text to span over multiple lines
  3. LEFT JOINS to template variable tables. If inner join would be used the query would not return rows where resource has no template variables
  4. And the important part, group by resource id

Example Output for JSON part beautified

[
    {
        "demoTv": "I am a some TV"
    },
    {
        "demoTv2": "Another TV"
    },
    {
        "demoTv3": "Last TV"
    }
]

This is valid JSON output and you can test your JSON strings with JSONLint

Complete "hastyResources" Snippet Before Security and TV Processing

The Query in PHP

<?php

$parents = $modx->getOption('parents', $scriptProperties, '0');
$rowTpl = $modx->getOption('rowTpl', $scriptProperties, 'exampleRow');
$tvPrefix = $modx->getOption('tvPrefix', $scriptProperties, 'tv.');
$resourceTable = $modx->getTableName('modResource');
$resourceFields = $modx->getSelectColumns('modResource', 'modResource', '');
// By leaving out the array after quotes, all fields will be selected
$tvValueTable = $modx->getTableName('modTemplateVarResource');
$tvNameTable = $modx->getTableName('modTemplateVar');

$sql = 'SELECT '. $resourceFields .',
CONCAT("[",GROUP_CONCAT("{\"'.$tvPrefix.'",vars.name, "\":\"",
		REPLACE(REPLACE(v.value, "\"", "\\\\\\""), "\n", ""),"\"}"),"]") as tvs
FROM '. $resourceTable .' modResource
LEFT JOIN ' . $tvValueTable . ' v ON v.contentid = modResource.id
LEFT JOIN ' . $tvNameTable . ' vars ON vars.id = v.tmplvarid
WHERE modResource.parent IN('.$parents.')
GROUP BY modResource.id';
$modx->exec('SET group_concat_max_len=15000'); // Required for long json's
$stmt = $modx->prepare($sql);
$stmt->execute();
$modx->exec('SET group_concat_max_len=1024'); // Back to default
$out = array();

foreach ($stmt->fetchAll(PDO::FETCH_ASSOC) as $row) {
    $tvTemp = array();
    if ($row['tvs'] !== '') {
        $tvs = json_decode($row['tvs'], true);
        foreach ($tvs as $k => $tv) {
            foreach ($tv as $a => $v) {
                $tvtemp[$a] =$v;
            }

        }
    }
    unset($row['tvs']);
    $out[] = $modx->getChunk($rowTpl, array_merge($row, $tvtemp));
}

return implode("\n", $out);

SQL Output for Query

SELECT `modResource`.`id`,
       `modResource`.`type`,
       `modResource`.`contentType`,
       `modResource`.`pagetitle`,
       `modResource`.`longtitle`,
       `modResource`.`description`,
       <... lots of fields ...>,
       `modResource`.`uri`,
       `modResource`.`uri_override`,
       `modResource`.`hide_children_in_tree`,
       `modResource`.`show_in_tree`,
       `modResource`.`properties`,
       CONCAT("[",GROUP_CONCAT("{\"",vars.name, "\":\"",
           REPLACE(REPLACE(v.value, "\"", "\\\""), "\n", ""),"\"}"),"]") AS tvs
FROM `modx_site_content` modResource
LEFT JOIN `modx_site_tmplvar_contentvalues` v ON v.contentid = modResource.id
LEFT JOIN `modx_site_tmplvars` vars ON vars.id = v.tmplvarid
WHERE modResource.parent IN(7)
GROUP BY modResource.id

To compare how the snippet performs against getResources and pdoResources. I ran siege with 8 concurrent users with fixed url list that was used in db engine tests for 10 minutes. The DBMS is MariaDB 10.0.11 and storage engine is TokuDB. Each snippet was set to load all siblings with &parent=`25` and included 3 different TV's with lot of text content. Also the content field was loaded with each.

Speed comparison between hastyResources, getResources, and pdoResources

Timings

* Green marks the best performer in category

Snippet Page loads Min transaction in seconds Max transaction in seconds Avg Transaction in seconds
hastyResources 19303 0.17 0.95 0.0310832513
getResources 14688 0.2 0.78 0.0408496732
pdoResources 15153 0.17 0.76 0.0395961196

hastyResources took clear win in 10 minute tests having over 4000 pageloads more than next competitor.

Difference in Percentages

  hastyResources getResources pdoResources
hastyResources   27.1542467124 24.0886928256
getResources -27.1542467124   -3.116517543
pdoResources  -24.0886928256 3.116517543  

Adding Policy Check for Load Policy

As the query does not produce modResource objects that does policy check on row load we can mimic this functionality by loading empty modResource object before foreach loop. You can replace lines starting from 25 to end with next piece of code.

$principal = $modx->newObject('modResource');
foreach ($stmt->fetchAll(PDO::FETCH_ASSOC) as $row) {
    $principal->id = $row['id'];
    $principal->_policies = array();
    if ($principal->checkPolicy('load') !== true) {
        continue;
    }
    $tvTemp = array();
    if ($row['tvs'] !== '') {
        $tvs = json_decode($row['tvs'], true);
        foreach ($tvs as $k => $tv) {
            foreach ($tv as $a => $v) {
                $tvtemp[$a] =$v;
            }

        }
    }
    unset($row['tvs']);
    $out[] = $modx->getChunk($rowTpl, array_merge($row, $tvtemp));
}

return implode("\n", $out);

 The performance difference is not that drastic with policy check. With same 10 minute run on hastyResources the benchmark resulted in 18784 page loads which is roughly 2.5% drop in performance. So no biggie issue really.

Processing Template Variables without Unnecessary Queries

If you want to render template variable through their processing methods, it can be done by cheating the TV processing logic by calling directly to output modifiers and by modifying the query bit to have bit more fields that are required by renderer.

SQL Query Modification (can copy paste over the old query)

$sql = 'SELECT '. $resourceFields .',
CONCAT("[",GROUP_CONCAT("{\"name\":\"",vars.name, "\",",
    "\"value\":\"",REPLACE(REPLACE(v.value, "\"", "\\\\\\""), "\n", ""),"\",\"display\":\"",
    vars.display,"\",\"output_properties\":\"",
    REPLACE(vars.output_properties, "\"", "\\\\\\""),"\",
    \"default_text\":\"",vars.default_text,"\"}"),"]") as tvs
FROM '. $resourceTable .' modResource
LEFT JOIN ' . $tvValueTable . ' v ON v.contentid = modResource.id
LEFT JOIN ' . $tvNameTable . ' vars ON vars.id = v.tmplvarid
WHERE modResource.parent IN('.$parents.')
GROUP BY modResource.id';

Modified TV foreach loop

$tvtemp = array();
if ($row['tvs'] !== null) {
    $tvs = json_decode($row['tvs'], true);
    $tmplvar = $modx->newObject('modTemplateVar');
    foreach ($tvs as $k => $tv) {
        $value = (empty($tv['value'])) ? $tv['default_text'] : $tv['value'];
        $tvtemp[$tvPrefix.$tv['name']] = $tmplvar->getRender(unserialize($tv['output_properties']),$value,
            $tmplvar->getRenderDirectories('OnTVOutputRenderList','output'),'output',null,$tv['display']);
    }
}

Closure

To use getResources or pdoResources really is not an question. You should definitely use one of them, you may use hastyResources if you know SQL or are willing to learn more about SQL. If you need count for getPage for example. Just run query with count(id) with the same resource where conditions without the joins. I highly discourage to use getCount() method, all my benchmarks with different sites and using slow log to see what queries cause problems. There is always getCount() queries in the top.

No TV based filtering? hastyResource does not have TV value filtering and there is one good reason why not. The TV value field is not indexed, what ever you do with that column in the query, you end up with query that does not use indexes properly. TV's were not originally even designed to have data that is used for filtering, custom schemas are for that. The indexing issue is not that bad on small sites, but more resources you'll end up having. More likely you'll start to see high load times.

Will there be a loadable extra? Most likely not, unless there is clear need for it. My intention with this article was to bring advanced SQL topics to knowledged of MODX community. The GROUP_CONCAT can be used in many other places also.

What about processing of TV's? I'll try to read the modTemplateVar object class someday and find out the easy way to cheat the processing like I've done with the policy check. This has been done.

Please do leave a comment if you want this to be new extra or drop me a line on twitter, retweet or just reply to tweet where I gave link to this article.

The Snippet in it's Full Height

<?php
$parents = $modx->getOption('parents', $scriptProperties, '0');
$rowTpl = $modx->getOption('rowTpl', $scriptProperties, 'exampleRow');
$tvPrefix = $modx->getOption('tvPrefix', $scriptProperties, 'tv.');
$resourceTable = $modx->getTableName('modResource');
$resourceFields = $modx->getSelectColumns('modResource', 'modResource', '');
// By leaving out the array after quotes, all fields will be selected
$tvValueTable = $modx->getTableName('modTemplateVarResource');
$tvNameTable = $modx->getTableName('modTemplateVar');

$sql = 'SELECT '. $resourceFields .',
CONCAT("[",GROUP_CONCAT("{\"name\":\"",vars.name, "\",",
    "\"value\":\"",REPLACE(REPLACE(v.value, "\"", "\\\\\\""), "\n", ""),"\",\"display\":\"",
    vars.display,"\",\"output_properties\":\"",
    REPLACE(vars.output_properties, "\"", "\\\\\\""),"\",
    \"default_text\":\"",vars.default_text,"\"}"),"]") as tvs
FROM '. $resourceTable .' modResource
LEFT JOIN ' . $tvValueTable . ' v ON v.contentid = modResource.id
LEFT JOIN ' . $tvNameTable . ' vars ON vars.id = v.tmplvarid
WHERE modResource.parent IN('.$parents.')
GROUP BY modResource.id';
$modx->exec('SET group_concat_max_len=15000'); // Required for long json's
$stmt = $modx->prepare($sql);
$stmt->execute();
$modx->exec('SET group_concat_max_len=1024'); // Back to default
$out = array();

$principal = $modx->newObject('modResource');
foreach ($stmt->fetchAll(PDO::FETCH_ASSOC) as $row) {
    $principal->id = $row['id'];
    $principal->_policies = array();
    if ($principal->checkPolicy('load') !== true) {
        continue;
    }
$tvtemp = array();
if ($row['tvs'] !== null) {
    $tvs = json_decode($row['tvs'], true);
    $tmplvar = $modx->newObject('modTemplateVar');
    foreach ($tvs as $k => $tv) {
        $value = (empty($tv['value'])) ? $tv['default_text'] : $tv['value'];
        $tvtemp[$tvPrefix.$tv['name']] = $tmplvar->getRender(unserialize($tv['output_properties']),$value,
            $tmplvar->getRenderDirectories('OnTVOutputRenderList','output'),'output',null,$tv['display']);
    }
}
    unset($row['tvs']);
    $out[] = $modx->getChunk($rowTpl, array_merge($row, $tvtemp));
}

return implode("\n", $out);

Errata / Changes

  1. Added if null check for TV's when there is not TV's for resource so foreach loop does not throw error. Also added tvTemp array so array_merge wont fail.
  2. Added possibility for processing TV's
  3. Added whole script with security check and TV processing
  4. Added MAX_GROUP_CONCAT_LEN variable

Share your Thoughts

0 Responses

Your email address will not be published

Please enter your name.
Please enter valid email address.
Please enter your website address.
Please enter your message.
TwitterX Error: Could not load tweets as Twitter responded with the error: 'Invalid or expired token.'.
-->

Contact me with email or add my account to skype

the_dunnock@outlook.com

the_dunnock@outlook.com