1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830
|
<?php
/**
* Matomo - free/libre analytics platform
*
* @link https://matomo.org
* @license https://www.gnu.org/licenses/gpl-3.0.html GPL v3 or later
*/
namespace Piwik\Plugins\Actions;
use PDOStatement;
use Piwik\Config;
use Piwik\DataTable\Row\DataTableSummaryRow;
use Piwik\DataTable;
use Piwik\DataTable\Row;
use Piwik\Metrics as PiwikMetrics;
use Piwik\Piwik;
use Piwik\RankingQuery;
use Piwik\Tracker\Action;
use Piwik\Tracker\PageUrl;
use Piwik\Tracker\GoalManager;
use Zend_Db_Statement;
/**
* This static class provides:
* - logic to parse/cleanup Action names,
* - logic to efficiently process aggregate the array data during Archiving
*
*/
class ArchivingHelper
{
public const OTHERS_ROW_KEY = '';
/**
* Ideally this should use the DataArray object instead of custom data structure
*
* @param Zend_Db_Statement|PDOStatement $query
* @param string|bool $fieldQueried
* @param array $actionsTablesByType
* @return int
*/
public static function updateActionsTableWithRowQuery($query, $fieldQueried, $actionsTablesByType, $metricsConfig)
{
$rowsProcessed = 0;
while ($row = $query->fetch()) {
if (empty($row['idaction'])) {
$row['type'] = ($fieldQueried == 'idaction_url' ? Action::TYPE_PAGE_URL : Action::TYPE_PAGE_TITLE);
// This will be replaced with 'X not defined' later
$row['name'] = '';
// Yes, this is kind of a hack, so we don't mix 'page url not defined' with 'page title not defined' etc.
$row['idaction'] = -$row['type'];
}
if ($row['type'] != Action::TYPE_SITE_SEARCH) {
unset($row[PiwikMetrics::INDEX_SITE_SEARCH_HAS_NO_RESULT]);
}
if (in_array($row['type'], array(Action::TYPE_CONTENT, Action::TYPE_EVENT, Action::TYPE_EVENT_NAME, Action::TYPE_CONTENT_PIECE, Action::TYPE_CONTENT_TARGET))) {
continue;
}
$hasRowName = !empty($row['name']) && $row['name'] != RankingQuery::LABEL_SUMMARY_ROW;
// This will appear as <url /> in the API, which is actually very important to keep
// eg. When there's at least one row in a report that does not have a URL, not having this <url/> would break HTML/PDF reports.
$url = '';
$pageTitlePath = null;
if (
$row['type'] == Action::TYPE_SITE_SEARCH
|| $row['type'] == Action::TYPE_PAGE_TITLE
) {
$url = null;
if ($hasRowName) {
$pageTitlePath = $row['name'];
}
} elseif ($hasRowName) {
$url = PageUrl::reconstructNormalizedUrl((string)$row['name'], $row['url_prefix']);
}
if (
isset($row['name'])
&& isset($row['type'])
) {
$actionName = $row['name'];
$actionType = $row['type'];
$urlPrefix = $row['url_prefix'];
$idaction = $row['idaction'];
// in some unknown case, the type field is NULL, as reported in #1082 - we ignore this page view
if (empty($actionType)) {
if ($idaction != DataTable::LABEL_SUMMARY_ROW) {
self::setCachedActionRow($idaction, $actionType, false);
}
continue;
}
$actionRow = self::getActionRow($actionName, $actionType, $urlPrefix, $actionsTablesByType);
self::setCachedActionRow($idaction, $actionType, $actionRow);
} else {
$actionRow = self::getCachedActionRow($row['idaction'], $row['type']);
// Action processed as "to skip" for some reasons
if ($actionRow === false) {
continue;
}
}
if (is_null($actionRow)) {
continue;
}
// Here we do ensure that, the Metadata URL set for a given row, is the one from the Pageview with the most hits.
// This is to ensure that when, different URLs are loaded with the same page name.
// For example https://piwik.org and https://id.piwik.org are reported in Piwik > Actions > Pages with /index
// But, we must make sure https://piwik.org is used to link & for transitions
// Note: this code is partly duplicated from Row->sumRowMetadata()
if (
!is_null($url)
&& !$actionRow->isSummaryRow()
) {
if (($existingUrl = $actionRow->getMetadata('url')) !== false) {
if (
!empty($row[PiwikMetrics::INDEX_PAGE_NB_HITS])
&& $row[PiwikMetrics::INDEX_PAGE_NB_HITS] > $actionRow->maxVisitsSummed
) {
$actionRow->setMetadata('url', $url);
$actionRow->maxVisitsSummed = $row[PiwikMetrics::INDEX_PAGE_NB_HITS];
}
} else {
$actionRow->setMetadata('url', $url);
$actionRow->maxVisitsSummed = !empty($row[PiwikMetrics::INDEX_PAGE_NB_HITS]) ? $row[PiwikMetrics::INDEX_PAGE_NB_HITS] : 0;
}
}
if (
$pageTitlePath !== null
&& !$actionRow->isSummaryRow()
) {
$actionRow->setMetadata('page_title_path', $pageTitlePath);
}
if (
$row['type'] != Action::TYPE_PAGE_URL
&& $row['type'] != Action::TYPE_PAGE_TITLE
) {
// only keep performance metrics when they're used (i.e. for URLs and page titles)
if (array_key_exists(PiwikMetrics::INDEX_PAGE_SUM_TIME_GENERATION, $row)) {
unset($row[PiwikMetrics::INDEX_PAGE_SUM_TIME_GENERATION]);
}
if (array_key_exists(PiwikMetrics::INDEX_PAGE_NB_HITS_WITH_TIME_GENERATION, $row)) {
unset($row[PiwikMetrics::INDEX_PAGE_NB_HITS_WITH_TIME_GENERATION]);
}
if (array_key_exists(PiwikMetrics::INDEX_PAGE_MIN_TIME_GENERATION, $row)) {
unset($row[PiwikMetrics::INDEX_PAGE_MIN_TIME_GENERATION]);
}
if (array_key_exists(PiwikMetrics::INDEX_PAGE_MAX_TIME_GENERATION, $row)) {
unset($row[PiwikMetrics::INDEX_PAGE_MAX_TIME_GENERATION]);
}
}
unset($row['name']);
unset($row['type']);
unset($row['idaction']);
unset($row['url_prefix']);
foreach ($row as $name => $value) {
// in some edge cases, we have twice the same action name with 2 different idaction
// - this happens when 2 visitors visit the same new page at the same time, and 2 actions get recorded for the same name
// - this could also happen when 2 URLs end up having the same label (eg. 2 subdomains get aggregated to the "/index" page name)
if (($alreadyValue = $actionRow->getColumn($name)) !== false) {
$newValue = self::getColumnValuesMerged($name, $alreadyValue, $value, $metricsConfig);
$actionRow->setColumn($name, $newValue);
} else {
$actionRow->addColumn($name, $value);
}
}
// if the exit_action was not recorded properly in the log_link_visit_action
// there would be an error message when getting the nb_hits column
// we must fake the record and add the columns
if ($actionRow->getColumn(PiwikMetrics::INDEX_PAGE_NB_HITS) === false) {
// to test this code: delete the entries in log_link_action_visit for
// a given exit_idaction_url
foreach (self::getDefaultRow()->getColumns() as $name => $value) {
$actionRow->addColumn($name, $value);
}
}
$rowsProcessed++;
}
// just to make sure php copies the last $actionRow in the $parentTable array
$actionRow =& $actionsTablesByType;
return $rowsProcessed;
}
/**
* Update the existing action datatable with goal columns
*
* @param Zend_Db_Statement|PDOStatement $resultSet Result set from the goals data query
* @param bool $isPages True if page view goals metrics should be used, else entry goal metrics
*
* @return int Number of rows processed
* @throws \Exception
*/
public static function updateActionsTableWithGoals($resultSet, bool $isPages): int
{
$rowsProcessed = 0;
while ($row = $resultSet->fetch()) {
if (self::updateActionsTableRowWithGoals($row, $isPages)) {
$rowsProcessed++;
}
}
return $rowsProcessed;
}
/**
* Add goals metrics to a single row of the actions table
*
* @param array $row The array of goals metric data to add to the action table row
* @param bool $isPages True if page view goals metrics should be used, else entry goal metrics
*
* @throws \Exception
*/
private static function updateActionsTableRowWithGoals(array $row, bool $isPages): bool
{
if (!isset($row['idaction']) || !isset($row['type'])) {
return false;
}
// Match the existing action row in the datatable
$actionRow = self::getCachedActionRow($row['idaction'], $row['type']);
if ($actionRow === false || is_null($actionRow)) {
return false;
}
// Define the possible goal metrics available in the goals data resultset
if ($isPages) {
// Page view metrics
$possibleMetrics = [
PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS => 'nb_conversions', // 1
PiwikMetrics::INDEX_GOAL_REVENUE => 'revenue', // 2
PiwikMetrics::INDEX_GOAL_NB_PAGES_UNIQ_BEFORE => 'nb_conv_pages_before', // 9
PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ATTRIB => 'nb_conversions_attrib', // 10
PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_PAGE_RATE => 'nb_conversions_page_rate', // 11
PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_PAGE_UNIQ => 'nb_conversions_page_uniq', // 12
PiwikMetrics::INDEX_GOAL_REVENUE_ATTRIB => 'revenue_attrib', // 15
];
} else {
// Entry page metrics
$possibleMetrics = [
PiwikMetrics::INDEX_GOAL_REVENUE_ENTRY => 'revenue_entry', // 17
PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ENTRY_RATE => 'nb_conversions_entry_rate', // 12
PiwikMetrics::INDEX_GOAL_REVENUE_PER_ENTRY => 'revenue_per_entry', // 13
PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ENTRY => 'nb_conversions_entry', // 16
];
}
unset($row['type']);
unset($row['idaction']);
if (!isset($row['idgoal'])) {
return false;
}
if (
$isPages &&
isset($row[PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ATTRIB]) &&
isset($row[PiwikMetrics::INDEX_GOAL_NB_PAGES_UNIQ_BEFORE])
) {
/**
* Ensures this metric is available. It will be calculated later using a filter.
* @see \Piwik\Plugins\Goals\DataTable\Filter\CalculateConversionPageRate
*/
$row[PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_PAGE_RATE] = 0;
}
if (isset($row[PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ATTRIB])) {
$row[PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ATTRIB] = (float) $row[PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ATTRIB];
}
if (isset($row[PiwikMetrics::INDEX_GOAL_REVENUE_ATTRIB])) {
$row[PiwikMetrics::INDEX_GOAL_REVENUE_ATTRIB] = (float) $row[PiwikMetrics::INDEX_GOAL_REVENUE_ATTRIB];
}
if (!$isPages) {
$nbEntrances = $actionRow->getColumn(PiwikMetrics::INDEX_PAGE_ENTRY_NB_VISITS);
$conversions = $row[PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ENTRY];
if ($nbEntrances !== false && is_numeric($nbEntrances) && $nbEntrances > 0) {
// Calculate conversion entry rate
if (isset($row[PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ENTRY])) {
$row[PiwikMetrics::INDEX_GOAL_NB_CONVERSIONS_ENTRY_RATE] = Piwik::getQuotientSafe(
$conversions,
$nbEntrances,
GoalManager::REVENUE_PRECISION + 1
);
}
// Calculate revenue per entry
if (isset($row[PiwikMetrics::INDEX_GOAL_REVENUE_ENTRY])) {
$row[PiwikMetrics::INDEX_GOAL_REVENUE_PER_ENTRY] = (float) Piwik::getQuotientSafe(
$row[PiwikMetrics::INDEX_GOAL_REVENUE_ENTRY],
$nbEntrances,
GoalManager::REVENUE_PRECISION + 1
);
$row[PiwikMetrics::INDEX_GOAL_REVENUE_ENTRY] = (float) $row[PiwikMetrics::INDEX_GOAL_REVENUE_ENTRY];
}
}
}
// Get goals column
$goalsColumn = $actionRow->getColumn(PiwikMetrics::INDEX_GOALS);
if ($goalsColumn === false) {
$goalsColumn = [];
}
// Create goal subarray if not exists
if (!isset($goalsColumn[$row['idgoal']])) {
$goalsColumn[$row['idgoal']] = [];
}
// Find metric columns in the goal query row and add them to the actions data table row
foreach ($possibleMetrics as $metricKey => $columnName) {
if (isset($row[$metricKey])) {
// Add metric
if (!isset($goalsColumn[$row['idgoal']][$metricKey])) {
$goalsColumn[$row['idgoal']][$metricKey] = $row[$metricKey];
} else {
if ($metricKey == PiwikMetrics::INDEX_GOAL_NB_PAGES_UNIQ_BEFORE) {
if ($goalsColumn[$row['idgoal']][$metricKey] < $row[$metricKey]) {
$goalsColumn[$row['idgoal']][$metricKey] = $row[$metricKey];
}
} else {
$goalsColumn[$row['idgoal']][$metricKey] += $row[$metricKey];
}
}
// Write goals column back to datatable
$actionRow->setColumn(PiwikMetrics::INDEX_GOALS, $goalsColumn);
}
}
return true;
}
public static function removeEmptyColumns($dataTable)
{
// Delete all columns that have a value of zero
$dataTable->filter('ColumnDelete', array(
$columnsToRemove = array(PiwikMetrics::INDEX_PAGE_IS_FOLLOWING_SITE_SEARCH_NB_HITS),
$columnsToKeep = array(),
$deleteIfZeroOnly = true,
));
}
/**
* For rows which have subtables (eg. directories with sub pages),
* deletes columns which don't make sense when all values of sub pages are summed.
*
* @param $dataTable DataTable
*/
public static function deleteInvalidSummedColumnsFromDataTable($dataTable)
{
foreach ($dataTable->getRows() as $id => $row) {
if (
($idSubtable = $row->getIdSubDataTable()) !== null
|| $id === DataTable::ID_SUMMARY_ROW
) {
$subTable = $row->getSubtable();
if ($subTable) {
self::deleteInvalidSummedColumnsFromDataTable($subTable);
}
if ($row instanceof DataTableSummaryRow) {
$row->recalculate();
}
foreach (Metrics::$columnsToDeleteAfterAggregation as $name) {
$row->deleteColumn($name);
}
}
}
// And this as well
ArchivingHelper::removeEmptyColumns($dataTable);
}
/**
* Returns the limit to use with RankingQuery for this plugin.
*
* @return int
*/
public static function getRankingQueryLimit()
{
$configGeneral = Config::getInstance()->General;
$configLimit = $configGeneral['archiving_ranking_query_row_limit'];
$limit = $configLimit == 0 ? 0 : max(
$configLimit,
$configGeneral['datatable_archiving_maximum_rows_actions'],
$configGeneral['datatable_archiving_maximum_rows_subtable_actions']
);
// FIXME: This is a quick fix for #3482. The actual cause of the bug is that
// the site search & performance metrics additions to
// ArchivingHelper::updateActionsTableWithRowQuery expect every
// row to have 'type' data, but not all of the SQL queries that are run w/o
// ranking query join on the log_action table and thus do not select the
// log_action.type column.
//
// NOTES: Archiving logic can be generalized as follows:
// 0) Do SQL query over log_link_visit_action & join on log_action to select
// some metrics (like visits, hits, etc.)
// 1) For each row, cache the action row & metrics. (This is done by
// updateActionsTableWithRowQuery for result set rows that have
// name & type columns.)
// 2) Do other SQL queries for metrics we can't put in the first query (like
// entry visits, exit vists, etc.) w/o joining log_action.
// 3) For each row, find the cached row by idaction & add the new metrics to
// it. (This is done by updateActionsTableWithRowQuery for result set rows
// that DO NOT have name & type columns.)
//
// The site search & performance metrics additions expect a 'type' all the time
// which breaks the original pre-rankingquery logic. Ranking query requires a
// join, so the bug is only seen when ranking query is disabled.
if ($limit === 0) {
$limit = 100000;
}
return $limit;
}
/**
* @param $columnName
* @param $alreadyValue
* @param $value
* @return mixed
*/
private static function getColumnValuesMerged($columnName, $alreadyValue, $value, $metricsConfig)
{
if (array_key_exists($columnName, $metricsConfig)) {
$config = $metricsConfig[$columnName];
if (!empty($config['aggregation'])) {
if ($config['aggregation'] == 'min') {
if (empty($alreadyValue)) {
$newValue = $value;
} elseif (empty($value)) {
$newValue = $alreadyValue;
} else {
$newValue = min($alreadyValue, $value);
}
return $newValue;
}
if ($config['aggregation'] == 'max') {
$newValue = max($alreadyValue, $value);
return $newValue;
}
}
}
$newValue = $alreadyValue + $value;
return $newValue;
}
public static $maximumRowsInDataTableLevelZero;
public static $maximumRowsInSubDataTable;
public static $maximumRowsInDataTableSiteSearch;
public static $columnToSortByBeforeTruncation;
protected static $actionUrlCategoryDelimiter = null;
protected static $actionTitleCategoryDelimiter = null;
protected static $defaultActionName = null;
protected static $defaultActionNameWhenNotDefined = null;
protected static $defaultActionUrlWhenNotDefined = null;
public static function reloadConfig()
{
// for BC, we read the old style delimiter first (see #1067)
$actionDelimiter = @Config::getInstance()->General['action_category_delimiter'];
if (empty($actionDelimiter)) {
self::$actionUrlCategoryDelimiter = Config::getInstance()->General['action_url_category_delimiter'];
self::$actionTitleCategoryDelimiter = Config::getInstance()->General['action_title_category_delimiter'];
} else {
self::$actionUrlCategoryDelimiter = self::$actionTitleCategoryDelimiter = $actionDelimiter;
}
self::$defaultActionName = Config::getInstance()->General['action_default_name'];
self::$columnToSortByBeforeTruncation = PiwikMetrics::INDEX_NB_VISITS;
self::$maximumRowsInDataTableLevelZero = Config::getInstance()->General['datatable_archiving_maximum_rows_actions'];
self::$maximumRowsInSubDataTable = Config::getInstance()->General['datatable_archiving_maximum_rows_subtable_actions'];
self::$maximumRowsInDataTableSiteSearch = Config::getInstance()->General['datatable_archiving_maximum_rows_site_search'];
DataTable::setMaximumDepthLevelAllowedAtLeast(self::getSubCategoryLevelLimit() + 1);
}
/**
* The default row is used when archiving, if data is inconsistent in the DB,
* there could be pages that have exit/entry hits, but don't yet
* have a record in the table (or the record was truncated).
*
* @return Row
*/
private static function getDefaultRow()
{
static $row = false;
if ($row === false) {
// This row is used in the case where an action is know as an exit_action
// but this action was not properly recorded when it was hit in the first place
// so we add this fake row information to make sure there is a nb_hits, etc. column for every action
$row = new Row(array(
Row::COLUMNS => array(
PiwikMetrics::INDEX_NB_VISITS => 1,
PiwikMetrics::INDEX_NB_UNIQ_VISITORS => 1,
PiwikMetrics::INDEX_PAGE_NB_HITS => 1,
)));
}
return $row;
}
/**
* Given a page name and type, builds a recursive datatable where
* each level of the tree is a category, based on the page name split by a delimiter (slash / by default)
*
* @param string $actionName
* @param int $actionType
* @param int $urlPrefix
* @param array $actionsTablesByType
* @return DataTable\Row
*/
public static function getActionRow($actionName, $actionType, $urlPrefix, $actionsTablesByType)
{
// we work on the root table of the given TYPE (either ACTION_URL or DOWNLOAD or OUTLINK etc.)
/* @var DataTable $currentTable */
$currentTable = $actionsTablesByType[$actionType];
if (is_null($currentTable)) {
throw new \Exception("Action table for type '$actionType' was not found during Actions archiving.");
}
// check for ranking query cut-off
if ($actionName == RankingQuery::LABEL_SUMMARY_ROW) {
$summaryRow = $currentTable->getRowFromId(DataTable::ID_SUMMARY_ROW);
if ($summaryRow === false) {
$summaryRow = $currentTable->addSummaryRow(self::createSummaryRow());
}
return $summaryRow;
}
// go to the level of the subcategory
$actionExplodedNames = self::getActionExplodedNames($actionName, $actionType, $urlPrefix);
list($row, $level) = $currentTable->walkPath(
$actionExplodedNames,
self::getDefaultRowColumns(),
self::$maximumRowsInSubDataTable
);
return $row;
}
/**
* Returns the configured sub-category level limit.
*
* @return int
*/
public static function getSubCategoryLevelLimit()
{
return Config::getInstance()->General['action_category_level_limit'];
}
/**
* Returns default label for the action type
*
* @param $type
* @return string
*/
public static function getUnknownActionName($type)
{
if (empty(self::$defaultActionNameWhenNotDefined)) {
self::$defaultActionNameWhenNotDefined = Piwik::translate('General_NotDefined', Piwik::translate('Actions_ColumnPageName'));
self::$defaultActionUrlWhenNotDefined = Piwik::translate('General_NotDefined', Piwik::translate('Actions_ColumnPageURL'));
}
if ($type == Action::TYPE_PAGE_TITLE) {
return self::$defaultActionNameWhenNotDefined;
}
return self::$defaultActionUrlWhenNotDefined;
}
/**
* Explodes action name into an array of elements.
*
* NOTE: before calling this function make sure ArchivingHelper::reloadConfig(); is called
*
* for downloads:
* we explode link https://piwik.org/some/path/piwik.zip into an array( 'piwik.org', '/some/path/piwik.zip' );
*
* for outlinks:
* we explode link https://dev.piwik.org/some/path into an array( 'dev.piwik.org', '/some/path' );
*
* for action urls:
* we explode link https://piwik.org/some/path into an array( 'some', 'path' );
*
* for action names:
* we explode name 'Piwik / Category 1 / Category 2' into an array('Matomo', 'Category 1', 'Category 2');
*
* @param string $name action name
* @param int $type action type
* @param int $urlPrefix url prefix (only used for TYPE_PAGE_URL)
* @return array of exploded elements from $name
*/
public static function getActionExplodedNames($name, $type, $urlPrefix = null)
{
// Site Search does not split Search keywords
if ($type == Action::TYPE_SITE_SEARCH) {
return array($name);
}
$name = str_replace("\n", "", $name);
if ($type == Action::TYPE_PAGE_TITLE && self::$actionTitleCategoryDelimiter === '') {
if ($name === '' || $name === false || $name === null || trim($name) === '') {
$name = self::getUnknownActionName($type);
}
return array(' ' . trim($name));
}
$name = self::parseNameFromPageUrl($name, $type, $urlPrefix);
// outlinks and downloads
if (is_array($name)) {
return $name;
}
$split = self::splitNameByDelimiter($name, $type);
if (empty($split)) {
$defaultName = self::getUnknownActionName($type);
return array(trim($defaultName));
}
$lastPageName = end($split);
// we are careful to prefix the page URL / name with some value
// so that if a page has the same name as a category
// we don't merge both entries
if ($type != Action::TYPE_PAGE_TITLE) {
$lastPageName = '/' . $lastPageName;
} else {
$lastPageName = ' ' . $lastPageName;
}
$split[count($split) - 1] = $lastPageName;
return array_values($split);
}
/**
* Gets the key for the cache of action rows from an action ID and type.
*
* @param int $idAction
* @param int $actionType
* @return string|int
*/
private static function getCachedActionRowKey($idAction, $actionType)
{
return $idAction == RankingQuery::LABEL_SUMMARY_ROW
? $actionType . '_others'
: $idAction;
}
/**
* Static cache to store Rows during processing
*/
protected static $cacheParsedAction = array();
public static function clearActionsCache()
{
self::$cacheParsedAction = array();
}
/**
* Get cached action row by id & type. If $idAction is set to -1, the 'Others' row
* for the specific action type will be returned.
*
* @param int $idAction
* @param int $actionType
* @return Row|false
*/
private static function getCachedActionRow($idAction, $actionType)
{
$cacheLabel = self::getCachedActionRowKey($idAction, $actionType);
if (!isset(self::$cacheParsedAction[$cacheLabel])) {
// This can happen when
// - We select an entry page ID that was only seen yesterday, so wasn't selected in the first query
// - We count time spent on a page, when this page was only seen yesterday
return false;
}
return self::$cacheParsedAction[$cacheLabel];
}
/**
* Set cached action row for an id & type.
*
* @param int $idAction
* @param int $actionType
* @param \Piwik\DataTable\Row
*/
private static function setCachedActionRow($idAction, $actionType, $actionRow)
{
$cacheLabel = self::getCachedActionRowKey($idAction, $actionType);
self::$cacheParsedAction[$cacheLabel] = $actionRow;
}
/**
* Returns the default columns for a row in an Actions DataTable.
*
* @return array
*/
private static function getDefaultRowColumns()
{
return array(PiwikMetrics::INDEX_NB_VISITS => 0,
PiwikMetrics::INDEX_NB_UNIQ_VISITORS => 0,
PiwikMetrics::INDEX_PAGE_NB_HITS => 0,
PiwikMetrics::INDEX_PAGE_SUM_TIME_SPENT => 0);
}
/**
* Creates a summary row for an Actions DataTable.
*
* @return Row
*/
private static function createSummaryRow()
{
$summaryRow = new Row(array(
Row::COLUMNS =>
array('label' => DataTable::LABEL_SUMMARY_ROW) + self::getDefaultRowColumns(),
));
$summaryRow->setIsSummaryRow(); // this should be set in DataTable::addSummaryRow(), but we set it here as well to be safe
return $summaryRow;
}
private static function splitNameByDelimiter($name, $type)
{
if (is_array($name)) {
return $name;
}
if ($type == Action::TYPE_PAGE_TITLE) {
$categoryDelimiter = self::$actionTitleCategoryDelimiter;
} else {
$categoryDelimiter = self::$actionUrlCategoryDelimiter;
}
if (empty($categoryDelimiter)) {
return array(trim($name));
}
$split = explode($categoryDelimiter, $name, self::getSubCategoryLevelLimit());
// trim every category and remove empty categories
$split = array_map('trim', $split);
$split = array_filter($split, 'strlen');
// forces array key to start at 0
$split = array_values($split);
return $split;
}
private static function parseNameFromPageUrl($name, $type, $urlPrefix)
{
$urlRegexAfterDomain = '([^/]+)[/]?([^#]*)[#]?(.*)';
if ($urlPrefix === null) {
// match url with protocol (used for outlinks / downloads)
$urlRegex = '@^http[s]?://' . $urlRegexAfterDomain . '$@i';
} else {
// the name is a url that does not contain protocol and www anymore
// we know that normalization has been done on db level because $urlPrefix is set
$urlRegex = '@^' . $urlRegexAfterDomain . '$@i';
}
$matches = array();
preg_match($urlRegex, $name, $matches);
if (!count($matches)) {
return $name;
}
$urlHost = $matches[1];
$urlPath = $matches[2];
$urlFragment = $matches[3];
if (in_array($type, array(Action::TYPE_DOWNLOAD, Action::TYPE_OUTLINK))) {
$path = '/' . trim($urlPath);
if (!empty($urlFragment)) {
$path .= '#' . $urlFragment;
}
return array(trim($urlHost), $path);
}
$name = $urlPath;
if ($name === '' || substr($name, -1) == '/') {
$name .= self::$defaultActionName;
}
$urlFragment = PageUrl::processUrlFragment($urlFragment);
if (!empty($urlFragment)) {
$name .= '#' . $urlFragment;
}
return $name;
}
public static function setFolderPathMetadata(DataTable $dataTable, $isUrl, $prefix = '')
{
$configGeneral = Config::getInstance()->General;
$separator = $isUrl ? '/' : $configGeneral['action_title_category_delimiter'];
$metadataName = $isUrl ? 'folder_url_start' : 'page_title_path';
foreach ($dataTable->getRows() as $row) {
$subtable = $row->getSubtable();
if (!$subtable) {
continue;
}
$metadataValue = $prefix . $row->getColumn('label');
$row->setMetadata($metadataName, $metadataValue);
self::setFolderPathMetadata($subtable, $isUrl, $metadataValue . $separator);
}
}
}
|