<?php
/*
*
* @Author: Zuhair Mirza
* @Implementation Of Page Rank in PHP
* @Date : 18-November-2014
*
<!--
PageRank is an algorithm used by Google Search to rank websites in their search engine results. PageRank was named after Larry Page, one of the founders of Google. PageRank is a way of measuring the importance of website pages. According to Google:PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is.The underlying assumption is that more important websites are likely to receive more links from other websites.
-->
$links = array(
1 => array(5),
2 => array(4, 7, 8, 9),
3 => array(1, 3, 7),
4 => array(1, 2, 4, 8),
5 => array(1, 6, 7, 9),
6 => array(1, 5),
8 => array(3, 4, 6, 8),
9 => array(1, 8)
);
$pageRank = calculatePageRank($links);
var_dump($pageRank);
?>
<!--If we look at this link graph, a few things stand out. Page 1 is pointed to by almost everything, so is presumably very good. Page 4 is also very popular, so should be expected to rank well. We can then calculate the PageRank by running our iterative process.
We've got a hard limit of 100 iterations, but we also keep track of the amount of change between one set of PageRanks and the next.
If that change drops below a set level (here 0.00005) the process decides it's stable enough and stops.-->
<?php
function calculatePageRank($linkGraph, $dampingFactor = 0.15) {
$pageRank = array();
$tempRank = array();
$nodeCount = count($linkGraph);
// initialise the PR as 1/n
foreach ($linkGraph as $node => $outbound) {
$pageRank[$node] = 1 / $nodeCount;
$tempRank[$node] = 0;
}
$change = 1;
$i = 0;
while ($change > 0.00005 && $i < 100) {
$change = 0;
$i++;
// distribute the PR of each page
foreach ($linkGraph as $node => $outbound) {
$outboundCount = count($outbound);
foreach ($outbound as $link) {
$tempRank[$link] += $pageRank[$node] / $outboundCount;
}
}
$total = 0;
// calculate the new PR using the damping factor
foreach ($linkGraph as $node => $outbound) {
$tempRank[$node] = ($dampingFactor / $nodeCount) + (1 - $dampingFactor) * $tempRank[$node];
$change += abs($pageRank[$node] - $tempRank[$node]);
$pageRank[$node] = $tempRank[$node];
$tempRank[$node] = 0;
$total += $pageRank[$node];
}
// Normalise the page ranks so it's all a proportion 0-1
foreach ($pageRank as $node => $score) {
$pageRank[$node] /= $total;
}
}
return $pageRank;
}
?>
/*
*
* @Author: Zuhair Mirza
* @Implementation Of Page Rank in PHP
* @Date : 18-November-2014
*
<!--
PageRank is an algorithm used by Google Search to rank websites in their search engine results. PageRank was named after Larry Page, one of the founders of Google. PageRank is a way of measuring the importance of website pages. According to Google:PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is.The underlying assumption is that more important websites are likely to receive more links from other websites.
-->
$links = array(
1 => array(5),
2 => array(4, 7, 8, 9),
3 => array(1, 3, 7),
4 => array(1, 2, 4, 8),
5 => array(1, 6, 7, 9),
6 => array(1, 5),
8 => array(3, 4, 6, 8),
9 => array(1, 8)
);
$pageRank = calculatePageRank($links);
var_dump($pageRank);
?>
<!--If we look at this link graph, a few things stand out. Page 1 is pointed to by almost everything, so is presumably very good. Page 4 is also very popular, so should be expected to rank well. We can then calculate the PageRank by running our iterative process.
We've got a hard limit of 100 iterations, but we also keep track of the amount of change between one set of PageRanks and the next.
If that change drops below a set level (here 0.00005) the process decides it's stable enough and stops.-->
<?php
function calculatePageRank($linkGraph, $dampingFactor = 0.15) {
$pageRank = array();
$tempRank = array();
$nodeCount = count($linkGraph);
// initialise the PR as 1/n
foreach ($linkGraph as $node => $outbound) {
$pageRank[$node] = 1 / $nodeCount;
$tempRank[$node] = 0;
}
$change = 1;
$i = 0;
while ($change > 0.00005 && $i < 100) {
$change = 0;
$i++;
// distribute the PR of each page
foreach ($linkGraph as $node => $outbound) {
$outboundCount = count($outbound);
foreach ($outbound as $link) {
$tempRank[$link] += $pageRank[$node] / $outboundCount;
}
}
$total = 0;
// calculate the new PR using the damping factor
foreach ($linkGraph as $node => $outbound) {
$tempRank[$node] = ($dampingFactor / $nodeCount) + (1 - $dampingFactor) * $tempRank[$node];
$change += abs($pageRank[$node] - $tempRank[$node]);
$pageRank[$node] = $tempRank[$node];
$tempRank[$node] = 0;
$total += $pageRank[$node];
}
// Normalise the page ranks so it's all a proportion 0-1
foreach ($pageRank as $node => $score) {
$pageRank[$node] /= $total;
}
}
return $pageRank;
}
?>
Thank you again for all the knowledge you distribute,Good post. I was very interested in the article, it's quite inspiring I should admit. I like visiting you site since I always come across interesting articles like this one.Great Job, I greatly appreciate that.Do Keep sharing! Regards,
ReplyDeletecheck pagerank
You’ve got some interesting points in this article. I would have never considered any of these if I didn’t come across this. Thanks!.
ReplyDeletecheck pr