SALSA algorithm explained

Stochastic Approach for Link-Structure Analysis (SALSA) is a web page ranking algorithm designed by R. Lempel and S. Moran to assign high scores to hub and authority web pages based on the quantity of hyperlinks among them.[1]

Origins

SALSA is inspired by two other link-based ranking algorithms, namely HITS and PageRank, in the following ways:

Properties

SALSA can be seen as an improvement of HITS.

It is computationally lighter since its ranking is equivalent to a weighted in/out degree ranking. The computational cost of the algorithm is a crucial factor since HITS and SALSA are computed at query time and can therefore significantly affect the response time of a search engine. This should be contrasted with query-independent algorithms like PageRank that can be computed off-line.

SALSA is less vulnerable to the Tightly Knit Community (TKC) effect than HITS. A TKC is a topological structure within the Web that consists of a small set of highly interconnected pages. The presence of TKCs in a focused subgraph is known to negatively affect the detection of meaningful authorities by HITS.

The Twitter Social network uses a SALSA style algorithm to suggest accounts to follow.[2]

References

Notes and References

  1. Web site: Wang . Ziyang . Improved Link-Based Algorithms for Ranking Web Pages . cs.nyu.edu . New York University, Department of Computer Science . 7 August 2023.
  2. Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Bosagh Zadeh WTF: The who-to-follow system at Twitter, Proceedings of the 22nd international conference on World Wide Web