Competitive analysis (online algorithm) explained

Competitive analysis is a method invented for analyzing online algorithms, in which the performance of an online algorithm (which must satisfy an unpredictable sequence of requests, completing each request without being able to see the future) is compared to the performance of an optimal offline algorithm that can view the sequence of requests in advance. An algorithm is competitive if its competitive ratio—the ratio between its performance and the offline algorithm's performance—is bounded. Unlike traditional worst-case analysis, where the performance of an algorithm is measured only for "hard" inputs, competitive analysis requires that an algorithm perform well both on hard and easy inputs, where "hard" and "easy" are defined by the performance of the optimal offline algorithm.

For many algorithms, performance is dependent not only on the size of the inputs, but also on their values. For example, sorting an array of elements varies in difficulty depending on the initial order. Such data-dependent algorithms are analysed for average-case and worst-case data. Competitive analysis is a way of doing worst case analysis for on-line and randomized algorithms, which are typically data dependent.

In competitive analysis, one imagines an "adversary" which deliberately chooses difficult data, to maximize the ratio of the cost of the algorithm being studied and some optimal algorithm. When considering a randomized algorithm, one must further distinguish between an oblivious adversary, which has no knowledge of the random choices made by the algorithm pitted against it, and an adaptive adversary which has full knowledge of the algorithm's internal state at any point during its execution. (For a deterministic algorithm, there is no difference; either adversary can simply compute what state that algorithm must have at any time in the future, and choose difficult data accordingly.)

For example, the quicksort algorithm chooses one element, called the "pivot", that is, on average, not too far from the center value of the data being sorted. Quicksort then separates the data into two piles, one of which contains all elements with value less than the value of the pivot, and the other containing the rest of the elements. If quicksort chooses the pivot in some deterministic fashion (for instance, always choosing the first element in the list), then it is easy for an adversary to arrange the data beforehand so that quicksort will perform in worst-case time. If, however, quicksort chooses some random element to be the pivot, then an adversary without knowledge of what random numbers are coming up cannot arrange the data to guarantee worst-case execution time for quicksort.

The classic on-line problem first analysed with competitive analysis is the list update problem: Given a list of items and a sequence of requests for the various items, minimize the cost of accessing the list where the elements closer to the front of the list cost less to access. (Typically, the cost of accessing an item is equal to its position in the list.) After an access, the list may be rearranged. Most rearrangements have a cost. The Move-To-Front algorithm simply moves the requested item to the front after the access, at no cost. The Transpose algorithm swaps the accessed item with the item immediately before it, also at no cost. Classical methods of analysis showed that Transpose is optimal in certain contexts. In practice, Move-To-Front performed much better. Competitive analysis was used to show that an adversary can make Transpose perform arbitrarily badly compared to an optimal algorithm, whereas Move-To-Front can never be made to incur more than twice the cost of an optimal algorithm.

In the case of online requests from a server, competitive algorithms are used to overcome uncertainties about the future. That is, the algorithm does not "know" the future, while the imaginary adversary (the "competitor") "knows". Similarly, competitive algorithms were developed for distributed systems, where the algorithm has to react to a request arriving at one location, without "knowing" what has just happened in a remote location. This setting was presented in .

See also

References