Web browsing history explained

Web browsing history refers to the list of web pages a user has visited, as well as associated metadata such as page title and time of visit. It is usually stored locally by web browsers[1] [2] in order to provide the user with a history list to go back to previously visited pages. It can reflect the user's interests, needs, and browsing habits.[3]

All major browsers have a private browsing mode in which browsing history is not recorded. This is to protect against browsing history being collected by third parties for targeted advertising or other purposes.

Applications

Local history

Locally stored browsing history can facilitate rediscovering lost previously visited web pages of which one only has a vague memory in mind, or pages difficult to find due to being located within deep web. Browsers also utilize it to enable autocompletion in their address bar for quicker and more convenient navigation to frequently visited pages.[4]

The retention span of browsing history varies per internet browser. Mozilla Firefox (desktop version) records history indefinitely by default inside a file named places.sqlite, but automatically erases the earliest history upon exhausted disk space, while Google Chrome (desktop version) stores history for ten weeks by default, automatically pruning earlier entries. An indefinite history file named Archived History was once recorded, but has been removed and automatically deleted in version 37, released in September 2014.[5] [6]

Browser extensions such as History Trends Unlimited for Google Chrome (desktop version) allow the indefinite local storage of browsing history, exporting into a portable file, and self-analysis of browsing habits and statistics.[7]

Browsing history is not recorded when using the private browsing mode provided by most browsers.

Targeted advertising

See main article: Targeted advertising. Targeted advertising means presenting the user with advertisements that are more relevant to one based on one's browsing history.[8] A typical example is a user receiving advertisements on shoes when browsing other websites after searching for shoes on shopping websites. One research shows that targeted advertising doubles the conversion rate of classical online advertising.[9]

Real-time bidding (RTB) is the method used behind targeted advertising. It is a system that automatically bids up the price for presenting advertisements on certain websites. Advertisers decide how much they are willing to pay based on the target audience of the websites. Therefore, more information about the users could encourage advertisers to pay higher prices. The information of users, such as browsing history, is provided to all firms that are involved in the bidding.[10] Since it is a real-time process, information is usually collected without the consent of the user and transferred in unencrypted form.[11] The user has very limited knowledge of how their information is collected, stored, and used.[12] [13]

The response of the user towards targeted advertising depends on whether one knows the information is being collected. If the user already knows that the information is being collected ahead of time, the targeted advertisement could potentially create a positive effect, leading to a higher intention of clicking through the link. However, if the user is not informed about information collection, one would be more concerned with privacy. This will decrease one's intention of clicking through the link. Meanwhile when the user considers the website reliable, it is more possible for them to click through the link and accept the personalization service.[14]

To solve the conflicts between privacy and profits, one newly proposed system is pay-per-tracking. A broker exists between users and advertisers. Users could decide whether to provide their personal information to the broker and then the broker would send the personal information offered by users to advertisers. Meanwhile, users could receive monetary rewards for sharing their personal information. This could help protect the privacy and tracking efficiency, but would lead to extra cost.[15]

Personalized pricing

See main article: Personalized pricing. Personalized pricing is based on the idea that if a user purchases a certain product frequently or pays a higher price for that product, the user could be charged a higher price for this product. Web browsing history could give reliable predictions on the purchasing behaviors of users. When using personalized pricing, the profit of firms could increase by 12.99% compared to status quo cases.[16]

Research

Web browsing history could be used to facilitate research, such as revealing the browsing behavior of people. When a user browses extensively on one site, the probability of requesting an additional page increases. When a user visits more sites, the likelihood of requesting extra pages reduces.[17]

Web browsing history could also be used to create personal web libraries. A personal web library is created by collecting and analyzing the web browsing history of the user. It could help the user to notice browsing trends, time distribution, and the most frequently used websites. Some users regard this function as helpful.

Privacy

Concerns

Web browsing history stored locally is not published anywhere publicly by default. However, almost all the websites are tracked by adwares and potentially unwanted programs (PUPs) which collect users' information without their consent.[18] These tracking methods are usually allowed by platforms by default. Web browsing history is also collected by cookies on websites, which could be divided into two kinds, first-party cookies and third-party cookies. Third-party cookies are usually embedded on first-party websites and collect information from them.[19] Third-party cookies have higher efficiency and data aggregation ability than first-party cookies. While first-party cookies only have access to users' data on one website, third-party cookies could combine data collected from different websites to make the image of the user more complete. Meanwhile, several third-party cookies could exist on the same website.

With enough information available, users could be identified without logging into their accounts.[20]

When third-party cookies collect the web browsing history of users from multiple websites, more information leads to more privacy concerns. For example, a user browses news on one website and searches for medical information on the other website. When the web browsing history from these two websites is combined, the user may be considered interested in news related to medical topics. When browsing history from different websites is combined, it could reflect a more complete image of the person.

Scandals

In 2006, AOL released a large amount of data of its users, including search history. Although no user IDs or names was included, users could be identified based on the browsing history released.[21] For example, user No. 4417749 was identified with her search history over three months.[22]

In 2020, Avast, a popular antivirus software, has been accused of selling browsing history to third parties. It is under preliminary investigation of this accusation by officials of the Czech Republic. The report shows that Avast sold users' data through Jumpshot, a marketing analytics tool. Avast claimed that users' personal information was not included in the leak. However, browsing history could be used to identify users. Avast shut down Jumpshot as a reply to this issue.[23]

Protection

When the user feels there is a risk to privacy, one's intention of disclosing personal information will be lower, but the actions are not affected.[24] However, some studies finds that there is no significant difference between the intention and the actions of disclosing private information, meaning the user will reduce actions of sharing personal information and take more protection measures when feeling concerned about privacy.[25] When users have privacy concerns, they would make less use of online services. They would also make more protection measures such as refusing to offer their information, offering false information, removing their information online and complaining to people around them or relevant organizations.[26]

However, it is hard for users to protect their privacy due to multiple reasons. First, users do not have enough privacy awareness. They are not concerned about being tracked unless there are substantial impacts on them. They are also not aware of how their data contains commercial values. It is generally difficult for users to notice privacy policy links on all kinds of websites, with female users and older users, being more likely to ignore these notices. Even when users notice privacy links, their information disclosure may not be affected.[27] In addition, users are also not equipped with enough technical knowledge to protect themselves even when they notice privacy leakage. They are placed on the passive side with little room to change the situation.

Most users make use of ad blockers, delete cookies, and avoid websites that collect personal information to try to protect their web browsing history from being collected.[28] However, most ad blockers do not offer enough guidance to users to help them improve their privacy awareness. More importantly, they rely on standard black and white list.[29] These lists do not usually include the websites that are tracking users. Ad blockers could only be effective if these tracking domains are blocked.[30]

There are a series of open source projects that try to protect their privacy through collecting their browsing history on the hard drive instead of the browser.[31] It solves the issue of such as that users cannot see the browsing history data once the user deletes the data on the browser.

Notes and References

  1. Web site: Wiederherstellen wichtiger Daten aus einem alten Profil Hilfe zu Firefox . support.mozilla.org . de.
  2. Web site: Google Chrome History Location Chrome History Viewer . www.foxtonforensics.com.
  3. Du, Weidman, Zhenyu Cheryl Qian, Paul Parsons, Yingjie Victor Chen. 2018. “Personal Web Library: organizing and visualizing Web browsing history”. International Journal of Web Information Systems 14(2): 212-232.
  4. Web site: Autocompletion in Chrome's Omnibox is getting smarter . MSPoweruser . 24 August 2020.
  5. Web site: Benson. Ryan. Archived History files removed from Chrome v37. https://web.archive.org/web/20141010125418/http://www.obsidianforensics.com/blog/archived-history-files-removed-from-chrome-v37/. dead. 2014-10-10. Obsidian Forensics. obsidian.
  6. Web site: [chrome] Revision 275159 ]. src.chromium.org.
  7. Web site: 3 Simple Yet Useful Extensions to Enhance Chrome's History . Make Tech Easier . 7 October 2018.
  8. Hennig, Nicole. 2018. “Privacy and security online: best practices for cybersecurity”. Library Technology Reports 54(3): 1-37.
  9. Beales. Howard. 2010. The Value of Behavioral Targeting. Network Advertising Initiative.
  10. Aguirre, Elizabeth, Dominik Mahr, Dhruv Grewal, Ko de Ruyter, Martin Wetzels. 2015. “Unraveling the Personalization Paradox: The Effect of Information Collection and Trust-Building Strategies on Online Advertisement Effectiveness”. Journal of Retailing 91(1): 34-49.
  11. Estrada-Jimenez, Jose, Javier Parra-Arnau, Ana Rodriguez-Hoyos, Jordi Forne. 2017. “Online advertising: Analysis of privacy threats and protection approaches”. Computer Communications 100(1): 32-51.
  12. Evans, David S. 2009. "The Online Advertising Industry: Economics, Evolution, and Privacy". Journal of Economic Perspectives 23 (3): 37-60.
  13. Estrada-Jimenez, Jose, Javier Parra-Arnau, Ana Rodríguez-Hoyos, Jordi Forne. 2019. “On the regulation of personal data distribution in online advertising platforms”. Engineering Applications of Artificial Intelligence 82(1): 13-29.
  14. Chellap, Ramnath K., Raymond G. Sin. 2005. “Personalization versus Privacy: An Empirical Examination yes of the Online Consumer’s Dilemma”. Information Technology Management 6(1): 181-202.  
  15. Parra-Arnau, Javier. 2017. “Pay-per-tracking: A collaborative masking model for web browsing”. Information Sciences 385-386(1): 96-124.
  16. Shiller, Benjamin Reed. 2020. “Approximating purchase propensities and reservation prices from broad consumer tracking”. International Economic Review 61(2): 847-870.
  17. Bucklin, Randolph E., Catarina Sismeiro. 2003. “A Model of Web Site Browsing Behavior Estimated on Clickstream Data”. Journal of Marketing Research 40(3): 249-267.
  18. Urban, Tobias, Dennis Tatang, Thorsten Holz, Norbert Pohlmann. 2019. “Analyzing leakage of personal information by malware”. Journal of Computer Security 27(4): 459-481.
  19. Binns, Reuben, and Elettra Bietti. 2020. “Dissolving Privacy, One Merger at a Time: Competition, Data, and Third Party Tracking”. Computer Law & Security Review: The International Journal of Technology Law and Practice 16(1): 1-19.
  20. Puglisi, Silvia, David Rebollo-Monedero, Jordi Forne. 2017. “On-web user tracking of browsing patterns for personalized advertising”. International Journal of Parallel, Emergent & Distributed Systems 32(5): 502-521.
  21. News: Kawamoto. Dawn. Aug 9, 2006. AOL apologizes for release of user search data. CNET. Nov 27, 2020.
  22. News: Barbaro. Michael. Zeller Jr.. Tom. Aug 9, 2006. A Face Is Exposed for AOL Searcher No. 4417749. The New York Times. Nov 27, 2020.
  23. News: Morris. Chris. Feb 13, 2020. Popular antivirus software Avast under investigation for selling user browsing histories. Fortune. Nov 27, 2020.
  24. Norberg, Patricia A., Daniel R.Horne, and David A. Horne. 2007. “The Privacy Paradox: Personal Information Disclosure Intentions versus Behaviors”. The Journal of Consumer Affairs 41(1): 100-126.
  25. Baruh, Lemi, Ekin Secinti, Zeynep Cemalcilar. 2017. “Online Privacy Concerns and Privacy Management: A Meta-Analytical Review”. Journal of Communication 67(1): 26-53.
  26. Son, Jai-Yeol, Sung S. Kim. 2008. “Internet Users' Information Privacy-Protective Responses: A Taxonomy and a Nomological Model”. MIS Quarterly 32(3): 503-529.
  27. Rodríguez-Priego, Nuria, Rene van Bavel, Shara Monteleone. 2016. “The disconnection between privacy notices and information disclosure: an online experiment”. Economia Politica: Journal of Analytical and Institutional Economics 33(3): 433-461.
  28. Wills, Craig H., Mihajlo Zeljkovic. 2011. “A personalized approach to web privacy: awareness, attitudes and actions”. Information Management & Computer Security 19(1) 53-73.
  29. Malandrino, Delfina, Vittorio Scarano. 2013. “Privacy leakage on the Web: Diffusion and countermeasures”. Computer Networks 57(14): 2833-2855.
  30. Ahmad, Bashir Muhammad, Wilson Christo. 2018. “Diffusion of User Tracking Data in the Online Advertising Ecosystem”. Proceedings on Privacy Enhancing Technologies 2018(4): 85-103.
  31. Web site: Visited: Securely collect browsing history over browsers . github.com. 12 May 2022 .