What is personally identifiable information (PII) harvesting?

Back to glossary

Understanding personally identifiable information harvesting

Personally identifiable information (PII) harvesting is a type of attack in which criminals manipulate the forms within your web pages to collect the personally identifiable information that users submit, typically on a login or checkout page. PII may include social security numbers, usernames, passwords, pin numbers and addresses. After collection, this data is used by the criminal or resold on the dark web. The list of companies who have fallen prey to a PII harvesting attack – sometimes called formjacking – includes some well-known brands such as British Airways. Digital skimming and Magecart attacks also harvest and steal PII data, but they primarily target credit card data.

How does PII harvesting work?

To gain access to PII, attackers exploit security vulnerabilities in JavaScript and other third-party code components used to build websites and web applications. A vast majority of developers use code components to improve performance or add new capabilities faster. In fact, research has shown that third-party scripts account for 50-70% of code on websites. The security flaws in client-side code give attackers a new way in because it’s relatively easy to inject malicious code into third-party components, particularly where a vulnerability in a widely used component is broadly known.

Why is PII harvesting difficult to detect?

Client-side attacks on front-end code can be hard because the behavior changes on the pages are often small and selective. Malicious scripts are frequently designed to load dynamically and evade detection by external scanners. They may purposefully target a small percentage of users, only load in a real client-side environment or remove themselves from memory when they detect code analysis taking place. This makes it unlikely that malicious code will be running during any particular moment-in-time scan.

Because client-side scripts load in users’ browsers, they often fall outside the purview of traditional security controls like web application firewalls (WAFs).  More than 90% of website decision-makers do not have complete visibility into the third-party scripts on their website. Attackers can breach a site’s client-side code and hijack the users’ PII data, but it could be months before anyone is aware of the breach.

Third-party scripts are constantly changing and could be compromised at any point between scans or when they load downstream. In many cases, your third-party code refers to other third-party code, creating a long supply chain of 4th-, 5th- and nth-party vendors — i.e., your vendors’ vendors. A security vulnerability may occur in the nth spot in the chain, but if it leads to a PII harvesting attack on your site, you are liable for the resulting damage.

How can organizations prevent PII harvesting?

In order to protect web and mobile applications from client-side data theft, website owners must have visibility into the scripts running on the client side. This means continuously monitoring browser script behavior during every user session. By being able to detect and track suspicious scripts, as well as new scripts and changes in the behavior of existing ones, you can block attackers from accessing your users’ data.

Here are some tips:

  • Vet your third-party code vendors, especially those that support your web forms and underlying software. Only use code from trusted vendors to ensure that all web forms and JavaScript come from known sources.
  • Inventory your forms and any unique software components that support them, and try to identify any nth-party developers.
  • Ask your vendors for regular software updates for your web forms, especially any available updates for JavaScript. Use sandbox testing to ensure that any updates don’t present new vulnerabilities before trusting and installing them on your site.
  • Test your site and forms for vulnerabilities with external code scanning tools and service providers. Penetration testers scan your software once or twice a year as you determine, and bug bounty services can test your web forms around the clock.

These methods can help you detect malicious code, so you can quickly mitigate vulnerabilities with patches or implement a temporary workaround on your site. However, the best way to prevent PII harvesting is to enable automatic monitoring of all client-side JavaScript behavior on your website, so you can catch malicious code activity in action. Inventorying your website and network forms for anomalous behavior allows you to see client-side JavaScript at work.

How does HUMAN prevent PII harvesting?

HUMAN Client-Side Defense protects digital businesses from client-side attacks that harvest users’ sensitive data. The solution monitors third-party script activities in real time, during every user session, to detect suspicious activity and changes in script behavior. Unwanted scripts are stopped from accessing specific form fields or blocked entirely.

By leveraging real-time, behavior-based analysis and machine learning models, Client-Side Defense provides full visibility and control over first-, third- and nth-party scripts running on the client-side. The solution detects and mitigates unauthorized PII access, data exfiltration events, and known script vulnerabilities to prevent digital skimming.