Scraping Data from LinkedIn and Clubhouse

Scraping Data from LinkedIn and Clubhouse

Once again, the personal data of users of social networks was skimmed and offered in hacker forums. LinkedIn and Clubhouse are currently affected by the scraping. While the providers deny hacking attacks, we spoke with an expert about the legal and economic classification of the data collection.

Scraping at LinkedIn and Clubhouse

After we recently reported on the scraping of user data at Facebook, the platforms LinkedIn and Clubhouse have now apparently also become the target of similar attacks. Details of this have been published by the portal Cybernews.

This information was siphoned off and offered

At ClubHouse, information from 1.3 million users was siphoned off and published free of charge on a “popular” hacker forum. In detail, the data sets included user ID, name, photo URL, username, Twitter pseudonym, Instagram name, number of followers and subscribed users, date of account creation, and profile name of the inviting user.

Meanwhile, at LinkedIn, 500 million users are affected. The records had been offered for sale in a likewise popular hacker forum; two million records were published as proof of concept.

READ:  What is Mimikatz?

Subsequently, another user had extended the offer even further, i.e. adding another 327 million LinkedIn profiles to the previously mentioned 500 million records. Buyers of the data thus gained access to LinkedIn IDs, full names, email addresses, phone numbers, gender, and linked profiles inside and outside LinkedIn.

By the way, as Cybernews correctly notes, the sum of the offered profiles exceeds the number of current LinkedIn users. The platform itself currently estimates this at over 740 million.

Scraping versus hacking

This raises further questions about the timeliness and origin of the data. LinkedIn itself denies a data leak of its own systems via a blog post. The information now available was gathered from various sources. Apparently, publicly accessible data from LinkedIn itself could have been tapped. Incidentally, the provider prohibits such scraping in its terms of use.

ClubHouse argues similarly on Twitter. Only data that is already generally available via API or app was published.

Technical security measures required

The security service provider BlackBerry is not satisfied with this argument. As Global Senior Vice President Adam Enterkin says, “Companies must remember that all personal data in their care is equally valuable. When it is collected, it must be protected. It’s imperative to ensure that appropriate security controls are implemented to protect the data from unauthorized access.”

READ:  What Is Spear Phishing?

However, BlackBerry was unable to tell us how these security controls would be implemented in concrete terms, even after multiple inquiries. However, a statement from Facebook shows that there are technical options. According to this, vulnerabilities in the “Contact Importer” have been fixed to make scraping more difficult.

Web application firewalls can also prevent the automated reading of web pages.

Comprehensive podcast on scraping

But scraping doesn’t stop at technical precautions. There are also questions about the law, data sovereignty, and possible monopolies: Is data disclosed on social networks public property, how worthy of protection is such information, and can service providers effectively prohibit scraping via terms of use?