Data loss detection tool mines the ephemeral world of 'pastes'

It’s not easy to figure out if your data has been collected by hackers, but an online tool has been expanded to hunt through one of the most prolific sources of leaked data, known as “pastes.”

The most well-known paste website, Pastebin, has long been used as a public yet relatively anonymous way for attention seekers to showcase data they’ve collected through intrusions.

Similar sites, including Pastie and Slexy, publish pastes that contain more than 19,000 email addresses per day. Often passwords are also published with those email addresses, putting people at risk.

Sydney-based software architect Troy Hunt last year launched ”Have I Been Pwned,” (HIBP) a website where people could see if their usernames and email addresses had turned up in caches of data from massive breaches that affected companies such as Adobe Systems, Stratfor and Sony.

He’s now added a capability to quickly input email addresses that pop up in pastes into his database, giving his subscribers quick alerts just minutes after their data is published.

“A lot of the big breaches we’ve seen have had partial dumps on Pastebin,” Hunt said in a phone interview Tuesday.

The latest feature in Hunt’s service uses a Twitter feed “@dumpmon,” short for Dump Monitor, which is a project by Jordan Wright. Dump Monitor is a bot that monitors posts on Pastebin and other sites for email addresses, hashes and APIs that may indicate a data breach.

Dump Monitor then regularly tweets links to those pastes. Hunt uses Dump Monitor’s feed, collecting the email addresses from the pastes and storing them in its breach database. That process of collecting the email address and putting them into HIBP’s database takes a little over 30 seconds.

Users can sign up for alerts that are sent when their email address shows up in a paste. It means that within a few minutes, someone could know when their data appears on the Web, allowing them to take quick action, such as changing their password.

Pastes are notoriously ephemeral. Pastebin forbids the publishing of personal data in its terms and conditions, and pastes may be removed. HIBP pulls the email addresses from the pastes, but does not store the actual paste, according to a writeup on Hunt’s blog.

The reason, Hunt said, is that many of the pastes contain sensitive information such as passwords, which he doesn’t want to store for obvious reasons. That decision may make it harder for people to take action, but mitigates the risk that comes with storing large batches of live credentials, he said.

Best of all, Hunt’s service is free unlike those offered by several companies that specialize in monitoring underground forums for stolen data. HIBP may not have the comprehensive scope of those services, but it does allow consumers to not have to rely on companies they’ve interacted with to notify them of a breach.

To comment on this article and other Network World content, visit our Facebook page or our Twitter stream.
Related:
Must read: Hidden Cause of Slow Internet and how to fix it
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.