Search /
Docfinder:
Advanced search  |  Help  |  Site map
RESEARCH CENTERS
SITE RESOURCES
Click for Layer 8! No, really, click NOW!
Networking for Small Business
TODAY'S NEWS
iPhone 5 rumor rollup for the week ending Feb. 10
Forget Public Cloud or Private Cloud, It's All About Hyper-Hybrid
Apple passes HP as largest tech company
How to get the IRS' attention: Forge nearly $8 million in tax returns, steal identities
Much of Western U.S. is a 3G wasteland, says FCC
How the Phoenix Suns basketball team takes on social media attacks
Microsoft details Windows 8 for ARM devices
Resume Makeover: How an Information Security Professional Can Target CSO Jobs
Blogger exposes major Google Wallet security flaw
Web app lets enterprise set security, sharing for Google Apps users
Cloudscaling to offer OpenStack private cloud platform
Macs take on the enterprise
Valentine's Day Patch Tuesday: Microsoft to issue 9 patches, 4 critical
Mobile World Congress sneak peek: Quad-core smartphones, Ice Cream Sandwich & more


 
Send to a friend Feedback

Compendium:

Data-mining Usenet

Related linksToday's breaking news
Send to a friendFeedback


Now, no wisecracks about how data-mining the Internet's oldest public space would mean coming up with a mountain of X-rated JPEGs and make-money-fast spams.

Marc Smith, a research sociologist at Microsoft, has begun looking at ways of extracting trend data and other information from the network. His Netscan software sucks in the messages from 50,000 or so Usenet newsgroups and then analyzes them every which way, including average number of posts, the size of the posts, cross-linked newsgroups, etc., etc (the software is mounted on his site, so you can play with the info yourself).

The goal of the Netscan project is to collect base-line measures of the Usenet, its structure and dynamics so as to map of the kinds and qualities of the groups and institutions that form when people use the net to interact with one another. Netscan provides a range of measures of activity in the Usenet including the number of messages in each of the groups studied and the number of people who participate in them. This can reveal some interesting patterns when this data is analyzed over a period of hours, days, weeks or longer. Other network media like email lists, chat rooms, and proprietary discussion systems could also be studied in this way.
Those of you a little wary of anything in which "Microsoft" and "personal data" are mentioned in the same sentence might not be thrilled by this statement from the project FAQ:
The ultimate goal is to shed light on the vast invisible continent of social cyberspace and to see the crowds that are gathered there.

Because while sociology is the study of groups, it doesn't take too much imagination to figure out how something like this could be used to track specific individuals - or at least, the online names of individuals. Used properly, something like this might be helpful in law enforcement, but even there, recent news would give one pause.

Via Anil Dash.

Related Links

Apply for your free subscription to Network World. Click here. Or get Network World delivered in PDF each week.

Get Copyright Clearance
Request a reprint or permission to use this article.

To top

NWFusion offers more than 40 FREE technology-specific email newsletters in key network technology areas such as NSM, VPNs, Convergence, Security and more.
Click here to sign up!
New Event - WANs: Optimizing Your Network Now.
Hear from the experts about the innovations that are already starting to shake up the WAN world. Free Network World Technology Tour and Expo in Dallas, San Francisco, Washington DC, and New York.
Attend FREE
Your FREE Network World subscription will also include breaking news and information on wireless, storage, infrastructure, carriers and SPs, enterprise applications, videoconferencing, plus product reviews, technology insiders, management surveys and technology updates - GET IT NOW.