Blog spam, content theft, and other fun things

When I converted my site to a “for real” blog using WordPress about a year ago after nearly a decade “pretend” blogging, it was like I’d moved into a nice, upscale neighborhood. The neighbors were friendly and reasonably intelligent, and everyone’s yard was tidy. The rare comments to my site were reasonably intelligent, and were obviously written by real people who were actual visitors.

Now the new, evil side of the blogosphere has moved in. My site is receiving a hundred or so spam comments a week. They are just like spam emails- only sometimes even more stupid. It really started in earnest in the January time-frame, and thank goodness a new version of WordPress came along at about that time with an anti-spam feature. Akismet is the name of that feature, and its caught 1,250 spam comments since I activated it about three months ago.

About a week ago, something new (for me, at least) happened. I noticed several new sites linking to mine. When I went to check them out, I discovered they were meaningless template sites with no original content of their own. They had large excerpts of my postings with author taglines implying the site “owner” had written the content. The one redeeming feature was that each such posting had a link to my site.

Those sites have since disappeared (at least I can’t find them any more- the incoming links seem to have gone away)- maybe my content wasn’t working out for them πŸ™‚ But it does make me a bit uncomfortable. I mean, I don’t write my posts to line someone else’s pocket…heck, I don’t line my pocket with my posts. My words are my words- warts and all- and I’d prefer them only to exist on my site. Link to me- sure! Excerpt a sentence or two? Fantastic! But add your own comments and make the blog “universe” grow, not shrink.

None of what I’m experiencing with my blog is in any way “new”. Other bloggers who’ve been officially blogging for a while longer have been experiencing blog spam and content theft for a long time- its just new to me because I’ve only become a target recently. And it doesn’t really surprise me at all- its part of the wild west Internet. But it is a bit sad…

Update: After writing this post, I did some googling and came across Jonathan Bailey’s site- Plagiarism Today. It has a ton of information, tips, and reports of real-world experiences with content theft and “site scraping”. Shortly after I found this site, Jonathan himself posted a comment here- he must keep an eagle-eye on the Technorati keyword lists or something πŸ™‚

6 thoughts on “Blog spam, content theft, and other fun things”

  1. I’m sorry to hear that happened to you. I’d venture a guess that they were blogspot blogs that were in turn shut down by Google. Google has been doing a decent job clamping down on these spam blogs, or splogs, and that has caused many of the spammers to just take up residence elsewhere. MSN Spaces is popular at the moment as well.

    Anyway, I’m glad that the matter was resolved. If I can help in in any way in the future, don’t hesitate to write me using the contact form on my site.

  2. Greetings, Jonathan! Apologies for the delay in your comment appearing- all comments here are moderated, and I keep meaning to put an indicator to that effect in the comment submission form.

    Funny that you comment here- I was just at your site, PlagiarismToday, an hour or so ago. I was doing my usual Google-powered “research” (which quite often I do *after* I write my article- bad me!) and found your site to be helpful in framing the problem of content theft. I knew about the problem of blog content theft long ago, but I had never expected to be a “victim”. It seemed to me to be more something that would impact high-traffic bloggers, the ones like Robert Scoble. I guess I was surprised when someone targeted my site, and its proof I think that the folks doing this are increasing their scope.

    I appreciate your response!

  3. Kelly,

    A lot of people think that, if they aren’t A-list bloggers, that they are shielded from the content theft problem. Oddly enough, the opposite is true.

    Plagiarists target mid-range bloggers because, sites with decent followings and good content but aren’t household names. A-List bloggers are too well known to be good targets (they will be called on it immediately) and Z-listers just don’t have the content that they’re looking for.

    Do A-listers get hit? Every day. But those in between are both more numerous and more appealing targets.

    Some will always target Scoble and others like him, but the majority of theft will hit people like you and me…

    PS: No worries about the delay, didn’t even notice it really. Many of my comments get moderated as well though Akismet makes that choice for me.

  4. It’s an interesting social commentary, on the devaluation of thought. At one time if someone plagarized it was bevause they found value in your words or ideas; value they wished to claim and capitalize on as their own.

    Now, computers automatically plagerize sites without any value judgement on the content or words, and use the plagarized “intellectual property” simply as filler.

    It’s almost enough to make one appreciate good ole face to face converation ( well it would be if I wasn’t a non social grouch that didn’t like people much πŸ˜‰ )

  5. ItÒ€ℒs an interesting social commentary, on the devaluation of thought. At one time if someone plagarized it was bevause they found value in your words or ideas; value they wished to claim and capitalize on as their own.

    I still think the content thieves want content that has value: they want to attract people to their site, and to do that they need some sort of “interesting” content. Pure “filler” is recognized as such by the folks who browse the web- if its true crap, visitors won’t be back. The thieves need content that’s good enough to attract visitors, but not immediately identifiable as belonging to someone “important”.

    You are right, though, that there isn’t much (if any) human judgement of value. The thieves primarily look at site popularity and, as Jonathan said, they try to hit folks in the “middle class” popularity range. Once they target you, the process from then on is largely automated.

  6. That’s just it, they aren’t stealing your ideas, they are stealing your traffic. Entirely a different thing that someone copying say, your novel, and puting their name on it and trying to pass themself off as the real author.

    In the case of traditional plagarism, there is usually a little of the “I wish I’d thought of that” factor. I doubt the spammers that stole bits of your site ever said “gee, this is good stuff, I wish I wrote like a middle aged geek” πŸ˜‰ I doubt they looked at the content at all. They probably checked for certain keywords, indexed against traffic numbers and search engine responses and voila, you are “in”.

    The computer and the internet are in some ways doing what assembly line automation did to manual labour; turning into a commodity defined by external standards. We talk of such things and “man hours” and “full time equivalents”. The labour itself is interchangable ( much less the person doing it! ) Not a lot of though goes into the value or quality of the work itself, but only in terms of numbers of output or throughput. Content doesn’t matter, only production figures.

    And that’s the way these plagerists look at intellectual thought. Content is irrelevant, only the production numbers ( traffic, throught clicks etc ) count.

    It’s interesting

Leave a Reply to Jonathan BaileyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.