How long does it take for you to write a quality article? Let’s assume an hour. And let’s just say over the course of a year, you post 100 of these articles on your site. How would you feel if you stumbled upon another site one day with each and every one of those articles. That’s 100 hours of work that someone didn’t have to do because you did it for them! Would that make you want to hunt them down and … well, use your imagination for the rest.
When you pour all your hard work into writing fresh content for your site and someone copies your article, literally word by word, and makes the same post on their site without ever giving you credit for it, that’s called scraping (aka web scraping or blog scraping).
Blog scrapers are new media thieves who are simply looking to benefit from your energy with no intention of ever letting you know about it. Some scrapers do this manually while others use software tools to “harvest” or “extract” the data. Regardless, what they are doing is stealing from you.
Other than accidentally finding your article posted somewhere or having one of your visitors inform you, there are ways to do your own detective work.
There are, of course, many tools that you could use. From Google Webmaster Tools and Google Blog Search to Yahoo SiteExplorer and Technorati, there seems to be an abundance of sites out there that can show you link data. However, I found the easiest way is to simply Google search your own articles. Here’s how I manually do it:
- Copy a paragraph of your article.
- Paste the text into a google search box.
- Place quotes before and after it.
- See the results.
I know, there’s no rocket science involved in that approach but it works! It’s the KISS method but why complicate things? You might get the occasional scrape that slips through the cracks but you save yourself a lot of time and energy. Just by searching your own articles as I described once every month will inform you if scrapers are stealing every single article from you.
What happens if you actually find one of these thieves? In my next article, I’ll tell you what I have personally done as well as recommended for others to do when we find a web scraper.
Have you found anyone scraping your work before?










I am a business professional
with an entrepreneurial spirit. Although I have an MBA and managed websites and IT departments
for several Fortune 500 companies ...
yes, it seems good for seo to promote your site.
.-= mp4 convertion software´s last blog ..Hello world! =-.
I cant wait for your next article… I am currently tracking at LEAST 5 sites that do nothing but steal our content and never even credit us for it. Well over 10-15 more that occasionally steal. I gave up sending DMCA forms as that doesnt even deter them anymore, most know they can just respond and negate it or even start a new blog just as fast. I’m SO sick of these people, but I cannot find any way to combat them. If I retaliate in any way it just gets my site spammed.
Hope you have some good tips that may help, honestly I just don’t think there is much I can do but ignore it and hope I win out over the lamers. A couple of these sites are more popular than mine, so its hard to get ahead and even with proof their users are “loyal” and don’t seem to care.
.-= Jake´s last blog ..Super Awesome Unreleased Animals! =-.
Try contacting the host and CC’ing the site. If they’re hosted in the US, a DMCA to the host is almost certain to shut them down.
.-= Anne´s last blog ..Ok, so now this blog runs on Thesis… =-.
I tried that originally, but some of the more popular or knowledgeable sites know how to get around that. Most hosts wont shut them down without notice to take down the content. Unfortunately if you respond to a DMCA with a counter DMCA it takes all responsibility away from the host and you are expected to settle it in court with the offender. So they are more like a warning that the offender can say no to…
I cannot afford the time or money to take anything like this to court. I still do DMCA’s occasionally for the smaller sites that don’t know about this, but then they are the ones I care least about if they steal my content… Its a big circle of drama that I have yet to find a decent solution to. At the moment I’m just ignoring it now and hoping eventually they will give up, I have little faith in my current plan though. lol
.-= Jake´s last blog ..Super Awesome Unreleased Animals! =-.
Regardless of whether or not you are able to stop them, if you’re getting scraped often, then you should probably have deep links on your articles so when it’s copied you get your own backlinks.
Also, I would include an article footer that describes you and your site.
exactly… linking to other articles of your blog will actually direct users to your blog. In that way, users will come to think that yours is the original. I’m actually confident that SE’s won’t penalize your site. SE’s check also the maturity of the blog and the individual contents. The worst thing here, is when the duplicates rank better than the original…
.-= Cebu Tech Blogger´s last blog ..Learn Blogging Using Blogger =-.
[...] This post was mentioned on Twitter by Freya, Zak Morris, Jane Smith, Annanta.com, Brad Ward and others. Brad Ward said: RT @pnstlion Care to see if someone is stealing your blog articles? http://bit.ly/a2qU10 [...]
Oh, I’ve been a victim of article scraping before… They used a plugin on that blog to scrape the entire contents from my blog. I just emailed the owner, but didn’t get a response. Anyway, Google is intelligent enough to distinguish which is the original between duplicate contents. I use copyscape also. I’ll try what you’ve done…
.-= Cebu Tech Blogger´s last blog ..Learn Blogging Using Blogger =-.
so, what we can do if our article have copying?
.-= Robin Benedict @Internet Security´s last blog ..Kaspersky License Key & Activation Code =-.
I think you have to read this: http://freebloghelp.com/found-a-web-scrape-thief-now-what/
.-= Cebu Tech Blogger´s last blog ..Line2 VOIP App for iPhone is down =-.
I like the deep linking idea. I need to start doing that so that at least when my sites get scraped I get some value out of it. Copyscape is good but I find it a bit cumbersome.
I was outraged when this happened to me. Adding to injury was the insult of seeing my article posted on Digg attributed to someone else.
Here’s what I’m doing about it: very little. I just don’t care, because I believe this behavior has a limited lifespan preying on unsophisticated users. These days I don’t anything on the web that doesn’t have a real face behind it. Such information literally has no value to me.
That being said, there are things you can do:
* Avoid using the letters ESS EEE OHH in the title element, url or any header element. Articles with those letters get fully scraped literally within seconds of hitting your feed. It’s truly amazing. It’s like pr0n or something.
* Avoid trending topics in those same elements. I published 2 articles on the EYE PAT these week, and again, both were scraped instantly.
* Consider moving to partial feeds. WordPress doesn’t really handle partial feeds very well, and people generally stink at writing teaser copy, but a partial feed will stop a lot of this.
* Internal linking is how I catch most of it.
* I don’t fight it anymore because the last time I tried, I spent hours attempting to determine where the site was actually hosted. Turns out it had some sort of shrouding or something, and none of the hosting companies would own up to it. “Not our problem.”
* Post your stuff to Digg right after publishing gets you a time stamp, should that ever be necessary. It will certainly help when someone else posts it to Digg with their attribution.
* I assume the universe will send me customers who are reading my stuff because I wrote it, not because they found some article in a search engine, which is displayed on a content scraper.
Anyways, I’m done for now, and I”m going to use this for tomorrow’s article as well. Gabe, I had to mass unsubscribe from RSS, rebuilding my feed now, I’ll be spending a little more time over here in the near future. Also planning on linking to this one, and I’ll link to your next when you publish it.
.-= Dave Doolin´s last blog ..How to Practice Blogging Like a Master – You can do this =-.
Thanks for the list of tips. Just keep those deep links in your articles and you’re all set.
You should be getting some traffic from my followup article.
.-= Dave Doolin´s last blog ..7 Excellent Tips for Handling Content Robbers (’cause you cain’t shoot ‘em) =-.
[...] Care to see if someone is stealing your blog articles? [...]
Sheesh! Thieves have no shame!
I had no idea this was such a widespread problem. I haven’t done any checking but its worth taking a half hour or so a month and see what the dastardly cowards are up to …. and I’ll read your next post Gabe to see how to dispense with these ignoramouses (if I was a mouse I would be big time cheesed off).
.-= Valentina´s last blog ..WordPress Direct Review =-.
It is a sad thing to see my work scraped, and i discovered most of them is as a result of submitting my site to some RSS directories.
Thanks for the nice post.
.-= Onibalusi Bamidele´s last blog ..Blogging your passion =-.
Hi Gabe, it’s been a while.
I don’t bother about it simply because i have prevented by blog’s content from being copied.
Any can go ovet to my blog and try to copy content and see what happens.
.-= Olusegun´s last blog ..I’m back – Rising from the Ashes like a Phoenix =-.
I have used this on many occasions but a lot of articles are recycled instead of being verbatim, that’s if the thief has any sense!
Some sites that pull data from multiple blogs and aggregate them are alright. As for the ones that just blatantly steal the whole of another bloggers site that is out of order.
Aggregating is fine as long as credit is given. The scrapers I’m talking about are the ones who intentionally steal others’ work and post as their own.
It makes me mad seeing my own article posted in someone’s website without your consent or even credit for doing so.. it’s plagiarism i think
.-= cebu attractions´s last blog ..Gadgets on Vacation Travel =-.
Generally i used copyspace to catch those rats who stealing my blog cheese. but here i find easiest way and i will sure try this one. Thanks Gabe!
.-= Rakesh Solanki´s last blog ..How to Get Your Twitter Account Verified? =-.
It is a sad thing to see my work scraped, and i discovered most of them is as a result of submitting my site to some RSS directories.
Thanks for the nice post
.-= 1skyliner´s last blog ..Find Blogs Using the Top Commentators Plugin =-.
It’s happened to me twice. both times I managed to track down who it was from the IP which i got from my sitemeter.
One time it was a seemingly respectable Indian businessman living in the US – a gentleman called Tushar Matar who claimed his innocence but neither I or my blog readers were convinced. http://thebookaholic.blogspot.com/2009/03/tushar-matar-blog-content-thief.html
The second time it was a guy called Aubrey Hall in Ireland who i discovered from his IP address had a whole bank of computers (he was also logged into the NASA website). He had numerous blogs – all of them scrapping content. Googling his IP further I found his name and all the porn sites he was subscribed to. I wrote to him and he more or less laughed in my face (not only a thief but bloody rude as well) and said I had a creative commons license on my blog which he said gave him permission to take my stuff. He did eventually remove my content after some acrimonious emails in which he huffed as if he were the wronged party. More here – http://thebookaholic.blogspot.com/2008/05/my-blog-content-stolen.html
The moral of the story. Stay alert to the possibility of content being stolen. Install a site meter on your site that tracks visitors IP addresses. Do some detective work with that IP number. Be prepared to name and shame the perp. Inform other people whose work has been stolen and maybe you can act together.
.-= bibliobibuli´s last blog ..Best Kids’ Books =-.
That is one fantastic story, bib.
Thanks for taking the time to write it up.
I don’t bother with these clowns any more, too much to lose if one of them decides to run a DOS on my server.
.-= Dave Doolin´s last blog ..SEO for Writers and Artists (or, how to date your search engine) =-.
very Interesting stuff
Recent Issue Today is provides latest issues, sports news, UFC 114,ufc online,mma, UFC results and scoops buzzing the internet. Delivering sizzling Issues just got better here in recent issue today
Recent issue today’s last post…Steve Clancy Hill AKA Steve Driver Dies in Police Standoff
I have to say that freebloghelp.com is really a good website
A useful guide
Hi, I am new here.
The problem is that it returns outcomes from Related Websites plugin and also excerpts from social bookmarking sites.
i found an blog with copyscape problem mailed to an owner and he has corrected that error
I am glad that new technologies are coming out in web design that make things easier, improved, and better looking for design.
koziol products’s last post…Quality UK department store
If possible, as you gain knowledge, please add to this blog with more information. I have found it enormously useful.
The image that you have posted is very much suitable to all over the blog, I would say that this is very good work done by you.
that is really neat to see where each of your areas and posts excel.
Cool post. Waiting for you to continue the topic.
Anete Hakkinen
escorts warsaw poland
Interesting post would like to have an regular visit
kslThanks for sharings
I like this post, adding to my knowledge of SEO techniques. Thank you.
I had no idea this was such a widespread problem. I haven’t done any checking but its worth taking a half hour or so a month and see what the dastardly cowards are up to ….
This is definitely the nastiest thing another blogger could do to a fellow. Others would even spin your articles and change everything but the thought is the same. But then this is a bit better that the one scraping everything. I used to see my article being published on other sites and when I was trying to submit it the site won’t allow because of the same title. I tried to make that title as unique as possible. But then that nasty blogger just got it from me. sigh! I will try your method. Do you have any idea if we can make sure that our blogs are unscrappable? lol No such word!
Hey, I read a lot of blogs on a daily basis and for the most part, people lack substance but, I just wanted to make a quick comment to say GREAT blog!…..I”ll be checking in on a regularly now….Keep up the good work Forerunner 610 Review
Thanks a lot sharingThanks a lot sha
Thanks a lot sharingThks a lot sha
If doable, as you gain information, please increase this blog with additional info. I even have found it enormously helpful.
Once you know that someone is stealing your blog articles how to fight against it ?
Gabe,
I agree with you. I usually prefer doing copyscape and then also check using blekko.com to make sure that my hours of work doesn’t go waste.
Free Hindi SMS’s last post…Happy New Year Sms – Who Else Wants Hindi New Year Sms 2012?
psp…
[...]Care to see if someone is stealing your blog articles? | Free Blog Help[...]…
Best Webdesigning company…
[...]Care to see if someone is stealing your blog articles? | Free Blog Help[...]…