How long does it take for you to write a quality article? Let’s assume an hour. And let’s just say over the course of a year, you post 100 of these articles on your site. How would you feel if you stumbled upon another site one day with each and every one of those articles. That’s 100 hours of work that someone didn’t have to do because you did it for them! Would that make you want to hunt them down and … well, use your imagination for the rest.
When you pour all your hard work into writing fresh content for your site and someone copies your article, literally word by word, and makes the same post on their site without ever giving you credit for it, that’s called scraping (aka web scraping or blog scraping).
Blog scrapers are new media thieves who are simply looking to benefit from your energy with no intention of ever letting you know about it. Some scrapers do this manually while others use software tools to “harvest” or “extract” the data. Regardless, what they are doing is stealing from you.
Other than accidentally finding your article posted somewhere or having one of your visitors inform you, there are ways to do your own detective work.
There are, of course, many tools that you could use. From Google Webmaster Tools and Google Blog Search to Yahoo SiteExplorer and Technorati, there seems to be an abundance of sites out there that can show you link data. However, I found the easiest way is to simply Google search your own articles. Here’s how I manually do it:
- Copy a paragraph of your article.
- Paste the text into a google search box.
- Place quotes before and after it.
- See the results.
I know, there’s no rocket science involved in that approach but it works! It’s the KISS method but why complicate things? You might get the occasional scrape that slips through the cracks but you save yourself a lot of time and energy. Just by searching your own articles as I described once every month will inform you if scrapers are stealing every single article from you.
What happens if you actually find one of these thieves? In my next article, I’ll tell you what I have personally done as well as recommended for others to do when we find a web scraper.
Have you found anyone scraping your work before?










I am a business professional
with an entrepreneurial spirit. Although I have an MBA and managed websites and IT departments
for several Fortune 500 companies ...
I have found people copying content from one of my niche sites, but I decided not to do anything because people said it was a lot of effort, so I’ll just stick with the links that they are sending back to me (because all my posts have internal linking – and they kept them
)
Many scrapers copy the entire post so you do get backlinks. However, you also get duplicate content.
I have found some of them copying my content first i would ask them gently by sending a mail to remove the content or give me the credit later . . till ow everyone have responded well and gave me the due credit.. will keep this in mind when something goes really wrong . .
Rajesh Kanuri´s last blog ..4 Useful & Essential SEO WordPress Plugins
Once in a while you can find someone who made an “innocent” mistake but I’m referring to the obvious theft of your writing, your ideas, your time.
For example, someone who scrapes every single article is usually someone I’d rather just ask questions later.
good points but best way to check is with http://www.copyscape.com/

Free Vector Graphics´s last blog ..Rolled Canvas Giveaway from Uprinting
I love copyscape. The problem is that it returns outcomes from Related Websites plugin and also excerpts from social bookmarking sites.
I have and as Rajesh said, the first email will usually do the trick.
Shoot me an email and I’ll send you some info on a really cool plugin i was introduced too recently…takes care of this nicely.

Dennis Edell´s last blog ..Would You Like a FREE Banner Ad Position?
I’ve had people steal content, you just have to roll with the punches.
patsypill´s last blog ..SquishyCash Proof Of Payment
Copyscape (http://copyscape.com/) also somewhat works. It helps a bit in finding copied content.
0 Results for FBH
.
Brad´s last blog ..Opera Mini Soon Will Be Available For The iPhone
As I mentioned in my response to Free Vector Graphics, it really depends on when you check. This site is 0 sometimes and other times has results from the Related Websites plugin.
Still, copyscape is probably the best tool out there if you don’t want to check manually.
Social comments and analytics for this post…
This post was mentioned on Twitter by annanta: RT @pnstlion Care to see if someone is stealing your blog articles? http://bit.ly/a2qU10...
you can include your keywords in the article so that it helps to promote your site in search engines
yes, it seems good for seo to promote your site.
mp4 convertion software´s last blog ..Hello world!
I cant wait for your next article… I am currently tracking at LEAST 5 sites that do nothing but steal our content and never even credit us for it. Well over 10-15 more that occasionally steal. I gave up sending DMCA forms as that doesnt even deter them anymore, most know they can just respond and negate it or even start a new blog just as fast. I’m SO sick of these people, but I cannot find any way to combat them. If I retaliate in any way it just gets my site spammed.
Hope you have some good tips that may help, honestly I just don’t think there is much I can do but ignore it and hope I win out over the lamers. A couple of these sites are more popular than mine, so its hard to get ahead and even with proof their users are “loyal” and don’t seem to care.
Jake´s last blog ..Super Awesome Unreleased Animals!
Try contacting the host and CC’ing the site. If they’re hosted in the US, a DMCA to the host is almost certain to shut them down.
Anne´s last blog ..Ok, so now this blog runs on Thesis…
I tried that originally, but some of the more popular or knowledgeable sites know how to get around that. Most hosts wont shut them down without notice to take down the content. Unfortunately if you respond to a DMCA with a counter DMCA it takes all responsibility away from the host and you are expected to settle it in court with the offender. So they are more like a warning that the offender can say no to…
I cannot afford the time or money to take anything like this to court. I still do DMCA’s occasionally for the smaller sites that don’t know about this, but then they are the ones I care least about if they steal my content… Its a big circle of drama that I have yet to find a decent solution to. At the moment I’m just ignoring it now and hoping eventually they will give up, I have little faith in my current plan though. lol
Jake´s last blog ..Super Awesome Unreleased Animals!
Regardless of whether or not you are able to stop them, if you’re getting scraped often, then you should probably have deep links on your articles so when it’s copied you get your own backlinks.
Also, I would include an article footer that describes you and your site.
exactly… linking to other articles of your blog will actually direct users to your blog. In that way, users will come to think that yours is the original. I’m actually confident that SE’s won’t penalize your site. SE’s check also the maturity of the blog and the individual contents. The worst thing here, is when the duplicates rank better than the original…
Cebu Tech Blogger´s last blog ..Learn Blogging Using Blogger
[...] This post was mentioned on Twitter by Freya, Zak Morris, Jane Smith, Annanta.com, Brad Ward and others. Brad Ward said: RT @pnstlion Care to see if someone is stealing your blog articles? http://bit.ly/a2qU10 [...]
Oh, I’ve been a victim of article scraping before… They used a plugin on that blog to scrape the entire contents from my blog. I just emailed the owner, but didn’t get a response. Anyway, Google is intelligent enough to distinguish which is the original between duplicate contents. I use copyscape also. I’ll try what you’ve done…
Cebu Tech Blogger´s last blog ..Learn Blogging Using Blogger
so, what we can do if our article have copying?
Robin Benedict @Internet Security´s last blog ..Kaspersky License Key & Activation Code
I think you have to read this: http://freebloghelp.com/found-a-web-scrape-thief-now-what/
Cebu Tech Blogger´s last blog ..Line2 VOIP App for iPhone is down
I like the deep linking idea. I need to start doing that so that at least when my sites get scraped I get some value out of it. Copyscape is good but I find it a bit cumbersome.
I was outraged when this happened to me. Adding to injury was the insult of seeing my article posted on Digg attributed to someone else.
Here’s what I’m doing about it: very little. I just don’t care, because I believe this behavior has a limited lifespan preying on unsophisticated users. These days I don’t anything on the web that doesn’t have a real face behind it. Such information literally has no value to me.
That being said, there are things you can do:
* Avoid using the letters ESS EEE OHH in the title element, url or any header element. Articles with those letters get fully scraped literally within seconds of hitting your feed. It’s truly amazing. It’s like pr0n or something.
* Avoid trending topics in those same elements. I published 2 articles on the EYE PAT these week, and again, both were scraped instantly.
* Consider moving to partial feeds. WordPress doesn’t really handle partial feeds very well, and people generally stink at writing teaser copy, but a partial feed will stop a lot of this.
* Internal linking is how I catch most of it.
* I don’t fight it anymore because the last time I tried, I spent hours attempting to determine where the site was actually hosted. Turns out it had some sort of shrouding or something, and none of the hosting companies would own up to it. “Not our problem.”
* Post your stuff to Digg right after publishing gets you a time stamp, should that ever be necessary. It will certainly help when someone else posts it to Digg with their attribution.
* I assume the universe will send me customers who are reading my stuff because I wrote it, not because they found some article in a search engine, which is displayed on a content scraper.
Anyways, I’m done for now, and I”m going to use this for tomorrow’s article as well. Gabe, I had to mass unsubscribe from RSS, rebuilding my feed now, I’ll be spending a little more time over here in the near future. Also planning on linking to this one, and I’ll link to your next when you publish it.
Dave Doolin´s last blog ..How to Practice Blogging Like a Master – You can do this
Thanks for the list of tips. Just keep those deep links in your articles and you’re all set.
You should be getting some traffic from my followup article.
Dave Doolin´s last blog ..7 Excellent Tips for Handling Content Robbers (’cause you cain’t shoot ‘em)
[...] Care to see if someone is stealing your blog articles? [...]
Sheesh! Thieves have no shame!
I had no idea this was such a widespread problem. I haven’t done any checking but its worth taking a half hour or so a month and see what the dastardly cowards are up to …. and I’ll read your next post Gabe to see how to dispense with these ignoramouses (if I was a mouse I would be big time cheesed off).
Valentina´s last blog ..WordPress Direct Review
It is a sad thing to see my work scraped, and i discovered most of them is as a result of submitting my site to some RSS directories.
Thanks for the nice post.
Onibalusi Bamidele´s last blog ..Blogging your passion
Hi Gabe, it’s been a while.
I don’t bother about it simply because i have prevented by blog’s content from being copied.
Any can go ovet to my blog and try to copy content and see what happens.
Olusegun´s last blog ..I’m back – Rising from the Ashes like a Phoenix
I have used this on many occasions but a lot of articles are recycled instead of being verbatim, that’s if the thief has any sense!
Some sites that pull data from multiple blogs and aggregate them are alright. As for the ones that just blatantly steal the whole of another bloggers site that is out of order.
Aggregating is fine as long as credit is given. The scrapers I’m talking about are the ones who intentionally steal others’ work and post as their own.
It makes me mad seeing my own article posted in someone’s website without your consent or even credit for doing so.. it’s plagiarism i think
cebu attractions´s last blog ..Gadgets on Vacation Travel
Generally i used copyspace to catch those rats who stealing my blog cheese. but here i find easiest way and i will sure try this one. Thanks Gabe!
Rakesh Solanki´s last blog ..How to Get Your Twitter Account Verified?
It is a sad thing to see my work scraped, and i discovered most of them is as a result of submitting my site to some RSS directories.
Thanks for the nice post
1skyliner´s last blog ..Find Blogs Using the Top Commentators Plugin
It’s happened to me twice. both times I managed to track down who it was from the IP which i got from my sitemeter.
One time it was a seemingly respectable Indian businessman living in the US – a gentleman called Tushar Matar who claimed his innocence but neither I or my blog readers were convinced. http://thebookaholic.blogspot.com/2009/03/tushar-matar-blog-content-thief.html
The second time it was a guy called Aubrey Hall in Ireland who i discovered from his IP address had a whole bank of computers (he was also logged into the NASA website). He had numerous blogs – all of them scrapping content. Googling his IP further I found his name and all the porn sites he was subscribed to. I wrote to him and he more or less laughed in my face (not only a thief but bloody rude as well) and said I had a creative commons license on my blog which he said gave him permission to take my stuff. He did eventually remove my content after some acrimonious emails in which he huffed as if he were the wronged party. More here – http://thebookaholic.blogspot.com/2008/05/my-blog-content-stolen.html
The moral of the story. Stay alert to the possibility of content being stolen. Install a site meter on your site that tracks visitors IP addresses. Do some detective work with that IP number. Be prepared to name and shame the perp. Inform other people whose work has been stolen and maybe you can act together.
bibliobibuli´s last blog ..Best Kids’ Books
That is one fantastic story, bib.
Thanks for taking the time to write it up.
I don’t bother with these clowns any more, too much to lose if one of them decides to run a DOS on my server.
Dave Doolin´s last blog ..SEO for Writers and Artists (or, how to date your search engine)
very Interesting stuff
Recent Issue Today is provides latest issues, sports news, UFC 114,ufc online,mma, UFC results and scoops buzzing the internet. Delivering sizzling Issues just got better here in recent issue today
Recent issue today´s last blog ..Steve Clancy Hill AKA Steve Driver Dies in Police Standoff
I have to say that freebloghelp.com is really a good website
A useful guide
Hi, I am new here.
The problem is that it returns outcomes from Related Websites plugin and also excerpts from social bookmarking sites.
i found an blog with copyscape problem mailed to an owner and he has corrected that error
I am glad that new technologies are coming out in web design that make things easier, improved, and better looking for design.
koziol products´s last blog ..Quality UK department store
If possible, as you gain knowledge, please add to this blog with more information. I have found it enormously useful.