Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, last March, after two startup ideas bombed, I was doing some research on where online advertising revenue was being spent. I figured that if I was going to even partially monetize with ad revenue, it made sense to figure out what topics and niches were paying.

As part of that research, I ended up purchasing a list of the top 50k paying keywords off a shady looking website. The site only took PayPal, and they were only charging 50 bucks, so I figured it was worth it, and I wasn't risking too much aside from exposing my PayPal account, so I bought it.

I was pretty skeptical of the data at first, but I randomly spot checked around 50 search terms, with my Adsense account, and they seemed to correlate pretty closely.

I then wrote a script to cross reference how much people were paying for each set of keywords with how many results were returned for those keywords. That gave me a ratio of High cost keywords::number of results, which was enlightening.

Mind you, this data is suspect. Google doesn't charge a fixed rate for their advertising rates, from what I understand, each time an ad is served, the price essentially depends on an auction and the price varies accordingly. Also, I bought this data off a shady looking website during a really low point in one of the worst economies in decades. Caveat Emptor

That being said, I thought you might find it interesting.



Do you have the rights to re-distribute this data?


There was no guidance whatsoever on the website I bought it from as to whether or not I could or couldn't redistribute. I agreed to nothing, I didn't click "next" on any licensing terms. The site I got if from was really pretty sketchy. The site was only 3 or 4 pages. A description of the product, a price list, a "buy now" page and a download page. I clicked the buy now button, paid $50 and downloaded a CSV. No license, no agreements. Nothing.

Also, it was a March '09 that I did this, not March '10. Sorry about that.

I'm really not trying to be shady at all with this, and I really had no idea that people would be uncomfortable with something like this or think it's wrong.

If people have serious moral qualms about this being here, I'm happy to take it down. I just thought it was an interesting hack so I thought I'd share it.


By the Berne convention, everything is copywritten on creation. And, to the best of my knowledge, the default is that you can't distribute a copywritten work unless you have an explicit license to do so.

For example, I can walk into a book store and buy Harry Potter for cash, without signing a license agreement. But, that doesn't mean I can redistribute its contents.

You are usually allowed to resell what you bought under the first sale doctrine (but not to sell copies of it).

IANAL, etc etc


I think there is an exception for "facts," although it may only apply to individual facts, not a collectioon of them. Any one here know more about this?


I'm not a lawyer, but per Feist v. Rural, facts cannot be copyrighted in the United States. Collections of facts may be eligible for copyright, but it requires authorship, and hinges on the creativity of presentation (the author must select which facts to include, how to present them, etc.) but the facts within the collection are not entitled to copyright protection. If this collection was generated automatically by accessing the Google API, it would probably not qualify for copyright protection based on a lack of creativity in its creation.

http://caselaw.lp.findlaw.com/scripts/getcase.pl?court=US...


Facts can't be copyrighted. There's no problem with redistributing the data.


FWIW in the UK (and Europe I think) there are database rights too: http://en.wikipedia.org/wiki/Database_rights

Where copyright is termed so as to protect a unique arrangement of data then an ordered listing would appear to be protected.


Facts and ideas can't be copyrighted, but their expression and structure can.

http://www.templetons.com/brad/copymyths.html

This is definitely a copyright violation, and I am shocked and enraged that this was voted so highly on HN and not flagged.

[Edits]

1. The standard "Recipes and collection of recipes" argument. This is a collection of recipes.

2. I am not shocked that there is a copyright violation. I am shocked that such an egregious violation has been voted so highly on HN. We are all digital workers on HN, and such disregard for digital work is (still) shocking.


Enraged? Your reaction doesn't fit the crime, as is so often the case with copyright.


Expression/structure can be copyrighted, but that's also not the whole story: http://en.wikipedia.org/wiki/Feist_Publications_v._Rural_Tel...

If this document merely lists the keywords in ascending/descending order of popularity, it is unlikely that the author's expression is creative enough to warrant copyright.


It is ill-regard of the system, not disregard of the work.


Since when Hacker News has become Private Bay?


Vincent has a point, I don't feel comfortable looking at this even if the source was shady.


Then, don't look at it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: