Right, but where do they get their data? Are they rolling their own JS API that ...

executive · on March 6, 2017

They buy it direct from credit card companies, retail stores, 'free' online widgets (AddThis biz model: https://www.quora.com/Whats-the-business-model-for-AddThis-a...), etc

AndrewKemendo · on March 6, 2017

Right, that all makes sense but seems like it would be more of a grind than a simple JS API. You are effectively creating a marketplace.

executive · on March 6, 2017

Correct - see http://www.crosspixel.net for example:

"Cross Pixel's DMP is powered by our proprietary data relationships with more than 5,500 web sites and mobile apps where we identify and harvest the shopping and researching behaviors on over 650 million unique browsers. Our data partners are leading e-Commerce sites, search directories, comparison shopping engines, coupon sites and toolbars across North America and Latin America."

In general, the 'marketplace' is usually the DMP (Data Management Platform) where two parties can meet and share segments without data leakage (for example - Krux is a DMP used by a lot of Fortune 500 companies).

However the lines between DMP and Data Provider are blurring in recent years...

AndrewKemendo · on March 6, 2017

Great answer thanks! I hadn't heard the term DMP

x0x0 · on March 6, 2017

BlueKai is one of the biggest; it's a data marketplace for cookie-tagged data. They were bought by Oracle a few years ago.

GFischer · on March 6, 2017

And Krux's website says they have been acquired by Salesforce.

http://www.krux.com/blog/general/salesforce-krux/

xerxes777 · on March 6, 2017

Big thanks for the info, that was insightful.

dataflow · on March 6, 2017

> Right, but where do they get their data?

Yeah, I'm basically looking for the companies whose answer to this question is "by actually mining the data ourselves from your doctor, grocery store, Facebook, etc.".

dsp1234 · on March 6, 2017

Why do you think it works this way and not the other way around as well? Grocery stores shop their data around to see who will pay the most for it. A person who is out of work goes to the local courthouse and requests a bunch of records, compiles them into a spreadsheet and then cold (or warm with something like LinkedIn) calls to see if anyone is interested in the data. An online quiz company is going out of business, and as part of their bankruptcy settlement, they sell off their database of answers at auction. etc, etc, etc.

As pointed out elsewhere, it's a marketplace, and as such there are going to be buyers and sellers. Some of those sellers are going to be primary sources themselves.

dataflow · on March 6, 2017

Good question. I thought it works that way because it takes a lot of work to sanitize and cross-link people's data to other datasets accurately, so even if it's a "push" model, I still can't believe that every single website that does this does their own data cleaning & ML & whatnot. It's far too much repeated work and a good business to just do the work and sell it off to others. So I'd assume a few companies have to be making profits at the lower layer regardless of whether it's a pull or a push model.

dsp1234 · on March 6, 2017

As mentioned elsewhere, you seem to have a lot of unfounded assumptions, and misconceptions about this sector.

It's far too much repeated work

Companies will repeat work over and over again if it's cheaper than buying it, they have custom needs that aren't filled with the data available, etc. Businesses repeat work all the time, and this is not any different. Additionally, for many businesses in the sector, they themselves are the primary source for data. For them it's not repeated work.

a good business to just do the work and sell it off to others.

Yes, that's why some aggregators exist. They make money by brokering the data from multiple sources, some primary and some resold. But they are the tip of the iceberg.

You seem to be under the impression that there is some small list of companies who are all working from primary sources, and that everyone then gets feeds of data from those companies. This would make sense if gathering data was very difficult, or had a natural resource-like limitations. So that model works well for something like diamond mining (as compared to diamond growing), because the number of diamond mines are limited, and there is a natural entry barrier. However, that doesn't take into account the fact that gathering this data is generally easy. Sometimes it's very easy, such as a sftp feed of data from a government records database. Sometimes it's a bit harder, such as needing to physically be present to obtain the data.

That means there is very little barrier to entry, and thus generally there is going to be a lot of competition, and thus many companies vying to make money.

Personal data has value just like any other commodity. So a bit of economic theory goes a long way to understanding what the boundaries of a market might be. Low production cost, high profit goods generally have a large number of companies in the market.

VT_Drew · on March 6, 2017

> Right, but where do they get their data?

You give it to them when you sign up for that stupid rewards card.