Training our own AI models

Posted by tartieret |2 hours ago |94 comments

JimDabell an hour ago[4 more]

“Opt-in by default” is an oxymoron. If it’s default then I haven’t opted into anything. It’s been enabled by default.

Waterluvian an hour ago[1 more]

PostHog was a system we set up once, generally don't think about, and review from time to time, providing some occasional value. It was mostly harmless to leave around.

But it's apparently yet one more thing we have to be actively suspicious of as it defaults towards an intolerable state. So it's easier to just rip it out of the system and move on.

sixtyj 2 hours ago[10 more]

Most companies would bury this change in a deceptively boring T&Cs update, but we value transparency, so here's what you need to know in an internet-friendly numbered list:

Users on our EU cloud instance are opted out by default

So too users with agreements that prevent training (e.g. BAA, MSA, or similar)

All other users on our US cloud instance are opted in by default

We will anonymize all data before it's used for training

We will only use data that already exists in your PostHog instance

We will do all the model training ourselves, which means...

We won't sell or send your data to third-party model providers

You can opt out at any time via your org settings in PostHog (admin access required)

Training won't start until June 29, so there's plenty of time to decide

Dave_Rosenthal 18 minutes ago

They say, "our goal here is to improve PostHog as a product for our customers, not to expose or sell models trained on your data" but then don't actually list that as a limitation in the bulleted points.

AFAICT this now gives them default permission to train an LLM on your code (as Posthog telemetry data is inextricably tied to your code) use it, and even sell it if they wanted to (as it's not your data anymore, it's their model). Yikes.

thecatapps 29 minutes ago

It's probably very obvious by now, but there's something to be said about companies with the "SF Quirky" vibes:

- The OS Redesign

- "Sexy Legal Documents"

- Emails with "<relevant hedgehog meme goes here>" as the subject line

- Having a merch shop with action figures of your CEO

It works both ways. When you're looking for adoption and making very pro-user moves, I guess it can be a benefit. However, when you're now looking to grow revenue and making very anti-user moves, it's insult to injury.

I'm the last person to say that tech "shouldn't be fun" or something overly-broad like that, but if your messaging doesn't match the decisions of leadership, you're gonna have a bad time.

frankest an hour ago[1 more]

What a great reminder to build my own analytics and self host. PostHog just lost a customer. They could easily send a email to each customer asking if we want this. The assumption means they have no product intuition about their own customers, let alone the customers of their customers. Bye.

infecto an hour ago[1 more]

Thanks for posting. I had been in the fence for the past few months of switching. The new AI products combined with the weird UIs had been irking me for a while. This is the final nail in the coffin. Opt-in is a terrible business model imo.

tines an hour ago[2 more]

“Opt-in by default” = opt-out?

brauhaus an hour ago[2 more]

Every day I'm more glad about EU legislation, that's all I have to say for now

abustamam 37 minutes ago

> Why this is opt out, not opt in

> Put simply, because otherwise we will not have enough data to train a model that's actually useful.

AKA we won't be able to make as much money if we required you to give us permission to use your data.

rad_val 14 minutes ago

All of them do if you don't do something about it(e.g. migrate to self hosted solutions), trusting a ToS in 2026 is as naive as it gets.

freshnode an hour ago[1 more]

Why won't companies explain what anonymisation means for them?

Posthog has unfettered logged in access to some sensitive stuff. What steps are they actually taking to scrub sensitive data from my replay before being used to train a model?

stevoski 8 minutes ago

I’ve been evaluating PostHog for our company.

I’ve now made our decision. We won’t be using them.

If they are going to position yourself as the non-slimy no-BS guys, they can’t pull this nonsense.

the__alchemist 44 minutes ago

How much are they paying the users?

ASinclair an hour ago[1 more]

Mostly unrelated but the name of this company makes me think it's a Dick-Pics-as-a-Service provider.

mrcwinn an hour ago[1 more]

Gross.

They’ll use your product and your data to later sell a product back to you.

gyoridavid 35 minutes ago

I feel that the US should step up their legislation game and make sure these companies can't retroactively make rules to steal their users data. I know it's trendy to hate the EU but their legislation actually protects the users, and not the companies interests.

jen20 an hour ago[1 more]

Perhaps if they hopped on a quick call for five minutes with some customers, they'd realize quite how little appetite there is for putting up with being opted into things automatically in the US but not in the EU.

As an aside, this also means the EU rules are working.

bigstrat2003 an hour ago

This is the fastest way possible to ensure I will never do business with you, or stop doing business with you if I already am.

tartieret 2 hours ago[1 more]

I initially used Posthog as an alternative to Google Analytics with more privacy. Now they want to use the data for a business purpose. Working hard towards enshitification?

calmbonsai an hour ago

LOL. You stay classy PostHog.

Henchman21 an hour ago[1 more]

You can’t “opt-in” to something that is the default. The choice is made for you — and when the choice is made for you? You haven’t opted in or out?

dzonga 38 minutes ago

another would be excellent product company destroyed or being destroyed slowly due to VCs and the ever chase for 'growth'

mikkelam 37 minutes ago

The enshittification has begun. Time to move on!

TZubiri an hour ago[1 more]

Today I was thinking, if I start a company in the LLM tooling space, I would put in the company mission in the incorporation documents that client data will not be used to train.

The temptation and the value is too great, and the opt-in opt-out consent thing ends up being a fuckery where the company tries to trick the user into allowing them to take a look into the data, presumably because they are selling the product at a loss and need an alternative revenue model.

Just make it impossible from the get-go, the fine print would be that the data can be shared off-band explicitly, in an email, or if explicitly copy pasted in a support chatbox, but there would be no mechanism for us to read the data from the databases much less from the client.

I don't mean it would be an air-tight mechanism like Signal or ProtonMail, if a court order would ask us to produce client info, we would still reserve the right to produce the data, but exceptionally, and definitely not for training models.

slopinthebag an hour ago

PostHog better transition to an AI company soon because they are one of the SAAS's which are absolutely cooked by vibe coding. What it does is extremely amenable to LLMs and it's also non-critical for a business, making it an excellent candidate for replacement by in-house solutions. And if it means never having to use their website again that's even better.

I wonder if they regret opensource, considering people will be using LLMs to replace them which have surely trained off of their code.

Ayush_Khati1 an hour ago

[flagged]

jasonmp85 an hour ago

Comment deleted