this post was submitted on 04 Jan 2024
100 points (91.0% liked)
Fediverse
28496 readers
309 users here now
A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).
If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!
Rules
- Posts must be on topic.
- Be respectful of others.
- Cite the sources used for graphs and other statistics.
- Follow the general Lemmy.world rules.
Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Are there any instances that aren’t free? Instances which run ads and tracking?
Today almost no instances run ads (misskey is as far as I know the only platform that's got support for ads) and Threads is the only one that does tracking. I'm using "free fediverses" the way https://freefediverse.org/index.php/Main_Page does -- instances that reject federation with Meta.
You do realize that federating with meta doesn’t mean that the instances which allow federation with meta will be tracked more, right? Meta users are going to be tracked as much as meta users are going to be tracked. The old tricks that facebook used to do to track everyone with a facebook account everywhere on the net don’t really work any more in modern browsers and won’t ever work on an instance that doesn’t have a facebook “share this link” or “like button” integration.
Technically, if an instance did have those buttons and facebook users in older browsers used those instances, that would be tracked by meta, even if the instance itself didn’t federate with meta.
You do realize that instances federating with Threads will share data with Threads, and that Meta's supplemental privacy policy specifically says that they'll use all activity that federates to meta for tracking and ad targeting, right?
So for example, if you're on an instance that federates with Threads, and somebody on Threads is following you, all of your posts -- including your followers-only posts -- will get tracked by Meta. Or if somebody who boosts your post and they've got followers on Threads, your post will be tracked by Meta. Or if you like, boost, or reply to a post that originated on Threads, it gets tracked my Meta. And these are just the most obvious cases. What about if somebody on an instance that's not Threads replies to a Threads post, and you reply to the reply? It depends on the how the various software implements replies -- ActivityPub allows different possibilities here. And there are plenty of other potential data flows to Meta as well.
Of course they're still just at the early stages of federation so it's hard to know just how it'll work out. Individually blocking Threads might well provide a lot of protection. But in general, instances which federate with Meta will almost certainly be tracked significantly more than instances that don't.
I just wrote a comment explaining in detail how tracking works and why it wouldn’t work with lemmy. I suggest you read it or skim it first https://lemmy.world/comment/6404079
If those instances choose to share data with Threads, you should not join those instances. Federating with threads shares “data” in the form of content, which is how the fediverse works. But this data is the content we are looking for - posts. The “data” you’re worried about being shared (tracking info, identifying info) won’t be shared. See the linked post for more details.
I’ve got some bad news for you buddy: there’s defederating and there’s blocking. If meta or any other company wants to right now they can create a crawler for the entire fediverse, follow everyone and log everything. We have no evidence that people aren’t already doing this so I would assume that they are. Lemmy isn’t an isolated island, it’s a public internet-type software where content exists on the internet. Don’t want your content used by AI or linked to your pseudoanonymous lemmy account? Your only option is to join an instance that isn’t connected to the Internet (at least not publicly allowing access to accounts, something where all communities to community members only. Federation simply means that Threads users can’t interact with your instance and vice versa.
Please keep in mind that there are open source developers who understand that facebook is just another silly site (i.e., isn’t the internet, isn’t the gods of the internet). The only way this tracking nightmare you’re describing comes true is if Lemmy developers decide to make instances track users and ship that private tracking data to facebook.
As for “site A tracks users who interact with site A” yeah, that’s the internet for you.
Federation isn’t complex. I explained this in the linked post. The one point I want to put across here is, if your instance decides to defederate from threads, your instance is still going to be tracked by meta and everyone else, and you probably won’t care because you haven’t in the past. It’s a different kind of tracking, not the 3rd party web-based tracking we’re used to when just visiting any site. There’s some exceptions to this which I’ve outlined in the linked post.
𝕯𝖎𝖕𝖘𝖍𝖎𝖙: If those instances choose to share data with Threads, you should not join those instances.
Also 𝕯𝖎𝖕𝖘𝖍𝖎𝖙: Federating with threads shares “data” in the form of content
I appreciate all the time and energy you're putting into the comments here, but what it comes down to is that you're not concerned about the difference between the federation scenario -- where this data is given to Threads under an agreement that explicitly consents to giving Meta the right to use the data for virtually whatever they want -- and the situation today -- where Meta and others can do the work to non-consensually scrape public data on sites that don't put up barriers.
We're not going to convince each other, and we've both got enough walls of text up that at this point neither of us are going to convince people reading the thread who aren't already convinced, so let's save ourselves the time and energy and leave it here.
I pretty clearly described the difference in “data” between the data we want and data we don’t want. Try re-reading that and if it still doesn’t make sense I’ll explain in more detail.
Buddy, that’s the internet:
This is how data is exchanged on the internet. It’s less of an agreement and more of a protocol. If you’re trying to claim here that an artist putting up a peice of work on a fediverse site means that facebook now has full rights to that work, I think you’re mistaken. Yes, as part of how the fediverse operates, if you are federated with my server, you are giving me permission to federate the federated content with my server’s users. This is currently happening right now across all federated instances.
It’s a shame, because we have the same goal in mind: not being tracked by facebook / meta. We aren’t on opposing sides of this issue. My point is that defederating from meta doesn’t stop meta from tracking you online. If you want to stop meta from tracking you online, you need to do the following:
Defederating isn’t on the list because defederating from meta doesn’t stop meta from tracking you.
Yes, you described what you see as the difference between data and "data" clearly. And I described what I see as the implications clearly. If anybody's still reading the thread, they can make their own conclusions.
Threads Supplemental Privacy Policy begs to differ that there's not an agreement here.
I never claimed it did. It eliminates one path of consensually sharing data (or "data", in your terms) with Meta.
In terms of your list, my perspective is that a server that federates with Threads is part of Meta's ecosystem -- #1 in your list. You don't seem to see it that way, and that's what we're not going to convince each other about.
Threads is a company and it needs a legal document to describe how the social media site operates. A website needs to copy and distribute content in order to show content. Federated websites need to copy and distribute content from other federated servers and their own server in order to show content. This is how lemmy.world and blajah.zone can speak as if we are on the same server. Each fediverse should also include a similar privacy policy as it describes how their content is distributed on the internet. Facebook’s privacy policy likely describes how your content may be seen on other facebook products and in facebook apps. These legal documents spell out how systems operate. You assume a similar risk with each site you operate on.
(data, ‘data’,
data
, “data”) are all the same term. Let’s use better terms:Facebook is evil for a lot of reasons but their original sin was the 3rd party tracking which, thanks to their assets (images) being put everywhere because site owners wanted better SEO and engagement with facebook users, facebook could send a cookie with a random id that specifes some user as it travels from site to site and then link that random id with the facebook user when that user logs into facebook. This allowed facebook to track its users everywhere on the internet. However, this didn’t allow facebook to identify non-facebook users like it could with facebook users. All facebook would end up with when tracking non-facebook users is information about what a random user viewed on the web - this isn’t great (and can only be stopped if you just block facebook at the router or browser level) but at least that user stayed anonymous. The motivation for this tracking was to push more targeted advertisements to facebook users, where facebook actual stands to earn a profit. There isn’t a lot of profit in just identifying users online if those users stay anonymous… except of course of internet advertising (programmatic advertisements). This is why it’s also important to interact with sites which do not have programmatic advertisements - these are most advertisements now a days, especially if that ad feels targeted specifically to you based on a site you’ve been to. You want to worry about tracking, look into how the programmatic ad industry works - of which facebook is a part but most importantly doesn’t involve federation because that’s something totally separate and something which actually protects against tracking.
Think of federation as a site downloading an image and reserving it to you. Tracking happens in the serving of an asset (image, video, etc), so if you are getting that content from the site you trust won’t track you, then .. you won’t be tracked. A nasty site that wants to track you cannot get your information via the fediverse, since the federated site simply copies and privately redistributes the copy to its users, leaving the nasty site only knowing that “this instance with thousands of users received our content” - not very useful for tracking, advertising or for ad revenue, doesn’t provide any data that would be valueable for data sellers either.
Please stop putting words in my mouth. I am capable of speaking on my own terms. Also, by your perspective, a server that receives and processes email from meta is part of meta’s ecosystem. That statement would be correct if you replace “meta” with “the internet”.
Also, why are we discussing this with Meta when there’s a log bigger threat out there (ABC/Google, search engines and scrapers)? And by “threat” I mean “how the web has always operated”. I feel like writing an application that showed users how easily they could be tracked on the fediverse even if that instance were not federated by any other servers.
Meta is a company whose business model depends on exploiting the data it gathers, and its privacy policies are carefully written to give it as much flexibility as possible. It's true that if you're on an instance that federates with Threads you're assuming that risk. If you compare their language to a policy that's written with a goal of privacy -- like eu.social's the differences are clear.
OK, then, speak for yourself: do you see instances that federeate with Threads as being part of Meta's ecosystem?
That depends on what you mean by “Meta’s ecosystem”. If we consider Meta’s ecosystem to contain only the entities which directly help Meta earn money in exchange for tracking data, then the answer is no, for reasons I have explained.
If you consider “Meta’s ecosystem” to be “ActivityPub federated instances which do not block ActivityPub data from going to Meta” then the answer is yes, but that’s an arbitrary definition. How does meta profit from this? By showing more and different memes to its existing userbase? You could argue that by giving meta more content, the engaged users on meta’s properties will be more engaged and more likely to see an ad served by meta’s property to the user, leading to higher time on site. It’s a weak argument if your concern is being tracked by meta, as ActivityPub doesn’t share tracking information between servers.
I personally define “Meta’s ecosystem” to be meta’s properties and I suppose by extension any site which helps meta do its tracking: any site which shows facebook buttons served directly from facebook, therefor allowing tracking in older browsers. I consider ActivityPub to be a protocol not that far off from RSS or even Email, although RSS is a better comparison as it also happens over HTTP[S]. I define it this way because of my work as a web developer who has also built tracking systems similar to how Facebook’s tracking works, though not as sinister (I used 1st party cookies and pixels, similar to Google Analytics, prior to GA being free years ago). I’ve also worked for some websites that relied on ad views, and learned how programmatic works (tl;dr: each ad you see is an action for your eyeballs based on the data collected by hundreds of other agencies you may not even know of). I’ve also worked for startups where a large part of generating a high valuation for the company involved simply having contact information of hundreds of users as well as some basic information about their preferences. A startup like that would then have data to sell to a broker for programmatic ads.
Ultimately, websites (and instances) require money to stay up. Money comes from volunteers, donations or ad revenue / subscriptions. Programmatic ads can be included inside any website or app, and most importantly, an instance owner could choose to provide tracking data to facebook and other data brokers just to show advertisements, all while choosing to defederate from Meta’s fediverse properties either knowingly or unknowingly. It’s kind of like adding high-security locks on your doors and then leaving your windows open. If the end goal is to provide an instance which respects privacy of its users, that instance needs to choose to show ethical advertising (not programmatic, databroker-based ads which require sending the user’s data with each ad visit). Federating or defederating from meta’s properties won’t send any data to meta’s properties that every instance owner doesn’t already get as a part of ActivityPub federation.
Thanks for the detailed explanation. I agree that it depends on whether "Meta's ecosystem" is defined as including "ActivityPub federated instances which do not block ActivityPub data from going to Meta”. I do, and I originally said that "you don’t seem to see it that way." You objected that I was putting words into your mouth ... but after your last post I'm pretty sure that I accurately described your position: your definition of "Meta's ecosystem" only includes sites that help Meta do their tracking, and you had previously said don't consider federating data there as tracking.
Like I said, we're not going to convince each other. I understand your position and why you think that way, I just disagree. It's true that defederating from Threads while still federating with instances that use Meta's services doesn't help, it's true that federating with Threads just sends them the data that goes to other ActivityPub instances, it's true that Google's also a threat -- this is all part of why I frame things in terms of surveillance capitalism, not just whether or not to federate with Threads. We just come to different conclusions about the privacy impact of defederating from Threads. Restating our arguments another time won't change anything.
And in any case, that's not even the reason that most instances are defederating from Threads! Concern about harassment from hate groups there is a much bigger deal. So, as interesting as this conversation is, is it really a good use of our time?
I’d like to understand your definition of tracking, in that case. What is your biggest concern regarding tracking when federating? Am I correct in assuming that you also don’t want to be tracked by other data brokers? I have these conversations because I want to try to steer these conversations in a productive direction. I don’t need to convince you, but it would be interesting to understand the tracking concerns people have with federating. I’m also entertaining the idea of creating a website which shows people what data they share on the fediverse so they can understand where the real risks are (e.g., we probably should reject instances which choosed to use targeted advertising, as it sends data not only to facebook but to data brokers, etc..)
There are good reasons for defederating from instances, such as harassment, hate, lax moderation policies, etc, as you mentioned. I’ve discussed that topic a lot too in other posts, mostly boiling down to “yeah it’s going to be really hard to say ‘yes’/‘no’ to what amounts to being one instance with millions of users”. Personally, I like the decentrialized nature of the internet, the fediverse and the freedom with which that brings. I don’t really have any interest in being on an instance which federates with meta properties, but I also don’t really take a strong stance for or against it. I personally see more conversation about defederating from threads and less concern with a route that some instance owners may be forced to head in: targeted advertising. After all, the tactics meta uses are the same tactics any web developer can use.
The only positive I’ll say about federating with Threads is that some people have a lot of friends who are stick in facebook, and this would be a way for people to stay connected. I think that’s probably why they’re moving towards that direction, especially if they are seeing those users migrate away to ActivityPub. But someone else will need to make that empassioned argument - and I’m sure there’ll be a non-zero number of instances who choose to federate, and users will decide which ones they want to engage with at what time in their life. Choice!
I’m certainly not going to make the argument that people should federate with any instance. You don’t like the instance, you server the connection. That ability was built in for a reason and should remain.
A website like that would be very helpful. A lot of people I talk to think that unlisted gives more protection than it actually does (they're used to how it behaves on YouTube where it's harder to discover), don't realize that it's still likely to get indexed by Googe et al even if they haven't opted in to search engines (because their post may well appear in a thread by somebody who has opted in), don't understand the limited protection of blocking if authorized fetch isn't enabled, don't realized that RSS leaves everything open etc.
Yes, I think in terms of protecting data generally, not just from Meta but also data brokers, Google, and other data harvesters -- as well as stalkers. Meta's a concrete and timely example so it's a chance to focus attention and improve privacy protections, both for instances that don't federate and for instances that do. I agree that most (although not all) of the information Meta can get from federating they already can by scraping and they certainly could scrape (and quite possibly are already scraping) most if not all profiles and public and unlisted posts on most instances, and so could everybody else ... it's a great opportunity to make progress on this. https://privacy.thenexus.today/fediverse-threat-modeling-privacy-and-meta/ has more about how I look at it.
Specifically in terms of data that flows to Threads through federating that isn't otherwise easily scrapable today, three specific examples I know of are
That said this isn't based on a full analysis so there may well be other paths. As far as I know the draft privacy threat model I did last summer is the deepest dive - And the software is buggy enough in general that it wouldn't surprise me if there are paths that shouldn't exist.
In terms of concerns about tracking others have about federating ... like I say for most people this isn't the top concern. To the extent it is about data going to Threads, for a lot of people it's about consent and/or risk management, full stop. They do not want to give Meta or accounts on Threads easy access to data from their fediverse account, even if Meta can get it without consent now (and even if they have some other Meta accounts). There's also a lot of "well Eugen said it's all fine", and especially from techies a lot of "well they can scrape it all anyhow, whatever" and "everything is public anyhow on social networks".
Thanks for this. I’ve checked out your site and you’ve given me a lot to think about here. I also just found this site today which might be helpful for folks like us. not lemmy related, but data broker related. https://databrokerswatch.org/
But where will meta put its ads, and how can it filter what you see if you can't (as a user of an instance blocking meta/threads/...) subscribe to meta instances?
I mean it's just a hot mess, so I'm all for blocking those predatory psycopaths even if it theoretically isn't needed in some cases.
Ask your same question about any other instance.
For example, what’s keeping me from setting up my own instance that just spams ads across every community? Answer: nothing.
What could owners of instances do to prevent seeing advertisements? Answer: defederate.
How much are users of instances that don’t defederate from my spam instance tracked? Answer: the same as any other site.
Tracking online works by communication with a client and server, ideally with identifying information. Everyone on the web has some identifying information, such as IP address and user agent. The problem is, IP addresses are shared with many people, the same as user agents. The only way to uniquely identify a user online is with cookies. Facebook used 3rd party cookies to track users who visted site A when site A has a image serve from facebook servers. Facebook gives the visitor a 3rd party cookie “You are now 93ga3490f” and logs that the cookie was served to visitor 93ga3490f on site A. That visitor then goes to site B and sees a facebook like image button, but this time when the user asks for the like button from facebook, the browser says “Hello, I am 93ga3490f requesting the facebook like button” and facebook records that 93ga3490f also visited Site B. Honestly this is still pretty useless until that user logs into their facebook account. At this point, the browser tells facebook, “Hello, I am 93ga3490f logging into facebook with email xyz@example.com” and facebook records xyz@example.com as visitor of Sites A and B (formerly user 93ga3490f.)
How does tracking work with federation? Answer: it doesn’t. Federation you can think of as a server subscribing to another server. You’re offline, and instanceXYZ downloads a copy of new content uploaded to instanceABC, since instanceABC and instanceXYZ are federated with one another. You go online, log onto your instance (instanceXYZ) and instanceXYZ serves you the downloaded content that it previously got from instanceABC. instanceABC doesn’t know you’ve viewed this, doesn’t know you’ve downloaded this. All instanceABC knows is that instanceXYZ copied this content for all of its users. The only way you can be tracked here is:
It's all about scale and resources. They have thousands of times more than the whole lemmyverse.
This isn’t new
There are pay to join instances. Not many, but they exist.
There's also at least one add supported Misskey instance
And of those pay to join instances, do we know they are tracking their users without their consent? That would be the only real issue here (tracking without consent)