this post was submitted on 24 Jul 2024
1 points (100.0% liked)

Technology

59587 readers
5236 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

DuckDuckGo, Bing, Mojeek, and other search engines are not returning full Reddit results any more.

(page 2) 50 comments
sorted by: hot top controversial new old
[–] reddig33@lemmy.world 0 points 4 months ago (1 children)

I’m not understanding what stops a search engine from scraping a publicly accessible website. ?

[–] Eril@feddit.org 0 points 4 months ago (2 children)

robots.txt, I guess? Yes, you can just ignore it, but you shouldn't, if you develop a responsible web scraper.

[–] reddig33@lemmy.world 0 points 4 months ago (2 children)

Doesn’t seem legal that a robots.txt could pick and choose who scrapes. Seems like legally it would have to be all or nothing. Here’s hoping one of the search engines ignores it and makes it a legal case.

[–] capital@lemmy.world 0 points 4 months ago (2 children)

You'd probably feel differently if it were your service. Should you be able to control who scrapes your sites or should that be all or nothing?

For the record, I fucking hate what the internet is becoming. I naively believed that even if shit got cordoned off into the walled gardens that are mobile phone apps, the web would remain as open as it was. This is a terrible sign of things to come.

load more comments (2 replies)
[–] Eril@feddit.org 0 points 4 months ago (1 children)

Actually currently it contains this:

User-agent: *
Disallow: /

Well, that actually is a blanket ban for everyone, so something else must be at play here.

[–] starman@programming.dev 0 points 4 months ago (1 children)
[–] russjr08@bitforged.space 0 points 4 months ago

We believe in the open internet, but we do not believe in the misuse of public content.

That's real rich, coming from Reddit.

[–] hotpot8toe@lemmy.world 0 points 4 months ago

Also, rate limiting. A publicly accessible website doesn't mean that it will allow scrapers to read millions of pages each week. They can easily identify and block scrapers because of the pattern of their activity. I don't know if Reddit has rate-limiting, but I wouldn't be surprised if they implement one.

[–] dabster291@lemmy.zip 0 points 4 months ago (1 children)
[–] BurnSquirrel@lemmy.world 0 points 4 months ago

Still couldn't get me to use it, I use DDG which can switch between search engines and search sites very quickly with it's ! syntax (Everyone goes on about privacy, but this is pretty much it's best feature). Google results are consistently the worst for me if I'm hitting multiple search engines

[–] Fedizen@lemmy.world 0 points 4 months ago

Reddit really fucked themselves. Not as much as Elon fucked twitter but super close.

Also pretty sure DDG uses Bing

[–] sag@lemm.ee 0 points 4 months ago

Fuck You Reddit

[–] JackbyDev@programming.dev 0 points 4 months ago (12 children)

If you use Bing, DuckDuckGo, Mojeek, Qwant or any other alternative search engine that doesn’t rely on Google’s indexing and search Reddit by using “site:reddit.com,” you will not see any results from the last week.

That's absolutely insane... Reddit truly is making things awful. The "just add reddit" or "just add site:reddit.com" has been trash for a while because they bombard you with the "pwease use the app" and not showing more than like three comments at a time. It's useless.

load more comments (12 replies)
[–] Imgonnatrythis@sh.itjust.works 0 points 4 months ago (2 children)

Meh, fuck em. The tighter they make their circle the less useful it is.

Reminder that Kagi searches Lemmy which is great.

[–] UnderpantsWeevil@lemmy.world 0 points 4 months ago (6 children)

Kagi

Ah, yes. The "Fuck you, Pay me" search engine.

load more comments (6 replies)
load more comments (1 replies)
[–] mrvictory1@lemmy.world 0 points 4 months ago (2 children)
load more comments (2 replies)
[–] daniskarma@lemmy.dbzer0.com 0 points 4 months ago (18 children)

To be fair, Reddit is no longer that good of a source for answers in the later years.

Quality drop in comments is insane. Sometimes it looks like Quora.

[–] Kecessa@sh.itjust.works 0 points 4 months ago (2 children)

I was looking for Bluetooth speakers recommendations and it's the first time I really noticed "generic bot replies" like "I've got this great product to recommend, not only is it good but it offers great sound quality as well! The product is [link to Amazon page]"

Gotta start searching using "before:" to get quality results...

load more comments (2 replies)
load more comments (17 replies)
[–] Freefall@lemmy.world 0 points 4 months ago

Oh cool! My searching won't be spammed with Reddit now!

[–] dullbananas@lemmy.ca 0 points 4 months ago
[–] vxx@lemmy.world 0 points 4 months ago

Another nail.

[–] cmrn@lemmy.world 0 points 4 months ago (3 children)

Every time I click a Reddit link now it’s just “download the app to verify your age” regardless of what it is

[–] Wolf314159@startrek.website 0 points 4 months ago (5 children)

I feel your pain.

I edit the URL to remove the first part of the URL and replace it with "http://old.reddit.com". That still seems to work, last I checked, but I fully expect it to be killed any day now.

load more comments (5 replies)
load more comments (2 replies)
load more comments
view more: ‹ prev next ›