If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks.
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech Why not write the article yourself as a blog post? Would much rather hear the full version of your side of the story than a journo's interpretation of it.
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
I feel for yall. These residential proxies and the sdk networks are the bane of my existence and I’m paid to deal with them.
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech @jwildeboer recently wrote about these sdk-based services. His approach might be of use here - or at the very least, it might make useful reading: https://jan.wildeboer.net/2025/02/Blocking-Stealthy-Botnets/ and https://jan.wildeboer.net/2025/04/Web-is-Broken-Botnet-Part-2/
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech Have you heard of Anubis by Xe Iaso? 🥺 Good luck, we need you!!
-
@osm_tech I wonder if there's a way to fail2ban requests coming in faster than typically found in human requests.
Cycling to new IPs is trivial, I ban a few thousand IPs and cidr ranges in my WAF, I’ll see 75% of them show up the next time the scraper hits. Then after that most don’t show up again and the next scrape comes from a mostly new set of IPs.
I’ve see A few instances where they will cycle IPs during the same scraping event if some of them are blocked.
I’ve got scrapers that will send every request from a unique IP.
There is a lot of money to be made right now offering hard to block scraping services or tools to enable them.
-
@AliveDevil @utf_7 @osm_tech basically botnet/malware
-
A andresimous@oslo.town shared this topic
E exxo@nrw.social shared this topic
-
@AliveDevil Yes but they could still be banned when caught. A few devs getting banned would be a big deterrent for others to ship this malware.
The right *technical* defense, however, is not to allow apps arbitrary network access unless they're declared in the manifest as a "browser" or other "client software" that the user can use with any service they want (like IRC clients, mail clients, Mastodon clients, etc.).
Instead, the manifest should declare a single domain the app can contact, or multiple if the developer is willing to pay for more intensive vetting of them, and only allow network access to the declared domain(s).
@dalias @AliveDevil dafuq? if so, "software development kit sounds" wrong in that contedt. this is plain malware.
imagine using an app and someone downloads child porn or classical torrent over your connection. how will you proof you're innocent
-
W wiase@ibe.social shared this topic
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
-
A angelacarstensen@mastodon.online shared this topic
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
-
@osm_tech Why not write the article yourself as a blog post? Would much rather hear the full version of your side of the story than a journo's interpretation of it.
-
@osm_tech I wonder if there's a way to fail2ban requests coming in faster than typically found in human requests.
@BalooUriza The problem is, who do you ban? Since the requests keep changing IPs and user agents.
-
@chillicampari @osm_tech So if it is too late to tag @mfeilner now, I am tagging @evawolfangel
Journa et al.
- #OSM is equally important as #wikipedia and wikibase
- Please do not only report about critical infrastructure problems if an OSS project has birthday … -
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech @camwilson fyi
-
@osm_tech @BalooUriza For IPv4, a bitmask of the entire address space is a viable "efficient" implementation of blocking. I wonder if there are tools that can do it that way rather than needing a gigantic list.
@dalias @osm_tech @BalooUriza we have a very efficient implementation in #vinylcache (formerly #varnishcache )
-
@blub @osm_tech @heiseonline Yeah I already replied.
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
The real solution here is for app stores to give users proper per-app security settings. If an app isn't doesn't have a good reason to be sending email, it shouldn't be trying.
-
@osm_tech
Maybe @adfichter for @republik_magazin ?@Linux after vacation;) @osm_tech @republik_magazin
-
B bjoerne@norden.social shared this topic
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
-
@jorgesanz @osm_tech @civio hmm, it doesn’t fit in Civio’s scope I’m afraid. But it’s definitely an issue I’m aware of, it’s worse now with all the AI scrapers and I wonder if we should block them all, they flood my apps too
Maybe the 404 Media guys would be interested in this? https://www.404media.co/ai-scraping-bots-are-breaking-open-libraries-archives-and-museums/