If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks.
-
@chillicampari @osm_tech So if it is too late to tag @mfeilner now, I am tagging @evawolfangel
Journa et al.
- #OSM is equally important as #wikipedia and wikibase
- Please do not only report about critical infrastructure problems if an OSS project has birthday … -
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech @camwilson fyi
-
@osm_tech @BalooUriza For IPv4, a bitmask of the entire address space is a viable "efficient" implementation of blocking. I wonder if there are tools that can do it that way rather than needing a gigantic list.
@dalias @osm_tech @BalooUriza we have a very efficient implementation in #vinylcache (formerly #varnishcache )
-
@blub @osm_tech @heiseonline Yeah I already replied.
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
The real solution here is for app stores to give users proper per-app security settings. If an app isn't doesn't have a good reason to be sending email, it shouldn't be trying.
-
@osm_tech
Maybe @adfichter for @republik_magazin ?@Linux after vacation;) @osm_tech @republik_magazin
-
B bjoerne@norden.social shared this topic
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
-
@jorgesanz @osm_tech @civio hmm, it doesn’t fit in Civio’s scope I’m afraid. But it’s definitely an issue I’m aware of, it’s worse now with all the AI scrapers and I wonder if we should block them all, they flood my apps too
Maybe the 404 Media guys would be interested in this? https://www.404media.co/ai-scraping-bots-are-breaking-open-libraries-archives-and-museums/ -
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech
Maybe @La_Directa @donestech
@tunubesecamirio
@albalafarga
@mediapart
@mainichi
@heisecNot Sure If they are already aware

I remember @FediTips shared a list of News Media here in the fediverse, I'll try to find it.... Here it is https://fedi.directory/tag/investigative-journalism/
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech hey, look into spur.us, they can help with the residential proxy issue.
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech in my experience, it helps if you have local representatives so journalists can speak with, and write about, a person in their own region.
I could nudge Dutch (/Belgian) press! -
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech I tried to find you on BSky - I'd try over there...
https://bsky.app/profile/joetidy.bsky.social
https://bsky.app/profile/zsk.bsky.social
https://bsky.app/profile/404media.co -
@ryanvade Was also my idea - and @heiseonline @heisec
-
Vielleicht ist das ein Thema für die @lagedernation?
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech Interesting. Perhaps we could follow-up via e-mail or DM? alexander.fanta@ftm.nl
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse
@osm_tech Wow. TIL that software development kits are more or less silently embedding internet scrapers in (unrelated) end-user applications to distribute AI data scraping across residential addresses and therefore be harder to defend against.
Hey, Tim, were you expecting stuff like this 35 years down the line? -
@BalooUriza We use fail2ban to handle some of this with custom rules, but eventually fail2ban becomes a bottleneck after 100,000 IP addresses.
@osm_tech @BalooUriza is it using ipset hashsets, or default rule-per-ip rules? raw namespace or? I don't know the details of implementation, but if it is L7 load that is problematic (instead of pure bandwidth DDoS), it might be worth to consider whitelisting instead. I.e. whitelist addresses (or /24s) that have *not* had excessive requests lately, and put them in priority network bucket, and the rest (which is not blacklisted) goes in best-effort bucket (to maybe migrate to whitelist later)
-
If you write about the messy reality behind "free" internet services: we're seeing #OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks. We're a volunteer-run service and the costs are real. We'd love to talk to a journalist about what we're seeing + how we're responding. #AI #Bots #Abuse