Scrapers are crawling these endpoints and pkgs01 takes a while to call
git on the backend and return data to them. This causes latency to
increase a bunch because it's got all those blame and history requests
it's processing so it can't process more important things.
So, lets just block these for now. Any users who need them can easily
git clone locally and run history/blame just fine.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
I am not sure these are even ai scrapers. If they are, they are broken
and unfit for scraping. They just hit these forks (and nothing else)
over and over via a Distributed pile of ips. They pass anubis
challenges, so probibly residential users who they don't care about.
Anyhow, on high load on pkgs01, see if more blocks need to be added
here.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
there's about... 7million hits a day from sites passing a referrer
of forks/kernel or forks/firefox where they are fetching static content
over and over and over. This may be because before they were blocked
from the forks themselves they were also downloading the js and static
content, and now they are just too dumb to see the 403 and still
want to fetch the old static content. Fortunately, they send a
referrer we can match on.
So, this should cut load another chunk.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
Some scraper(s) were very very agressively crawling kernel fork repos
and causing all kinds of problems for koji and src.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
Since we no longer have any machines in phx2, I have tried to remove
them from ansible. Note that there are still some places where we need
to remove them still: nagios, dhcp, named were not touched, and in cases
where it wasn't pretty clear what a conditional was doing I left it to
be cleaned up later.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>