This avoids the tcp timeout problem totally from what I can tell.
Just switch it for now as we continue to work on the underlying problem.
This does mean that we don't use varnish, but apache is able to
keep up ok so far.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
openqa uses apache load balancer now, and doesn't use haproxy at all.
Clean up some things that current haproxy warns about on start.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
After some troubleshooting I was finally able to fix the OpenID
authentication on staging. These are the changes I ended up deploying to fix
the remaining issues.
Signed-off-by: Michal Konecny <mkonecny@redhat.com>
This seems to be a similar case to the kojipkgs one, where we see from
time to time timeouts from proxies to pkgs01.
If it's a health check, haproxy will mark the backend down.
If it's a user request they will get a timeout and a 503 back.
This will help mitigate the second problem and retry those.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
We are having problems with connections sometimes hanging from proxies
to kojipkgs. Lets try and mitigate that at the haproxy level and
hopefully improve things while we try and figure out what the underlying
cause is.
This should retry connections that failed for any 'retryable' output
(including timeout) and also it should try a _different_ backend than
the one that returned the error. This will not eliminate errors, but
should reduce them.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
To be redirected to openid server during the authentication let's set a
cookie for it and match against that.
This was tested and it's working, but ipsilon is doing something with
the requests and the cookie is gone after redirect.
Signed-off-by: Michal Konecny <mkonecny@redhat.com>
When checking if the server has openid capabilities we are checking for
openid_identifier, let's redirect that to openid backend as well.
Signed-off-by: Michal Konecny <mkonecny@redhat.com>
The paths are too similar and /openidc ended up being routed to wrong
ipsilon server, let's add specific rule for OIDC as well.
Signed-off-by: Michal Konecny <mkonecny@redhat.com>
Prior to 38d138e this condition existed with 'iad2' instead of
'rdu3'. @abompard took it out entirely, but that was wrong, it
makes the external proxies include this block. We need to put the
condition back with the correct data center name.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Added host vars for all the control plane vm's and bootstrap node.
Set latest version for downloading and setting things up.
Setup haproxy in rdu3 prod to load balance the ocp api and internal api.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
Time to retire ODCS. ELN is moved off and that was the last thing using
it. Thanks for all the service ODCS!
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
This commit retires pdc from ansible.
The website should get redirected to a wiki page about the retirement.
If for some reason we need to bring things back, the vm's will still
have their disks and xml saved off so we can bring it back.
Would need to revert this, run proxy playbooks and do a little cleanup
on the redirect, then bring the vm's back up.
Hopefully we don't have to.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
With the EOL of Fedora 38 yesterday, we are no longer building any
modules and can retire our module build service.
Note that toddlers needs to be adjusted still, that will happen after
this.
Thanks for all the modules!
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
This removes osbs and allmost all it's associated playbooks and files.
It served long and well, but we no longer need it.
flatpaks are building with a koji-flatpak plugin.
base/minimal/toolbox containers are building with kiwi.
We aren't building any other containers right now, and we did they could
be added to kiwi.
This is the end of an era... I look with nostolga on
ansible-ansible-openshift-ansible (a role to setup ansible on a control
host and run it from our ansible).
Good bye osbs!
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
We are hitting a sporadic and anoying 502 error with ostree pulls.
see https://pagure.io/releng/issue/11439
The problem seems to be between haproxy and varnish on kojipkgs01.
We set the httpclose option in haproxy globally, which closes
connections as soon as it thinks they are done.
Setting this option 'httpkeepalive' will keep connections alive
and handle the case of lots of fast connections downloading small
objects much better.
Sadly, we don't have a way to test this in staging, so we would need to
test in prod and roll back if there's problems.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>