all_reused_in_prev_batch should only check components in the previous
batch, not components in all batches including future batches. This
was accidentally regressed by some code refactoring in c36bd7ebac.
We use module builds as an intermediate build for building flatpaked
applications on Fedora. As Flatpaks in Fedora are officially supported
only on aarch64 and x86_64 we wanted to limit the builds just to these
architectures to save Fedora resources. We were able to do with commit
https://src.fedoraproject.org/modules/flatpak-common/c/65a01f which
works perfectly in koji/mbs, but doesn't work when run locally as the
build fails with following errors and exceptions:
info: Getting tag for flatpak-common:f36:3620220516070452
info: Start to handle flatpak-common:f36:3620220516070452:cab77b58 which is in init state.
Traceback (most recent call last):
File "/usr/lib/python3.10/site-packages/module_build_service/scheduler/handlers/modules.py", line 182, in init
record_module_build_arches(mmd, build)
File "/usr/lib/python3.10/site-packages/module_build_service/scheduler/submit.py", line 150, in record_module_build_arches
arches = get_build_arches(mmd, conf)
File "/usr/lib/python3.10/site-packages/module_build_service/scheduler/submit.py", line 95, in get_build_arches
new_arches = _check_buildopts_arches(mmd, arches)
File "/usr/lib/python3.10/site-packages/module_build_service/scheduler/submit.py", line 131, in _check_buildopts_arches
print(arches not in unsupported_arches, file=sys.stderr)
TypeError: unhashable type: 'list'
info: State transition: 'init' -> 'failed', <ModuleBuild flatpak-common, id=2, stream=f36, version=3620220516070452, scratch=False, state 'failed', batch 0, state_reason 'An unknown error occurred while validating the modulemd'>
warning: Note that retrieved module state 4 doesn't match message module state 'failed'
Traceback (most recent call last):
File "/usr/bin/mbs-manager", line 33, in <module>
sys.exit(load_entry_point('module-build-service==3.6.1', 'console_scripts', 'mbs-manager')())
File "/usr/lib/python3.10/site-packages/click/core.py", line 1137, in __call__
return self.main(*args, **kwargs)
File "/usr/lib/python3.10/site-packages/flask/cli.py", line 596, in main
return super().main(*args, **kwargs)
File "/usr/lib/python3.10/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/usr/lib/python3.10/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3.10/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/usr/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/lib/python3.10/site-packages/flask/cli.py", line 440, in decorator
return __ctx.invoke(f, *args, **kwargs)
File "/usr/lib/python3.10/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/usr/lib/python3.10/site-packages/module_build_service/manage.py", line 171, in build_module_locally
module_build_service.scheduler.local.main(module_build_ids)
File "/usr/lib/python3.10/site-packages/module_build_service/scheduler/local.py", line 57, in main
raise_for_failed_build(module_build_ids)
File "/usr/lib/python3.10/site-packages/module_build_service/scheduler/local.py", line 39, in raise_for_failed_build
raise ValueError("Local module build failed.")
ValueError: Local module build failed.
The problem is that the code as it's now will fail to proceed if it will
detect any unsupported architecture - in this case the aarch64 even
tough the local x86_64 is supported. The check should be redone so if
there's an unsupported architecture detected, we should check that isn't
not the local one and proceed with the build, otherwise fail (the local
hardware doesn't support any of the specified architectures).
config.allow_compatible_base_modules=False does different things for
build-requires selection and for module reuse. Clarify this in the config
key documentation.
(This config key is really: True: "do what RHEL expects"
False: "do what Fedora expects")
Event handlers decorated with @celery_app_task don't raise an exception -
they just log the exception, leaving the Moksha hub running. This meant
that any failures in test_build would result in the test suite hanging
rather than failing usefully.
We solve this by adding a new class EventTrap which acts as a context
manager. Any exceptions that occur in event handlers are set on the
current EventTrap, and the test_build tests re-raise the exception.
SQLAlchemy objects can't be used from multiple threads - so when starting
threads for builds, pass the ComponentBuild id rather than the object.
(Note that despite the comment that the threads were sharing a session,
they weren't - what was passed to the thread was a scoped_session that
acts as a separate thread-local session per-thread.)
BUILD_COMPONENT_DB_SESSION_LOCK - a threading.Lock() object that was used
in a few places - but not nearly enough places to effectively lock usage
of a shared session - is removed.
Because each event handler wrapper would call scheduler.run() at the end before
returning, with a queue of 100 events to process we'd end up
with:
Event handler 1 wrapper
scheduler.run()
Event handler 2 wrapper
scheduler.run()
.....
.... Event handler 100 wrapper
Which would eventually exhaust the Python stack limit. Fix this by making scheduler.run()
no-op if the scheduler is already processing the queue in the current thread.
Using a memory database causes tests/test_build to intermittently
fail, because using the same pysqlite3 connection object from multiple
threads - as was done so that the threads shared the same memory database
- is not, in the end, thread safe. One thread will stomp on the transaction
state of other threads, resulting in errors from starting a new transaction
when another is already in progress, or trying to commit a transaction
that is not in progress.
To avoid a significant speed penalty, the session-scope fixture sets up
a database in the pytest temporary directory, which will typically be on
tmpfs. Time to complete all tests:
memory backend: 38 seconds
file on tmpfs: 40 seconds
file on nvme ssd with btrfs: 137 seconds
MBSSQLAlchemy, which attempted to make the memory backend work, is removed.
Session hooks are installed on the Session class rather than on the
scoped_session instance - this works better when we're changing from
one database to another at test setup time.
The base module conflict generation was skipped for local builds
in 6b2e5be93a because libdnf wasn't ported to libmodulemd yet -
that was done in libdnf-0.45, so only warn and skip for versions of
dnf too old to require libdnf-0.45.
(Don't just unconditionally skip check/warning in case someone is
doing local module builds on RHEL 8.)
Ever since local builds were changed to call handlers directly
instead of going through the scheduler, the current module ID wasn't
set, causing no logs to be written to the module build log file.
No implementations of MBS are using Greenwave, and there are no current plans
to do so. Koji Resolver will be sufficient for any usecase dependent on gating.
There is a need to rebuild the module builds done in CentOS 9 Stream
internally in MBS to include them in RHEL. This is currenly a hard task,
because the RPM components included in a module are usually
taken from HEAD of the branch defined by their `ref` value.
For the rebuild task, it means we would have to ensure that the HEAD
of all RPM components points to right commit hash right before we start
rebuilding CentOS 9 Stream module in internal MBS. This is very hard
and fragile thing to do, especially if there are two different modules
using the RPM component from the same branch. This is prone to race
condition and makes the rebuilds quite complex and in some cases
not possible to do without force pushes to RPM component repositories
which is not acceptable by internal dist-git policy.
This commit fixes it by allowing overriding the commit hash while
submitting the module build. This helps in the mentioned situation,
because we can keep internal RPM components branches in 1:1 sync with
CentOS 9 Stream branches and HEAD can always point to the same commit
in both internal and CentOS 9 Stream repositories.
When the module rebuild is submitted in internal MBS,
we can use this new feature to override the `ref` for each RPM component
so it points to particular commit and the requirement for HEAD to point
to this commit is no longer there.
The `ref` is overriden only internally in MBS (but it is recorded in logs
and in XMD section), so the input modulemd file is not altered. This is
the same logic as used for other overrides (`buildrequire_overrides` or
`side_tag`).
This does not bring any security problem, because it is already possible
to use commit hash in `ref`, so the package maintainer can already change
the commit hash to any particular commit by using this `ref` value.
Signed-off-by: Jan Kaluza <jkaluza@redhat.com>
This works around a case where tagging messages are missed for a build
with high reuse. In such a case, we can start out in the final batch,
but have incorrect tag state for reused components from previous
batches.
Since ComponentBuildTrace(s) get created with db_session.commit() call,
is is not possible to commit more items in bulk if they already have been flushed.
Current unit-tests' setup can be significantly sped up if items can be quickly
flushed on the fly and bulk-commited only once at the end. Moreover in general it
seems more appropriate/safer to handle this in before_flush as any implicit
or accidental flush could cause new build traces not to be created at all. As flush
is implicitly called before every commit anyway, this change shouldn't pose any harm.
This function could get called multiple times if the init handler
runs more than once. This can happen if the build failed in the
init handler due to external infrastructure being down and the
user resumes their build.
Commit 98b1ac79 ensures the message is sent after data changes are
committed into database. Hence, it is doable to remove these two
workarounds.
Signed-off-by: Chenxiong Qi <cqi@redhat.com>
In MBS, there are two cases to send a message when a module build moves
to a new state. One is to create a new module build, with
ModuleBuild.create particularly, when user submit a module build.
Another one is to transition a module build to a new state with
ModuleBuild.transition. This commit handles these two cases in a little
different ways.
For the former, existing code is refactored by moving the publish call
outside ModuleBuild.create.
For the latter, message is sent in a hook of SQLAlchemy ORM event
after_commit rather than immediately inside the ModuleBuild.transition.
Both of these changes ensure the message is sent after the changes are
committed into database successfully. Then, the backend can have
confidence that the database has the module build data when receive a
message.
Signed-off-by: Chenxiong Qi <cqi@redhat.com>
This also includes `from __future__ import absolute_import`
in every file so that the imports are consistent in Python 2 and 3.
The Python 2 tests fail without this.