There's a lot here, though most of this has come up before (e.g. in the
"koji 2.0" discussions). Have a look at koji.next.md in the source tree.
So, you're touching on some stuff that we've known we want to do for a
while now. It's just that these are major invasive changes, and dev
bandwidth is limited. We're finally getting to the point where we can
really consider stuff like this (we did py3 support finally).
-
Regarding rpc, I think the key word is alternative. While I previously
entertained the idea of dropping xmlrpc in favor of something else, I don't
think we can reasonably drop xmlrpc anytime soon. We can add a new rpc
method, we can use it in Koji itself (though the cli client will want to
support old servers for a while), but a hard drop is not in the cards.
I'll have a look at grpc, though I think the easiest first thing would be
to accept a straightforward json-encoded post request in addition to the
xml-encoded one. This is not a standard, but posting json to a url is
pretty trivial to implement. The hard part of xmlrpc is getting the
encoding right.
Longer term, we can look at rest, but as you correctly point out, this is a
major overhaul.
-
Cheetah is not as dead as it used to be, but I'm generally down with
porting to jinja2 anyway. Still, major overhaul, and honestly, the web ui
could use even more overhaul than that.
-
sqlalchemy will be a very large pill to swallow. Sure you can more or less
drop-in replace psycopg2 with it, but then you're not really using
sqlalchemy. Getting all the way to using the orm is basically a complete
rewrite of the hub. Not saying no (it's on the koji.next list after all),
but we need to recognize the difficulties.
-
Concerning pytest. I'm honestly not a fan. Every time I've encountered it,
I've found it frustrating.
-
"Dynamic builders" is not quite the right term for where I want to go, but
yes, we do want to be able to take advantage the cloud here.
... didn't get all the way through, will reply more later
On Wed, Mar 13, 2019 at 3:56 PM Ken Dreyer <ktdreyer(a)ktdreyer.com> wrote:
Hi folks,
I have some ideas about Koji development. I didn't want to throw a bunch
of ideas up in the air without any code, but at the same time I did want to
at least get the topics out there.
Please let me know what you think!
== API alternative to XML-RPC ==
From time to time I hear complaints about the XML parts of Koji. It's true
that this is showing its age, but XML-RPC is a pretty mature solution with
broad client support in a lot of languages that matter.
Nevertheless I sometimes hear REST offered as a solution. I've worked with
a couple services that added a REST API in addition to the original XML-RPC
API, and unfortunately one of the biggest barriers to completely
transitioning is all the dependencies. Koji's ecosystem is growing more and
more as Koji's architecture becomes more modular and pluggable, and REST
would "break the world". In some of these projects' cases that tried to
transition, I suspect the projects themselves are going to die before they
drop XML-RPC support.
Moreover, there are some things that have no easy analog with an HTTP REST
API:
- Koji has a "list-api" RPC that automatically provides a list of all
calls the hub provides. This is extremely useful when developing code and
services that interact with Koji. There's nothing simple that gives us this
same functionality out of the box.
- Koji has multi-call support, allowing us to send multiple RPCs over a
single HTTP request. This is critical to operating Koji at scale. The doing
requests serially (or even parallelizing the on the client) is incredibly
slow compared to the performance of multicall operations. Given Kojihub's
single "large box" hub architecture, it's important to avoid hammering
with
more requests.
It's the "XML" that's bad in "XML-RPC", and I am wondering if
gRPC could
be a good solution. I have not played around with it. There is slow
progress towards developing GSSAPI authentication for this at
https://github.com/grpc/proposal/pull/101
== Cheetah -> Jinja ==
Cheetah is essentially dead upstream and there is a lot of support behind
the Jinja2 project.
I could have sworn that I saw some patch from Tomas about this where he
was experimenting converting over, but maybe I am imagining this.
== SQLAlchemy ==
https://pagure.io/koji/issue/125
I expect an ORM would help with developer velocity and avoiding SQL
injection in a lot of areas. Koji has its own "history" helper methods to
record an audit trail for some changes in the database.
I've had some good experience on a small project using
https://pypi.org/project/sqlacodegen/ to reverse a pre-existing schema
into a series of SQLAlchemy model classes.
I think a SQLAlchemy transition could be 1) swap out the psycopg
connection code to use SQLAlchemy connections instead, and pass all raw SQL
into the SQLalchemy connection 2) use sqlacodegen to migrate to using rich
models over time.
== pytest ==
Currently the Koji tests use Python's unittest framework, and pytest would
let us have advanced features and cut out a lot of the boilerplate.
pytest is able to execute unittest's tests, so that would help with the
transition instead of having to cut everything over all at once.
== Dynamic builders ==
If Koji's task queue grows beyond what the static list of builders can
handle, there's no way to "burst" to a cloud environment to dynamically
add
and remove builder capacity.
I have been brainstorming some kind of an "orchestrator" that can create
the necessary builder credentials and authorize the builders into the hub.
It would need an ability to add and remove Kerberos principals for each
builder's FQDN, or maybe not?
Maybe this could be implemented as an OpenShift operator.
== Event-driven architecture ==
Currently Koji polls a lot. This puts pressure on the hub to continuously
answer all the poll requests from the CLI, web interface, kojid, etc. Big
environments have to tune the kojid's sleep time to use longer timeouts,
which means kojid picks up new builds slowly.
In other projects celery with rabbitmq has been a great combination for
dispatching jobs to workers. I think celery could be a good choice for Koji
as well.
== Stronger checksums ==
While I was working on content generators, I found Koji relies on md5 in
several areas. This hash is very broken and we'll need a new one.
It would be ideal to have a tool that can scan every existing build
archive, calculate the new hash values, and add the new hash values into
the database.
== Longer GPG key lengths ==
Koji currently stores short key IDs. This has ramifications for Pungi,
productmd, and probably lots more, because they all get these key values
from Koji.
The
evil32.com website explains the problem with these short key IDs, and
I'm surprised we don't have attacks on Red Hat's keys already in this area.
== Storing builds in object storage (S3) ==
Koji assumes a sizable NFS architecture, and in many environments object
storage like S3 is more attractive and scalable.
There are a couple open-source implementations of S3's API, like Ceph.
Maybe S3 buckets could be another "volume" type for the Koji hub. I
haven't looked in depth at what this would mean for how Koji manipulates
builds (eg with createrepo).
- Ken
_______________________________________________
koji-devel mailing list -- koji-devel(a)lists.fedorahosted.org
To unsubscribe send an email to koji-devel-leave(a)lists.fedorahosted.org
Fedora Code of Conduct:
https://getfedora.org/code-of-conduct.html
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedorahosted.org/archives/list/koji-devel@lists.fedorahoste...