2025-12-01 08:43:15 We need to badly rethink our mirror storage usage. This is not sustainable. 2025-12-01 08:46:57 what is the main problem? too many packages or too many past releases? 2025-12-01 08:55:16 it is cloud backed, right? 2025-12-01 08:56:09 how much is used/needed? 2025-12-01 08:59:37 lotheac: Too many packages. The past releases are relatively small. Even if we remove them, we gain little space back 2025-12-01 08:59:57 omni: Not cloud backed. Mirrors are all on servers we manage with limited storage 2025-12-01 09:00:18 The only thing 'cloud' about is that fastly fronts cdn covering a lot of bandwidth 2025-12-01 09:03:42 One suggestion I made before, but we haven't done yet, is to start archiving the community repo for unsupported releases (where the entire release is EOL), so starting from 3.19 after the 3.23 release 2025-12-01 09:05:01 Perhaps we can contact archive.org, iirc they do something for archlinux as well 2025-12-01 09:05:32 sounds reasonable to me. what's the size distribution (roughly) like between main/community/testing? 2025-12-01 09:09:16 https://tpaste.us/waRJ 2025-12-01 09:13:08 thanks 2025-12-01 09:15:05 i thought releases/ looks pretty large too, but i guess that has to be edge, and edge seems to have a pretty large history of release binaries 2025-12-01 09:16:12 i'm assuming releases/ on actual release versions is smaller in comparison 2025-12-01 09:21:53 I wonder if we could separate out a games repo 2025-12-01 09:22:25 I mean I wonder how much space that would save us 2025-12-01 09:24:36 Where would we host that repo? 2025-12-01 09:30:16 this horrible oneliner to find >=100MB packages from the edge/community index generated by the http server https://termbin.com/oqyq makes me think there aren't as many big games as there are other big packages 2025-12-01 18:22:37 lotheac: are you able to help getting the new CI to work for x86? What at least was missing is making sure the container is started with linux32 as entrypoint 2025-12-01 18:23:18 This is what we do for docker: 2025-12-01 18:23:20 https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/.gitlab-ci.yml?ref_type=heads#L93 2025-12-01 20:22:50 Would there be a place to find the stuff that gets archived? Something similar to Ubuntu's old-releases? Otherwise if might be hard for people to upgrade 2025-12-01 20:35:07 We used to have something like old.aloinelinux.org 2025-12-01 23:36:36 ikke: I'm mixing things up, I was thinking of that single disk zfs pool, was that distfiles (and cloud based)? 2025-12-02 00:38:11 ikke: sure i can take a look 2025-12-02 00:40:48 any chance you could provide me with kubectl access to the cluster (could probably be scoped to some unrelated ns) so that i can experiment without setting up a separate cluster on my side? 2025-12-02 01:03:56 i am thinking we could just add another podSpec patch to change the entrypoint. but would like to test that 2025-12-02 01:18:09 might be as simple as this https://gitlab.alpinelinux.org/alpine/infra/k8s/ci-cplane-1/-/merge_requests/13 2025-12-02 11:27:35 lotheac: thanks, I'll check it 2025-12-02 11:27:47 I saw a post the other day of integrating gitlab OIDC in kubernetes 2025-12-02 11:28:03 yeah any oidc should be fine 2025-12-02 11:28:18 well, it might need some mangling though 2025-12-02 11:28:36 https://dille.name/blog/2025/02/24/using-gitlab-oidc-to-authenticate-against-kubernetes/ 2025-12-02 11:28:42 i use authentik for sso (oidc and saml) at $CLIENT and also at home 2025-12-02 11:28:49 including for kube 2025-12-02 11:29:40 iirc one of the gotchas was that the oidc scopes are not configurable so the OP must be configured to return everything with just openid scope 2025-12-02 11:30:21 anyway let me know if i can help :) 2025-12-02 11:31:11 not sure how to configure k0s apiserver to be an oidc client though 2025-12-02 11:32:22 seems like it is supported 2025-12-02 11:32:56 https://docs.k0sproject.io/head/examples/oidc/oidc-cluster-configuration/ 2025-12-02 11:33:28 looks pretty straightforward 2025-12-02 11:33:31 ack 2025-12-02 11:34:15 the blog post you linked fills the rest of the gaps. i also use int128/kubelogin as the kubectl-side glue 2025-12-02 11:34:35 i might package that for alpine later 2025-12-02 11:35:35 There's also https://kubernetes.io/blog/2024/04/25/structured-authentication-moves-to-beta/ instead of the CLI options 2025-12-02 11:35:40 (from the blog post) 2025-12-02 11:36:45 doesn't seem like that is supported in k0s? 2025-12-02 11:38:54 Does it need to? 2025-12-02 11:39:34 Seems like a kubernetes API thing? 2025-12-02 11:40:26 that latest link is describing authentication configuration that you provide to the apiserver process by --authentication-config on the command line 2025-12-02 11:41:07 as opposed to all the previously-used separate oidc command line options 2025-12-02 11:41:30 but regardless it is direct configuration for the api server, _not_ a kube api object 2025-12-02 11:41:55 Right 2025-12-02 11:42:20 and from the k0s docs it does not seem to support that style of config 2025-12-02 11:42:49 doesn't matter much though, the --oidc stuff works just fine 2025-12-02 11:42:59 yup 2025-12-02 12:34:28 can someone kick build-edge-riscv64? 2025-12-02 12:37:07 The host is unreachable. clandmeter, could you restart it? 2025-12-02 12:37:25 (No response from console) 2025-12-02 12:39:42 damn maybe i should disable kanidm on riscv64. building almost 1000 rust crates is not per-se fun 2025-12-02 12:40:58 cosmic is a similar spot :p 2025-12-02 12:41:03 *is in 2025-12-02 12:41:19 yeah 2025-12-02 13:44:46 we need power cycle the pioneer box again 2025-12-02 13:44:59 the build-3-23-riscv64 is dead 2025-12-02 13:45:03 Host is unreachable 2025-12-02 13:45:22 ikke: can you do that or do we need clandmeter? 2025-12-02 13:50:34 We need Carlo 2025-12-02 14:58:44 everybody needs me 2025-12-02 14:58:54 ncopa: i power toggled the box 2025-12-02 14:58:56 check pls 2025-12-02 16:06:10 I'm getting 4 copies of every email notification from gitlab now 2025-12-02 16:06:24 Yeah, it's an ongoing issue, trying to figure out what's causing it 2025-12-02 20:47:53 i have lxc-freeze build-edge-riscv64 in an attempt to speed up build-3-23-riscv64 so I can tag -rc2 before going to bed 2025-12-02 21:04:03 ncopa: does edge also need to finish? 2025-12-02 21:04:22 Not for a 3.23 release 2025-12-02 21:04:28 (or rc) 2025-12-02 21:05:01 alright 2025-12-02 21:05:11 then we should get there today 2025-12-02 21:05:19 not for 3.23 as said. we only need the build-3-23-* to get ready 2025-12-02 21:05:58 i have frozen the build-edge-riscv64 to free up resources on the machine, so build-3-23-riscv64 works faster - hopefully 2025-12-03 10:07:35 hiya, can someone remove ~/aports/community/go/tmp manually? there seems to be a really weird bug in go's testssuite removing w permissions to $tmpdir/go-tool-dist-*, leading to abuild not being able to clean it up 2025-12-03 10:38:02 where? 2025-12-03 10:39:14 oh 2025-12-03 10:39:16 everywhere 2025-12-03 10:47:12 https://www.memecreator.org/static/images/memes/5591356.jpg 2025-12-03 10:54:01 oh no 2025-12-03 10:54:03 what just happened 2025-12-03 10:54:10 300 packages to build? 2025-12-03 10:55:26 ikke: can you please post a notification on gitlab to not push any bigger build jobs til 3.23 is out? 2025-12-03 10:57:23 Yes 2025-12-03 11:05:14 Done 2025-12-03 11:06:58 thanks 2025-12-03 11:20:41 im stopping all edge builders 2025-12-03 17:31:30 lotheac: I wonder, you say the emails have different message-ids, but from what I see, gitlab uses the note id as message id, so how can it be different for the same note? 2025-12-03 17:31:40 Message-ID: 2025-12-03 17:45:16 https://gitlab.com/gitlab-org/gitlab/-/issues/231353 2025-12-03 17:58:16 So the cause is that sidekiq for some reason thinks the email was not sent properly and is retrying up to 3 times 2025-12-03 17:58:31 In sidekiq, I see 'Net::ReadTimeout: Net::ReadTimeout with #' 2025-12-03 18:00:51 ncopa: I'm planning to upgrade gitlab tonight. Please let me know when you're finished 2025-12-03 18:02:08 how much downtime? 2025-12-03 18:02:12 Not a lot 2025-12-03 18:02:19 Couple of minutes 2025-12-03 18:02:24 i will go out for an hour or two soonish 2025-12-03 18:02:28 (y) 2025-12-03 18:59:57 btw can someone already create a 3.23.1 milestone? 2025-12-03 19:00:08 Not right now :P 2025-12-03 19:01:02 yeah i mean not in the next minutes but in general :p 2025-12-03 19:01:21 Yeah, wasn't a serious response 2025-12-03 19:01:40 yeah yeah i figured 2025-12-03 19:05:43 every second gitlab release they change the ui in non-intuitive ways, love it 2025-12-03 20:20:01 achill: https://gitlab.alpinelinux.org/groups/alpine/-/milestones/28 2025-12-03 20:20:09 great thx 2025-12-03 21:50:16 thanks! 2025-12-04 03:02:57 ikke: i don't know what to tell you. https://tpaste.us/pygE 2025-12-04 03:03:19 here i have 4 distinct message-id's 2025-12-04 11:46:05 I've cleaned up the riscv64 builders a bit, lots of tmp/ dirs in aports directories taking up quite some space 2025-12-04 11:47:19 thanks 2025-12-04 13:04:26 lol warp-s3's 1.3.1 tarball on distfiles.a.o got corrupted: https://build.alpinelinux.org/buildlogs/build-edge-armhf/testing/warp-s3/warp-s3-1.3.1-r1.log 2025-12-04 13:05:02 warp-s3-1.3.1.tar.gz: data 2025-12-04 13:05:15 guess i'll append a -1 to the source= filename 2025-12-04 13:09:30 achill: I can rm it 2025-12-04 13:10:04 oh, alreayd done 2025-12-04 13:19:35 I've removed it anyway 2025-12-04 14:40:50 Until very recently abuild didn't try to clean up tmp/ by default. 2025-12-04 14:40:58 yeah 2025-12-04 14:41:11 I figured 2025-12-04 14:41:16 the 3.23 builder wasn't affected 2025-12-04 15:14:22 ohh makes sense, that likely triggered the go failures 2025-12-04 18:32:29 we have a weird problem in tiny-cloud: https://gitlab.alpinelinux.org/alpine/cloud/tiny-cloud/-/jobs/2127420#L642 2025-12-04 18:32:36 the dash test fail. only dash 2025-12-04 18:33:14 we added set -x for debugging (so now the others also fails). but it seems like a file glob does not work 2025-12-04 18:33:37 https://gitlab.alpinelinux.org/alpine/cloud/tiny-cloud/-/blob/main/lib/tiny-cloud/common?ref_type=heads#L70 2025-12-04 18:33:55 the "$i" has some weird extra chars 2025-12-04 18:34:08 for hetzner 2025-12-04 18:34:29 [ -x /builds/alpine/cloud/tiny-cloud/tests/../lib/tiny-cloud/cloud/hetzner/autodetectXGO ] 2025-12-04 18:34:51 wher does those extra XG chars at end come from? 2025-12-04 18:34:57 does not ring a bell 2025-12-04 18:35:06 its either a bug in dash, or filesystem is corrupt? 2025-12-04 18:57:31 i think its dash 2025-12-04 18:59:14 https://git.kernel.org/pub/scm/utils/dash/dash.git/commit/?id=85ae9ea3b7a9d5bc4e95d1bacf3446c545b6ed8b 2025-12-04 19:02:02 https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/94226 2025-12-04 19:02:09 this needs backport to 3.23 2025-12-04 19:02:23 i need to go 2025-12-04 19:04:13 I can backport it 2025-12-04 19:05:03 thanks! 2025-12-04 20:09:51 lotheac: runner [dumb-init] entrypoint: No such file or directory 2025-12-04 20:16:00 lotheac: I think we need this: https://docs.gitlab.com/runner/executors/kubernetes/#overwrite-generated-pod-specifications 2025-12-04 20:51:27 https://gitlab.com/gitlab-org/gitlab-runner/-/issues/30713 2025-12-04 21:04:15 Seems like the change to fix this has been reverted again and never reapplied: https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/4535 2025-12-04 21:26:28 lotheac: what I see is that with FF_KUBERNETES_HONOR_ENTRYPOINT, the container spec uses args instead of command 2025-12-04 21:26:46 I tried to use a patch to include a command, but that never seems to happen 2025-12-04 21:27:04 https://tpaste.us/knbb 2025-12-05 04:03:45 ikke: hmm... that's a bummer. we could of course just patch the image itself to change the command if other ways fail 2025-12-05 04:05:01 however... if we get ENOENT on dumb-init, i think that means that the patch did do what it was supposed to. but maybe it applied to the helper pod or something which it should not have 2025-12-05 04:05:06 not sure 2025-12-05 04:54:56 lotheac: it applied to rhe run 2025-12-05 04:55:17 The runner pod, not the build pods 2025-12-05 04:55:24 i see 2025-12-05 05:07:56 seems like custom image would be the way to go then maybe... 2025-12-05 05:14:15 We already use custom images for building 2025-12-05 05:17:52 i forgot about that. where's the configuration for it? 2025-12-05 05:18:49 For aports ci builds: https://gitlab.alpinelinux.org/alpine/infra/docker/alpine-gitlab-ci 2025-12-05 05:19:22 i don't mean the dockerfiles, i mean how does gitlab pick which image to use 2025-12-05 05:19:41 .gitlab-ci in the project root 2025-12-05 05:19:56 ah, right 2025-12-05 05:20:06 https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/.gitlab-ci.yml 2025-12-05 05:22:04 oh, so despite the entrypoint config here, it doesn't work? 2025-12-05 05:26:26 Correct, which is mentioned in the issues I link to 2025-12-05 05:26:30 specific for kubernetes 2025-12-05 05:27:01 The kubernetes executor by defaukt sets the command, which is effectively what the entrypoint would be 2025-12-05 05:27:43 with the FF_KUBERNETES_HONOR_ENTRYPOINT, it switches from setting command to setting args, but it does provide the entrypoint as command 2025-12-05 05:27:57 There was an MR which would do that, but that was reverted 2025-12-05 05:28:14 that's annoying 2025-12-05 05:28:44 it does not* 2025-12-05 05:29:10 So the only option I see for now 2025-12-05 05:29:27 is providing an entrypoint in the image which, based on a flag, execs with linux32 2025-12-05 05:29:35 but we would need to make sure all relevant images do this 2025-12-05 05:29:54 but if the kubernetes executor always overrides pod.spec.containers.command, does the image entrypoint even matter 2025-12-05 05:30:04 not with FF_KUBERNETES_HONOR_ENTRYPOINT 2025-12-05 05:30:31 I mistyped, with FF_KUBERNETES_HONOR_ENTRYPOINT, it does _not_ set the command 2025-12-05 05:30:43 right, understood - it sets args in that case 2025-12-05 05:31:08 then, why do we need a flag in the image entrypoint? couldn't it just be linux32 for the -x86 tagged ones 2025-12-05 05:31:56 they're built from the same source, you cannot have a conditional entrypoint 2025-12-05 05:32:07 right, ok, so build-time flag :) 2025-12-05 05:32:18 well, that would just be two dockerfiles 2025-12-05 05:32:37 or multi-stage build 2025-12-05 05:47:27 another random thought: if we were to run the container runtime on the x86 nodes under linux32, maybe the multiarch image would just work as-is 2025-12-05 05:55:39 Could that cause issues with things like the daemonsets that run on the nodes? 2025-12-05 05:55:48 possibly 2025-12-05 05:56:08 a separate RuntimeClass might be a better option if we're able to patch the pod specs 2025-12-05 05:56:44 as in we would configure the container runtime to start pods in our custom RuntimeClass with linux32 2025-12-05 05:57:08 and then we'd set pod.spec.runtimeClass for the target pods 2025-12-05 06:00:00 another pretty heavy-weight thing we _could_ do is writing an admission webhook service that modifies the pods; that would enable us to set command/entrypoint as we wish, but it's maybe bit too big a hammer 2025-12-05 06:14:29 lotheac: could you provide a suggestion for rbac roles/clusterroles for administrative purposes? Then I can give you access to the cluster 2025-12-05 06:15:23 that all depends on the desired level of privilege :) but sure i can give an example or two 2025-12-05 06:17:27 I think for this cluser, any meaningfull access would give you some way of accessing the "sensitive" information anyway 2025-12-05 06:18:20 So I'm thinking about some role that we can use for day to day work without relying on the default admin role 2025-12-05 06:22:17 usually for reduced scope i just use namespaced roles... but let's see if we can figure out something that works without being cluster-admin 2025-12-05 06:25:38 something like this would be a starting point https://tpaste.us/Ovp6 ; ref https://kubernetes.io/docs/reference/access-authn-authz/rbac/#default-roles-and-role-bindings 2025-12-05 06:26:10 that would grant read access to most things except secrets 2025-12-05 06:27:06 (for the oidc group "developers") 2025-12-05 07:55:23 ikke: runtimeclass works. https://tpaste.us/r5kV 2025-12-05 07:57:09 Ok, cool. That sounds like a feasible option if we get the runner to apply it 2025-12-05 07:57:17 yup 2025-12-05 07:58:40 need to configure the wrapper script and containerd.d on all nodes though 2025-12-05 08:03:13 Yes, understood. Maybe we can use an ansible role for that 2025-12-05 08:03:24 Unless you have a better idea 2025-12-05 08:07:34 i've sometimes (ab)used daemonsets with hostPath mounts to configure stuff on the host machines 2025-12-05 08:07:42 just to avoid ansible :p 2025-12-05 08:09:17 https://tpaste.us/PQWy here's an example from my home cluster. it installs a static pod definition into /etc/kubernetes/manifests on the hosts. that makes the kubelet automatically create static pods on the nodes, _but_ i don't have to manage the files manually 2025-12-05 08:09:31 it's a bit hacky but eh :) 2025-12-05 08:10:13 if those machines are already managed by ansible then ansible may make sense anyways 2025-12-05 16:24:28 can someone poke build-edge-ppc64le? i think the messagelib build is stuck 2025-12-05 16:27:24 done 2025-12-05 16:29:21 thanks! 2025-12-05 20:22:42 lotheac: "To use this feature, you must enable the FF_USE_ADVANCED_POD_SPEC_CONFIGURATION feature flag" 2025-12-05 20:22:44 Oooh 2025-12-05 20:22:55 (regarding overwriting pod_spec) 2025-12-05 20:41:37 .. 2025-12-05 20:58:45 lotheac: finally 2025-12-05 20:58:57 I got it work with the pod_spec patch from yesterday 2025-12-05 20:59:05 https://gitlab.alpinelinux.org/kdaudt/hello-world/-/jobs/2129125 2025-12-05 21:17:17 lotheac: and the RuntimeClass works as well :) 2025-12-05 21:17:22 So 2 options now 2025-12-05 21:25:13 https://gitlab.alpinelinux.org/alpine/infra/k8s/ci-cplane-1/-/merge_requests/15 2025-12-05 23:39:13 nice :) 2025-12-05 23:43:15 ikke: RuntimeClass may be slightly more correct in that all processes including those not started via the entrypoint (eg. kubectl exec) should have the correct personality. changing just the command that may not be the case, i’m guessing 2025-12-06 06:31:15 lotheac: indeed, I had a similar feeling 2025-12-06 06:32:01 nice to be on the same page :) 2025-12-06 11:54:46 lotheac: oh: you can directly set the runtime class: https://docs.gitlab.com/runner/executors/kubernetes/#set-the-runtimeclass 2025-12-06 11:54:49 no need for a patch 2025-12-06 12:08:56 nice! 2025-12-06 12:11:32 I'm debugging why setting something like KUBERNETES_MEMORY_REQUEST: 2Gi in gitlab-ci.yml is ignored 2025-12-06 12:12:24 https://docs.gitlab.com/runner/executors/kubernetes/#overwrite-container-resources 2025-12-06 12:15:09 https://tpaste.us/me1V 2025-12-06 12:15:32 The request for CPU and memory still have the original values 2025-12-06 12:29:48 Oh right, now I remember. You need to set a max before you are allowed to overwrite it 2025-12-06 12:29:59 (even if it's lower) 2025-12-06 12:38:05 hmm 2025-12-06 12:40:58 No, that wasn't the problem :( 2025-12-06 12:51:26 This is anoying, because small jobs request way too much memory and prevent other jobs to be scheduled 2025-12-06 12:56:13 hmm :| 2025-12-06 12:56:57 I thought I had this working with the other cluster 2025-12-06 12:57:04 The temporary one 2025-12-06 13:03:21 oh, checking the config on the runner, it seems the memory_request_overwrite_max_allowed is missing 2025-12-06 13:04:06 type 2025-12-06 13:04:10 typo* 2025-12-06 13:04:37 (I cannot properly spell typo the first time to point out a typo) 2025-12-06 13:05:22 I copied that typo from the temp cluster, so I guess it never properly worked there either. 2025-12-06 13:08:19 The typo was overwite instead of overwrite 2025-12-06 13:11:44 Ok, fixed 2025-12-06 13:15:37 well, glad to be the rubber duck :D 2025-12-06 13:17:17 heh 2025-12-06 20:22:33 Shutting down the 2 last equinix servers used for CI 2025-12-07 22:20:36 clandmeter: ^ boots into a live iso 2025-12-08 06:08:15 clandmeter: I've changed the boot order in the bios, now it boots the OS again 2025-12-08 08:29:11 ok 2025-12-08 09:46:15 .. 2025-12-08 10:14:36 One server is delayed, which is causing some issues 2025-12-08 11:26:57 what's going on with the download servers? 2025-12-08 11:35:09 kunkku: there was one server down. It's now catching up 2025-12-08 11:36:01 I ran apk on one machine and it started downgrading packages, e.g. kernel 6.18 to 6.12 2025-12-08 11:36:10 Oof 2025-12-08 11:36:22 a cache coherency issue perhaps 2025-12-08 11:36:40 It's one of the backend servers for cdn 2025-12-08 12:00:12 I've stopped the mirror services on that server so that it will not return data 2025-12-08 15:17:38 alpine-edge-riscv64 is out of space 2025-12-08 15:18:19 when someone has a moment, looks like we need to crate a /var/lib/image-sync/v3.23 on dev.alpinelinux.org for the 3.23 cloud image release 2025-12-08 17:02:30 I used to have an x86_64 lxc container running on 172.16.26.18, but it seems down now. did it get a new ip address or was it deleted? 2025-12-08 17:02:53 if it was deleted, would it be possible to give me access to a new one? would be very helpful for working on !94355 2025-12-08 20:58:31 riscv64 CI builder is also having issues? 2025-12-09 14:41:15 ikke: I know there's a ton going on still on the infra side, but if you have a spare moment today to get me access to the wiki server I have capacity on my end to look into them :) 2025-12-10 06:49:23 one of the riscv64 machines had full disk. I have cleaned it a bit 2025-12-10 07:02:51 tomalok: seems like there is a /var/lib/image-sync/v3.23 on dev.a.o now 2025-12-10 07:03:42 nmeum: i wonder if the 172.16.26.* was a equinix metal machine? i don't know where it was moved 2025-12-10 07:05:00 It has been moved 2025-12-10 12:18:30 nmeum: 172.16.5.18 (nmeum-edge-x86_64.deu-dev-1.alpin.pw) 2025-12-10 12:35:23 thank you! much appreciated 2025-12-11 11:43:48 clandmeter: 172.16.30.2 is down again 2025-12-12 07:49:55 im setting pu a new riscv64 CI machine in scaleway 2025-12-12 08:35:26 i have added one riscv64 CI runner in scaleway. will try add another later today 2025-12-13 07:50:05 FWIW, it seems there might be some networking issues with x86_64 builders: it particularly shows-up in (heavy) testcases from Tailscale package: x86_64 specifically now needs 2 networking-related thresholds to be augmented, while other arches pass OK without such adjustments (ref https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/94576). 2025-12-14 00:08:32 I've never temporarily raised a CI timeout and don't know where to do that 2025-12-14 00:24:18 it is for !94560 2025-12-14 00:24:27 I also wonder why it is taking so much longer than before 2025-12-14 01:26:22 omni: should be on https://gitlab.alpinelinux.org/omni/aports/-/settings/ci_cd#js-general-pipeline-settings 2025-12-14 08:46:35 dne: thanks! it seems like that menu item is not visible for me, until I follow your link 2025-12-14 08:51:17 is build-edge-riscv64 stuck? 2025-12-14 09:18:50 no only stuck. Host is unreachable 2025-12-14 09:19:05 both riscv64 machines are unrechable atm 2025-12-14 09:19:45 i know there is a plan to test the 6.18 kernel on the pioneer machines, but I don't know when that is supposed to happen 2025-12-14 09:19:53 i guess they are just down 2025-12-14 09:42:56 they are both down. an we currently don't have any remote reset option, so it has to wait til over the weekend 2025-12-14 10:05:07 I added a second CI runner in scaleway 2025-12-14 10:05:25 we should now have 2 riscv64 CI runners in scaleway 2025-12-14 17:41:46 omni: the menu should open up/unfold when you click on "General pipelines" or the arrow on the left of it 2025-12-14 18:00:54 I cannot find a "General pipelines" either... 2025-12-14 21:25:16 oh, hmm… 2025-12-15 09:02:10 clandmeter: thanks! 2025-12-15 09:23:23 ywe 2025-12-15 09:23:50 i re added the wifi powerplug 2025-12-15 09:24:10 so remove power control is back 2025-12-15 15:38:48 lotheac: I think we need to adjust our approach for CI a bit. The default resource requests are quite high. That means for smaller jobs, we need to explicitly lower the requests, otherwise those jobs often fail to run due to requesting too much resources. 2025-12-15 15:39:44 fine with me, but from where do you intend to provide the resource configuration for each job? 2025-12-15 15:39:47 lotheac: what I was thinking is, could we have dedicated runners for the aports build jobs (tag ci-build) and then other runners for the remaining jobs 2025-12-15 15:40:00 ah, yeah that could work 2025-12-15 15:40:29 i've got something similar at $CLIENT for github, differently "sized" runners 2025-12-15 15:41:20 I'm currently traveling, so not a lot of means to make extensive changes 2025-12-15 15:41:58 if you need me to make an MR i can take a look sometime later this week 2025-12-15 15:42:09 Sure 2025-12-15 15:42:22 today is a bit hard, i should sleep soon 2025-12-15 15:43:34 No worries 2025-12-16 10:29:53 it looks like build-edge-riscv64 gets stuck on erlang28, I'm thinking of disabling it for that architecture to see if that would unblock the builder 2025-12-16 10:31:29 i think the builder is dead 2025-12-16 10:33:05 it was stuck on erlang28, then unstuck and built all the main/ aports and a few of the community/ ones, before getting stuck on erlang28 again 2025-12-16 10:33:28 so someone would need to resurrect the builder after erlang28 is disabled for riscv64 2025-12-16 10:42:03 !94733 2025-12-16 10:42:44 I saw build-edge-riscv64 was restarted and got that in before it is done with main/ 2025-12-16 10:43:27 omni: the riscv64 machines are flaky, and we are also testing the 6.18 kernel on them currently 2025-12-16 10:52:27 check 2025-12-17 11:54:40 x86_64 gitlab runner down? 2025-12-17 11:54:46 https://gitlab.alpinelinux.org/funderscore/aports/-/jobs/2143267 2025-12-17 21:56:00 can someone please remind me how I increase the timeout in CI for riscv64 2025-12-17 21:56:28 in the project settings -> ci/cd -> runner timeout 2025-12-17 21:56:41 i guess since we're developers you have to do it in alpine/aports 2025-12-17 21:58:38 thanks found it 2025-12-17 21:58:43 its set on 6 hours 2025-12-17 21:59:00 I think I 'll just merge it 2025-12-17 21:59:21 we need faster CI runners for riscv64 2025-12-19 16:48:46 what's the ip address of the DNS server used to resolve the alpine.pw domain? 2025-12-19 16:49:10 I think it used to be 172.16.6.3 at some point, but isn't anymore it seems 2025-12-22 01:28:12 nmeum: it hasn't changed 2025-12-23 06:31:34 ikke: couldn't get around to it last week, but something like this i suppose https://gitlab.alpinelinux.org/alpine/infra/k8s/ci-cplane-1/-/merge_requests/16 2025-12-23 06:32:57 don't have the means to test it properly, but i think you get the idea. i didn't add a x86-only variant of that because i'm assuming it would be mainly used for linting etc. non-arch-specific jobs 2025-12-23 20:10:56 > 1 error prohibited this user from being saved: 2025-12-23 20:10:56 > Email is not allowed for sign-up. Please use your regular email address. Check with your administrator. 2025-12-23 20:10:56 > 2025-12-23 20:11:26 Hi there, get the above message when trying to sign up on gitlab.alpnelinux.org, please can someone help? 2025-12-23 20:11:38 s/alpnelinux/alpinelinux/ 2025-12-23 23:45:16 jack_kekzoz[m]: what provider are you using? 2025-12-24 07:00:38 ikke: Hotmail 2025-12-24 11:04:02 jack_kekzoz[m]: microsoft is blocking our emails 2025-12-24 11:04:47 ikke: ok, is gmail ok? 2025-12-24 11:04:54 Yes 2025-12-24 11:05:08 also, please can you add a note on the registration page that explains that? 2025-12-24 11:05:15 Thanks - and Merry Christmas 2025-12-30 07:53:34 lotheac: Thanks, I think that should suffice, I'm trying to figure out though why the job is failing. I've checked the job output, and for some reason, it appears the helper job does not run properlty for some reason 2025-12-30 07:54:08 helper job: "waiting for logs" 2025-12-30 07:54:18 build job: "/usr/local/bin/entrypoint: cd: line 100: can't cd to /builds/lotheac/ci-cplane-1: No such file or directory" 2025-12-30 07:54:30 hmm :) 2025-12-30 07:54:51 if i recall correctly my previous PRs also failed those jobs 2025-12-30 07:54:56 let me verify that 2025-12-30 07:55:52 https://gitlab.alpinelinux.org/lotheac/ci-cplane-1/-/jobs?kind=BUILD yeah seems so 2025-12-30 07:56:14 they failed in a different way though 2025-12-30 07:59:24 what is the pod spec like in the failing pod? 2025-12-30 08:00:06 as in, kubectl get -o yaml -n gitlab-ci pod/runner-gnatciks-project-5730-concurrent-0-cfrybwog 2025-12-30 08:04:41 https://tpaste.us/r5LV 2025-12-30 08:07:46 so /builds is an emptyDir. i am assuming the entrypoint script should be creating /builds/lotheac/ci-cplane-1 before attempting to cd to it, OR there is something that should be mounted there but isn't (in the pod spec nothing under /builds is mounted) 2025-12-30 08:08:51 The helper container is what clones the repository 2025-12-30 08:09:23 there is a /scripts-/ directory in that container, which is empty 2025-12-30 08:09:58 lotheac: Could you try to create an MR directly in the ci-cplane-1 project? 2025-12-30 08:10:04 yeah, sure 2025-12-30 08:10:14 I mean, pushing the branch directly there 2025-12-30 08:10:52 https://gitlab.alpinelinux.org/alpine/infra/k8s/ci-cplane-1/-/merge_requests/17 2025-12-30 08:11:32 Same 2025-12-30 08:11:51 huh 2025-12-30 08:14:28 i'm confused about "line 100" in /usr/local/bin/entrypoint, my copy of registry.alpinelinux.org/alpine/infra/docker/exec/k8s-deploy:latest only has 88 lines in that entrypoint script 2025-12-30 08:18:37 if the helper image is supposed to clone the repo... is it failing to do that? also how is that supposed to happen before the build image executes the entrypoint if they are just two containers in the same pod - is there some sort of synchronization logic? 2025-12-30 08:19:39 I'm not sure how that's working in kubernetes 2025-12-30 08:20:19 containers in the same pod don't have any guaranteed ordering. initContainers run to completion before containers start though 2025-12-30 08:20:28 There is an init container 2025-12-30 08:20:44 but that one only seems to touch the log file 2025-12-30 08:20:55 yeah 2025-12-30 08:22:16 lotheac: my copy of that image has 107 lines 2025-12-30 08:22:32 i pulled it just now 2025-12-30 08:22:43 lotheac: if you use the nl tool, make sure to use -b all 2025-12-30 08:22:48 argh yeah sorry 2025-12-30 08:23:26 yeah, 107 is correct 2025-12-30 08:23:56 but from what the entrypoint script looks like, it does no synchronization whatsoever and just assumes $CI_PROJECT_DIR already exists 2025-12-30 08:24:21 Yes, the entrypoint script itself is unaware of the helper script 2025-12-30 08:24:41 ie. when the script runs, it assumes the environment is already correctly setup 2025-12-30 08:25:03 right, but for that to be the case, the helper must somehow be arranged to run before starting the build container 2025-12-30 08:25:09 eg. in an initcontainer 2025-12-30 08:25:26 so i'm just confused now how this was working to begin with :) 2025-12-30 08:27:12 May it have to do with FF_KUBERNETES_HONOR_ENTRYPOINT? 2025-12-30 08:27:22 I guess the runner would inject the command to run later 2025-12-30 08:27:30 The args just run a shell 2025-12-30 08:27:31 hmm, possibly 2025-12-30 08:30:34 is the helper actually cloning the repo? can you exec into the helper container and see if /builds/ contains anything 2025-12-30 08:31:05 No, it's not 2025-12-30 08:31:13 It's doing exactly nothing 2025-12-30 08:32:38 then i don't think the problem is FF_KUBERNETES_HONOR_ENTRYPOINT... if it is the helper that should be cloning stuff, then whatever the build container does or doesn't do on startup sounds inconsequential 2025-12-30 08:33:29 But it could affect perhaps what the helper helper image does? 2025-12-30 08:33:47 i suppose they could be communicating, yeah 2025-12-30 08:34:22 On a normal build, I see that the helper container logs output 2025-12-30 08:40:45 seems unrelated to the changes in the MR since an essentially empty one has the same issue https://gitlab.alpinelinux.org/alpine/infra/k8s/ci-cplane-1/-/merge_requests/18 2025-12-30 08:41:16 so it must have broken either as a result of the previous change to the runners or a version change 2025-12-30 08:42:35 But it's strange that it's only this project that's affected 2025-12-30 08:43:08 that it is 2025-12-30 08:45:26 i gotta run for now 2025-12-30 08:45:36 o/ 2025-12-30 08:45:40 later 2025-12-30 09:30:44 lotheac: setting FF_KUBERNETES_HONOR_ENTRYPOINT=false immediately makes the job pass 2025-12-30 12:07:48 ikke: that's interesting. now i'm curious if the pod spec is different when setting that... 2025-12-30 12:08:04 as in, what exactly changes with that setting and how that could cause the issue we're seeing 2025-12-30 12:08:22 I think it mostly affects what the gitlab runner is doing 2025-12-30 12:08:56 One change is that it changes whether it uses args or command 2025-12-30 12:11:06 right, i mean it's possible that honoring the entrypoint causes some crucial wrapper script not to be run at startup, but that's kind of why i want to see the diff of the pod specs 2025-12-30 12:11:15 to know if that is the only thing that's going on 2025-12-30 12:11:52 probably won't be able to look very closely today though 2025-12-30 12:22:59 https://tpaste.us/PQVy 2025-12-30 12:31:11 Not sure if it's related, the entrypoint for k8s-deploy does check if $1 = 'sh' and then executes sh with the arguments 2025-12-30 12:34:49 But not sure how that would affect the helper container 2025-12-30 14:11:45 yo, who knows how i can get a alpinelinux.org email address? 2025-12-30 14:14:53 I should know :) 2025-12-30 14:15:01 Just didn't get to it yet, just back from vacation 2025-12-30 14:16:59 oops yeah no pressue 2025-12-31 01:49:48 true" containers "build" and "helper", with either "command" or "args" set to a thing that runs an interactive shell. that's a bit weird - so how does the actual job execution happen? i believe the runner operator must be exec'ing into these containers for that 2025-12-31 01:49:48 ikke: ok i think i finally see what is happening. FF_KUBERNETES_HONOR_ENTRYPOINT=true makes it so that https://gitlab.alpinelinux.org/alpine/infra/docker/exec/-/blob/master/k8s-deploy/entrypoint is the entrypoint that is executed. but it will always fail on line 100 when the container is first created, because nothing has yet created $CI_PROJECT_DIR. from the pod specs, what the gitlab runner operator must be doing is: create pod and start "stdin: 2025-12-31 01:50:20 so my assumption is: first it execs into the helper container to do the setup, then it will exec into the builder container to run the actual build stuff 2025-12-31 01:51:19 the reason that it works with FF_KUBERNETES_HONOR_ENTRYPOINT=false is that in that case the entrypoint is overridden and both containers just don't run the defined entrypoint script at startup at all 2025-12-31 01:52:50 however the *same* script eventually gets run from the "script: [entrypoint]" definition in .gitlab-ci.yaml, via execing into the container (which bypasses the image entrypoint too) 2025-12-31 01:54:32 i think it will work fine even with FF_KUBERNETES_HONOR_ENTRYPOINT=true if we just remove ENTRYPOINT from the dockerfile 2025-12-31 01:55:53 in that case, the containers will start just those interactive shells regardless of whether the entrypoint (which is not defined) is being honored, and the "script:" will run via exec after the helper has set stuff up 2025-12-31 01:56:11 bit of a weird design they have used here, but it is what it is :) 2025-12-31 06:06:08 > i think it will work fine even with FF_KUBERNETES_HONOR_ENTRYPOINT=true if we just remove ENTRYPOINT from the dockerfile 2025-12-31 06:06:17 But that defeats the idea of FF_KUBERNETES_HONOR_ENTRYPOINT=true 2025-12-31 06:06:45 for that specific image, yes 2025-12-31 06:07:08 I want to figure out why it works in some cases, but not in this case 2025-12-31 06:07:34 because _this_ image's entrypoint expects the git repo to be cloned, or it fails 2025-12-31 06:07:44 and that's not the order of operations that happens 2025-12-31 06:07:53 lotheac: every image expects that 2025-12-31 06:08:14 show me another one :) 2025-12-31 06:09:16 What I expect to happen in the entrypoint is that the runner provides sh -c '..' as arguments to the entrypoint 2025-12-31 06:09:32 it does do that 2025-12-31 06:09:52 It then executes shift 1; sh "$@" 2025-12-31 06:09:57 but '..' is apparently always just this shell script monster that finds a shell and invokes it interactively 2025-12-31 06:10:23 and the actual thing executed for the build is NOT invoked in the context of that shell at all 2025-12-31 06:10:45 (the "script:" part from .gitlab-ci.yml) 2025-12-31 06:11:24 that one, from what i gathered above, is just run as an additional exec against the existing container (whose entrypoint is just running this interactive shell forever) 2025-12-31 06:12:28 since you have the /usr/local/bin/entrypoint script as BOTH the image ENTRYPOINT, _and_ as the script specified in the gitlab-ci.yml file, it actually gets executed in both contexts if the entrypoint is honored 2025-12-31 06:13:08 and if you disable the flag to honor entrypoint, then /usr/local/bin/entrypoint is _only_ invoked via exec 2025-12-31 06:14:27 that said i am assuming a bit here: i haven't seen the actual behavior so i'm only assuming from the rbac permissions that the runner is executing (both in the helper and build containers) by creating pods/exec resources 2025-12-31 06:15:10 it _could_ technically be instead attaching to the container main process 2025-12-31 06:15:23 (which is an interactive shell) 2025-12-31 06:15:31 but that would be even weirder 2025-12-31 06:16:45 but just looking at the command: or args: of the pod specs, it's not possible that either container (build or helper) are doing anything by themselves to initiate the clone: literally they just exec a shell without args 2025-12-31 06:17:47 therefore: the cloning and other initialization must be started from outside of the pod 2025-12-31 06:18:38 and if that is the case, then it does not make sense to run an entrypoint script as the very first shell wrapper that runs in the container, which assumes cloning is already complete 2025-12-31 06:19:02 am i making sense? :) 2025-12-31 06:30:07 https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/.gitlab-ci.yml?ref_type=heads#L24 this image for example does not define entrypoint at all. it does have COMMAND though, but that gets overridden whether or not FF_KUBERNETES_HONOR_ENTRYPOINT 2025-12-31 06:30:25 ah, sorry, s/COMMAND/CMD/ 2025-12-31 06:31:39 do we have some other image *with* an ENTRYPOINT in some job? 2025-12-31 06:56:30 lotheac: I guess the problem is that it does the `cd` before checking the argumenbts 2025-12-31 06:56:50 for the docker executor, that's not a problem, but for kubernetes that might be 2025-12-31 06:57:44 But I have to go now 2025-12-31 07:04:12 ikke: yes, sort of - if you made the script check for sh and execute that _before_ doing cd, then the command which runs at container startup no longer depends on the existence of CI_PROJECT_DIR 2025-12-31 07:04:47 i don't understand why that would not also be a problem for the docker executor though 2025-12-31 07:25:33 unless of course the docker executor provides an already-cloned dir as a mount to the container when it starts 2025-12-31 07:28:32 but also... i don't think we depend on FF_KUBERNETES_HONOR_ENTRYPOINT=true at present, though? looking at the previous commits we just added that as an alternative, but actually depended on the runtimeClass to solve the linux32 problem at the time 2025-12-31 07:33:33 With the docker executor, the helper image / container runs before the actual build job, I guess for kubernetes, there is some differenec 2025-12-31 07:36:29 yeah, okay, makes sense 2025-12-31 07:36:38 that's certainly what it looks like 2025-12-31 07:37:26 i wonder why they did not make the kubernetes executor run the helper in an initContainer. that way it would have made it similar to docker 2025-12-31 08:48:34 But indeed, if we do not use the entrypoint at the moment, we can just as well disable that feature flag for now 2025-12-31 09:39:25 the entrypoint is needed for the linux32 prefix, yes? 2025-12-31 09:43:25 For kubernetes, we now use a runtime class to do that 2025-12-31 09:49:33 do we execute `linux32` as a prefix or do we set personality for runc? 2025-12-31 09:49:50 I noticed that runc has a setting to set personality(2) 2025-12-31 09:57:43 https://gitlab.alpinelinux.org/alpine/infra/ansible-playbooks/-/merge_requests/5 2025-12-31 09:58:36 https://gitlab.alpinelinux.org/alpine/infra/k8s/ci-cplane-1/-/commit/ed5f415439a2d19994972ffec91cf348907a62f4 2025-12-31 10:03:33 ncopa: Any reference for that? 2025-12-31 10:15:56 https://github.com/opencontainers/runc/blob/4246d6a0788c121858a1b235140ac5be696b27e3/libcontainer/init_linux.go#L677 2025-12-31 10:16:13 https://github.com/opencontainers/runc/blob/4246d6a0788c121858a1b235140ac5be696b27e3/libcontainer/configs/config.go#L237 2025-12-31 10:16:25 I havent seen anything in any documentation though 2025-12-31 10:20:09 its in the OCI runtime spec: https://github.com/opencontainers/runtime-spec/blob/main/config-linux.md#personality 2025-12-31 10:20:26 I have no clue how to use that from either docker or kubernetes 2025-12-31 10:41:30 Ok, the current solution works at least 2025-12-31 11:43:42 lotheac: https://gitlab.alpinelinux.org/alpine/infra/docker/exec/-/merge_requests/19 2025-12-31 12:34:06 lgtm 2025-12-31 12:40:07 Ok, now it still fails, but because K8S_CERT is not set, like before 2025-12-31 12:40:13 So I expect !18 to work 2025-12-31 12:41:18 Hmm, no, it was 17 that was in the project, but that's still failing 2025-12-31 13:07:09 well that's weird 2025-12-31 13:07:12 i'll look at it next year 2025-12-31 13:07:36 have a happy rest of this year :) 2025-12-31 13:55:11 lotheac: thanks, you as well :-) 2025-12-31 13:59:20 is build-edge-x86 stuck? 2025-12-31 16:18:56 seem build-edge-x86 is stuck yes. I'm restarting it 2025-12-31 16:26:56 thanks! 2025-12-31 17:44:40 lotheac: the reason is that EXEC_COMMAND is set. The way the container is run in kubernetes is completely different from docker, so the entrypoint does not really work well 2025-12-31 17:45:17 in docker, the environment variables are injected later, while in kubernetes they are already part of the environment of the container 2025-12-31 18:04:01 So for now, easiest is indeed to disable FF_KUBERNETES_HONOR_ENTRYPOINT 2025-12-31 19:33:08 lotheac: Hmm, we need to specify a dedicated secret for the small-config runner, otherwise it will just be an additional runner with the same configuration (eg tags) 2025-12-31 19:43:00 https://gitlab.alpinelinux.org/alpine/infra/k8s/ci-cplane-1/-/merge_requests/20 2025-12-31 19:55:59 lotheac: The new runner works now :-) 2025-12-31 20:10:27 https://gitlab.alpinelinux.org/alpine/infra/k8s/ci-cplane-1/-/merge_requests/21