2025-10-01 06:24:47 those branches are not used often since they are EOL, so their caches etc are probably dead cold 2025-10-01 06:37:36 ncopa: i have already updated them recently 2025-10-01 06:39:31 but I also wonder if we are low on resources on the gitlab machine, if it is our own machines that causes it 2025-10-01 06:40:58 It's a pathological case 2025-10-01 06:41:09 Hard to get enough resources for that 2025-10-01 07:52:12 What we need to fix is the performance cost of those git operations. The memory usage and load caused by the pile up of many of such processes should vanish as a result 2025-10-01 07:55:57 it's only those builders? fetches done by later releases have different perf characteristics? 2025-10-01 07:57:33 Correct 2025-10-01 07:57:44 Possibly git protocol related 2025-10-01 07:57:55 git over https, i assume? 2025-10-01 07:57:59 Yes 2025-10-01 08:11:28 did minor spelunking over the release notes between v3.18 git (2.40.4) and v3.19 git (2.43.7) and came up with this wild guess https://github.com/git/git/commit/eaa0fd658442c2b83dfad918d636bba3ca3b4087 2025-10-01 08:12:14 it may very well be something different though 2025-10-01 10:27:14 lotheac: using GIT_TRACE_CURL=1, it shows protocol v2 is used: Send header: Git-Protocol: version=2 2025-10-01 10:51:44 Just a single git pull takes >1m 2025-10-01 10:53:27 real 2m 18.07s 2025-10-01 10:53:57 which brought in just 2 commits 2025-10-01 12:07:48 so separating giatly to a dedicated server, with more memory,cpu and faster storage wouldnt help? 2025-10-01 12:08:21 maybe we should have the builders to pull from the github mirror? 2025-10-01 12:08:48 im just thinking out loud... 2025-10-01 14:14:57 something appears to be wrong with the milk-v pioneer machine 2025-10-01 14:15:09 occationally processes hangs on 99.9% cpu 2025-10-01 14:15:28 first it was in perl process when building openssl in both 3.22-stable and 3.21-stable 2025-10-01 14:15:47 i killed the process and retried and 3.22-stable passed 2025-10-01 14:16:15 now it is a cc1 process that hangs on 3.21-stable 2025-10-01 14:17:21 retrying now 2025-10-01 14:56:38 ikke: are those findings true from both 3.18 and more recent builders? for me the diff looked like it might cause v0 instead of v2 for newer ones 2025-10-01 14:56:48 just fact-finding 2025-10-01 14:57:32 I'll check it 2025-10-01 15:05:58 lotheac: on edge: => Send header: Git-Protocol: version=2 2025-10-01 15:08:52 ncopa: we could also try to point them to git.a.o. My worry though is race conditions (builders get a notification and start pulling before the other repo is updated) 2025-10-01 15:09:34 right, so same protocol - that wasn't the problem 2025-10-01 15:10:40 i also thought about giving them a mirror to hit - it would solve the immediate problem but not the real issue 2025-10-01 15:11:14 that said, if nobody is abusing the real issue, good enough, right? :) 2025-10-01 15:16:52 i suppose we could fix the notifications. builders could subscribe to notification that git.a.o is synced 2025-10-01 15:17:50 We can first try and see if it's an actual issue, and then adjust if necessary 2025-10-01 15:18:09 And the other part is that <=3.18 are EOL anyway, so we do not expect many commits on them 2025-10-01 15:54:01 switched the 3.17 builders to git.a.o, and retrying them makes them finsh (before they would just hang on pulling git) 2025-10-01 15:57:33 same for 3.18 2025-10-01 16:11:11 https://build.alpinelinux.org/buildlogs/build-edge-ppc64le/testing/fungw/fungw-1.2.2-r0.log 2025-10-01 16:11:23 Can't open ".SIGN.RSA./home/buildozer/.abuild/alpine-devel@lists.alpinelinux.org-58cbb476.rsa.pub" for writing, No such file or directory 2025-10-01 16:11:29 That looks creepy 2025-10-01 18:41:21 achill: not really sure what happened, but I don't see anything out of the ordinary on that builder 2025-10-01 18:41:59 at least now it seems to build again 2025-10-01 18:42:08 before it didn't even pick up new pkgs 2025-10-01 23:29:02 achill: That package pulled in dash-binsh and abuild-sign which failed uses /bin/sh as shebang. And it seems like keyname=${pubkey##*/} didn't work as expected which seems impossible to fail. 2025-10-01 23:32:11 Would need to re-check the posix spec but this looks like a dash bug: v=foo/bar dash -c 'echo ${v##*/}' 2025-10-02 00:26:58 Could anybody please backport https://git.kernel.org/pub/scm/utils/dash/dash.git/commit/?id=6dcc007a72f13c3e518a65bffef571795ad6678c 2025-10-02 05:56:55 can someone maybe check on the 3.22 riscv64 builder progress? it might be stuck 2025-10-02 10:23:36 ncopa mentioned that he noticed processes started to spin on that builder 2025-10-02 10:23:45 I see rustc process spinning 100% now 2025-10-02 10:29:16 I've restarted it 2025-10-02 12:41:25 thanks! 2025-10-02 13:22:50 build-3-21-riscv64 now get compiler internal errors 2025-10-02 13:23:06 i wonder if we should try do a reboot 2025-10-02 13:24:58 We can try 2025-10-02 16:51:45 ncopa: should I reboot it? 2025-10-02 17:03:47 ok 2025-10-02 17:13:29 done 2025-10-02 17:53:14 thanks 2025-10-02 17:55:16 If you have a moment, can you check https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/90971? 2025-10-02 17:55:22 This keeps breaking the ppc64le builder 2025-10-04 20:16:01 nu_: We lost the default route on the builder 2025-10-04 20:16:10 (ipv6) 2025-10-04 20:17:43 I can still access it via the OOB network 2025-10-06 08:50:10 so the build-3-22-aarch64 is unavailable now? 2025-10-06 08:52:30 Yes, I've forwarded an email about it 2025-10-07 12:13:59 still no arm64 builder? 2025-10-07 13:27:47 Nope 2025-10-07 15:36:29 ncopa: now it is ^ 2025-10-07 15:43:39 it is back, sorry about it 2025-10-07 15:45:05 i should check this channel more often 2025-10-07 15:45:39 could i get email notifs when the builder goes down? i remember we you mentioned that it would be a possibility 2025-10-07 16:12:26 nu_: right we did, I forgot about it 2025-10-07 16:13:31 To your protonmail address? 2025-10-07 16:22:47 yup 2025-10-08 06:02:06 nu_: thank you! 2025-10-08 12:37:20 welcome! 2025-10-08 18:24:50 nu_: I'm triggering an issue to test if you get the alert for it 2025-10-08 19:42:04 nu_: Did you receive the email now? 2025-10-08 22:41:17 ikke: no:/ 2025-10-09 11:09:10 clandmeter: network issues ^? 2025-10-09 11:21:04 The dmpvn router is back, but the builders not yet. The router was rebooted 2025-10-10 07:01:57 clandmeter: thanks 2025-10-10 08:37:17 looks like the power was down when i was away 2025-10-10 08:39:53 clandmeter: wb 2025-10-11 02:55:02 something's up with the gitlab x86_64 runners: Unschedulable: "0/3 nodes are available: 1 Insufficient cpu, 1 Insufficient memory, 2 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling." 2025-10-11 02:56:31 node.kubernetes.io/unreachable sounds like a network partition between the node controller and those nodes 2025-10-12 03:48:38 s390x runner coredumped when unpacking 2025-10-12 06:55:27 qaqland: there's a musl regression affecting s390x 2025-10-14 15:07:22 im bootstrapping 3.23 builders now 2025-10-14 19:34:33 apparently the x86 CI works differently between kubernetes and x86.ci.alpinelinux.org when it comes to 64-bit detection 2025-10-14 19:35:11 https://gitlab.alpinelinux.org/alpine/aports/-/jobs/2053544 2025-10-14 19:35:23 here, when it runs on kubernetes, the architecture is detected as x64 2025-10-14 19:35:34 I'm curious how the CPU bitness leaks 2025-10-14 19:35:40 https://gitlab.alpinelinux.org/alpine/aports/-/jobs/2053531 2025-10-14 19:35:50 here, on the shared runner, it's ia32 (correct) 2025-10-14 19:36:32 panekj: it depends how python's "platform.machine()" works 2025-10-14 19:37:55 os.uname() 2025-10-14 19:37:57 unless it fails 2025-10-14 19:38:33 could it be that on kubernetes it somehow doesn't do `linux32` / whatever the equivalent would be 2025-10-14 19:39:06 well, let's try that i gues 2025-10-14 19:39:07 https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/91519/diffs#9853bba0d23ae23ea74b8cef51bc99a59d0780dd_59_59 2025-10-14 19:39:14 yes thank you algitbot 2025-10-14 19:40:02 interesting results 2025-10-14 19:40:03 https://gitlab.alpinelinux.org/alpine/aports/-/jobs/2053568 2025-10-14 19:40:04 indeed 2025-10-14 19:43:26 how is it setup to run 32-bit mode in k8s? 2025-10-14 19:44:15 I wasn't really ready yet to commission x86 on kubernetes yet 2025-10-14 19:45:52 oh, so x86 CI shouldn't even be on k8s? 2025-10-14 19:46:23 looking at py source it should be just `uname -p` output? 2025-10-14 19:46:38 you mean -m? 2025-10-14 19:46:48 ['uname', '-p'], 2025-10-14 19:47:07 that just prints `unknown` everywhere for me 2025-10-14 19:47:44 unknown for me too 2025-10-14 19:48:02 Kind of expected given it's non-portable 2025-10-14 19:48:41 not to mention that cpython doesn't exec `uname` or anything, it does the syscall 2025-10-14 19:48:44 so idk where `-p` came from 2025-10-14 19:49:17 The idea is that it should run with linux32 2025-10-14 19:49:55 ptrc: in class _Processor 2025-10-14 19:51:37 that's just a fallback and it doesn't execute for me 2025-10-14 19:51:53 and not like it matters, since `machine` is the field that we care about 2025-10-14 19:52:13 dunno, all python code is disaster 2025-10-14 19:52:25 anyway the issue is known 2025-10-14 19:53:03 ikke: so it doesn't run with it now 2025-10-14 19:53:06 ? 2025-10-14 19:53:18 I would expect it to 2025-10-14 19:53:38 https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/.gitlab-ci.yml?ref_type=heads#L93 2025-10-14 19:53:48 But maybe the entrypoint is ignored with the kubernetes executor 2025-10-14 19:55:42 do you have FF_KUBERNETES_HONOR_ENTRYPOINT feature flag enabled? 2025-10-14 19:56:10 Not explicitly 2025-10-14 19:56:12 "When enabled, the Docker entrypoint of an image will be honored if FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY is not set to true" 2025-10-14 19:57:45 uh 2025-10-14 19:58:02 idk if that applies or not 2025-10-14 20:21:38 Me neither 2025-10-14 20:21:57 I wanted to verify first before enabling it 2025-10-14 20:28:04 well, it's verified now :P 2025-10-16 09:16:41 ikke: unfortunately i still didnt get the monitoring mail:/ 2025-10-16 09:17:21 a wait, i responded to something i already responded, sry, ignore moe :p 2025-10-16 09:44:02 ikke: how can i remove the announcement of holding back merges to master? 2025-10-16 09:48:53 Broadcast messages in the admin panel 2025-10-16 09:57:21 done. thanks! 2025-10-16 13:26:40 I asked around in the #gcc channel about the ICE segafults we have when building clang on build-3-23-riscv64 2025-10-16 13:26:49 they say it is hardware 2025-10-16 13:26:58 so we have unreliable hardware for riscv64 2025-10-16 13:27:39 I am copying over the lxc to the other milkv machine to see if it makes any difference 2025-10-16 13:43:21 i continued the build on both machines and on nld-bld-1 it segfaulted again 2025-10-16 13:43:45 nld-bld-2 it is still running 2025-10-16 13:44:25 i think nld-bld-1 is flaky. not sure what the difference between them are? 2025-10-16 13:58:20 this is the diff: https://tpaste.us/e6wV 2025-10-16 13:58:37 i think we should consider replace the nvme storage 2025-10-16 14:23:10 clandmeter: I wonder if you could order a new storage for nld-bld-1 (172.16.30.2) maybe a samsung evo 990 pro 1TB something 2025-10-16 14:23:49 the samsung 970 evo on the nld-bld-2 appears to have been stable 2025-10-16 16:15:57 ncopa: so it's the storage that causes the issue? 2025-10-16 16:56:11 ikke: i dont know. could also be insufficient cooling 2025-10-16 16:56:31 maybe i should have checked the temp sensors if there are any 2025-10-16 16:57:03 but that was the only thing I could see that differs on the hardware https://tpaste.us/e6wV 2025-10-16 16:58:25 Both are below 50° 2025-10-17 16:01:30 ncopa: i have it already 2025-10-17 16:01:37 just didnt have tiem to copy it over 2025-10-17 16:01:43 we got it sponsored last time 2025-10-17 16:03:13 and yes i think its the ssd related 2025-10-19 09:16:48 nu_: I'm checking one more time what happens when Zabbix tries to send an email to you 2025-10-19 16:21:59 oki 2025-10-19 16:22:07 i didnt get a mail:/ 2025-10-19 22:03:02 could someone please check on the 3.23 x86_64, armv7 and aarch64 builders to see if the build is still progressing? they may be having a bit of trouble with the same package 2025-10-20 09:32:12 🚪knock knock: ERROR: Job failed: failed to pull image "alpinelinux/gitlab-runner-helper:latest" with specified policies [always]: Error response from daemon: Head "https://registry-1.docker.io/v2/alpinelinux/gitlab-runner-helper/manifests/latest" ;: received unexpected HTTP status: 503 Service Unavailable (manager.go:250:0s) 2025-10-20 09:40:28 achill: known AWS issue 2025-10-20 09:40:37 Seems to be recovering 2025-10-20 09:40:59 ahhhhh i see 🙃 2025-10-20 11:33:43 putting all eggs in one basket as a service 2025-10-20 13:25:58 i have done go boostrap on build-3-23-x86 2025-10-20 13:26:14 doing go bootstrap on build-3-23-ppc64le now 2025-10-20 13:26:21 👍 2025-10-20 13:43:46 also build-3-23-loongarch64 2025-10-20 14:00:09 bootstrapping openjdk8 on build-3-23-armhf 2025-10-20 14:01:06 rm: can't remove '/home/buildozer/aports/community/go/tmp/go-tool-dist-2567746971/fips-v1.0.0-c2097c7c/golang.org/fips140@v1.0.0-c2097c7c/LICENSE': Permission denied 2025-10-20 14:01:14 im getting various of those on loongach64 2025-10-20 14:02:06 it fails to delete tmp/ 2025-10-20 14:25:56 Yeah, me too 2025-10-20 15:37:25 i have bootstrapped openjdk8 on build-3-23-aarch64 2025-10-20 15:37:31 thanks meow :3 2025-10-20 19:10:46 nu_: I finally was able to grab the logs, I typoed your email addres 2025-10-20 19:22:05 nu_: status=sent (250 2.0.0 Ok: queued as 4cr4yv66Fkz3n 2025-10-21 06:21:00 morning! im stopping builders due to trouble building ocaml 2025-10-21 06:28:57 mornya 2025-10-21 06:29:00 im still so tired aaaaaa 2025-10-21 08:47:38 ikke: nld-bld-4 appears to be stuck. I cant even kill -9. I think I'm gonna upgrade the kernel and reboot? 2025-10-21 08:48:05 build-3-23-x86_64 is stuck. cannot reboot it 2025-10-21 08:48:09 the container 2025-10-21 08:48:41 ncopa: ok 2025-10-21 10:49:29 There are multiple processes in disk sleep 2025-10-21 10:51:43 rebooting 2025-10-21 11:07:37 hang on a sec 2025-10-21 11:07:59 lets wait for chromium to comlete the build 2025-10-21 11:08:15 so it is somethign with the disks? 2025-10-21 11:26:22 not necessarily 2025-10-21 11:26:32 But it's an uninterruptable sleep 2025-10-21 11:26:41 dmesg does not show anything 2025-10-21 11:27:02 in any case, no reboot command seems to work anyway 2025-10-21 11:33:15 bootstrapping openjdk8 on armv7 2025-10-21 12:58:21 ikke: monitoring received 2025-10-21 12:58:23 thanks! 2025-10-21 13:01:45 Good 2025-10-21 14:13:37 Rebooted ^ 2025-10-21 15:22:47 has gcc ice on stable riscv64 been resolved? getting an ice on u-boot rebuild on 3.20, should the upgrade linked to the rebuild be reverted for now? 2025-10-21 15:23:06 3.20? 2025-10-21 15:23:09 yeah 2025-10-21 15:23:29 Have not heard of one there 2025-10-21 15:23:47 https://build.alpinelinux.org/buildlogs/build-3-20-riscv64/main/u-boot/u-boot-2024.04-r7.log 2025-10-21 15:25:05 Possibly related to the host / disks 2025-10-21 15:25:29 At least, I've heard ncopa mention that some error disappeared once he moved it to another host 2025-10-21 15:26:20 yeah, possibly related 2025-10-21 18:41:38 bootstrapping go on riscv64 2025-10-22 03:45:37 rebooting, https://tpaste.us/ 2025-10-22 05:45:21 running a zpool scrub ^ 2025-10-22 06:34:56 Hmm 2025-10-22 17:39:19 rebooting the x86 builder host 2025-10-22 18:02:58 ncopa: nld-bld-3 had some issues loading iptables, so I've upgraded it to 3.22 now, and everything is working again 2025-10-22 18:44:25 upgrading and rebooting the x86_64 builder host 2025-10-23 06:22:41 nice! thank you! 2025-10-23 10:56:02 I wonder if we should create an email alias moderation@alpinelinux.org or similar for CoC topics 2025-10-23 11:21:36 send and receive emails. maybe change the help@a.o to moderation@a.o 2025-10-23 11:22:05 we should also try on-board moderators, who can help us deal with CoC incidents 2025-10-23 11:29:09 Yes, we should 2025-10-23 15:17:43 bootstrapping openjdk21 on aarch64 2025-10-23 19:01:28 bootstrapping openjdk17 on aarch64 2025-10-24 06:31:35 something wrong with build-edge-x86_64? seems like it's stuck on testing/electron-38.4.0-r0, the MR for which was merged 21h ago !91957 2025-10-24 06:48:35 You underestimate chromium 2025-10-24 06:51:27 well, electron's gitlab CI build on x86_64 was finished in 237 minutes, so i would not really expect 21h :) although i don't really know when the electron build started on the builder 2025-10-24 06:53:49 (also none of the other arches are building that anymore) 2025-10-24 06:58:20 the builder was busy before building another chromium 🙃 2025-10-24 06:59:19 i suppose that would explain it 2025-10-25 13:07:27 if the standard isos can be built for riscv64 (https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/90444), what's preventing them from being added to dl-cdn? 2025-10-25 13:10:40 AnIDFB: it would be done at the next release 2025-10-25 13:11:56 ikke: 3.23.0 or 3.22.3? (if the latter's even happening) 2025-10-25 13:13:36 3.23.0 2025-10-25 13:14:41 ikke: is this planned or just a maybe? 2025-10-25 13:15:40 It's part of mkimg.standard, so it would be automatic 2025-10-25 13:16:52 ikke: alright, thanks for the info! 2025-10-26 19:43:39 hmm, i cant reach build.alpinelinux.org 2025-10-26 20:30:25 fossdd: same 2025-10-27 10:28:22 clandmeter: we have issues with the stable riscv64 builders (3.2*) 2025-10-27 10:28:37 they need manual life support to build anything nowadays 2025-10-27 10:48:10 We also have serious issues with the distfiles zfs pool 2025-10-27 10:48:16 I'm making a backup now 2025-10-27 10:49:23 Anyone with decent zfs experience here? 2025-10-27 10:59:09 whtas up with the zfs pool? 2025-10-27 11:01:21 https://tpaste.us/avgK 2025-10-27 11:01:38 yesterday the server failed to boot because it panicked on import 2025-10-27 11:02:11 I had to boot the server without the zfs disk, disable the services before I could boot 2025-10-27 11:02:45 then reboot with the disk attached, and then set 2 sys parameters before I could even import it 2025-10-27 11:03:01 echo 1 > /sys/module/zfs/parameters/zil_replay_disable 2025-10-27 11:03:03 echo 1 > /sys/module/zfs/parameters/zfs_recover 2025-10-27 11:10:43 aw 2025-10-27 11:12:48 So curious if this is recovarable or whether we need to recreate it 2025-10-27 12:16:35 hi 2025-10-27 12:17:04 ncopa: what is manual life support? 2025-10-27 12:17:21 manual intervention? 2025-10-27 12:17:34 the disk issue you address previously? 2025-10-27 12:18:44 ikke: which server is this? location 2025-10-27 12:36:39 clandmeter: deu5-dev1 linode 2025-10-27 12:36:59 zfs in the cloud 2025-10-27 12:37:02 ? 2025-10-27 12:37:11 Yes, an additional volume 2025-10-27 12:40:14 I was under the impression zfs wants the whole physical disk assigned to it. 2025-10-27 15:15:28 my zfs experience says: if data corruption has been detected, it is best to recreate the entire pool (but you may zfs send|zfs recv backed-up data into it) 2025-10-27 15:16:14 does this pool have redundancy? ie. mirrors or raidz 2025-10-27 15:16:29 ah i guess not based on the tpaste -- just one disk 2025-10-27 15:17:34 that usually means: that one disk is bad, replace it with a new one and restore the data from backup 2025-10-27 15:17:53 didn't know people used zpools in that cloudy way 2025-10-27 15:18:25 (makes sense, but even so, i wonder if cloud disks lie) 2025-10-27 15:18:51 well, why wouldn't they? many features are useful even with virtual disks 2025-10-27 15:22:10 that said... if this is a cloud disk, then it stands to reason it's a zfs bug rather than a problem with the storage device 2025-10-27 15:22:56 yeah. we could tee that up on -offtopic but i knew someone who worked at joyent 2025-10-27 15:23:36 i knew a bunch of people from there, but they never cared about "linux bugs", so i think openzfs is the better address :p 2025-10-27 15:25:36 and i think upstream bugs are not offtopic 2025-10-27 15:26:41 we probably know some of the same people 2025-10-27 15:29:44 it's a small world :) 2025-10-27 15:53:12 clandmeter: builds like kernel, firefox etc deadlocks. I have to kill something, and manually try continue build. repeat until it finished 2025-10-27 15:58:54 ncopa: reason? 2025-10-27 16:36:04 dont know exactly, but it is hardware related 2025-10-27 16:36:11 since its not consistent 2025-10-27 16:36:52 i did a comparison of the hardware between the two boxes, and disk is different 2025-10-27 16:37:08 so I suspect it would help to replace the nvme storage 2025-10-28 07:56:37 ncopa: i can insert a second nvme 2025-10-28 07:56:54 need someone to duplicate it and we can try switching 2025-10-28 08:23:53 ok. would be good 2025-10-28 08:43:16 clandmeter: looks like you plugged it in. do you need help with cloning it? 2025-10-28 08:49:22 ncopa: can you do a live clone? 2025-10-28 08:49:33 not sure i have proper boot material 2025-10-28 08:53:32 i can pobably live clone it. not sure how the bootloader is set up though 2025-10-28 10:07:45 hum 2025-10-28 10:07:47 [ 5484.120258] nvme1n1: Write(0x1) @ LBA 1347235040, 1016 blocks, Data Transfer Error (sct 0x0 / sc 0x4) 2025-10-28 10:07:47 [ 5484.120278] I/O error, dev nvme1n1, sector 1347235040 op 0x1:(WRITE) flags 0x4000 phys_seg 127 prio class 2 2025-10-28 10:08:37 got io errors on the new disk :-/ 2025-10-28 10:26:25 not enough power? 2025-10-28 10:27:26 are we running the same kernel and BSP? 2025-10-28 10:27:57 wow dmesg is like xmas tree 2025-10-28 10:33:31 i think it is the same kernel 2025-10-28 10:33:54 i did an fsck and continued the rsync 2025-10-28 10:33:59 im checking the temp 2025-10-28 10:34:26 temp appears to be acceptable 2025-10-28 10:34:33 distfiles backup is still ongoing 2025-10-28 10:49:09 I'm gonna stop shared-runner nor-ci-2 (aarch64) for a week or so. I'm gonna need it for other stuff 2025-10-28 11:03:59 Can you pause it in the admin panel? 2025-10-28 11:04:15 i did 2025-10-28 11:09:14 thanks 2025-10-28 11:11:13 We list distfiles again, rebooting 2025-10-28 11:11:49 WARNING: zfs: adding existent segment to range tree (offset=2365157c00 size=800) 2025-10-28 11:11:52 lost* 2025-10-28 11:35:32 so we also have issues with the distfiles storage 2025-10-28 11:46:40 yes 2025-10-28 11:46:50 But it's not storage we manage 2025-10-28 11:47:04 but I have a feeling it has to do with networking 2025-10-28 11:47:18 ok 2025-10-28 12:09:04 hi, just curious, how much ram does this device with new nvme have? 2025-10-28 12:19:56 if its more than 32Gb, a simple disk transfer test can be done with 4Gb to 16Gb img on /tmp, also make sure data transfer rate is noted 2025-10-28 12:20:49 simple tests for bottle-necks in data transfer 2025-10-28 13:45:37 ncopa: sync ok? 2025-10-28 14:38:05 yes. sync is ok 2025-10-28 14:38:57 clandmeter: i assume that boot options are in /boot/extlinux/extlinux.conf? 2025-10-28 14:39:14 i think so yes 2025-10-28 14:39:19 the BSP reads it 2025-10-28 14:39:42 ok. i just change the root partition UUID and we should be good? 2025-10-28 14:39:59 we may need to remove the old ssd 2025-10-28 14:40:03 nvme 2025-10-28 14:40:18 the bsp will search for the extlinux 2025-10-28 14:40:23 so not sure which one it will find 2025-10-28 14:40:37 or maybe both and add them to the boot screen 2025-10-28 14:40:43 i think removing the old is a good idea anyways 2025-10-28 14:41:22 power it off and let me know 2025-10-28 14:41:27 i have some time now 2025-10-28 14:43:07 are you logged in and have /mnt open? 2025-10-28 14:44:00 im powering it off now 2025-10-28 14:46:43 clandmeter: it is off now 2025-10-28 15:13:25 ncopa: it failed to enable nvme 2025-10-28 15:13:38 i can try another boot 2025-10-28 15:17:18 ok now it passed 2025-10-28 15:25:07 I would still check dmesg ;) 2025-10-28 15:36:22 ncopa: https://wiki.gentoo.org/wiki/Milk-V_Pioneer_Box#NVMe_is_flaky 2025-10-28 15:44:09 i think we could build a mainline kernel, looks like almost everything is already merged 2025-10-28 15:44:17 seems there is also a new pcie driver 2025-10-28 15:44:54 looks like we are only missing some DTS stuff which we could patch in 2025-10-28 15:47:05 https://github.com/sophgo/linux/wiki#sg2042 2025-10-28 15:47:47 not sure if this is a complete list, but we would need 6.18 2025-10-28 16:58:08 network appears to not work in the lxc containers? 2025-10-28 17:22:33 Probably awall or similar 2025-10-28 22:07:53 I haven't taken the time to fix it properly with awall 2025-10-28 22:07:55 iptables -P FORWARD ACCEPT 2025-10-28 22:29:38 ncopa: we should add a timeout to the builders when they try to upload the buildlogs 2025-10-28 22:46:02 just a thought, instead of pushing buildlogs let build.a.o pull them in slow and consistant speed, trigger via mqtt might be needed 2025-10-28 22:46:39 or redis 2025-10-28 22:51:52 People like the buildlogs to be available immediately 2025-10-28 23:01:51 should not make lots of difference, another benifit is, builders can continue with next build if success 2025-10-28 23:02:18 pull speed can be optimised/throttled via mqtt msgs 2025-10-28 23:05:00 It makes everything a lot more complex 2025-10-28 23:09:42 ok 2025-10-29 08:21:07 morning. looks like the riscv64 builders are online 2025-10-29 08:21:26 thanks ikke and clandmeter 2025-10-29 08:22:29 oh, and now we have a disk error :/ 2025-10-29 08:23:25 and it remounted read-only 2025-10-29 08:23:32 maybe I ws too optimistic about this builder 2025-10-29 08:34:42 ncopa: did you read my msg? 2025-10-29 08:35:10 regarding 6.18 and the new pcie driver 2025-10-29 08:35:46 yeah, i havent read the links though 2025-10-29 08:35:48 will do so now 2025-10-29 08:36:44 we could also try to switch to sata 2025-10-29 08:40:58 do you have a sata disk handy available? not sure how big impact it has on performance 2025-10-29 08:41:08 yeah look like we can use 6.18 kernel 2025-10-29 08:41:13 vanilla 2025-10-29 08:41:23 with the patches for dts 2025-10-29 08:41:27 that would be great 2025-10-29 08:42:36 it already has a ssd 2025-10-29 08:42:52 i need to look if i have some old ones 2025-10-29 08:43:17 i think sata would be the best short time solution 2025-10-29 08:43:33 once 6.18 is out we can try switch to that 2025-10-29 08:44:19 i guess we could try 6.18 rc? i guess the changes are already in? 2025-10-29 08:44:21 im currently at a conference so it will be limited what I can do 2025-10-29 08:44:26 yup we could 2025-10-29 08:45:16 there new v2.3.4 for openzfs 2025-10-29 08:45:23 backup would be to switch to https://semiconductor.samsung.com/consumer-storage/internal-ssd/870evo/ 2025-10-29 08:46:47 sounds good 2025-10-29 08:47:11 im rebooting it, due to it is readonly 2025-10-29 08:47:29 it cant do anything in its current state 2025-10-29 08:48:52 according to the gentoo wiki page, switching to uboot solves the boot issues 2025-10-29 08:52:09 oh, its alredy at v2.3.4 2025-10-29 09:01:27 ncopa: looks like the nvme is gone mia 2025-10-29 09:01:38 even the bootloader doesnt see it anymore 2025-10-29 09:09:27 we don't have an nvme adapter, just to check if its nvme or nvme slot issue ? 2025-10-29 09:27:08 tried a few things, but something is broken now. i hope its the nvme drive... 2025-10-29 09:36:10 if usb adapter v3.1 is available, it can be a temporary solution 2025-10-29 09:36:51 i have one that has a type-c(with usb converter) 2025-10-29 09:44:48 I don't have hot-swap gizmos, opening hardware is sometimes a pain, so i keep couple of them 2025-10-29 10:01:41 just for info, type-c is now a standard in EU(iirc, since yr 2021) 2025-10-29 11:39:02 inserted the old nvme back, its shown but had to boot twice. something is not working as it should. not sure what it is. 2025-10-29 11:57:57 is it possible to download specs pdf for both hardware and 'new nvme'? 2025-10-29 12:01:03 read about latest truenas(debian), seems to have implemented new stuffs 2025-10-29 12:18:28 whatta 2025-10-29 12:25:01 if i boot, qemu-system-x86_64 -enable-kvm -m 1024 -cdrom alpine-extended-3.22.0-x86_64.iso -boot d -nographic -display curses 2025-10-29 12:25:14 why do i see boot:boot twice, https://tpaste.us/5Qxy, just curious 2025-10-29 16:50:47 fyi in the previous arm+nvme the samsung ssd produced weird io errors. switching to corsair fixed it. 2025-10-29 16:51:36 *previous arm+nvme case 2025-10-30 11:05:20 rebooting again 2025-10-30 13:20:09 :( 2025-10-30 14:01:47 I'm trying to backup the data