2024-11-01 00:05:42 https://build.alpinelinux.org/buildlogs/build-edge-s390x/main/mariadb/mariadb-11.4.3-r2.log failing with No space left on device. 2024-11-01 08:41:35 This is annoying 2024-11-01 09:19:54 do we know why it happens? 2024-11-01 09:30:11 Not yet, no obvious large amount of requests (there are many coming from ips from Alibaba), but they get a nice 429 response 2024-11-01 09:30:22 So perhaps a specific request that is expensive 2024-11-01 10:49:15 Uztelekom BGP issue maybe 2024-11-01 13:14:09 we are out of diskspace on s390x builder 2024-11-01 13:23:34 Yup 2024-11-01 13:24:52 We could disable and remove community for eol releases 2024-11-01 14:02:33 maybe I should backup some of the older releases 2024-11-01 14:02:44 and maybe we should ask IBM if they can help 2024-11-01 14:03:20 i think we should ask IBM first 2024-11-01 14:04:03 just explain the problem 2024-11-01 18:11:12 ncopa: 80G free now on usa2 2024-11-01 18:11:39 But that's about all we can get atm 2024-11-02 08:24:46 The load on that server increased over the last 7 days 2024-11-02 08:25:18 (and correlates with requests to cgit) 2024-11-02 10:55:05 Looking at zabbix, there are frequent bursts of requests 2024-11-02 10:55:14 periodic 2024-11-02 15:39:05 ikke: a much better channel for this, thanks. 2024-11-02 15:39:12 durrendal: welcome :-) 2024-11-02 15:39:27 FYI, it's currently a bit spammed by issues with git.a.o, which I'm working on 2024-11-02 15:39:32 (figuring out what's causing it) 2024-11-02 15:40:30 No worries at all, I'm used to noisey monitoring systems and reading through the noise 2024-11-02 15:40:58 I'm trying to prevent noisy monitoring, but I don't want to hide any issues :) 2024-11-02 15:42:06 What are your experiences with working on infrastructure? 2024-11-02 15:44:29 Easy peasy, just fix everything right ;) just a heads up, I may be a bit slow to respond, out with the kids at the moment. 2024-11-02 15:46:04 Insofar as experience goes I've been a linux sysadmin professionally for about 10 years. Currently my day job leans more towards devops heavy on the infrastructure automation side. 2024-11-02 15:46:36 right, and no worry 2024-11-02 15:47:26 Our infrastructure is mostly docker compose or lxc containers 2024-11-02 15:47:34 Mostly maintained by hand 2024-11-02 15:47:38 That devops role has been the last 5 years of my career, lots of ansible, saltstack, terraform. Mixed Debian, Alpine, and windows environments. 2024-11-02 15:49:17 I have loads of docker experience. Not a bunch with lxc directly. But Ive used both LXD and now incus in my lab environments and some on prem production environments. 2024-11-02 15:49:53 lxc is fairly simple 2024-11-02 15:50:07 Just a bunch of containers that act like VMs 2024-11-02 15:54:07 That was my understanding of it. Incus and lxd just provide a nice cli on top of lxc which makes it easier to manage in my opinion. 2024-11-02 15:54:32 I would certainly be amenable to learning lxc though, I don't think that'd be a problem. 2024-11-02 15:55:58 What we use lxc for is quite limited though 2024-11-02 15:56:08 Well, all the builders run it 2024-11-02 15:56:16 but that does not take a lot of maintenance 2024-11-02 16:00:50 That makes sense, and tracks the prior conversation in -linux. The CI is just gitlab-runners running docker containers? 2024-11-02 16:01:01 yes 2024-11-02 16:01:02 What's currently used for monitoring? 2024-11-02 16:01:06 zabbix 2024-11-02 16:05:44 Excellent choice! Huge fan of Zabbix personally, it's a solid monitoring systematic, especially for typical infrastructure. 2024-11-02 16:06:07 s/systematic/system/ 2024-11-02 16:06:33 I have extensive experience with it :) 2024-11-02 16:08:13 Some graphs I'm currently watching https://imgur.com/a/5CfaxaR 2024-11-02 16:08:37 We're also using terraform a bit: to manage linode and gitlab 2024-11-02 16:09:07 Just some things, not everything (yet, at least) 2024-11-02 16:17:33 Same here, I've done several implementations of Zabbix at each job I've had. It's like a stalwart companion. 2024-11-02 16:18:03 And those are wicked nice dashboards! I'm actually surprised the load on things like cgit isn't higher honestly. 2024-11-02 16:18:10 The mailserver? 2024-11-02 16:21:28 Sorry I might not be following there, the mailserver? 2024-11-02 16:22:34 Starlwart is the name for a mailserver: https://stalw.art/ 2024-11-02 16:22:44 But I guess you refer to something else 2024-11-02 16:28:16 Oh I hadn't heard of that before, I meant that Zabbix was a stalwart companion. 2024-11-02 16:33:27 ah ok, I'm not too familiar with that term, hence the confusion :) 2024-11-02 16:43:02 No worries at all :) 2024-11-02 16:43:58 Do you currently track issues anywhere? Like a ticketing system or something? 2024-11-02 16:43:59 I'm currently trying to implement rate limiting on nginx based on AS 2024-11-02 16:44:18 durrendal: yes, https://gitlab.alpinelinux.org/alpine/infra/infra 2024-11-02 16:54:34 I had to do exactly that recently with my blog and Gitea instance. All of the LLM scraping has caused ridiculous load on my own servers. 2024-11-02 16:56:44 I'm not sure how the nginx configs are setup, I didn't see any of the configs in the gitlab/infra project, but you should be able to define a snippet that has an array of user agent strings and then set a rate limit in an if statement based on whether that user agent matches. 2024-11-02 16:57:09 durrendal: I'm going to use the geo feature (I'm already use that for gitlab) 2024-11-02 16:57:25 It allows me to map the IP address to ASN 2024-11-02 16:57:42 Not trying to be comprehensive, just the ones that I think are causing issues 2024-11-02 16:58:53 That's a good idea, these things tend to be in large ipsets anyways. And unlike my personal site, you probably dont want to just block the traffic at the firewall 2024-11-02 16:59:27 I have been doing that as well, with ipsets 2024-11-02 17:01:30 My latest trick is responding with 444 if trying to register a user from VPS / Cloud IPs 2024-11-02 17:01:41 At least, the ones that have been spamming users 2024-11-02 17:26:18 I've heard about that just hanging around on IRC, it makes sense to block anything strange like that. Even if it breaks one or two people's attempts to register. At least real people can report it as an issue in that scenario. 2024-11-02 17:27:57 ahuh 2024-11-02 17:28:08 The only anoying one remains AT&T 2024-11-02 18:16:33 Is there something specific to AT&T that's different from other LTE providers? Or is it just anything that implements a CGNAT network structure? 2024-11-02 18:16:57 Not sure if there's anything specific except they seem to attrackt a lot of spammers 2024-11-02 19:52:51 That is very odd. 2024-11-02 19:53:45 Out of curiosity, all of the infrastructure runs Alpine right? Are they on stable releases or are some running on edge? 2024-11-02 19:54:03 All running alpine, all stable releases 2024-11-02 19:55:05 We dogfood our own OS :) 2024-11-02 20:00:57 Haha I was hoping that was the answer! Barring missing packages for something I can't imagine a reason you wouldn't, and even then that's just impetus to package things. 2024-11-02 20:18:03 How much time would you say the infrastructure team spends on fighting fires, regular maintenance, and innovation? 2024-11-02 20:19:30 Oof, good question. Fighting fires can vary quite a lot 2024-11-02 20:19:54 Generally our infra is quite stable, so it's not that we are constantly fighting fires 2024-11-02 20:20:37 I don't have good numbers though 2024-11-02 20:21:04 I do kind of iterate through each of those activities though 2024-11-02 20:37:26 It absolutely can, but the fact that you don't have an off the cuff answer I think attests to the stability of the infrastructure. 2024-11-02 20:38:07 I mean, I've been around 6ish years I think, I don't personally remember massive outages during that time period. 2024-11-02 20:41:44 I've now added rate limiting based on ASN :) 2024-11-02 20:42:00 CHecking if it works 2024-11-02 20:50:35 Yes, it's rate limiting :) 2024-11-02 20:52:03 \o/ nicely done! 2024-11-02 20:52:15 Haha I forgot algitbot does that 2024-11-02 20:52:26 Just a single AS 2024-11-02 20:52:47 Does that rate limit apply to all of the alpinelinux domains? 2024-11-02 20:52:52 No 2024-11-02 20:52:57 Just a single application 2024-11-02 20:53:48 Ah, that makes sense. Gitlab I assume given the user registration 2024-11-02 20:54:04 yes, but there it's not rate limiting 2024-11-02 20:54:14 Here it's about cgit 2024-11-02 20:54:37 lots of scanners use git.a.o 2024-11-02 20:54:44 This is a single AS 2024-11-02 20:54:48 All from TENCENT 2024-11-02 20:56:34 198 IPs receiving 429 responses 2024-11-02 20:57:15 the module provides stats? 2024-11-02 20:58:26 It probably does, but I'm parsing the access log 2024-11-02 20:59:00 now 2563 IPs :-| 2024-11-02 20:59:04 Lots of IPs 2024-11-02 20:59:26 But the most important part, no more 50x repsonses during a spike 2024-11-02 21:00:16 https://imgur.com/a/qJJkKLn 2024-11-02 21:03:18 I've found a free database with subnets for each AS. I created a small programm that parses that, and generates a list of subnet to as mappings that the nginx geo module takes 2024-11-02 21:03:36 https://iptoasn.com/ 2024-11-02 21:08:51 I must say I quite like the geo module from nginx 2024-11-02 21:09:30 can we put it somehow behind fastly? 2024-11-02 21:10:03 Not sure it will help 2024-11-02 21:10:14 Though it coudl 2024-11-02 21:10:58 But not sure if we should support this abbusive traffic 2024-11-02 21:12:25 Their scraping it with thousands of IPs 2024-11-02 21:12:43 maybe scraping with k8s 2024-11-02 21:14:42 Our secdb receives 30 requests/s, but it's just static data 2024-11-02 21:16:05 And many of the requests are 304 not modified 2024-11-02 21:27:56 There are better, more respectful, ways to gather data than scraping information with thousands of IPs.. 2024-11-02 21:29:24 And especially with git, you just clone the repo once, and you can scrape it locally all you want 2024-11-02 21:29:49 And next, you just fetch the updates 2024-11-02 21:30:02 But people rather just point there webcrawlers at a webpage 2024-11-02 21:31:51 I used lnav to get an idea where the requests came from 2024-11-02 21:32:16 (lnav can parse logs, and allows you to perform sql-like queries on them) 2024-11-02 21:35:57 Lnav is an excellent tool :D you might like promtail, Loki, & grafana for log ingestion and discovery if you like lnav 2024-11-02 21:36:23 Let's you do the same sort of thing out of band. 2024-11-02 21:37:07 In fact, I have an instance running 2024-11-02 21:37:20 But not everything is fed inti it 2024-11-02 21:37:22 into it 2024-11-02 21:37:28 (only loki, no grafana) 2024-11-02 21:37:49 I have been playing a bit with kubernetes 2024-11-02 21:40:57 Great minds think alike :) grafana is just visualization anyways, Loki's where the cool toys are. 2024-11-02 21:41:12 good visualization can help make sense of the data 2024-11-02 21:41:17 just didn't get to deploy it yet 2024-11-02 21:41:19 What application do you have in mind for k8s? 2024-11-02 21:42:02 Oh absolutely, and there are good use cases for granting someone access to grafana while not giving them ssh access to the box hosting it. 2024-11-02 21:42:16 right 2024-11-02 21:42:30 That's one of the reasons I deployed it 2024-11-02 21:43:11 From everything we've talked about today not having gotten to something sounds like it's a matter of bandwidth more than anything. Alpine is a large project with a decent bit of infrastructure and a small team. 2024-11-02 21:43:19 That makes for time all the more precious. 2024-11-02 21:43:32 s/makes for time/makes time/ 2024-11-02 21:44:58 That's also one of the reasons I was looking into k8s, as a way to make it easier to do basically gitops. Allow members to deploy changes via CI/CD 2024-11-02 21:45:21 There are other ways, of course, but just one potential solution 2024-11-02 21:46:38 But one thing I find challenging with k8s is that storage is painful. From what I can tell, most will defer state to some managed cloud solution 2024-11-02 21:48:14 That is one way to do it. Things like terraform/opentofu and ansible in a CI runner is a pretty common pattern. 2024-11-02 21:49:19 You could potentially get some of the same functionality with incus and its clustering functionality, which would probably be easier to manage than k8s, and doesn't suffer from the same storage issue. 2024-11-02 21:49:50 It's also a smaller shift in terms of understanding, since it's still lxc containers under the hood. 2024-11-02 21:51:31 How does Incus deal with storage? 2024-11-02 21:53:46 It can do it a few different ways, but the suggested method is either a ZFS pool. That can be as complicated as multi disk vdevs, or as simple as a ZFS formatted image file. 2024-11-02 21:53:50 https://linuxcontainers.org/incus/docs/main/explanation/storage/ 2024-11-02 21:54:34 All of the containers share space inside of that pool, and can be restricted to specific resource usage (to prevent excessive growth from taking down everything) 2024-11-02 21:54:57 But from what I see, it's still stored on the host? 2024-11-02 21:55:05 Unless you use CEPH 2024-11-02 21:55:25 Yes that's true, unless you use CEPH. 2024-11-02 21:55:42 That's similar to k8s then 2024-11-02 21:56:10 I was just about to ask what the current setup looks like for the builders. From the earlier conversation it sounds like they need to have access locally to all of the built packages for a given arch? Is that just done on the host itself? 2024-11-02 21:56:40 it works exactly the same as you do locally with abuild 2024-11-02 21:56:46 the packages are collected in ~/packages 2024-11-02 21:57:39 And the fact that the builders have the complete repos saved our bacon the other day :-) 2024-11-02 22:05:42 It works exactly the same, that's neat! I didn't realize the process was so similar. 2024-11-02 22:06:21 Yup, it all uses the same tools 2024-11-02 22:06:30 But then that means you need several hundred gigs of space per builder just to keep up with the current packages. And I think that might just be edge. 2024-11-02 22:06:33 aports-build -> buildrepo -> abuild 2024-11-02 22:06:57 My memory is based off of wanting to setup an edge mirror and then realizing I needed far more disc space than I thought :) 2024-11-02 22:07:09 heh 2024-11-02 22:07:15 When I started, a builder was ~30G 2024-11-02 22:07:47 Then people like me joined the project and started dumping every package they could find into aports ;) 2024-11-02 22:08:22 It's 70G per builder now 2024-11-02 22:08:30 And yes, it's giving us issues 2024-11-02 22:15:09 https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/main/aports-build/aports-build 2024-11-02 22:15:22 This script is triggered by mqtt-exec when changes happen on git 2024-11-02 22:33:09 I can imagine. The shift from small and simple to what now feels like a more general distro is interesting. And I dont think the infrastructure implications of adding packages is something people consider 2024-11-02 22:34:04 So all of the build servers are controlled via the MQTT server essentially. That's the only real interconnection between any of them? 2024-11-02 22:34:32 And I'm guessing they're in various locations? Or are they all servers in a central location? 2024-11-02 22:36:48 Various locations 2024-11-02 22:37:59 And yes, mqtt is what binds things together 2024-11-02 22:42:50 clandmeter: ^ 2024-11-02 22:54:44 Hmm so to implement something like an NFS share it'd need to be over a VPN connection. And there are likely performance implications there 2024-11-03 07:10:25 clandmeter: when you have time, can you look on nld-bld-1? Is unresponsive 2024-11-03 08:12:48 ikke: i can powertoggle them 2024-11-03 08:13:07 if that does not work you will need to wait unil tomorrow 2024-11-03 08:13:36 nod 2024-11-03 08:14:33 power off 2024-11-03 08:19:05 power on 2024-11-03 09:47:03 ikke: stuff is back online? 2024-11-03 09:47:08 yes 2024-11-03 09:47:13 ok 2024-11-03 09:47:39 though I have not seen any messages from the builders on irc yet 2024-11-03 09:48:36 Oh, probably haver to fix the fw again 2024-11-03 09:50:16 yup 2024-11-03 13:53:26 The excessive requests stopped btw 2024-11-03 13:53:30 to git.a.o :-) 2024-11-04 09:01:55 I have replace lxc with incus on my local desktop machine 2024-11-04 09:02:04 incus is pretty nice 2024-11-04 10:24:13 the riscv64 machien cannot pull images for some reason? 2024-11-04 10:24:16 https://gitlab.alpinelinux.org/alpine/aports/-/jobs/1590041 2024-11-04 11:38:27 oh, a configuration error 2024-11-04 11:38:43 it pulls from gitlab.alpinelinux.org, but the registry is registry.alpinelinux.org 2024-11-04 11:40:53 It's fixed now 2024-11-04 13:50:12 It is a very slick solution. I've dropped virtmanager and qemu in favor of incus, though it's all the same under the hood, the CLI is just a slicker solution in my opioid 2024-11-04 13:50:27 s/opiod/opinion/ 2024-11-04 13:51:49 The one thing that it really lacks is support for emulating different hardware. I would love to run some arm or riscv containers/VMs with incus, but they just don't support it. 2024-11-04 17:17:43 Is there any particular reason why the algitbot's zabbix notifications are omitted from the irc logs found on irclogs.alpinelinux.org? 2024-11-04 18:19:57 durrendal: algitbot does not log its own messages 2024-11-04 18:26:53 Ah, that makes sense, I didn't realize it was handling the logging as well. 2024-11-04 18:27:11 yup 2024-11-04 18:43:30 I was looking at the logs expecting it to be there, was hoping I could get a sense for what type of alerts are typical and potentially get a rough sense of their frequency by pulling them from there 2024-11-05 05:44:07 seems like linode frankfurt had connectivity issues 2024-11-05 05:44:51 though nothing on the status page 2024-11-05 07:43:55 their status page gets updates from frankfurt ;-) 2024-11-05 09:13:11 does alpine have a statue page? 2024-11-05 11:28:00 qaqland: not yet, it's something I'm thinking about implementing 2024-11-05 14:03:22 Is one of our aarch64 builders an M1 mac system? 2024-11-05 15:02:07 One of the CI hosts 2024-11-05 15:03:48 The builders run on Ampere Altra 2024-11-05 15:34:31 Ah I had meant CI not builders, that's where I had seen it actually. 2024-11-05 15:36:05 Kind of neat to see consumer hardware in the mix to take the load off of other systems. I like the idea 2024-11-05 15:36:57 also helps paint a clearer picture of how distributed the infrastructure is 2024-11-05 15:37:40 Which speaking of, I was looking at the Infra issues and noticed the wireguard documentation attached to the project. Are all of the builders meshed together over wireguard? 2024-11-05 16:07:04 durrendal: The builders use dmvpn 2024-11-05 16:07:26 which is a homegrown mesh network solution 2024-11-05 16:23:29 We use wireguard for individual connections to the dmvpn network 2024-11-05 16:23:48 (dmvpn delegates complete subnets) 2024-11-05 16:41:00 im bootstrapping go on build-3-21-riscv64 now 2024-11-05 16:41:28 seemsm like it already was bootstrapped 2024-11-05 16:42:43 Right, I bootstrapped it, but forgot to reenable the builder, sorry 2024-11-05 16:45:57 It needs to be bootstrapped on x86_64 now 2024-11-05 16:47:14 ikke: that's a really cool setup. So then when an admin needs to connect to a box they must first connect to the wireguard VPN to access the system? 2024-11-05 16:54:20 ncopa: minor thing, community/gnuradio test should be fixed 4 days, but it hadn't a chance to retry on s390x 2024-11-05 16:54:53 s/4 days/4 days ago/ ... the build log timestamp is from 2024-10 2024-11-05 16:55:33 it was fixed upstream, unless there is something in s390x that causes it to fail on retry 2024-11-05 16:56:39 just mentioning it in relation to the commit disabling it on s390x due to the test 2024-11-05 16:57:11 the s390x builder was offline for a bit for bootstrapping 2024-11-05 17:09:29 durrendal: some parts, but not everything necessarily 2024-11-05 17:11:09 makes sense, you'd be reliant on the VPN being up to administer everything else, which could be a single point of failure. But it is good to know. 2024-11-05 20:11:54 mio: oh ok. seems like the s390x builder was/is down 2024-11-05 20:14:28 ncopa: yeah, no worries. the test did pass in s390x ci. maybe the maintainer will check it next upgrade and reenable if all is well 2024-11-05 20:15:06 the patch can be removed on next release anyway 2024-11-05 20:21:33 that said, i doubt anyone will ever use gnuradio on a s390x machine 2024-11-05 20:22:58 ikke: i got response from docker support. we need fill in a form every year. I dunno if anyone from infra team wants to do it or if I should go ahead 2024-11-05 20:33:28 ncopa: what is the form about? I could take a look? 2024-11-05 20:51:37 https://www.docker.com/community/open-source/application/ 2024-11-05 20:58:01 Did you list any sponsors last time? 2024-11-05 21:00:24 I already sent it. Sorry 2024-11-05 21:00:39 filled it out and sent it 2024-11-05 21:00:49 and already got accepted 2024-11-05 21:07:13 ah ok, thanks 2024-11-05 21:20:11 ncopa: I suppose we should add a calendar notification for next year 2024-11-05 22:00:54 Yeah 2024-11-06 19:46:40 CI is backlogged again 2024-11-06 19:47:09 Mostly armv7 2024-11-06 20:36:57 Anoying people scraping aports 2024-11-06 21:03:42 is it bots scrape commit by commit on the cgit/gitlab instance? 2024-11-06 21:04:03 APKBUILDs 2024-11-06 21:05:37 https://tpaste.us/W5RZ 2024-11-06 21:06:15 that sounds like the slowest way to get APKBUILDs possible 2024-11-06 21:06:24 But also the lasiest way 2024-11-06 21:06:30 laziest* 2024-11-06 21:07:22 true, it is definitely the path of least resistance for someone who doesn't care about the impact 2024-11-06 21:08:13 I noticed because I was testing something against the gitlab API and got timeouts 2024-11-06 21:09:21 Do you have experience with Azure? 2024-11-06 21:12:51 Not currently. I've done extensive work with AWS, Linode, and Digital Ocean though. 2024-11-06 21:13:14 I have a project at $work that I need to delve into it though, so it's on the road map 2024-11-06 21:13:41 Right. We received sponsorship from Azure, but we haven't really used it yet 2024-11-06 21:13:51 Azure is kinda complicated 2024-11-06 21:14:13 It does have aarch64 VMs though 2024-11-06 21:14:27 So I was trying to deploy some aarch64 hosts for CI 2024-11-06 21:16:13 Nice! Out of curiosity, what does the sponsorship entail? 2024-11-06 21:17:14 I didn't realize Azure had aarch64, I've always gone to AWS for those. I wonder if those are cheaper than the EC2 instances I've been using to rebuild my droid's kernel 2024-11-06 21:17:48 did you get stuck in the deployment process somewhere? 2024-11-06 21:17:52 Yes 2024-11-06 21:18:05 I tried to deploy a VM from our own image 2024-11-06 21:18:32 Azure kept waiting on something, the VM was running, but I could not connect to it 2024-11-06 21:18:49 I had access via a serial console, but no credentials to login (only ssh key) 2024-11-06 21:19:18 To be honest, I did use the tinycloud variant, so maybe that causes issues 2024-11-06 21:19:28 I was actually about to ask 2024-11-06 21:19:47 I think Azure supports Cloud-Init and their own Azure Linux Agent 2024-11-06 21:20:15 We do have an azure tinycloud variant (tinycloud is mostly cloud-init compatible) 2024-11-06 21:20:46 I didn't know that, I actually thought we only had AWS compatible cloud-init images 2024-11-06 21:21:46 Since a while we also have other variants 2024-11-06 21:21:59 Officially still beta 2024-11-06 21:22:03 https://www.alpinelinux.org/cloud/ 2024-11-06 21:23:04 The last time I went to /cloud it just had amis for ec2, this is exciting to see :) 2024-11-06 21:23:42 There are azure images going back to 3.18 2024-11-06 21:24:27 Still 26 pending jobs for armv7 2024-11-06 21:24:44 (chromium pipelines are holding things up 2024-11-06 21:24:47 ) 2024-11-06 21:25:23 I definitely haven't checked in a hot minute, I've just kept launching VMs on AWS. On DO I rolled my own image so I don't even think about it there 2024-11-06 21:25:48 I haven't used any of these much either, but want to test them for Azure now 2024-11-06 21:25:52 I would be curious to see if the cloudinit variant operates differently than the tinycloud one, maybe Azure does something funky 2024-11-06 21:26:25 Yeah 2024-11-06 21:26:35 Sadly it takes some time to import 2024-11-06 21:26:54 You have to create some gallery style image to be able to use aarch64 2024-11-06 21:27:53 And you have to fill in all kinds of details which mostly relate to commercial products 2024-11-06 21:28:47 that sounds like a less than stellar experience 2024-11-06 21:29:09 though it's not like any of these hyperscalers are winning awards for UX 2024-11-06 21:29:21 I'd rather spend the time futzing with Terraform from the get go instead 2024-11-06 21:29:40 Yeah, but even that is tricky with Azure 2024-11-06 21:30:28 The provider depends on their cli tools, which consists of countless python dependencies locked to specific versions 2024-11-06 21:30:39 And authentication is a disaster as well 2024-11-06 21:31:10 sort of like terraform depending on the impossibly difficult to maintain aws-cli 2024-11-06 21:31:25 right 2024-11-06 21:31:32 I've packaged several of the aws tools, it's such a mixed bag 2024-11-06 21:31:47 I did privately package the azure-cli, but never submitted it 2024-11-06 21:32:12 well if it's composed of locked python dependencies then I don't blame you for not submitting it 2024-11-06 21:32:24 salt has taught me the joy that comes with that specific headache 2024-11-06 21:32:51 ah yeah 2024-11-06 21:33:16 The whole ecosystem becomes such a mess with everything locking everything so tight that the only solution is containers or flatpack 2024-11-06 21:33:39 (or venvs for python) 2024-11-06 21:41:25 agreed entirely, and unfortunately everything is written in python because "it's what everyone uses" 2024-11-06 21:41:44 To deploy an image I have to do 4 steps for each region: create a storage account, upload the image as a blob, create a vm image, create a compute image 2024-11-06 21:42:04 thank goodness Golang exists and has at least partially fixed that problem as its gotten more popular 2024-11-06 21:42:13 yup 2024-11-06 21:42:55 that seems needlessly complicated 2024-11-06 21:43:40 Hmm, though it's apparently possible to replicate the compute image to other regions 2024-11-06 21:45:20 Now waiting for the compute image to be deployed 2024-11-06 21:45:43 That seems less frustrating, assuming the image works replicating it across all of other regions would be a small lift 2024-11-06 21:52:41 Ok, deployment complete 2024-11-06 21:56:38 Deploying VM 2024-11-06 22:32:29 Ok, succes 2024-11-06 22:32:46 I did a couple of things a bit different, so not sure what was the exact issue, but I have a VM now 2024-11-06 22:35:13 Was cloud init one of them or did you stick with tiny? 2024-11-06 22:36:10 yeah, picked cloud init 2024-11-06 22:40:45 when i last tried tiny-cloud on azure, the UI thought it wasn't instantiated, but it was accessible -- not sure what it was waiting for. 2024-11-06 22:41:13 right, but for some reason could not access it 2024-11-06 22:42:11 i need to spin one up soonish anyways to see if they provide any sort of "i'm on azure" via DMI info 2024-11-06 22:44:13 i don't recall if it was azure or gcp, but one had some weirdness about specifying the "cloud user" login 2024-11-06 22:44:42 that had previously confounded a couple of people 2024-11-06 22:45:01 Azure has it 2024-11-06 22:47:33 ikke if you have a chance on the azure host, could you check and see if anything in /sys/class/dmi/id/* has anything that indicates it's on azure? 2024-11-06 22:48:01 sure, will check 2024-11-06 22:50:07 Nothing explicitly mentions azure 2024-11-06 22:50:19 Only inderect microsoft / hyper-v, but that could be anywhere 2024-11-06 22:54:14 Nice Azure aarch64 hosts support 32-bit mode 2024-11-06 23:17:42 That's cool, would the preference be to run aarch64 in 32-bit mode and use that solely for armhf/v7 builds? 2024-11-06 23:21:09 The host itself runs aarch64, but the containers run the 32-bits images with linux32 2024-11-06 23:25:21 That makes sense. I haven't tried running non-x86_64 VMs outside of AWS where that's also supported. I just assumed it would work out the box 2024-11-07 07:11:13 i changed the dependencies for alpine-mksite. Do I need to change some CI job as well? 2024-11-07 07:12:34 or let me rephrase: how and where are wwwtest.a.o and a.o sites generated and published? I need to verify that the correct dependencies are installed 2024-11-07 07:29:35 They're lxc containers 2024-11-07 07:29:45 A.o lives on gbr-app-1 2024-11-07 07:29:58 wwwtest on dev.a.o 2024-11-07 07:30:54 found the repo. alpine-www 2024-11-07 07:31:57 That's an empty repo 2024-11-07 07:32:33 I think I used that for testing with Kubernetes 2024-11-07 07:33:00 (there is a branch which contains more) 2024-11-07 08:10:45 ok. so I have to log in to the lxc and install the dips manually then I suppose 2024-11-07 08:10:49 deps 2024-11-07 08:37:07 Yes, until we have a different way to deploy it 2024-11-07 11:26:19 ncopa: can I help with something? 2024-11-07 11:58:11 just replace lua-cjson with lua-rapidjson, or install them both 2024-11-07 11:58:23 sure, will do 2024-11-07 12:07:54 ncopa: done 2024-11-07 12:08:46 thank you! 2024-11-07 12:09:27 can you re-run the trigger for wwwtest? 2024-11-07 12:09:29 curl https://wwwtest.alpinelinux.org/rpi-imager.json 2024-11-07 12:09:37 should not show any \/ only / 2024-11-07 12:16:03 ncopa: looks okay now 2024-11-07 12:16:56 awesome! thanks! 2024-11-07 12:17:02 I'll merge it to master then 2024-11-07 12:17:22 to production I mean 2024-11-07 12:18:16 and now it works. thank you! 2024-11-07 12:23:03 ncopa: fyi, not entirely sure if that was the cause, but there were some issues deploying an Alpine VM on azure with tinycloud 2024-11-07 12:23:36 azure could not confirm the vm was deployed properly 2024-11-07 12:30:05 i heard. I'd like to fix that, but will not have time this week 2024-11-07 12:30:31 Sure, no problem. I have something working with cloud-init now 2024-11-07 12:30:57 i'd heard that the machine is up, you can ssh to it, but azure does not detect it. So I'd like to compare with a working cloud-init to find out what tiny-cloud needs to do 2024-11-07 13:38:29 ncopa: fyi, I've updated the gitlab-runner-alpine-ci project to take the new registrion flow into account 2024-11-07 14:24:42 ikke: when you're setting up new nodes, like you did yesterday on Azure, is the initial configuration all done by hand, or do you have it scripted in some sense? 2024-11-07 14:46:46 https://gitlab.alpinelinux.org/-/snippets/1152 2024-11-07 14:47:22 We have not automated managing hosts that much yet 2024-11-07 14:50:49 But running that script gives us a runner that is accepting jobs 2024-11-07 15:20:36 Makes sense, setting up runners is probably what you get the most churn with, and a good place to start. I had a similar process when I was using Gitlab CI 2024-11-07 15:21:19 https://krei.lambdacreate.com/durrendal/Verkos <- though I somewhat over-engineered my own personal solution to setting things up 2024-11-07 15:22:58 ansible might not be a bad fit for something like this, it could be tied into terraform/opentofu, and just as easily run one off in an ad hoc fashion 2024-11-07 15:24:05 My background is chef, where you have more continuous desired state 2024-11-07 15:25:02 (nowadays the open source cinc variant) 2024-11-07 15:26:17 I bounce around a lot. I view ansible a lot like scripts. Sort of one off ad hoc state application. Verkos is inspired by it. 2024-11-07 15:27:04 Professionally though I've done a ton of SaltStack work. Sadly that project is in major turmoil right now 2024-11-07 15:27:59 I was actually looking at Chef/Puppet to potentially replace my SaltStack deployment here at work, but my understanding is those are pull based systems (remote agent checks in every X minutes) versus push based like Salt 2024-11-07 15:28:04 Yeah, I've noticed 2024-11-07 15:28:17 salt was also my first orchestration like tool 2024-11-07 15:28:32 yes, correct 2024-11-07 15:28:47 That's also what i mean with more continuous desired state 2024-11-07 15:28:54 Not task based 2024-11-07 15:32:58 I'm a bit conflicted in what I'd like to achieve 2024-11-07 15:33:48 Ideally people would be able to contribute to our infra just by making merge requests 2024-11-07 15:34:11 Honestly, my use of Salt might have been non-standard. I use the push based applications to perform ad hoc maintenance (on workstations specifically) and the reactor, beacon, orchestrator to do automated deployments for servers. 2024-11-07 15:34:39 since most of Alpine's infrastructure is just servers, it feels like Chef/Puppet would fit well 2024-11-07 15:34:47 Yeah 2024-11-07 15:35:14 That would be a nice workflow. High visibility, MR approvals to make changes so you have everything logged. 2024-11-07 15:35:23 Problem is that chef (and cinc as well) are binary distributions 2024-11-07 15:35:29 I suppose CI could apply those changes or trigger them 2024-11-07 15:36:53 I admittedly haven't looked into it enough, I'm guessing it's not as easy as just packaging the project so it could be apk installed 2024-11-07 15:38:09 For the cinc client (what runs on the managed server), it's ruby, so it's doable, but again it's a dependency locked distribution 2024-11-07 15:38:25 Might have a hard time if you use just the dependencies that are available in aports 2024-11-07 15:39:17 The sum total of our Ruby ecosystem is pretty sparse currently, that's probably a decent lift to get packaged. 2024-11-07 15:39:47 I had initial thoughts to try and package Puppet, but I'm ~100 ruby dependencies deep and not even close... 2024-11-07 15:39:55 heh 2024-11-07 15:40:19 I used puppet before, but later switched to chef 2024-11-07 15:41:38 The challenge is security 2024-11-07 15:41:45 I might take a closer look at Chef, I've been fishing for something to replace Salt in the event Broadcom kills it, but it's tough to replace 2024-11-07 15:42:03 yes agreed, you have to trust having a remote agent on all of your distributed infrastructure. 2024-11-07 15:42:25 And I have no clue how Chef or Puppet handles secrets, Salt can be less than graceful with it if you're not careful 2024-11-07 15:43:52 You should avoid setting secrets in attributes in chef, which get sent back to the server 2024-11-07 15:44:09 I'm using hashicorp vault to dynamically obtain secrets 2024-11-07 15:46:39 We also btw have netbox to document our infra 2024-11-07 15:49:46 Agreed there, same thing goes for Salt. Unless you're using GPG encrypted pillar data. 2024-11-07 15:50:17 Vault is something I've wanted to really dive into, but I've never had the chance. What's the administrative burden like? 2024-11-07 15:50:19 chef has also something like that 2024-11-07 15:50:51 I have no personal long-time experience with vault, but I think once you have it setup, it's not bad 2024-11-07 15:51:38 also Netbox sounds like an excellent documentation solution for infrastructure. 2024-11-07 15:51:57 It is, though it's not perfect in our situation, but it suffices 2024-11-07 15:52:10 If there's no day to day friction from it then that sounds like it's pretty low burden :) 2024-11-07 15:52:44 I think Netbox maps more closely to physical locations, at least that's the sense I got when I evaluated it a couple years ago 2024-11-07 15:52:51 yes 2024-11-07 15:53:10 It assumes you are owning the datacenters 2024-11-07 15:53:16 We ended up rolling out SnipeIt at $work, but that's really just for tracking who has what hardware, and probably isn't helpful contextually for Alpine 2024-11-07 15:53:30 yeah, that's not that useful for us 2024-11-07 15:53:37 But we can do with netbox 2024-11-07 15:53:38 If someone donates an entire datacenter to Alpine you're ready though! 2024-11-07 15:53:44 :D 2024-11-07 15:53:46 Absolutely 2024-11-07 15:55:36 But again with vault, you are putting all your eggs in one basket security wise 2024-11-07 15:55:52 You have to really protect it 2024-11-07 15:57:16 you really do, it drastically changes your security posture. 2024-11-07 15:57:55 So while I'd love to deploy it, I'm also hessitent 2024-11-07 15:57:59 hessitant* 2024-11-07 15:58:01 Maybe it would be fine if absolutely everything was meshed across a VPN, and you had high assurance of what systems are part of your build cluster 2024-11-07 15:58:56 Almost everything is alreayd meshed with a VPN 2024-11-07 15:59:47 but with things like an M1 macbook in the mix, some of those systems are within physical reach of someone right? 2024-11-07 15:59:56 Not everything is locked inside a cage in a datacenter I mean 2024-11-07 16:00:16 Not everything needs access to secrets though 2024-11-07 16:00:31 Or at least, secrets stored centrally 2024-11-07 16:00:54 A gitlab runner needs a secret to access gitlab, but using vault would be trading one secret for another 2024-11-07 16:45:13 sorry had to step away for a meeting 2024-11-07 16:46:01 I think I'm considering vault/automation in a centralized context. If you wanted to automate the deployment of new systems then really only the orchestration system would need to talk with vault 2024-11-07 16:46:42 pushing limited secrets to a runner makes sense as a security trade off. It reduces exposure to the vault 2024-11-07 16:47:54 But that's centralized, and this feels much more distributed in nature 2024-11-08 00:27:34 anyone other than me abusing gitlab? 2024-11-08 11:51:28 ncopa: what storage do you use for your bpi3 board? I'm concerned using the internal mmc for CI may trash it. 2024-11-08 16:27:30 ikke: I'm using an nvme disk 2024-11-08 16:30:27 Ok 2024-11-08 17:44:06 ikke: ima-evm-utils test failed on 3.21 riscv64 builder with some output about "no xattr" in the logs, but previously passed in riscv64 ci. is there any way to find out why the test is failing on the 3.21 riscv64 builder or if it's missing a dependency? https://build.alpinelinux.org/buildlogs/build-3-21-riscv64/community/ima-evm-utils/ima-evm-utils-1.6.2-r0.log 2024-11-08 17:45:31 ~2 days ago attr was already added to checkdepends for the aport 2024-11-08 17:46:16 here's the job log for riscv64 ci where the test passed https://gitlab.alpinelinux.org/mio/aports/-/jobs/1593587 2024-11-08 17:48:30 the aport passed ci as well on the same pkgver back in sep when it was upgraded to current version, and it passed edge riscv64 builder then 2024-11-08 20:00:10 do we have any control over how many chars are shown of the filenames on the CDN? i.e. https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/cloud/ 2024-11-08 20:17:53 I believe that's darkhttpd 2024-11-08 20:18:07 Not sure if we can control that 2024-11-10 18:17:36 :q 2024-11-10 19:54:16 3.21 aarch64 builder might be stuck with ofono 2024-11-10 19:55:19 might also be a good idea to check on 3.21 x86 builder as well 2024-11-10 19:56:17 I can only check later 2024-11-10 19:57:40 okay, thanks, whenever you have a moment ... just mentioning as they seem to be building on the same aport for some time 2024-11-10 21:39:42 ikke: thanks 2024-11-10 21:39:51 np 2024-11-10 21:40:32 is 3.21 armv7 okay as well? 2024-11-10 21:41:00 Building chromium 2024-11-10 21:41:24 well, py3-puppeteer actually 2024-11-10 21:42:01 been a few hours for py3-pyppeteer, not sure 2024-11-10 21:42:22 Checking the logs 2024-11-10 21:42:30 See if there is progress 2024-11-11 03:34:54 I would like it if our https://alpinelinux.org/atom.xml could contain content 2024-11-11 12:50:28 was there progress on py3-pyppeteer? 2024-11-11 16:22:56 No progress 2024-11-11 17:14:25 okay, thanks 2024-11-12 16:14:27 im bootstrapping openjdk11 on s390x 2024-11-12 16:18:08 ncopa: 👍 2024-11-12 16:30:06 and its done 2024-11-12 16:53:10 could someone maybe check on the x86_64 builder? stuck on py3-pyppeteer, like previously with armv7 2024-11-12 16:59:45 done 2024-11-12 17:00:38 oh wow, loongarch64 finished community 2024-11-12 17:00:53 thanks 2024-11-12 17:29:53 Hey folks - I work for DigitalOcean and our recursive resolvers seem to be blocked from resolving anything in "alpinelinux.org" at the moment. We reached out to linode/akamai where the zone seems to be hosted, but we haven't heard anything back yet 2024-11-12 17:30:46 I was hoping maybe somebody here could help. :) We have a workaround, but it's not super pretty. 2024-11-12 17:51:32 egon1024: our dns is hosted by linode 2024-11-12 17:56:12 That's what we figured - we've sent them tickets and tried reaching out through various channels, but have not had any luck hearing back yet. We were hoping maybe we could ask you to ping Linode to at least look at our request? We have quite a few users who are unable to download Alpine. :) 2024-11-12 19:30:29 `mkdir: can't create directory '/home/buildozer/aports/community/xfce4-vala/src': No space left on device` from the 3.21 aarch64 builder 2024-11-12 19:53:42 `mkdir: can't create directory '/home/buildozer/aports/community/plasma-dialer/src': No space left on device` for 3.21 s390x builder, though sounds like it is a known issue 2024-11-12 19:54:51 just mentioning in case it is a new incident 2024-11-12 20:39:07 egon1024: sorry, I was AFK, but could you perhaps share some details (in private if you prefer)? 2024-11-12 20:39:31 ikke: No worries. :) Sure - I'll DM 2024-11-12 20:44:18 full disks, full disks everywhere 2024-11-12 21:03:01 che-bld-1 has 215G free now 2024-11-12 21:05:55 thanks 2024-11-12 21:07:20 and usa2-dev1 also has a bit more free space 2024-11-12 21:07:36 But it will be come harder and harder to clean up enough space 2024-11-13 05:58:46 bootstrapping openjdk8 on armhf 2024-11-13 05:59:13 and openjdk21 on s390x 2024-11-13 05:59:52 and openjdk17 on aarch64 2024-11-13 06:16:03 ghc on x86_64 2024-11-13 06:30:55 I think there are test failures for GHC on x86_64 2024-11-13 06:31:11 aarch64 has !check 2024-11-13 06:34:03 Not solved by https://gitlab.haskell.org/ghc/ghc/-/merge_requests/12570 https://gitlab.haskell.org/ghc/ghc/-/merge_requests/13218 the last time i tried 2024-11-13 07:37:21 cely: indeed, 2 test failures 2024-11-13 14:41:50 how is openjdk8 armhf bootstrapping going? 2024-11-13 16:03:52 looks like build-3-21-armhf was done bootstrapping so I rebooted it 2024-11-13 16:22:37 Did you remove the bootstrap package? 2024-11-13 17:25:48 apparently, yes 2024-11-13 19:25:40 im bootstrapping openjdk8-coretto on x86_&4 2024-11-13 19:29:53 Thank you 2024-11-13 19:59:02 im boostrapping opendk8 on ppc64le 2024-11-13 20:52:52 :| 2024-11-13 20:54:16 I just freed 200G, now only 77 is left :/ 2024-11-13 21:10:55 that's pretty rapid expansion 2024-11-13 21:11:18 It hosts 6 active builders, so it's not strange, but still 2024-11-13 21:11:29 ah well that makes a bit more sense 2024-11-13 21:11:45 It's mostly temporary build files 2024-11-13 21:12:22 this feels like something orchestration management would be perfect for 2024-11-13 21:13:47 I already have a script to wipe everything not essential in one go 2024-11-13 21:13:56 though, is there any reason the build process doesn't attempt to clean up build files after it finishes? 2024-11-13 21:14:34 of course :) I would expect nothing less, I can't imagine anyone removing files by hand this frequently haha 2024-11-13 21:15:02 It does clean up everything in the src and pkg dirs 2024-11-13 21:17:20 But many projects like to write in $OIME 2024-11-13 21:17:23 $HOME 2024-11-13 21:20:09 ah that makes sense 2024-11-14 00:16:00 3.21 s390x builder might be stuck, has been on py3-starlette for a few hours 2024-11-14 00:44:58 what if while asking from aports merge request a separate file is attached/updated, which has list of cleanup files with path 2024-11-14 00:45:15 this probably have to done first time manually 2024-11-14 00:45:31 updating should be easy 2024-11-14 00:45:56 infra admins only use it, as it does not go to aports(git) 2024-11-14 00:46:16 should be attached/updated to gitlab 2024-11-14 00:47:44 since its named after aports, builders can pick that file and run cleanup script after build 2024-11-14 00:49:27 initial phase to making that list could be some work, but not difficult 2024-11-14 00:51:21 1. ls -aRh $HOME/path/to/install > (after dep fetch) 2024-11-14 00:52:08 2. ls -aRh $HOME/path/to/install > /tmp/b.txt (after compile/upload) 2024-11-14 00:52:46 3. diff between a.txt,b.txt , manual cleanup, attach to gitlab 2024-11-14 00:57:45 using tool like sfic might help, not sure on it 2024-11-14 01:09:36 if those cleanup files are not attached, infra team can build gradually(initially targetting messy ones) 2024-11-14 03:14:29 3.21 riscv64 builder may also be stuck, on py3-networkx 2024-11-14 06:01:03 thanks ... could someone possibly unstuck the 3.21 s390x builder as well? 2024-11-14 06:01:43 p00f 2024-11-14 06:01:58 thanks! 2024-11-14 06:06:18 the mustach test failure seems odd, it passed in s390x ci and edge builder 2024-11-14 06:07:47 somehow fails on 3.21 builder 2024-11-14 06:08:42 could go back and disable check on s390x, just mentioning in case anyone would like to look into it first 2024-11-14 06:09:22 edge passing: https://build.alpinelinux.org/buildlogs/build-edge-s390x/community/mustach/mustach-1.2.10-r0.log 2024-11-14 06:09:35 3.21 failed: https://build.alpinelinux.org/buildlogs/build-3-21-s390x/community/mustach/mustach-1.2.10-r0.log 2024-11-14 06:11:35 segfault 2024-11-14 06:13:14 https://tpaste.us/9XNV 2024-11-14 06:13:35 Not sure if that's related, but that's whats in dmesg 2024-11-14 06:14:59 thanks ... looks like it may be better to disable check on the arch 2024-11-14 06:16:10 the older version had a similar error, was hoping it was fixed with an upgrade since it passed on the other builder 2024-11-14 06:17:03 maybe a sensitive test 2024-11-14 06:22:55 mio: seems to be related to valgrind 2024-11-14 06:23:20 https://tpaste.us/nNap 2024-11-14 06:25:47 mio: in this light, I would prefer not to just disable tests, but find out why valgrind is complaining 2024-11-14 06:25:53 or disable the package 2024-11-14 06:26:57 would it not fail in ci first? 2024-11-14 06:27:36 I have seen real bugs only exposed on the builder 2024-11-14 06:28:24 not saying it's necessarily the case here, but it's not good to dismiss an issue just because it passed in CI 2024-11-14 06:29:01 yeah. wondering why it passed on edge s390x builder 2024-11-14 06:30:02 will check upstream to see if there's anything on the issue 2024-11-14 06:31:03 the paste is helpful, thanks 2024-11-14 06:32:32 `valgrind ../mustache json must` passes indeed on edge 2024-11-14 06:34:07 One difference is bb -r6 vs -r7 2024-11-14 06:34:31 but it still passes on edge 2024-11-14 06:36:10 The assertion failure apparently is in valgrind itself 2024-11-14 06:36:22 m_debuginfo/image.c is a valgrind file 2024-11-14 06:37:33 mio: you could disable valgrind instead 2024-11-14 06:38:59 ikke: okay, thanks 2024-11-14 06:45:28 is there an option to add more storage to some of these machines? I know that just buys you time between clean ups, but there's something to be said for not being under constant fear of something running out of space enough to work on a permanent solution 2024-11-14 07:11:26 further on mustach: valgrind previously segfaulted on armv7, which led to the novalgrind option when the error was reported upstream. not sure yet if there is a consistent issue on s390x as well 2024-11-14 07:16:55 for mustach itself, yeah, the tests will likely pass without valgrind 2024-11-14 07:17:41 ikke: thanks again for your help 2024-11-14 07:25:39 good morning! I'll bootstrap ghc on aarch64 now 2024-11-14 11:22:22 ncopa: ghc on x86_64 has test failures, but apparently tests were disabled on aarch64 2024-11-14 11:36:30 bootstrapping openjdk17 on s390x 2024-11-14 12:22:23 im bootstrapping openjdk8 on x86 2024-11-14 13:56:14 opendj8 is done on x86. i had to revert the $ORIGIN fix in abuild as it broke things 2024-11-14 13:56:24 im bootstrapping openjdk8 on aarch64 now 2024-11-14 14:02:09 When you say you're bootstrapping the openjdk packages, is there something special to the process, or is it manually building the apkbuild for each? 2024-11-14 14:02:26 Also curious if there are other packages that require this type of intervention? 2024-11-14 14:05:58 durrendal: these are mostly for languages that are self-hosted 2024-11-14 14:06:22 They depend on themselves to build 2024-11-14 14:07:13 But when setting up a new builder for a stable release, these decencies are not available 2024-11-14 14:09:15 So we basically need to install the package from edge and then build the specific package 2024-11-14 14:09:58 Durrendal: https://build.alpinelinux.org/buildlogs/build-3-21-ppc64le/community/openjdk21/openjdk21-21.0.5_p11-r0.log 2024-11-14 14:11:39 In this case, openjdk21-bootstrap is provided by openjdk21 2024-11-14 14:18:01 Ah that makes sense! Not a difficult process, but I get why you'd want to clearly communicate when you're bootstrapping these since you're pulling in packages from edge to handle it 2024-11-14 14:18:40 sbcl builds like this, however in that case there's a -stage0 package, so I guess the process is a little different 2024-11-14 14:26:58 yes 2024-11-14 14:27:11 the -stage0 package can build on its own 2024-11-14 14:34:00 that difference is nuanced, but now that you point it out it makes complete sense. 2024-11-14 14:34:22 sbcl-stag0 is built with ecl, which then builds sbcl. So the process is different. 2024-11-14 14:34:59 how did we get the initial openjdk packages ported then? Did we need to use a non-alpine system to build the initial version? 2024-11-14 14:35:14 At least initially, we used gcc6 2024-11-14 14:36:11 Not sure how the latest iteration was bootstrapped though, because they are now completely self-hosted (not depending on earlier jdk versions) 2024-11-14 15:00:33 makes sense :) thanks for entertaining my curiosity ikke 2024-11-14 16:12:02 bootstrapping openjdk21 on x86_64 2024-11-14 18:01:33 how is openjdk21 going ikke? 2024-11-14 18:02:47 finished 2024-11-14 18:02:56 thanks! 2024-11-14 18:03:08 builder is running again 2024-11-14 18:03:38 awesome great! 2024-11-14 18:03:53 looks like we need to store acme-client on dev.a.o/archive 2024-11-15 07:16:18 i have cleaned some arm dev containers 2024-11-15 07:16:31 i saw my ncopa-edge-aarch64 took 27G something 2024-11-15 07:16:38 llvm and kernel builds 2024-11-15 07:20:21 Thanks ❤️ 2024-11-15 07:20:47 i wonder if we can delete dev containers of people who left the proj 2024-11-15 07:22:20 there are at least 3 2024-11-15 07:22:34 i think I'll just delete them. if they come back we can create new ones 2024-11-15 07:27:14 Yup 2024-11-15 14:45:47 im bootstrapping ghc on x86_64 2024-11-15 14:46:19 Thanks, fyi, last time I tried it failed the tesds 2024-11-15 14:46:23 Tests* 2024-11-15 15:52:33 can someone please check on the 3.21 aarch64 and riscv64 builders? looks like they are stuck on the tests of the py3 aports they were building ... thanks 2024-11-15 16:08:30 aarch64 builder may be OOMing 2024-11-15 16:11:45 or.. 2024-11-15 16:12:12 network outage 2024-11-15 16:12:26 No, ping is responding 2024-11-15 16:12:32 so OOM 2024-11-15 16:16:25 nu_: If it does not recover, would it be possible to reset it? 2024-11-15 16:22:58 the ghc bootstrap failed indeed 2024-11-15 16:23:17 will have to continue with that other time 2024-11-15 16:23:36 those tests failed on my local machine: 2024-11-15 16:23:39 /tmp/ghctest-3guwnxaz/test spaces/testsuite/tests/rts/pause-resume/list_threads_and_misc_roots.run list_threads_and_misc_roots [exit code non-0] (threaded1) 2024-11-15 16:23:39 Unexpected failures: 2024-11-15 16:23:39 /tmp/ghctest-3guwnxaz/test spaces/testsuite/tests/rts/testwsdeque.run testwsdeque [exit code non-0] (threaded1) 2024-11-15 16:26:28 ncopa: fyi, che-bld-1.a.o is currently not responding 2024-11-15 16:26:31 (ping works though) 2024-11-15 16:26:54 1h ago: 2024-11-15 16:26:56 Memory Used Percentage 2024-11-15 16:26:58 1h 10m 39s91.6759 % 2024-11-15 16:27:48 a few chromium builds, dotnet8 build, and qt6-qtwebengine builds at the same time 2024-11-15 16:30:53 times 6 2024-11-15 17:07:38 okay, thanks 2024-11-15 17:43:19 I though I enabled early-oom on che-bld-1, but apparently I haven't 2024-11-15 17:45:22 ptrc: could you perhaps stop your retry script, which may keep the arm builders memory contended? 2024-11-15 17:48:01 ouch, oops 2024-11-15 17:48:02 stopped 2024-11-15 17:49:51 thanks 2024-11-15 18:58:12 ncopa: We could move the dev containers to the CI host 2024-11-15 19:10:38 ok with me 2024-11-15 19:11:13 is the che-bld-1 still down? 2024-11-15 19:11:21 I still cannot login 2024-11-15 19:11:24 ok 2024-11-15 19:11:26 I have ssh running in a loop 2024-11-15 19:12:05 so it is currently full stop til we have it rebooted? 2024-11-15 19:12:15 It might still recover 2024-11-15 19:12:21 the question is how long before that happens 2024-11-15 19:12:23 maybe it will recover after a week or so :) 2024-11-15 19:12:25 yeah 2024-11-15 19:12:38 do you remember if we have swap enabled on it? 2024-11-15 19:12:45 4G or so 2024-11-15 19:13:07 sounds like a good number 2024-11-15 19:13:53 alright. I'll call it a week and hope its back Monday 2024-11-15 19:14:10 alright, enjoy the weekend 2024-11-15 19:14:26 i suppose there is nothing we can do 2024-11-15 19:14:49 No, nu was still to setup a vpn or something like that for us to get access to the bmc 2024-11-15 19:15:33 have a nice weekend! 2024-11-16 07:37:31 the aarch64 build host is still not responding 2024-11-16 11:02:24 which one is it? 2024-11-16 11:02:43 at nu? 2024-11-16 13:48:21 clandmeter: yes 2024-11-16 14:11:13 We do not have oob? 2024-11-16 14:15:42 Not that I'm aware of 2024-11-16 20:59:37 the dev.a.o/archive is at 99% disk usage 2024-11-16 21:10:08 I suppose we could get rid of the old loongarch repos 2024-11-16 21:12:08 ptrc: I've added some disk space 2024-11-16 21:13:22 thanks ^^ 2024-11-16 21:13:50 ( not that i needed it right now, just wanted to point it out before it becomes an issue ) 2024-11-17 16:49:43 looking into the oob access 2024-11-17 16:51:03 nu_: thanks 2024-11-17 17:08:09 why did i do 95% of the work for oob and now it doesnt work:/ 2024-11-17 17:12:53 Utterly anoying 2024-11-17 19:05:29 ill restore it by end of tomorrow 2024-11-17 19:05:41 nu_: thanks, appreciate it 2024-11-17 19:06:06 np, happy to help:) 2024-11-18 15:12:55 the arm and aarch64 builders have been stuck for days? 2024-11-18 15:15:12 omni: https://gitlab.alpinelinux.org/alpine/infra/infra/-/issues/10832/ 2024-11-18 15:20:41 oh, thanks 2024-11-18 15:21:14 those three are om the same host? 2024-11-18 15:21:51 apparently 2024-11-18 15:22:36 I just had in my head that arm* was built on one and aarch64 built on another 2024-11-18 15:24:57 No, same host 2024-11-18 15:25:18 One host as builder, the other for CI 2024-11-18 15:25:25 building qt6-qtwebengine (chromium) for all thre architectures for edge and 3.21 and at the same time building chromium for 3.20... 2024-11-18 15:26:23 ☠ 2024-11-18 15:28:16 ikke: thanks for clarifying, I often mix things up 2024-11-18 15:29:54 ikke: when it does get up, would it be possible to let it start with just the 3.20 builds or builds for just one of the architectures or something? 2024-11-18 15:30:47 let that complete, enable the other etc 2024-11-18 15:31:00 Yes, that's my plan 2024-11-18 15:31:15 👍 2024-11-18 15:38:11 omni: there was a time where we temporarily had an extra builder, and during transition, we could split things up a bit 2024-11-18 16:09:24 eta <1,5h 2024-11-18 17:18:05 booting 2024-11-18 17:20:32 it's alive! 2024-11-18 17:22:34 \o/ 2024-11-18 17:23:23 (in the voice of Henry Frankenstein in the 1931 movie) 2024-11-18 19:23:21 I'll try to keep this infra knowledge in mind when merging resource heavy things 2024-11-18 19:24:24 I usually do for CI builds, like wait with a qt?-qtwebengine chromium security upgrade for 3.20-stable before it's built for edge 2024-11-18 19:24:53 I enabled earlyoom as well, hopefully it will prevent this from happening in the future 2024-11-18 19:24:53 but I've been more careless when it comes to hitting the builders with work, as they usually seem to handle it 2024-11-18 19:25:22 or it will make large builds fail more easily! :D 2024-11-18 19:25:38 Building chromium (forks) 9 times will break even the most powerfull servers 2024-11-18 19:27:13 and they don't seem to get lighter to build 2024-11-18 19:29:54 the source tarball for qtwebengine-chromium is now (122-based) 821M, it's insane! 2024-11-18 20:44:09 fyi, I currently only have the 3.20 builders enabled for arm* 2024-11-18 20:51:18 thanks! 2024-11-18 21:31:07 community/zanshin tests hanged on build-3-21-x86. i killed it and rebooted the container 2024-11-18 21:32:29 thanks 2024-11-19 00:28:30 is build-edge-armhf stuck on uploading to main? 2024-11-19 06:06:29 omni: no, just did not start it 2024-11-19 06:06:31 yet 2024-11-19 08:18:09 ikke: looks like arm builder has a loadavg on ~80 2024-11-19 08:18:14 79 2024-11-19 08:20:25 i think I'll stop build-edge-aarch64 for a while 2024-11-19 08:20:44 at least til build-3-21-aarcht64 is done with webkit2gtk-4.1 2024-11-19 08:21:04 or i'll just lxc-freeze it 2024-11-19 08:24:01 Thanks 2024-11-19 08:24:38 build-edge-armv7 has not been started yet either 2024-11-19 08:58:51 i have unfrozen build-edge-aarch64 and have lxc-freeze build-edge-armhf 2024-11-19 08:59:09 will unfreez build-edge-armhf once webkit2gtk-4.1 is done on build-edge-aarch64 2024-11-19 12:34:33 lmk if anyone else needs che-bld-1 oob access 2024-11-19 12:35:12 Carlo has access now? 2024-11-19 12:35:25 looks like it 2024-11-19 12:38:40 very nice! thank you nu_! 2024-11-19 12:41:11 ikke: does vpn work for you? 2024-11-19 12:41:24 i cannot reach the bmc interface 2024-11-19 12:42:10 Yes, I can reach it 2024-11-19 12:42:28 me2 now 2024-11-19 12:42:36 ok 2024-11-19 12:44:24 looks good 2024-11-19 12:45:14 Can you check the fan settings? 2024-11-19 12:45:39 i can check :) 2024-11-19 12:46:15 nu_: how noisy is it? 2024-11-19 12:46:21 how much you want it downed? 2024-11-19 12:55:02 starting the edge-arm7 builder now 2024-11-19 13:09:33 ok 2024-11-19 13:09:36 thanks 2024-11-19 13:10:04 build-edge-aarch64 is alsmot done with qt6-qtwebengine 2024-11-19 13:35:28 ikke: its difficult to analyze the fans now 2024-11-19 13:35:33 as the load is very high now 2024-11-19 13:36:09 i cannot lower the fans on high load, want to prevent the melting point :) 2024-11-19 14:49:43 my impression was the the fans were running on higher rpm than it was neccessary 2024-11-19 14:50:21 the base speed is now lower so it should be better, thx clandmeter:) 2024-11-20 15:47:44 im bootstrapping openjdk11 on build-3-21-x86_64 2024-11-20 17:56:28 bootstrapping openjdk21 on ppc64le 2024-11-20 18:25:00 done 2024-11-20 18:38:58 thanks 2024-11-21 09:53:09 ikke: i think we should add linux-firmware-none to the CI docker image. That way we dont install the 1GB firmwmare when building 3rd party kernel modules 2024-11-21 09:54:18 ncopa: could you make a MR against alpine/infra/docker/build-base? 2024-11-21 10:20:50 ok 2024-11-21 10:21:12 I did docker system prune on the CI x86 machine 2024-11-21 10:23:42 Thanks 2024-11-21 10:45:22 ikke: I'd like to test fastly config for cdn.alpinelinux.org: https://gitlab.alpinelinux.org/alpine/infra/infra/-/issues/10811#note_457742 2024-11-21 10:45:37 how can I do that without disrupt prod? 2024-11-21 10:48:42 we have also a couple of *.gliderlabs.com domains. I think we can simply delete those 2024-11-21 10:54:54 oh, we have an dl-cdn-test.alpinelinux.org 2024-11-21 11:22:27 got it working 2024-11-21 11:22:31 http://dl-cdn-test.alpinelinux.org/ 2024-11-21 11:24:26 cool 2024-11-21 11:24:41 And http://dl-cdn-test.alpinelinux.org/alpine/ works as well 2024-11-21 11:26:37 ncopa: wondering, how would it work with cache invalidation, would we need to invalidate /alpine/ separately from /? 2024-11-21 11:30:37 dont know 2024-11-21 11:31:31 but note the "index of /alpine" 2024-11-21 11:31:38 so backend doesn't know 2024-11-21 11:32:18 so I wonder if we should do that trick in nginx backend instead 2024-11-21 11:38:00 It would be good to do it in a way to at least fastly is aware of it, but not sure if that's possible 2024-11-21 11:38:38 Otherwise cache efficacy decreases 2024-11-21 11:38:57 would be nice if we could tell varnish that it can use same cache for both, yes 2024-11-21 11:46:00 yes. it is possible. the trick is to rewrite the url in the vlc_hash instead 2024-11-21 11:53:47 i wonder if we also want the req.http.host to be unified for the hash (for cache object). so dl-cdn.alpinelinux.org and cdn.alpinelinux.org share same cache 2024-11-21 11:53:54 or if we want to separate them 2024-11-21 11:54:40 To me it would make sense to treat them the same 2024-11-21 11:55:55 the config will be somewhat hackish 2024-11-21 11:58:10 i wonder if we should use the cdn.alpinelinux.org as the long term goal 2024-11-21 11:58:38 and first phase we add cdn.a.o, without shared cache 2024-11-21 11:58:49 then we slowly warm up the cache for cdn.a.o 2024-11-21 11:59:26 finally we rewrite dl-cdn.a.o to use cdn.a.o cache 2024-11-21 11:59:43 and then hopefully, some time in the future we can remove dl-cdn.a.o 2024-11-21 11:59:59 Not sure if we can ever remove it 2024-11-21 12:00:27 At least, not any time soon 2024-11-21 12:03:59 the question is if we want do: if (req.http.host == "cdn.alpinelinux.org") { set req.http.host = "dl-cdn.alpinelinux.org" } 2024-11-21 12:04:05 or the other way around: 2024-11-21 12:04:25 if (req.http.host == "dl-cdn.alpinelinux.org") { set req.http.host = "cdn.alpinelinux.org" } 2024-11-21 12:04:44 this is what will be visible in the backed x-forward-for 2024-11-21 12:05:40 i suppose it does not matter 2024-11-21 12:06:09 i also dont know if we want distiguish them in the backend 2024-11-21 12:06:28 which we can but then I probably need to roll the entire config as custom 2024-11-21 12:06:59 no we dont 2024-11-21 12:07:13 ok i think I have an idea now 2024-11-21 12:17:48 Migh 2024-11-21 12:19:04 might be interesting to know how much traffic dl-cdn still gets in the future 2024-11-21 12:35:11 yeah 2024-11-21 12:35:15 i have a fix for that 2024-11-21 12:35:18 but 2024-11-21 12:35:40 currently https://cdn.alpinelinux.org is not enabled in fastly 2024-11-21 12:35:43 it errors 2024-11-21 12:35:54 so I can test it on dl-cdn-test first 2024-11-21 12:39:29 Need to fix the certs as well 2024-11-21 13:03:36 I added cdn.alpinelinux.org to the test config, but it does not work for some reason. gives 404 2024-11-21 13:33:09 i have no idea why cdn.alpinelinux.org gives 404 right now 2024-11-21 13:41:47 perhaps because the t1 mirrors do not server it? 2024-11-21 13:43:00 the nginx config does not mention any domain names 2024-11-21 13:43:13 it works now 2024-11-21 13:43:20 i added cdn.alpinelinux.org certificate 2024-11-21 13:43:21 It's in the docker config 2024-11-21 13:43:39 the .env file 2024-11-21 13:43:50 did you fix it? 2024-11-21 13:43:58 for nld 2024-11-21 13:44:05 or did it start by itself 2024-11-21 13:44:41 I added cdn.a.o to nld.t1.a.o 2024-11-21 13:44:46 I added cert but I need to add some validation in DNS 2024-11-21 13:45:23 my linode-tf.git repo seems to be broken 2024-11-21 13:45:43 $ git pull --rebase 2024-11-21 13:45:43 glab auth git-credential: "erase" is an invalid operation. 2024-11-21 13:45:43 remote: HTTP Basic: Access denied. If a password was provided for Git authentication, the password was incorrect or you're required to use a token instead of a password. If a token was provided, it was either incorrect, expired, or improperly scoped. See 2024-11-21 13:45:43 https://gitlab.alpinelinux.org/help/topics/git/troubleshooting_git.md#error-on-git-fetch-http-basic-access-denied 2024-11-21 13:49:48 Do you have some kind of hook installed? 2024-11-21 13:52:42 apparently. i solved it by switch to ssh url 2024-11-21 13:54:04 how did we validate the dl-cdn domain for cert? did we create the acme challenge CNAME? 2024-11-21 13:55:05 I think fastly supports LE 2024-11-21 13:55:11 so it should take care of it 2024-11-21 13:55:56 LE? 2024-11-21 13:56:01 Lets Encrypt 2024-11-21 13:56:25 yes, thats what I enabled but we need verify that we own the domain 2024-11-21 13:56:33 by addign a acme challenge CNAME 2024-11-21 13:56:58 clandmeter took care of that, so not sure what he did 2024-11-21 13:57:56 he added it directly in linode web interface 2024-11-21 13:58:00 and I did the same now 2024-11-21 13:58:02 ah ok 2024-11-21 14:10:39 ok i thikn the cert is set up now 2024-11-21 14:10:55 so I just need to add cdn.a.o to traefik on the backends 2024-11-21 14:11:24 i wonder why we use traefik in front of nginx? 2024-11-21 14:20:11 It's how we do it for all docker hosts 2024-11-21 14:21:59 Makes it easier to deploy more applications if we wanted to 2024-11-21 14:37:27 i have added cdn.a.o to usa.t1.a.o .env 2024-11-21 14:54:05 and I have added cdn.alpinelinux.org to sgp.t1 as well 2024-11-21 15:45:57 ncopa: thanks 2024-11-21 15:47:33 hey, im stressing a bit and messed up 2024-11-21 15:47:43 I wanted to discuss this https://gitlab.alpinelinux.org/alpine/infra/compose/alpine-mirror-sync/-/merge_requests/2 2024-11-21 15:47:52 but hit "merge" by mistake 2024-11-21 15:47:59 i hope its not automatically deployed 2024-11-21 15:48:28 It's not 2024-11-21 15:48:38 i need get some food. way over time lunch. after that i can revert if we need to do so 2024-11-21 15:49:29 what do you think about the idea? avoid redirect, support short URL, and still be backewards compat 2024-11-21 15:50:00 i don tthink we need to anything on fastly side, only add support for cdn.a.o 2024-11-21 16:31:08 should I revert it? 2024-11-21 16:39:36 i have some pending changes in fastly as well. would like someone help me review the changes I'm doing for cdn.alpinelinux.org 2024-11-21 16:40:30 i also realized that on each fastly config activation, we loose the entire cache 2024-11-21 16:44:43 no, i misunderstood. we dont lose the cache on new config 2024-11-21 16:45:02 req.vcl.generation is only incremented when we press "purge all" 2024-11-21 16:56:17 ncopa: i think the changes are fine. We can always change it if it's not behaving as expected 2024-11-21 16:59:20 alright. do you mind if I restart the docker containers? do I need to regenerate the docker images? 2024-11-21 17:04:14 i think I'll also go ahead and delete *.gitlab.com and add cdn.alpinelinux.org to prod 2024-11-21 17:38:12 I have enabled the updated nginx config on {nld,sgp,usa}.t1 2024-11-21 18:13:45 I removed the cert created for cdn.a.o and will create new cert tomorrow 2024-11-21 18:14:39 i also moved cdn.a.o to fastly production config, and removed *.gliderlabs.com, and activated that 2024-11-21 18:15:08 so now is http://cdn.apinelinux.org/ working. I'll try set up another cert later tonight or tomorrow 2024-11-21 18:17:25 i have a version 93 draft config for prod, which will make cdn.a.o share cache with dl-cdn.a.o/alpine, so we only need to purge one of them in the future 2024-11-21 20:58:27 i dont remember the ip address to the sophgo lxc host for build-3-21-riscv64 2024-11-21 21:00:15 172.16.30.2 2024-11-21 21:00:57 I really need to setup dns for that 2024-11-21 21:01:13 ssh: connect to host 172.16.30.2 port 22: Host is unreachable 2024-11-21 21:01:31 maybe we need to hard reset it 2024-11-21 21:01:38 Yes, I think so 2024-11-21 21:01:47 .3 is reachable 2024-11-21 21:01:51 do we have OOB access? 2024-11-21 21:01:51 clandmeter: ^ 2024-11-21 21:02:08 carlo has remote access to the power supply 2024-11-21 21:03:40 it would reboot both hosts though 2024-11-21 21:04:15 the other is build-edge-riscv64 and ncopa-edge-riscv64 2024-11-21 21:04:22 yes 2024-11-21 21:25:38 Will power toggle 2024-11-21 21:28:44 Done 2024-11-21 21:59:23 im bootstrapping openjdk17 on x86_64 2024-11-21 22:49:10 openjdk17 done on build-3-21-x86_64 2024-11-21 22:49:34 seems like the other pioneer box didnt come back after power cycle 2024-11-21 22:49:57 so build-edge-riscv64 is down 2024-11-22 06:06:10 Now nld-bld-1 is back but nld-bld-2 is unreahcable 2024-11-22 08:38:53 should be both back now 2024-11-22 08:39:07 they regularly do not boot 2024-11-22 08:39:17 i assume some fw issue 2024-11-22 12:20:11 thank clandmeter! 2024-11-22 12:21:17 clandmeter: i also wonder if you could help me with the cert for https://cdn.alpinelinux.org I have added the SAN to the cert for dl-cdn.alpinelinux.org in fastly, but it still does not work. I have also added the acme challenge cname in linode dns 2024-11-22 12:21:23 its no hurry though 2024-11-22 12:28:51 im bootstrapping opendjk11 on build-3-21-ppc64le now 2024-11-22 14:52:37 ppc64le should be done bootstrapping openjdk now 2024-11-22 14:53:02 we have a problem with nld-bld-2 2024-11-22 14:53:12 error: object file .git/objects/4d/fd3d673de321e3b8b9109954caad8f54b8c543 is empty 2024-11-22 14:53:21 looks like filesystem got corrupted 2024-11-22 14:59:10 im re-cloning the sports dir 2024-11-22 15:05:52 Fun. Cloning again is indeed the best option 2024-11-22 15:16:16 run gc on git periodically 2024-11-22 15:16:49 it will club together loose commit objects 2024-11-22 15:29:10 That's not relevant here 2024-11-22 15:29:24 This is an object without any content 2024-11-22 15:30:04 An actual corrupt repo, not some loose objects 2024-11-22 15:34:50 hmm, ok 2024-11-22 15:35:29 ikke: when a bit free, can these msgs be made bit more informative, https://m.insteps.net/mqtt/alpine/20241122/rsync/dl-master.alpinelinux.org/v3.21/ pls 2024-11-22 15:36:35 i have signed-up a free redis service, will try a build a POC app first 2024-11-22 18:47:29 build-edge-risc64 is broken 2024-11-22 18:47:40 ncopa: network issues? 2024-11-22 18:47:49 lots of files installed not owned by any package 2024-11-22 18:47:57 oh, like that 2024-11-22 18:47:59 ouch 2024-11-22 18:48:09 for example bzsomehing.h 2024-11-22 18:48:21 so mariadb found bzip2 support during configure phase 2024-11-22 18:48:39 i think im gonna delete /usr* and apk fix it 2024-11-22 18:50:37 do you remember the package that was building when we rebooted it? 2024-11-22 18:52:07 no 2024-11-22 18:55:57 ugh. i deleted the build-edge-riscv64 buildlogs by mistake 2024-11-22 18:56:09 i should do this stuff when I'm tired 2024-11-22 18:56:49 I think they should've been uploaded to build.a.o 2024-11-22 18:57:14 yeah, no big deal, i just wanted delete the ones that was older 2024-11-22 18:57:25 see if i could find the package that was building wen it was powered off 2024-11-22 18:57:36 make sure the buildlogs dir still exists 2024-11-22 18:58:03 👍 2024-11-22 19:01:20 ncopa: reminds me, we got an email from a developer from rv64. They asked if it matters to use for builders to be HW, or whether VMs would also work 2024-11-22 19:01:51 VMs are ok 2024-11-22 19:01:57 was in in the CC? 2024-11-22 19:02:21 yes 2024-11-22 19:02:27 "About Alpine riscv64 builders" 2024-11-22 19:02:55 The original mail was just to me 2024-11-22 19:04:59 foudn another disturbing email in my inbox. ".... The recommended mitigation is upgrading to Python 3.13.0" ... " what is the timeline ..." 2024-11-22 19:05:04 oh great 2024-11-22 19:05:18 ..? 2024-11-22 19:05:58 some security vuln in python and some security wonders when alpine will upgrade to python 3.13 2024-11-22 19:06:17 https://nvd.nist.gov/vuln/detail/CVE-2024-11168 2024-11-22 19:07:15 https://github.com/python/cpython/pull/103849/files 2024-11-22 19:07:56 yeah, he is probably -- something 2024-11-22 19:08:15 i'll deal with him other day 2024-11-22 19:08:44 Tell him as soon as upgrading will not take half a year 2024-11-22 19:09:23 build-edge-riscv64 should be fixed now 2024-11-22 19:09:32 ncopa: nice, thanks 2024-11-22 19:11:24 found the email 2024-11-22 19:13:01 nice of them to reach out 2024-11-22 19:13:13 for CI VMs are fine, as long as they are big enough 2024-11-22 19:13:45 And native, not emulated 2024-11-22 19:19:04 yeah 2024-11-22 19:28:59 ncopa: did you receive my last reply? 2024-11-22 19:29:34 i did thanks 2024-11-22 19:30:16 ok good, I got a bounce for clandmeter 2024-11-22 20:07:24 ikke: can I subscribe to the mqtt feed that build.alpinelinux.org? Curious if that's publicly available, seems like from recent chatter it is? 2024-11-22 20:07:49 durrendal: msg.alpinelinux.org 2024-11-22 20:10:56 thank you! 2024-11-22 22:34:56 is librttopo really fetching the tarball from distfiles.a.o or is it a temporary routing? https://build.alpinelinux.org/buildlogs/build-3-21-riscv64/community/librttopo/librttopo-1.1.0-r5.log 2024-11-22 22:35:40 the issue seems to be the distfiles.a.o version has a different checksum from the upstream source url 2024-11-22 22:36:37 if recalling correctly, upstream repo tarball downloading was down for a while, not sure if it was temporarily switched to use a distfiles.a.o during this time 2024-11-22 22:37:25 just ran a test locally fetching from upstream and the APKBUILD checksum matches the upstream tarball 2024-11-22 22:38:15 and different from the distfiles.a.o one, so it errors. if there's some way to point it back upstream it should be fine 2024-11-22 22:46:08 never mind, will add a MR to re-fetch 2024-11-23 00:20:27 could someone potentially run the following command on the x86_64 and x86 builders and tell me the result? it's an upstream check for fmemopen. thanks! 2024-11-23 00:20:31 printf "#include \n%s\n" "int main(void){char *buf; fmemopen(&buf, 0, \"r\");}" | cc -xc - >/dev/null 2>/dev/null && echo 1 || echo 0 2024-11-23 00:20:55 (3.21 x86_64/x86 builders) 2024-11-23 11:00:04 build-3-21-x86_64 [~]# printf "#include \n%s\n" "int main(void){char *buf; fmemopen(&buf, 0, \"r 2024-11-23 11:00:04 1 2024-11-23 11:00:04 \");}" | cc -xc - >/dev/null 2>/dev/null && echo 1 || echo 0 2024-11-23 11:11:27 ikke: do you think you will have time to bootstrap opejdk* on aarch64 today? 2024-11-23 11:38:46 Yes 2024-11-23 13:38:26 bootstrapping openjdk11 on aarch64 now 2024-11-23 13:45:39 same for openjdk21 2024-11-23 13:54:21 done 2024-11-23 14:24:21 ncopa: thanks 2024-11-23 20:32:56 cleaned up che-bld-1 again 2024-11-24 14:23:25 ikke: still no idea on why that is happening on build-edge-riscv64? https://build.alpinelinux.org/buildlogs//build-edge-riscv64/community/delfin/delfin-0.4.8-r0.log 2024-11-24 14:26:37 I can fix it, but not sure what causes it to happen 2024-11-24 14:41:47 seems to be happening for every aport 2024-11-24 14:49:27 I suspect it's the trigger 2024-11-24 14:49:32 Executing ca-certificates-20240705-r0.trigger 2024-11-24 14:50:50 run-parts: can't execute '/etc/ca-certificates/update.d/certhash': Exec format error 2024-11-24 14:53:18 There was some file corruption after the host crashed 2024-11-24 15:10:35 omni: should be fixed now 2024-11-24 15:40:55 There were many files that got corrupted. I did a reinstall by uninstalling packages and reinstalling 2024-11-24 15:41:33 But files in /etc/* are protected, so you only get .apk-new files 2024-11-24 15:41:43 I scanned the /etc for .apk-new 2024-11-24 15:41:54 and deleted the obvious stuff 2024-11-24 15:42:05 i didn’t touch certs 2024-11-24 15:42:08 ca-certificates.conf.apk-new was still there (and the .conf file 0 bytes) 2024-11-24 15:42:21 I missed tags one 2024-11-24 15:42:27 tjat 2024-11-24 15:42:32 that 2024-11-24 15:42:38 and also /etc/ca-certificates/update.d/certhash (binary) 2024-11-24 15:42:49 (or script actually) 2024-11-24 15:43:04 i missed the cert stuff 2024-11-24 15:43:24 No worry, but that's why the bundle file became empty 2024-11-24 15:43:29 probably something else as well 2024-11-24 15:43:39 yeah 2024-11-24 15:44:31 py3-networkx tests deadlocks on riscv64 2024-11-24 15:50:13 i wonder if we should just disable it for riscv64 + dependees 2024-11-24 21:38:15 is build-3-21-riscv64 stuck? 2024-11-24 21:38:23 oh 2024-11-24 21:38:33 writing before I read 2024-11-24 21:40:02 is it a specific test it is stuck at? 2024-11-24 23:52:56 could build-3-21-riscv64 be restarted anyway? to let it catch up building other aports 2024-11-24 23:55:24 I also have !75811 but it hasn't run on riscv64 yet 2024-11-24 23:58:13 heirloom-doctools !75717 also waiting for riscv64 to finish but should otherwise be ready 2024-11-24 23:58:57 not mine, but is one of the 3.21 blockers 2024-11-25 00:12:29 ... and it's done, cool 2024-11-25 08:59:04 oh 404 on gitlab 2024-11-25 08:59:57 ikke: ^ 2024-11-25 09:01:59 what happened to gitlab? 2024-11-25 09:02:22 nobody knows 2024-11-25 09:02:25 im restarting it 2024-11-25 09:02:50 at least i get bad gateway now 2024-11-25 09:04:15 and she is back 2024-11-25 09:04:50 nice thx 2024-11-25 09:12:34 Yes, was fixing it 2024-11-25 09:14:56 I stopped gitlab to fix load issues 2024-11-25 10:56:42 ikke: ok i didnt know you were on it 2024-11-25 10:56:45 so i just restarted it 2024-11-25 10:56:54 Np, I was distracted 2024-11-25 16:26:02 perhaps of interest: https://about.gitlab.com/releases/2024/11/21/gitlab-17-6-released/#merge-at-a-scheduled-date-and-time 2024-11-25 18:56:18 https://forum.openwrt.org/t/183206 https://lists.openwrt.org/pipermail/openwrt-devel/2024-January/041989.html 2024-11-25 19:22:35 ikke: has something like SeaweedFS been considered to deal with the disk space/locally built package issue? 2024-11-25 19:22:44 https://github.com/seaweedfs/seaweedfs 2024-11-25 19:23:51 It's a distributed FS with FUSE & S3 api capabilities. You can run volume servers on remote hardware (where ever you have the most storage), automatically replicate it around between systems. 2024-11-25 19:24:53 I've been using it for affordable bulk storage for my Loki stack at $work, but it struck me that it may have a use case for the build infrastructure. 2024-11-25 19:26:02 Not really, but what we have in mind would mean builders would no longer have a complete repository available 2024-11-25 19:27:11 And we don't have servers with enough storage available anyway 2024-11-25 19:27:59 how much storage is there? 2024-11-25 19:30:31 what do you currently have in mind? removing the dependency on a complete repository sounds interesting 2024-11-25 19:32:08 durrendal: nothing too concrete just yet, but the idea is being able to have multiple builders that can pick up build jobs and then upload the packages somewhere 2024-11-25 19:32:17 quite similar to how our CI already works 2024-11-25 19:40:22 that makes sense, so they'd pull down whatever dependencies they needed from the apk repositories, instead of fetching the dependencies locally? 2024-11-25 19:46:37 probably, yes 2024-11-25 19:48:16 omni: we have various builders, each with various amounts of local storage 2024-11-25 19:48:58 But I'm not sure we can rely on the storage of a specific builder 2024-11-25 19:49:23 Does that adjust the disaster recovery plan any? More frequent snapshots of the package repositories? (I think I remember you mentioning there are backups from the rsync snafu, just curious how you think about all of this) 2024-11-25 19:50:15 Yes, that's something that I was thinking of as well 2024-11-25 19:57:17 it makes sense to make the build systems as light and easy to configure as possible. Especially if it meant you could spin up more systems extremely rapidly without worrying about their maintenance 2024-11-25 19:57:28 yup 2024-11-25 19:57:42 but it sounds like if the repository was lost we'd be rebuilding from scratch, which sounds significant. 2024-11-25 19:57:50 suppose that's always a risk regardless 2024-11-25 21:51:33 thankfully we already have noarch for arch-independent and can now properly use it 2024-11-25 23:06:59 for pcc-libs, the source and project url have returned 404 for some time, could the cached tarball at https://distfiles.alpinelinux.org/distfiles/v3.20/pcc-libs-20230603.tgz be copied to dev.a.o for now for the APKBUILD to link to until upstream comes back? 2024-11-25 23:07:31 (if APKBUILDs usually don't link to distfiles.a.o directly) 2024-11-26 06:37:14 mio: I've copied it to the 3.21 directory, so the builders should be able to pick it up 2024-11-26 06:37:26 even better, thanks 2024-11-26 08:05:51 sorry, can you copy https://distfiles.alpinelinux.org/distfiles/v3.20/pcc-20230603.tgz as well to 3.21? it came up on the 3.21 x86 builder 2024-11-26 08:06:33 pcc-libs has been rebuilt 2024-11-26 11:21:32 mio: done 2024-11-26 17:31:56 ikke: thanks 2024-11-27 17:14:06 ikke: it seems like it says "ready to merge" when the first CI passes (but second does not) https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/75998 2024-11-27 17:14:32 looks green here: https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests 2024-11-27 17:16:50 Oh, that's bad 2024-11-27 17:24:03 ncopa: fixed 2024-11-27 19:26:56 durrendal: I was thinking about something you could help with 2024-11-27 19:27:47 Create something to deploy openbao 2024-11-27 19:27:55 (vault) 2024-11-27 19:28:16 Probably in docker + compose 2024-11-27 20:05:10 I'd be more than happy to help with that :) anything particular to take into consideration while designing (either from a nuance of the infrastructure or preference perspective)? 2024-11-27 20:06:39 One thing to figure out is how we are going to take care of unsealing the vault 2024-11-27 20:07:22 And delegate tokens from the root token to use 2024-11-27 20:08:16 For the rest, you could take a look at https://gitlab.alpinelinux.org/alpine/infra/docker and https://gitlab.alpinelinux.org/alpine/infra/compose to get a feeling of some common patterns 2024-11-27 20:21:03 Definitely a fun project to tackle, I'm pulling down the most recently updated docker and compose repos now so I can take a look. Thanks ikke! 2024-11-27 20:22:10 If I have any questions or ideas that need a second set of eyes I know where to ask :) 2024-11-27 20:22:31 :) 2024-11-28 08:15:09 is gitlab api/v4 service still available ? probably got disabled during upgrade 2024-11-28 08:15:38 Is still available 2024-11-28 21:42:50 im working on openjdk21 on build-3-21-riscv64 2024-11-29 00:46:01 is build-edge-riscv64 on pause to give more resources to build-3-21-riscv64? 2024-11-29 06:07:00 omni: no, it's a diffent host 2024-11-29 06:07:03 different 2024-11-29 08:10:12 im working on ghc on build-3-21-x86_64 now 2024-11-29 14:28:21 Hi, hope you are doing well :). Would i be eligible to get Gitlab permissions to remove status:mr-stale tags (primarely from my own MRs)? 2024-11-29 14:37:48 chereskata: in gitlab itself you need to be at least a developer before you can adjust labels on merge requests 2024-11-29 14:38:12 i think it could be too early to get "lifted" 2024-11-29 14:38:42 as well as i am active more in bursts than regulary 2024-11-29 14:38:50 I was thinking about adding support for commands to aports-qa-bot 2024-11-29 14:39:09 how complex is this in general? 2024-11-29 14:39:15 chereskata: that should in principle not be an issue 2024-11-29 14:39:24 chereskata: how complex is what? 2024-11-29 14:39:36 building the "unblock me" chatbot functionality 2024-11-29 14:40:13 I think not too difficult if you know go 2024-11-29 14:40:27 The bot already receives webhoooks 2024-11-29 14:40:39 Currently mostly regarding merge request events 2024-11-29 14:40:46 interesting, never touched that before, but sounds doable 2024-11-29 14:41:06 Only thing is that we should think about permissions 2024-11-29 14:43:31 yes, this has to be well structured in a whitelist approach