2024-11-01 00:05:42 https://build.alpinelinux.org/buildlogs/build-edge-s390x/main/mariadb/mariadb-11.4.3-r2.log failing with No space left on device. 2024-11-01 08:41:35 This is annoying 2024-11-01 09:19:54 do we know why it happens? 2024-11-01 09:30:11 Not yet, no obvious large amount of requests (there are many coming from ips from Alibaba), but they get a nice 429 response 2024-11-01 09:30:22 So perhaps a specific request that is expensive 2024-11-01 10:49:15 Uztelekom BGP issue maybe 2024-11-01 13:14:09 we are out of diskspace on s390x builder 2024-11-01 13:23:34 Yup 2024-11-01 13:24:52 We could disable and remove community for eol releases 2024-11-01 14:02:33 maybe I should backup some of the older releases 2024-11-01 14:02:44 and maybe we should ask IBM if they can help 2024-11-01 14:03:20 i think we should ask IBM first 2024-11-01 14:04:03 just explain the problem 2024-11-01 18:11:12 ncopa: 80G free now on usa2 2024-11-01 18:11:39 But that's about all we can get atm 2024-11-02 08:24:46 The load on that server increased over the last 7 days 2024-11-02 08:25:18 (and correlates with requests to cgit) 2024-11-02 10:55:05 Looking at zabbix, there are frequent bursts of requests 2024-11-02 10:55:14 periodic 2024-11-02 15:39:05 ikke: a much better channel for this, thanks. 2024-11-02 15:39:12 durrendal: welcome :-) 2024-11-02 15:39:27 FYI, it's currently a bit spammed by issues with git.a.o, which I'm working on 2024-11-02 15:39:32 (figuring out what's causing it) 2024-11-02 15:40:30 No worries at all, I'm used to noisey monitoring systems and reading through the noise 2024-11-02 15:40:58 I'm trying to prevent noisy monitoring, but I don't want to hide any issues :) 2024-11-02 15:42:06 What are your experiences with working on infrastructure? 2024-11-02 15:44:29 Easy peasy, just fix everything right ;) just a heads up, I may be a bit slow to respond, out with the kids at the moment. 2024-11-02 15:46:04 Insofar as experience goes I've been a linux sysadmin professionally for about 10 years. Currently my day job leans more towards devops heavy on the infrastructure automation side. 2024-11-02 15:46:36 right, and no worry 2024-11-02 15:47:26 Our infrastructure is mostly docker compose or lxc containers 2024-11-02 15:47:34 Mostly maintained by hand 2024-11-02 15:47:38 That devops role has been the last 5 years of my career, lots of ansible, saltstack, terraform. Mixed Debian, Alpine, and windows environments. 2024-11-02 15:49:17 I have loads of docker experience. Not a bunch with lxc directly. But Ive used both LXD and now incus in my lab environments and some on prem production environments. 2024-11-02 15:49:53 lxc is fairly simple 2024-11-02 15:50:07 Just a bunch of containers that act like VMs 2024-11-02 15:54:07 That was my understanding of it. Incus and lxd just provide a nice cli on top of lxc which makes it easier to manage in my opinion. 2024-11-02 15:54:32 I would certainly be amenable to learning lxc though, I don't think that'd be a problem. 2024-11-02 15:55:58 What we use lxc for is quite limited though 2024-11-02 15:56:08 Well, all the builders run it 2024-11-02 15:56:16 but that does not take a lot of maintenance 2024-11-02 16:00:50 That makes sense, and tracks the prior conversation in -linux. The CI is just gitlab-runners running docker containers? 2024-11-02 16:01:01 yes 2024-11-02 16:01:02 What's currently used for monitoring? 2024-11-02 16:01:06 zabbix 2024-11-02 16:05:44 Excellent choice! Huge fan of Zabbix personally, it's a solid monitoring systematic, especially for typical infrastructure. 2024-11-02 16:06:07 s/systematic/system/ 2024-11-02 16:06:33 I have extensive experience with it :) 2024-11-02 16:08:13 Some graphs I'm currently watching https://imgur.com/a/5CfaxaR 2024-11-02 16:08:37 We're also using terraform a bit: to manage linode and gitlab 2024-11-02 16:09:07 Just some things, not everything (yet, at least) 2024-11-02 16:17:33 Same here, I've done several implementations of Zabbix at each job I've had. It's like a stalwart companion. 2024-11-02 16:18:03 And those are wicked nice dashboards! I'm actually surprised the load on things like cgit isn't higher honestly. 2024-11-02 16:18:10 The mailserver? 2024-11-02 16:21:28 Sorry I might not be following there, the mailserver? 2024-11-02 16:22:34 Starlwart is the name for a mailserver: https://stalw.art/ 2024-11-02 16:22:44 But I guess you refer to something else 2024-11-02 16:28:16 Oh I hadn't heard of that before, I meant that Zabbix was a stalwart companion. 2024-11-02 16:33:27 ah ok, I'm not too familiar with that term, hence the confusion :) 2024-11-02 16:43:02 No worries at all :) 2024-11-02 16:43:58 Do you currently track issues anywhere? Like a ticketing system or something? 2024-11-02 16:43:59 I'm currently trying to implement rate limiting on nginx based on AS 2024-11-02 16:44:18 durrendal: yes, https://gitlab.alpinelinux.org/alpine/infra/infra 2024-11-02 16:54:34 I had to do exactly that recently with my blog and Gitea instance. All of the LLM scraping has caused ridiculous load on my own servers. 2024-11-02 16:56:44 I'm not sure how the nginx configs are setup, I didn't see any of the configs in the gitlab/infra project, but you should be able to define a snippet that has an array of user agent strings and then set a rate limit in an if statement based on whether that user agent matches. 2024-11-02 16:57:09 durrendal: I'm going to use the geo feature (I'm already use that for gitlab) 2024-11-02 16:57:25 It allows me to map the IP address to ASN 2024-11-02 16:57:42 Not trying to be comprehensive, just the ones that I think are causing issues 2024-11-02 16:58:53 That's a good idea, these things tend to be in large ipsets anyways. And unlike my personal site, you probably dont want to just block the traffic at the firewall 2024-11-02 16:59:27 I have been doing that as well, with ipsets 2024-11-02 17:01:30 My latest trick is responding with 444 if trying to register a user from VPS / Cloud IPs 2024-11-02 17:01:41 At least, the ones that have been spamming users 2024-11-02 17:26:18 I've heard about that just hanging around on IRC, it makes sense to block anything strange like that. Even if it breaks one or two people's attempts to register. At least real people can report it as an issue in that scenario. 2024-11-02 17:27:57 ahuh 2024-11-02 17:28:08 The only anoying one remains AT&T 2024-11-02 18:16:33 Is there something specific to AT&T that's different from other LTE providers? Or is it just anything that implements a CGNAT network structure? 2024-11-02 18:16:57 Not sure if there's anything specific except they seem to attrackt a lot of spammers 2024-11-02 19:52:51 That is very odd. 2024-11-02 19:53:45 Out of curiosity, all of the infrastructure runs Alpine right? Are they on stable releases or are some running on edge? 2024-11-02 19:54:03 All running alpine, all stable releases 2024-11-02 19:55:05 We dogfood our own OS :) 2024-11-02 20:00:57 Haha I was hoping that was the answer! Barring missing packages for something I can't imagine a reason you wouldn't, and even then that's just impetus to package things. 2024-11-02 20:18:03 How much time would you say the infrastructure team spends on fighting fires, regular maintenance, and innovation? 2024-11-02 20:19:30 Oof, good question. Fighting fires can vary quite a lot 2024-11-02 20:19:54 Generally our infra is quite stable, so it's not that we are constantly fighting fires 2024-11-02 20:20:37 I don't have good numbers though 2024-11-02 20:21:04 I do kind of iterate through each of those activities though 2024-11-02 20:37:26 It absolutely can, but the fact that you don't have an off the cuff answer I think attests to the stability of the infrastructure. 2024-11-02 20:38:07 I mean, I've been around 6ish years I think, I don't personally remember massive outages during that time period. 2024-11-02 20:41:44 I've now added rate limiting based on ASN :) 2024-11-02 20:42:00 CHecking if it works 2024-11-02 20:50:35 Yes, it's rate limiting :) 2024-11-02 20:52:03 \o/ nicely done! 2024-11-02 20:52:15 Haha I forgot algitbot does that 2024-11-02 20:52:26 Just a single AS 2024-11-02 20:52:47 Does that rate limit apply to all of the alpinelinux domains? 2024-11-02 20:52:52 No 2024-11-02 20:52:57 Just a single application 2024-11-02 20:53:48 Ah, that makes sense. Gitlab I assume given the user registration 2024-11-02 20:54:04 yes, but there it's not rate limiting 2024-11-02 20:54:14 Here it's about cgit 2024-11-02 20:54:37 lots of scanners use git.a.o 2024-11-02 20:54:44 This is a single AS 2024-11-02 20:54:48 All from TENCENT 2024-11-02 20:56:34 198 IPs receiving 429 responses 2024-11-02 20:57:15 the module provides stats? 2024-11-02 20:58:26 It probably does, but I'm parsing the access log 2024-11-02 20:59:00 now 2563 IPs :-| 2024-11-02 20:59:04 Lots of IPs 2024-11-02 20:59:26 But the most important part, no more 50x repsonses during a spike 2024-11-02 21:00:16 https://imgur.com/a/qJJkKLn 2024-11-02 21:03:18 I've found a free database with subnets for each AS. I created a small programm that parses that, and generates a list of subnet to as mappings that the nginx geo module takes 2024-11-02 21:03:36 https://iptoasn.com/ 2024-11-02 21:08:51 I must say I quite like the geo module from nginx 2024-11-02 21:09:30 can we put it somehow behind fastly? 2024-11-02 21:10:03 Not sure it will help 2024-11-02 21:10:14 Though it coudl 2024-11-02 21:10:58 But not sure if we should support this abbusive traffic 2024-11-02 21:12:25 Their scraping it with thousands of IPs 2024-11-02 21:12:43 maybe scraping with k8s 2024-11-02 21:14:42 Our secdb receives 30 requests/s, but it's just static data 2024-11-02 21:16:05 And many of the requests are 304 not modified 2024-11-02 21:27:56 There are better, more respectful, ways to gather data than scraping information with thousands of IPs.. 2024-11-02 21:29:24 And especially with git, you just clone the repo once, and you can scrape it locally all you want 2024-11-02 21:29:49 And next, you just fetch the updates 2024-11-02 21:30:02 But people rather just point there webcrawlers at a webpage 2024-11-02 21:31:51 I used lnav to get an idea where the requests came from 2024-11-02 21:32:16 (lnav can parse logs, and allows you to perform sql-like queries on them) 2024-11-02 21:35:57 Lnav is an excellent tool :D you might like promtail, Loki, & grafana for log ingestion and discovery if you like lnav 2024-11-02 21:36:23 Let's you do the same sort of thing out of band. 2024-11-02 21:37:07 In fact, I have an instance running 2024-11-02 21:37:20 But not everything is fed inti it 2024-11-02 21:37:22 into it 2024-11-02 21:37:28 (only loki, no grafana) 2024-11-02 21:37:49 I have been playing a bit with kubernetes 2024-11-02 21:40:57 Great minds think alike :) grafana is just visualization anyways, Loki's where the cool toys are. 2024-11-02 21:41:12 good visualization can help make sense of the data 2024-11-02 21:41:17 just didn't get to deploy it yet 2024-11-02 21:41:19 What application do you have in mind for k8s? 2024-11-02 21:42:02 Oh absolutely, and there are good use cases for granting someone access to grafana while not giving them ssh access to the box hosting it. 2024-11-02 21:42:16 right 2024-11-02 21:42:30 That's one of the reasons I deployed it 2024-11-02 21:43:11 From everything we've talked about today not having gotten to something sounds like it's a matter of bandwidth more than anything. Alpine is a large project with a decent bit of infrastructure and a small team. 2024-11-02 21:43:19 That makes for time all the more precious. 2024-11-02 21:43:32 s/makes for time/makes time/ 2024-11-02 21:44:58 That's also one of the reasons I was looking into k8s, as a way to make it easier to do basically gitops. Allow members to deploy changes via CI/CD 2024-11-02 21:45:21 There are other ways, of course, but just one potential solution 2024-11-02 21:46:38 But one thing I find challenging with k8s is that storage is painful. From what I can tell, most will defer state to some managed cloud solution 2024-11-02 21:48:14 That is one way to do it. Things like terraform/opentofu and ansible in a CI runner is a pretty common pattern. 2024-11-02 21:49:19 You could potentially get some of the same functionality with incus and its clustering functionality, which would probably be easier to manage than k8s, and doesn't suffer from the same storage issue. 2024-11-02 21:49:50 It's also a smaller shift in terms of understanding, since it's still lxc containers under the hood. 2024-11-02 21:51:31 How does Incus deal with storage? 2024-11-02 21:53:46 It can do it a few different ways, but the suggested method is either a ZFS pool. That can be as complicated as multi disk vdevs, or as simple as a ZFS formatted image file. 2024-11-02 21:53:50 https://linuxcontainers.org/incus/docs/main/explanation/storage/ 2024-11-02 21:54:34 All of the containers share space inside of that pool, and can be restricted to specific resource usage (to prevent excessive growth from taking down everything) 2024-11-02 21:54:57 But from what I see, it's still stored on the host? 2024-11-02 21:55:05 Unless you use CEPH 2024-11-02 21:55:25 Yes that's true, unless you use CEPH. 2024-11-02 21:55:42 That's similar to k8s then 2024-11-02 21:56:10 I was just about to ask what the current setup looks like for the builders. From the earlier conversation it sounds like they need to have access locally to all of the built packages for a given arch? Is that just done on the host itself? 2024-11-02 21:56:40 it works exactly the same as you do locally with abuild 2024-11-02 21:56:46 the packages are collected in ~/packages 2024-11-02 21:57:39 And the fact that the builders have the complete repos saved our bacon the other day :-) 2024-11-02 22:05:42 It works exactly the same, that's neat! I didn't realize the process was so similar. 2024-11-02 22:06:21 Yup, it all uses the same tools 2024-11-02 22:06:30 But then that means you need several hundred gigs of space per builder just to keep up with the current packages. And I think that might just be edge. 2024-11-02 22:06:33 aports-build -> buildrepo -> abuild 2024-11-02 22:06:57 My memory is based off of wanting to setup an edge mirror and then realizing I needed far more disc space than I thought :) 2024-11-02 22:07:09 heh 2024-11-02 22:07:15 When I started, a builder was ~30G 2024-11-02 22:07:47 Then people like me joined the project and started dumping every package they could find into aports ;) 2024-11-02 22:08:22 It's 70G per builder now 2024-11-02 22:08:30 And yes, it's giving us issues 2024-11-02 22:15:09 https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/main/aports-build/aports-build 2024-11-02 22:15:22 This script is triggered by mqtt-exec when changes happen on git 2024-11-02 22:33:09 I can imagine. The shift from small and simple to what now feels like a more general distro is interesting. And I dont think the infrastructure implications of adding packages is something people consider 2024-11-02 22:34:04 So all of the build servers are controlled via the MQTT server essentially. That's the only real interconnection between any of them? 2024-11-02 22:34:32 And I'm guessing they're in various locations? Or are they all servers in a central location? 2024-11-02 22:36:48 Various locations 2024-11-02 22:37:59 And yes, mqtt is what binds things together 2024-11-02 22:42:50 clandmeter: ^ 2024-11-02 22:54:44 Hmm so to implement something like an NFS share it'd need to be over a VPN connection. And there are likely performance implications there 2024-11-03 07:10:25 clandmeter: when you have time, can you look on nld-bld-1? Is unresponsive 2024-11-03 08:12:48 ikke: i can powertoggle them 2024-11-03 08:13:07 if that does not work you will need to wait unil tomorrow 2024-11-03 08:13:36 nod 2024-11-03 08:14:33 power off 2024-11-03 08:19:05 power on 2024-11-03 09:47:03 ikke: stuff is back online? 2024-11-03 09:47:08 yes 2024-11-03 09:47:13 ok 2024-11-03 09:47:39 though I have not seen any messages from the builders on irc yet 2024-11-03 09:48:36 Oh, probably haver to fix the fw again 2024-11-03 09:50:16 yup 2024-11-03 13:53:26 The excessive requests stopped btw 2024-11-03 13:53:30 to git.a.o :-) 2024-11-04 09:01:55 I have replace lxc with incus on my local desktop machine 2024-11-04 09:02:04 incus is pretty nice 2024-11-04 10:24:13 the riscv64 machien cannot pull images for some reason? 2024-11-04 10:24:16 https://gitlab.alpinelinux.org/alpine/aports/-/jobs/1590041 2024-11-04 11:38:27 oh, a configuration error 2024-11-04 11:38:43 it pulls from gitlab.alpinelinux.org, but the registry is registry.alpinelinux.org 2024-11-04 11:40:53 It's fixed now 2024-11-04 13:50:12 It is a very slick solution. I've dropped virtmanager and qemu in favor of incus, though it's all the same under the hood, the CLI is just a slicker solution in my opioid 2024-11-04 13:50:27 s/opiod/opinion/ 2024-11-04 13:51:49 The one thing that it really lacks is support for emulating different hardware. I would love to run some arm or riscv containers/VMs with incus, but they just don't support it. 2024-11-04 17:17:43 Is there any particular reason why the algitbot's zabbix notifications are omitted from the irc logs found on irclogs.alpinelinux.org? 2024-11-04 18:19:57 durrendal: algitbot does not log its own messages 2024-11-04 18:26:53 Ah, that makes sense, I didn't realize it was handling the logging as well. 2024-11-04 18:27:11 yup 2024-11-04 18:43:30 I was looking at the logs expecting it to be there, was hoping I could get a sense for what type of alerts are typical and potentially get a rough sense of their frequency by pulling them from there 2024-11-05 05:44:07 seems like linode frankfurt had connectivity issues 2024-11-05 05:44:51 though nothing on the status page 2024-11-05 07:43:55 their status page gets updates from frankfurt ;-) 2024-11-05 09:13:11 does alpine have a statue page? 2024-11-05 11:28:00 qaqland: not yet, it's something I'm thinking about implementing 2024-11-05 14:03:22 Is one of our aarch64 builders an M1 mac system? 2024-11-05 15:02:07 One of the CI hosts 2024-11-05 15:03:48 The builders run on Ampere Altra 2024-11-05 15:34:31 Ah I had meant CI not builders, that's where I had seen it actually. 2024-11-05 15:36:05 Kind of neat to see consumer hardware in the mix to take the load off of other systems. I like the idea 2024-11-05 15:36:57 also helps paint a clearer picture of how distributed the infrastructure is 2024-11-05 15:37:40 Which speaking of, I was looking at the Infra issues and noticed the wireguard documentation attached to the project. Are all of the builders meshed together over wireguard? 2024-11-05 16:07:04 durrendal: The builders use dmvpn 2024-11-05 16:07:26 which is a homegrown mesh network solution 2024-11-05 16:23:29 We use wireguard for individual connections to the dmvpn network 2024-11-05 16:23:48 (dmvpn delegates complete subnets) 2024-11-05 16:41:00 im bootstrapping go on build-3-21-riscv64 now 2024-11-05 16:41:28 seemsm like it already was bootstrapped 2024-11-05 16:42:43 Right, I bootstrapped it, but forgot to reenable the builder, sorry 2024-11-05 16:45:57 It needs to be bootstrapped on x86_64 now 2024-11-05 16:47:14 ikke: that's a really cool setup. So then when an admin needs to connect to a box they must first connect to the wireguard VPN to access the system? 2024-11-05 16:54:20 ncopa: minor thing, community/gnuradio test should be fixed 4 days, but it hadn't a chance to retry on s390x 2024-11-05 16:54:53 s/4 days/4 days ago/ ... the build log timestamp is from 2024-10 2024-11-05 16:55:33 it was fixed upstream, unless there is something in s390x that causes it to fail on retry 2024-11-05 16:56:39 just mentioning it in relation to the commit disabling it on s390x due to the test 2024-11-05 16:57:11 the s390x builder was offline for a bit for bootstrapping 2024-11-05 17:09:29 durrendal: some parts, but not everything necessarily 2024-11-05 17:11:09 makes sense, you'd be reliant on the VPN being up to administer everything else, which could be a single point of failure. But it is good to know. 2024-11-05 20:11:54 mio: oh ok. seems like the s390x builder was/is down 2024-11-05 20:14:28 ncopa: yeah, no worries. the test did pass in s390x ci. maybe the maintainer will check it next upgrade and reenable if all is well 2024-11-05 20:15:06 the patch can be removed on next release anyway 2024-11-05 20:21:33 that said, i doubt anyone will ever use gnuradio on a s390x machine 2024-11-05 20:22:58 ikke: i got response from docker support. we need fill in a form every year. I dunno if anyone from infra team wants to do it or if I should go ahead 2024-11-05 20:33:28 ncopa: what is the form about? I could take a look? 2024-11-05 20:51:37 https://www.docker.com/community/open-source/application/ 2024-11-05 20:58:01 Did you list any sponsors last time? 2024-11-05 21:00:24 I already sent it. Sorry 2024-11-05 21:00:39 filled it out and sent it 2024-11-05 21:00:49 and already got accepted 2024-11-05 21:07:13 ah ok, thanks 2024-11-05 21:20:11 ncopa: I suppose we should add a calendar notification for next year 2024-11-05 22:00:54 Yeah 2024-11-06 19:46:40 CI is backlogged again 2024-11-06 19:47:09 Mostly armv7 2024-11-06 20:36:57 Anoying people scraping aports 2024-11-06 21:03:42 is it bots scrape commit by commit on the cgit/gitlab instance? 2024-11-06 21:04:03 APKBUILDs 2024-11-06 21:05:37 https://tpaste.us/W5RZ 2024-11-06 21:06:15 that sounds like the slowest way to get APKBUILDs possible 2024-11-06 21:06:24 But also the lasiest way 2024-11-06 21:06:30 laziest* 2024-11-06 21:07:22 true, it is definitely the path of least resistance for someone who doesn't care about the impact 2024-11-06 21:08:13 I noticed because I was testing something against the gitlab API and got timeouts 2024-11-06 21:09:21 Do you have experience with Azure? 2024-11-06 21:12:51 Not currently. I've done extensive work with AWS, Linode, and Digital Ocean though. 2024-11-06 21:13:14 I have a project at $work that I need to delve into it though, so it's on the road map 2024-11-06 21:13:41 Right. We received sponsorship from Azure, but we haven't really used it yet 2024-11-06 21:13:51 Azure is kinda complicated 2024-11-06 21:14:13 It does have aarch64 VMs though 2024-11-06 21:14:27 So I was trying to deploy some aarch64 hosts for CI 2024-11-06 21:16:13 Nice! Out of curiosity, what does the sponsorship entail? 2024-11-06 21:17:14 I didn't realize Azure had aarch64, I've always gone to AWS for those. I wonder if those are cheaper than the EC2 instances I've been using to rebuild my droid's kernel 2024-11-06 21:17:48 did you get stuck in the deployment process somewhere? 2024-11-06 21:17:52 Yes 2024-11-06 21:18:05 I tried to deploy a VM from our own image 2024-11-06 21:18:32 Azure kept waiting on something, the VM was running, but I could not connect to it 2024-11-06 21:18:49 I had access via a serial console, but no credentials to login (only ssh key) 2024-11-06 21:19:18 To be honest, I did use the tinycloud variant, so maybe that causes issues 2024-11-06 21:19:28 I was actually about to ask 2024-11-06 21:19:47 I think Azure supports Cloud-Init and their own Azure Linux Agent 2024-11-06 21:20:15 We do have an azure tinycloud variant (tinycloud is mostly cloud-init compatible) 2024-11-06 21:20:46 I didn't know that, I actually thought we only had AWS compatible cloud-init images 2024-11-06 21:21:46 Since a while we also have other variants 2024-11-06 21:21:59 Officially still beta 2024-11-06 21:22:03 https://www.alpinelinux.org/cloud/ 2024-11-06 21:23:04 The last time I went to /cloud it just had amis for ec2, this is exciting to see :) 2024-11-06 21:23:42 There are azure images going back to 3.18 2024-11-06 21:24:27 Still 26 pending jobs for armv7 2024-11-06 21:24:44 (chromium pipelines are holding things up 2024-11-06 21:24:47 ) 2024-11-06 21:25:23 I definitely haven't checked in a hot minute, I've just kept launching VMs on AWS. On DO I rolled my own image so I don't even think about it there 2024-11-06 21:25:48 I haven't used any of these much either, but want to test them for Azure now 2024-11-06 21:25:52 I would be curious to see if the cloudinit variant operates differently than the tinycloud one, maybe Azure does something funky 2024-11-06 21:26:25 Yeah 2024-11-06 21:26:35 Sadly it takes some time to import 2024-11-06 21:26:54 You have to create some gallery style image to be able to use aarch64 2024-11-06 21:27:53 And you have to fill in all kinds of details which mostly relate to commercial products 2024-11-06 21:28:47 that sounds like a less than stellar experience 2024-11-06 21:29:09 though it's not like any of these hyperscalers are winning awards for UX 2024-11-06 21:29:21 I'd rather spend the time futzing with Terraform from the get go instead 2024-11-06 21:29:40 Yeah, but even that is tricky with Azure 2024-11-06 21:30:28 The provider depends on their cli tools, which consists of countless python dependencies locked to specific versions 2024-11-06 21:30:39 And authentication is a disaster as well 2024-11-06 21:31:10 sort of like terraform depending on the impossibly difficult to maintain aws-cli 2024-11-06 21:31:25 right 2024-11-06 21:31:32 I've packaged several of the aws tools, it's such a mixed bag 2024-11-06 21:31:47 I did privately package the azure-cli, but never submitted it 2024-11-06 21:32:12 well if it's composed of locked python dependencies then I don't blame you for not submitting it 2024-11-06 21:32:24 salt has taught me the joy that comes with that specific headache 2024-11-06 21:32:51 ah yeah 2024-11-06 21:33:16 The whole ecosystem becomes such a mess with everything locking everything so tight that the only solution is containers or flatpack 2024-11-06 21:33:39 (or venvs for python) 2024-11-06 21:41:25 agreed entirely, and unfortunately everything is written in python because "it's what everyone uses" 2024-11-06 21:41:44 To deploy an image I have to do 4 steps for each region: create a storage account, upload the image as a blob, create a vm image, create a compute image 2024-11-06 21:42:04 thank goodness Golang exists and has at least partially fixed that problem as its gotten more popular 2024-11-06 21:42:13 yup 2024-11-06 21:42:55 that seems needlessly complicated 2024-11-06 21:43:40 Hmm, though it's apparently possible to replicate the compute image to other regions 2024-11-06 21:45:20 Now waiting for the compute image to be deployed 2024-11-06 21:45:43 That seems less frustrating, assuming the image works replicating it across all of other regions would be a small lift 2024-11-06 21:52:41 Ok, deployment complete 2024-11-06 21:56:38 Deploying VM 2024-11-06 22:32:29 Ok, succes 2024-11-06 22:32:46 I did a couple of things a bit different, so not sure what was the exact issue, but I have a VM now 2024-11-06 22:35:13 Was cloud init one of them or did you stick with tiny? 2024-11-06 22:36:10 yeah, picked cloud init 2024-11-06 22:40:45 when i last tried tiny-cloud on azure, the UI thought it wasn't instantiated, but it was accessible -- not sure what it was waiting for. 2024-11-06 22:41:13 right, but for some reason could not access it 2024-11-06 22:42:11 i need to spin one up soonish anyways to see if they provide any sort of "i'm on azure" via DMI info 2024-11-06 22:44:13 i don't recall if it was azure or gcp, but one had some weirdness about specifying the "cloud user" login 2024-11-06 22:44:42 that had previously confounded a couple of people 2024-11-06 22:45:01 Azure has it 2024-11-06 22:47:33 ikke if you have a chance on the azure host, could you check and see if anything in /sys/class/dmi/id/* has anything that indicates it's on azure? 2024-11-06 22:48:01 sure, will check 2024-11-06 22:50:07 Nothing explicitly mentions azure 2024-11-06 22:50:19 Only inderect microsoft / hyper-v, but that could be anywhere 2024-11-06 22:54:14 Nice Azure aarch64 hosts support 32-bit mode 2024-11-06 23:17:42 That's cool, would the preference be to run aarch64 in 32-bit mode and use that solely for armhf/v7 builds? 2024-11-06 23:21:09 The host itself runs aarch64, but the containers run the 32-bits images with linux32 2024-11-06 23:25:21 That makes sense. I haven't tried running non-x86_64 VMs outside of AWS where that's also supported. I just assumed it would work out the box 2024-11-07 07:11:13 i changed the dependencies for alpine-mksite. Do I need to change some CI job as well? 2024-11-07 07:12:34 or let me rephrase: how and where are wwwtest.a.o and a.o sites generated and published? I need to verify that the correct dependencies are installed 2024-11-07 07:29:35 They're lxc containers 2024-11-07 07:29:45 A.o lives on gbr-app-1 2024-11-07 07:29:58 wwwtest on dev.a.o 2024-11-07 07:30:54 found the repo. alpine-www 2024-11-07 07:31:57 That's an empty repo 2024-11-07 07:32:33 I think I used that for testing with Kubernetes 2024-11-07 07:33:00 (there is a branch which contains more) 2024-11-07 08:10:45 ok. so I have to log in to the lxc and install the dips manually then I suppose 2024-11-07 08:10:49 deps 2024-11-07 08:37:07 Yes, until we have a different way to deploy it 2024-11-07 11:26:19 ncopa: can I help with something? 2024-11-07 11:58:11 just replace lua-cjson with lua-rapidjson, or install them both 2024-11-07 11:58:23 sure, will do 2024-11-07 12:07:54 ncopa: done 2024-11-07 12:08:46 thank you! 2024-11-07 12:09:27 can you re-run the trigger for wwwtest? 2024-11-07 12:09:29 curl https://wwwtest.alpinelinux.org/rpi-imager.json 2024-11-07 12:09:37 should not show any \/ only / 2024-11-07 12:16:03 ncopa: looks okay now 2024-11-07 12:16:56 awesome! thanks! 2024-11-07 12:17:02 I'll merge it to master then 2024-11-07 12:17:22 to production I mean 2024-11-07 12:18:16 and now it works. thank you! 2024-11-07 12:23:03 ncopa: fyi, not entirely sure if that was the cause, but there were some issues deploying an Alpine VM on azure with tinycloud 2024-11-07 12:23:36 azure could not confirm the vm was deployed properly 2024-11-07 12:30:05 i heard. I'd like to fix that, but will not have time this week 2024-11-07 12:30:31 Sure, no problem. I have something working with cloud-init now 2024-11-07 12:30:57 i'd heard that the machine is up, you can ssh to it, but azure does not detect it. So I'd like to compare with a working cloud-init to find out what tiny-cloud needs to do 2024-11-07 13:38:29 ncopa: fyi, I've updated the gitlab-runner-alpine-ci project to take the new registrion flow into account 2024-11-07 14:24:42 ikke: when you're setting up new nodes, like you did yesterday on Azure, is the initial configuration all done by hand, or do you have it scripted in some sense? 2024-11-07 14:46:46 https://gitlab.alpinelinux.org/-/snippets/1152 2024-11-07 14:47:22 We have not automated managing hosts that much yet 2024-11-07 14:50:49 But running that script gives us a runner that is accepting jobs 2024-11-07 15:20:36 Makes sense, setting up runners is probably what you get the most churn with, and a good place to start. I had a similar process when I was using Gitlab CI 2024-11-07 15:21:19 https://krei.lambdacreate.com/durrendal/Verkos <- though I somewhat over-engineered my own personal solution to setting things up 2024-11-07 15:22:58 ansible might not be a bad fit for something like this, it could be tied into terraform/opentofu, and just as easily run one off in an ad hoc fashion 2024-11-07 15:24:05 My background is chef, where you have more continuous desired state 2024-11-07 15:25:02 (nowadays the open source cinc variant) 2024-11-07 15:26:17 I bounce around a lot. I view ansible a lot like scripts. Sort of one off ad hoc state application. Verkos is inspired by it. 2024-11-07 15:27:04 Professionally though I've done a ton of SaltStack work. Sadly that project is in major turmoil right now 2024-11-07 15:27:59 I was actually looking at Chef/Puppet to potentially replace my SaltStack deployment here at work, but my understanding is those are pull based systems (remote agent checks in every X minutes) versus push based like Salt 2024-11-07 15:28:04 Yeah, I've noticed 2024-11-07 15:28:17 salt was also my first orchestration like tool 2024-11-07 15:28:32 yes, correct 2024-11-07 15:28:47 That's also what i mean with more continuous desired state 2024-11-07 15:28:54 Not task based 2024-11-07 15:32:58 I'm a bit conflicted in what I'd like to achieve 2024-11-07 15:33:48 Ideally people would be able to contribute to our infra just by making merge requests 2024-11-07 15:34:11 Honestly, my use of Salt might have been non-standard. I use the push based applications to perform ad hoc maintenance (on workstations specifically) and the reactor, beacon, orchestrator to do automated deployments for servers. 2024-11-07 15:34:39 since most of Alpine's infrastructure is just servers, it feels like Chef/Puppet would fit well 2024-11-07 15:34:47 Yeah 2024-11-07 15:35:14 That would be a nice workflow. High visibility, MR approvals to make changes so you have everything logged. 2024-11-07 15:35:23 Problem is that chef (and cinc as well) are binary distributions 2024-11-07 15:35:29 I suppose CI could apply those changes or trigger them 2024-11-07 15:36:53 I admittedly haven't looked into it enough, I'm guessing it's not as easy as just packaging the project so it could be apk installed 2024-11-07 15:38:09 For the cinc client (what runs on the managed server), it's ruby, so it's doable, but again it's a dependency locked distribution 2024-11-07 15:38:25 Might have a hard time if you use just the dependencies that are available in aports 2024-11-07 15:39:17 The sum total of our Ruby ecosystem is pretty sparse currently, that's probably a decent lift to get packaged. 2024-11-07 15:39:47