One of the troubles we have faced as a validator for Celo is keeping up with all the information that comes up in the Celo's Discord discussions. This is especially true for smaller validators whose portfolios include several networks. To help everyone stay in touch with what is going on the Celo validator scene and contribute to the validator and broader Celo community, we have decided to publish the Celo Discord Validator Digest. Here are the notes for the period of 23 November - 6 December 2020.
Discussions
Celostats improvement
@pranay invited everyone to share their views on how Celostats could be improved:
@pranay: ... Celostats (https://stats.celo.org/) has been having some issues lately, but in the past (eg. during Stake Off), it was quite valuable in understanding the health of the network. Question for y'all -- if you use celostats, what features are most valuable for you, and is there any information or features that could be surfaced that you would find useful when operating your nodes?
@Dee | Usopp.club: The real-time view is valuable as it helps us to know if other validators are facing an issue when we get an alert on our validator.
@mbay2002 | Qoor: Honestly "Celo Cauldron" is my go-to. But I do have a tab in my browser open to Celostats at all time as a secondary source. I use it for viewing the validator name, geth version, (Celostats "loses" these two pieces of info over time), affiliation, and score.
@daluxx | StakesStone: I use it to get a quick overview about all elected validators incl. the names, health status (with the bar showing the blocks validated) and their score. But also if I do maintenance I check Celo stats for the latest block to see if my node is in sync again.
@Thylacine | PretoriaResearchLab: Personally, I've always found the colouring/naming system and what actually appears at any given time very confusing. In the early stages of TGCSO I think every node appeared, regardless of it's purpose, and this was super useful for me when configuring backups, full nodes for other purposes, etc. Some random thoughts:
More consistency in the colouring and registered names. Seems like recently the registered names are being ignored and everything is just appearing in white as some unknown node.
Why do we even need to register validators and proxies via the celostats attribute? Why can't they just be crawled and use the registered name?
Alternatively, allow full nodes to self-register. Would be great to see my transaction and archive nodes, plus nodes used for attestation and other tools appear with human-readable names I provide. This will be more relevant when fullnode rewards are turned on.
Add filters like: "Show only" [ ] Proxy [ ] Validator [ ] Full Node [ ] cLabs [ ] All
Tighten up the size of the header div. The info div with block number and stats takes up nearly 50% of the screen on a normal 1080p monitor. I would prefer to see that information condensed and more of the list.
Useful info
Mariano | cLabs published the results of cLabs research on the OOM issues some nodes faced several weeks ago:
Initially, @victor | cLabs did some initial investigation about it, and we landed with an Incident Report (https://docs.google.com/document/d/1ShECNeXiRp5-WDi6J4vxTFnFISZU94bKNHsibIwE4jQ/edit#) and some additional notes (https://gist.github.com/nategraf/2fc2ef0e1a11da09da00cd7694c86da6#timeline-of-validator-recoveries) We got the intuition that nodes that first received the 4k tx didn't crash, but the ones that received it later crashed. During this milestone, we assigned more time to get to the bottom of it, and I wanted to share the results. You can find the research done by @pastoh here https://github.com/celo-org/celo-blockchain/issues/1169
TLDR: p2p message explosion, no limit on incoming messages, only 1 thread for processing p2p messages, same thread emptying the p2p message queue validates txs = OOM and/or node stuck for a long time.
This won't be an issue in the future, since go-ethereum EIP-2464 (https://eips.ethereum.org/EIPS/eip-2464) specifies a new way to propagate transactions, that solves this issues and it's much more efficient. That EIP is already on celo-blockchain master (https://github.com/celo-org/celo-blockchain/pull/1209) as we slowly but surely advance in merging upstream changes. Expect this to be on the next release (after 1.2.2)! Hope this helps adding some light to the event!!!
Reminder from Mariano | cLabs to those who used certbot for TLS support on their attestation nodes:
If you followed https://forum.celo.org/t/tls-for-attestation-service/649, you're using certbot to get a certificate.
Those certificates expire after 90 days.
Probably there's nothing to worry about, but it worth checking that the way you installed certbot also installed a cronjob/timer that automatically tries to renew "about to expiry certificates" In the instructions, we're using ubuntu with snaps; that actually installs a systemd timer that runs twice a day. So all should be fine, but anyhow, i though it worth mentioning this just in case!
The Churrito fork went smoothly on Baklava:
@Or | cLabs: We're past the Churrito activation block, blocks are still coming in, and both the explorer and celo cauldron are still working. From now on, unupgraded full nodes will be unable to peer with upgraded nodes (there is a fork id check during the handshake). But it doesn't disconnect peers it was already connected to. So I expect unupgraded validators (and other unupgraded full nodes) could see their connectivity degrade over time if upgraded peers disconnect (no idea how often that happens) because they won't be able to to reconnect to them.
Community
zviad | WOTrust | celovote.com published the November issue of Validator round up: https://wotrust.substack.com/p/celo-validator-roundup-november-2020
Cody | cLabs shared an attestation dashboard to observe the reliability of attestation services:
As a community, we've made some great progress on improving the reliability of our Attestation Services. This is incredibly important for the usability of Valora and the Celo-ecosystem as we work on growing adoption globally. To continue the improvement of the service, we'll be proposing a new CIP that provides stronger incentives to ensure high availability of the Attestation Service (stay tuned!). To complement the existing community driven tooling, we've built a public dashboard for Attestation Service operators to observe the reliability of their service and make improvements. We wanted to do this in advance before we introduce stronger incentives to give everyone an opportunity to improve their node's reliability as well as help shape our metrics.
The first thing you should do when opening the dashboard is to input your Validator address in the top right filters.
Verify that your node is on the latest version (and upgrade if not!)
Check your service's
FailureRate
andAttributableFailureRate
. Attributable failures are ones where the user submitted SMS codes from all other attestation nodes except from yours. This number is critical to get as close to zero as possible as these are very likely blocking real users from onboarding. If your node's success rate looks poor, you can use the remaining charts to help identify the issue if it's regional or provider related. This is the first version of the dashboard and feedback is highly encouraged. If something is unclear or there are certain charts you'd like to see, please let us know!
Moola's CELO and cUSD money market contracts went live on mainnet:
@Patrick|Validator.Capital|Moola: Users who are familiar with CLI can use these commands to deposit, borrow, repay, and redeem as well as view reserve data. Please understand that Moola is experimental and may have unknown vulnerabilities. https://github.com/moolamarket/moola
Like what we do? Support our validator group by voting for it!