Celo Discord Validator Digest #20
Migration of nodes to 1.1.0 and 1.2.0, recent attestation failures, useful info and community upgrades.
One of the troubles we have faced as a validator for Celo is keeping up with all the information that comes up in the Celo's Discord discussions. This is especially true for smaller validators whose portfolios include several networks. To help everyone stay in touch with what is going on the Celo validator scene and contribute to the validator and broader Celo community, we have decided to publish the Celo Discord Validator Digest. Here are the notes for the period of 26 October - 8 November 2020.
Discussions
Migration of nodes to 1.1.0 and 1.2.0
The new 1.1.0 docker image for mainnet was released. A few days later, it was announced that celo-blockchain v1.2.0 is out (the latest release adds support for multi-proxy setup and hot-swapping of validator nodes). Validators were advised to upgrade their nodes on Baklava and expect a mainnet image on 18 November.
Meanwhile, it seems that in the version 1.1.0, the nodes are unable to report stats to stats.celo.org:
@zviad | WOTrust | celovote.com: I am setting ethstats on both proxy and validator as:
--ethstats=wotrust2@stats-server.celo.org
is this what could be causing the issue? (i.e. should I have set it only for proxy?)...
Also, this was all working fine before upgrading to 1.1. and I have done few key rotations before too with the previous version and it continued to work fine. This doesn't really affect anything for us since our monitoring doesn't depend on anything from https://stats.celo.org/. I am just not sure what the overall priority is to make sure stats get reported there. (there are already ton of validators that don't report stats there anyways).
Several validators reported issues with the latest version:
@keefertaylor: 1.2.0 crashed on baklava last night and didn't restart on it's own. Interested if other people see this behavior. I also have this behavior on a 1.1.0 mainnet node.
@Dee | Usopp.club: I'm not able to get my validator running with 1.2.0 It connects to the proxy, I get a synchronisation message and then it shuts down with this error:
INFO [11-05|23:01:21.046] Block synchronisation started INFO [11-05|23:01:21.047] Stopping istanbul.Engine validating INFO [11-05|23:01:21.048] Mining aborted due to sync INFO [11-05|23:01:21.104] Imported new chain segment blocks=2 txs=1 mgas=0.196 elapsed=32.563ms mgasps=6.021 number=2284960 hash=c473c1…5f0b9d age=5m16s dirty=28.89KiBWARN [11-05|23:01:21.109] Elected but didn't sign block number=2284959 address=0x516e0CFFF40B1506d10F060Af803524d5E82E577 missed in a row=0INFO [11-05|23:01:21.987] Imported new chain segment blocks=63 txs=15 mgas=2.820 elapsed=879.544ms mgasps=3.206 number=2285023 hash=219bcb…0f1d25 dirty=657.14KiBWARN [11-05|23:01:21.990] Elected but didn't sign block number=2285022 address=0x516e0CFFF40B1506d10F060Af803524d5E82E577 missed in a row=0panic: event: wrong type in Send got istanbul.MessageEvent, want istanbul.MessageWithPeerIDEventgoroutine 539 [running]:github.com/ethereum/go-ethereum/event.(*Feed).Send(0xc0000d26c8, 0x14e4880, 0xc0093af620, 0x34cef8031) /go-ethereum/event/feed.go:141 +0xd1acreated by github.com/ethereum/go-ethereum/consensus/istanbul/backend.(*Backend).HandleMsg /go-ethereum/consensus/istanbul/backend/handler.go:137 +0x1400
The issue turned out to be linked to the celostast
flag but another issue emerged, with validator nodes not signing blocks:
@mbay2002 | Qoor: Ok. On 1.2.0, it's the damn
celostats
flag (ethstats
) that causes the validator to crash. I remember now that a quorum of us declared some time ago that this flag caused other types of issues, so I don't know why I still bother to use it. In any case, removing the flag from the proxy side allows everything to run (you can leave it on the validator, but I'm just going to get rid of it on both ends). So, now I'm running on 1.2.0, but still gettingElected but didn't sign block
missed in a row=0
The first issue was fixed in the 1.2.1 release:
@Joshua | cLabs: FYI if anyone's playing with this over the weekend. Version 1.2.1 is out to fix the
--celostats
issue (https://github.com/celo-org/celo-blockchain/releases/tag/v1.2.1)
The second one seems to have been linked to the use of the --baklava
flag:
@mbay2002 | Qoor: I'm signing again. After upgrading to 1.2.0 and getting the validator "panic", I tried several different combinations of flags on validator and proxy, and various backup copies of chain data. As I pointed out, the
--celostats
flag on the proxy side cause the validator to panic. But I also left out the--baklava
flag and that appears to be why I wasn't signing (though even log level 5 doesn't make it clear why this is the case --- it seems that flag is supposed to only be important to initializing the genesis block and pointing to a datadir).
The was also a question of whether it makes sense to upgrade to version 1.1.0 if the next upgrade is just around the corner:
@y3v63n | moonli.me: Is there any point in upgrading mainnet validators to 1.1.0 now if we are expecting a 1.2.0 release on mainnet on 18 November?
@yaz: if you are upgrading from a pre-1.1.0 image to 1.2.0 by the 18th when Mainnet docker image is released, you need to keep in mind that there are changes in 1.1.0 that you need to add when upgrading to 1.2.0
https://github.com/celo-org/celo-blockchain/releases/tag/v1.2.0
If you are upgrading from a version earlier than 1.1.0, see 1.1.0 release notes for changes made to remove the geth init step.
Basically 1.1.0 removed the need to have
geth init
as a step, as well as removed--bootnodes
flag.Also, in our docs we added a
--datadir
flag for testnets, but it's not needed for mainnet.I'd recommend upgrading 1.1.0 mainnet now so you can go over those changes, and won't have to be looking at both 1.1.0 and 1.2.0 changes by the 18th.
...
So it'd still be weeks later after Nov 18th before we need to coordinate upgrading 1.2.0 on mainnet.
For now, stay tuned for 1.1.0 updates.
Recent attestation failures
A number of validators has reported an influx of failed attestation in the recent days:
@Liviu | Easy2Stake.com: Yesterday we had a fail on baklava attestation with 30008 Unknown error on Twilio to a UK number. The strange thing is that we bought different numbers to ensure coverage and it was send from a UK to UK.... and was still a fail. Anyone have an idea what could be the problem?
@syncnode (George Bunea): I already said this multiple times. When that error is encountered there is not much we can do. For example the routing policies may be configured to send through a same network termination route but if you are sending to a number that is identified as being in Vodafone network but it was ported to another network the connection will fail. This is just an example. Normally a second attempt would try through a different route but we don’t have the possibility of a second attempt for now.
@timmoreton | cLabs: Unfortunately you can't do much with a
30008 Unknown error
. 1.0.5 and later versions will "stripe" retries between providers, so if you get this from Twilio and have Nexmo configured, it would try Nexmo next. Now, Nexmo has not been helpful despite initial promises with getting a number of you onboarded, but if you haven't got it configured yet, it's worth just checking in to see if you've had the account open long enough that the fraud check allows you to top up with a credit card. I have also been working with MessageBird and the next release will have support for that provider, which seems to work really well.We've also seen a number of cases where delivery status reports "delivered" but the SMS never arrives -- 1.0.5 allows the client to just say "it didn't get delivered" and trigger a retry -- Valora now supports this through a "Resend messages" button.
Useful info
Slashing is now enabled and have been tested on Baklava:
@yorhodes | cLabs: I slashed a bunch of validators on baklava just now and will share the tooling I developed. Here's the list of tx hashes and captured slash log for posterity:
SPOILER_downtime-slash-history.txt
@ag: beautiful
Community
Thylacine | PretoriaResearchLab updated the attestations sections of Cauldron:
Hi folks, I've upgraded https://cauldron.pretoriaresearchlab.io/attestations:
Now includes Baklava @ https://cauldron.pretoriaresearchlab.io/baklava-attestations
Fixed issue with favourites not storing
Fixed sort by column header issues with ascending/descending
Fixed some CORS / issues for most attestation endpoints
Don't forget to vote on the Core Contracts Release 1 governance proposal:
@claire | cLabs: Heads up that there’s the Core Contracts Release 1 governance proposal due to go out next week, anticipated timeline is as follows (dates in US format month/day): -
Proposal: 11/10 (as of morning PT) - UPVOTE
Approval: 11/11
Referendum: 11/12 - 11/16 - VOTE (YES/NO/ABSTAIN)
Execution: as of 11/17
Ref: https://forum.celo.org/t/governance-proposals-to-make-the-protocol-safe-and-easy-to-upgrade/615
zviad | WOTrust | celovote.com published the Celo Validator Round-Up for October: https://wotrust.substack.com/p/celo-validator-roundup-october-2020
Like what we do? Support our validator group by voting for it!