For the conclusion of our 4-part article series on the Eternal Draft Project, Micah and I will talk about the different factions’ WRM (Win Rate Modifiers), highlight cards for which our evaluation changed significantly and wrap up with some closing thoughts. If you guys haven’t checked out the previous 3 articles, I strongly recommend checking them out here, here and here. As always, I would love to hear your thoughts in the reddit thread!
While it’s hard to come up with an accurate and direct implication of an individual card’s WRM, a faction’s WRM is much more straight forward to comprehend. Imagine the following scenario, before even opening your first pack, you decide that you really want to play Praxis, and decide to force it. By doing so, your expected winrate will be 49.48% (50-0.52) on average. Similarly, if you decide to force Hooru in a vacuum, your expected winrate will be 57.13%.
This works because the WRM increases based on the card strength of the faction, and decreases based on the number of people that tend to play it. So even though a faction may not have the strongest cards, you can still easily draft a strong deck if it is a relatively unpopular faction (e.g. Hooru). There are some mathematical manipulations to normalize the distribution, but I shall not bore you with the details!
This is an amazing list (if I do say so myself) and a lot of the results make sense to me. Over the past 3 articles, we have repeatedly seen data suggesting that people are too greedy with their splashes and the abysmal WRM of 4F/5F piles backs up my beliefs. Note, this is not to say that 4F/5F is a bad strategy, but it could indicate that players aren’t playing enough fixing or not building their power bases correctly to support the deck.
Praxis has a surprising showing at -0.52 WRM and I think this arises because there seems to be a consensus that it is the strongest faction. Thus, people tend to draft it heavily, leading to it being cut more frequently than other factions. Being played 191 times also lends more credence to this theory.
Hooru (which I’m a huge fan of) has an amazing WRM of 7.13, Again, I have to emphasize that this does not mean that Hooru has the strongest cards. This simply means that it is often the faction of the strongest cards left in the pack, due to being a decent faction that is underdrafted. What this DOES mean is that if you are ever unsure which faction to be in, #forcehooru!
Individual Faction Highlights
For this section, Micah and I decided to highlight a card from each faction (Time, Justice, etc) that we are now more/less likely to play post-analysis.
More likely to play
Micah: Amber Ring. Previously I would actively try not to play Amber Ring, but now I would be happy to have 1 in many of my time decks. Runner-up: Envelop
Flash: Praxis Displacer. This is a card that I definitely underrated previously. It turns out slow speed teleport is still great, especially if it’s tagged onto an acceptable body. The emphasis towards a more Voltron/weapons-based style of gameplay highly benefits all bounce effects.
Micah: Backlash. The card has blowout potential and I can see fitting this in as my 26th or 27th card more often now. It’s possible this card is just underrated and that affects its WRM. If that is the case it’s even better because no one is expecting it! Runner-up: Icebow (in Feln and Elysian only)
Flash: Icebow. Another card that I’m guilty of underrating. I was always aware of the synergistic aspects, but I thought it was too rare to be of note. However, the data suggests that I was mistaken and I’m definitely keen to give this card another go. It could even be a good speculative pick if I’m heavily in primal p1 since it’s going to be great 50% of the time.
Micah: Furnace Mage. It has always been a solid card, but it appears that dealing with attachments can be an extremely useful in this limited format. Runner-up: Pit Fighter
Flash: Furnace Mage. Spot the theme yet? I seem to systematically underrate cards that can deal with weapons, something I attribute towards set 1 draft being less weapons-orientated.
Micah: Valkyrie Aspirant. This little dude does it all and I’ve been playing him more often as a result. Runner-up: Crownwatch Squire (the numbers are high enough to try him out for myself to see if it’s a lack of sample size or if it’s actually good).
Flash: Tranquil Scholar. RNGesus doesn’t favor me much (despite me writing for their subsidiary xP) and hence, I often felt that Tranquil Scholar was only marginally better than a random Stranger. The numbers suggest otherwise, and I’m keen to try rolling a few more battle skills.
Miach: Cabal Slasher. I thought this was stone-cold unplayable. Now I believe it is sometimes playable. Runner-up: Sorrow’s Shroud
Flash: Inspire Obedience. A 7-cost Slay seems so unplayable at first glance, but again, the numbers disagree. Given the weaker removal in this draft format, it might often be worth picking up an Inspire Obedience if your deck is slow enough.
Micah: Amaran Camel! Not just for lifeforce memes any more. This card seems to be great in Elysian Fliers. Runner-up: Shield bash
Flash: Aerialist Trainer. I was always not fully on board the Aerialist Trainer hype train as I was not sold on the situational card draw being worth the cost. I know multiple top drafters disagree with this, and with the stats to back them up, I’ll have to give this card another go.
Less likely to play
Micah: Initiate of the Sands. I never considered this a good card in general, but I would sometimes play it as my 27th card. Now I’d rather have basically any 2-drop in that spot. Runner-up: Predator’s Instinct
Flash: Sauropod Wrangler. I still think this is a fine card, but without multiple hits in the deck, it often ends up being one of the weaker 2 drops. Thus, it might be worth picking up other higher impact Time cards unless I’m short on 2 drops.
Micah: Skycrag Wyvarch. As insane as it sounds this card lives in the bottom of the WRM dumpster. I will play this if I am already in Primal, but I wouldn’t read its signal as strong during the draft. Runner-up: Dragonbreath. Much like Icebow I’ll play continue to play this in Feln and Elysian, but I won’t be taking it highly or splashing for it.
Flash: Yeti Troublemaker. I thought this card wasn’t great, but it was at least a decent filler. However, it seems like paying 4 power for 3/2 is too much a drawback and does not outweigh the card advantage granted by echo. That being said, you can bet I’m still playing this card if I manage to pick up multiple Ageless Mentors.
Micah: Oni Ronin. This poor dude is still good, but requires a VERY aggressive deck. Much like On the Hunt, it’s great on turn 1 and pretty mediocre afterwards. Runner-up: Cloud of Ash. *sniff* Sadly I will submit to the data and not play this unless I’m in dire need of playables.
Flash: Piercing Shot. I originally picked this extremely highly, but given the increased density of mediocre damage-based removal, it seems to be no longer as good as it was in Set 1.
Micah: Minotaur Oathkeeper. I realize this is heresy, but I do think it has been moderately overrated by the community. I will still play this, but I don’t believe it’s the best Justice common anymore. Runner-up: Mithril Mace
Flash: Flight Lieutenant. I used to run this whenever I was in a Justice shell and lacked good finishers. However, it seems like 4 health is way too low for a 7-drop. Being vulnerable to most removal is definitely not something that you want for your top end.
Micah: Longshot Marksmen. This card has just performed really poorly and I’ll try to avoid it if possible. Runner-up: Desperado
Flash: Lethrai Ranger. Oh, how the mighty has fallen. From one of the best 2-drops in set 1 draft, Lethrai Ranger seems to be barely better than a completely off-faction Stranger now. In the current format, there is a lower density of cards that grant evasion, and thus, Lethrai Ranger often ends up as a vanilla 2/2.
Micah: Accelerated Evolution. This card is very good, but I will no longer attempt to warp my manabase just to play it. If I’m in Elysian, I’ll still be windmill slamming this. Runner-up: Smuggler’s Stash
Flash: Xenan Cupbearer. This is probably the only card with lifesteal that I am less likely to play. The loss of 2 stats (compared to the standard Striped Araktodon) is way too costly for the lifesteal keyword.
Closing Thoughts (Rant Warning!)
On New Friends and Healthy Debates
First and foremost, I want to thank Micah for putting in the enormous effort in collecting and compiling this gigantic dataset. None of this would be possible without his hard work! This series of articles has been amazing to work on and it was definitely a fun experience.
I am also grateful to the numerous players who took time to debate and comment on the articles, (be it on reddit or discord). Regardless of whether I managed to convince you or not, talking about it helps me ensure that I’m doing the right thing and not just fluffing some numbers. Also, there were some really helpful suggestions and plausible alternate explanations that were brought up and I really enjoyed reading through them!
On Expectations and Preconceived Notions
When I first started to analyse this data, I had next to no expectations. I remember my initial conversation with Micah where I was even hesitant to comment on whether we will be able to put out articles based on this analysis. After some preliminary results, we initially estimated 2 articles worth of content, and today, exceeding my wildest expectations, we are concluding a 4-part series and I still have so much more that I want to talk about and do with the data! I’m really glad that I started this venture despite my initial scepticism.
As such, I do understand why there seems to be a general reluctance to accept the results of this analysis (besides the issue of noise which I will address in the next section). The results were often contradictory to some of our assumptions as well and thus, as humans, it’s very tempting to simply dismiss results that you don’t agree with. However, that is not a good reason to dismiss results, nor is it helpful in the long run. I think it’s always important to go into something new with no expectations, and no preconceived notions.
On Noise and Babies
I have no doubt that this data set is crazy noisy. I also have no doubt that there are reporting biases and other effects from not knowing what is left in the pool or what is being passed in the picking stage. That being said, a lot of comments on how we shouldn’t read too much into the data or dismissing every controversial data set as noise really annoyed me, because not only is it not helpful, it is flat out wrong.
Again, let me go back to my field of work. My work revolves around trying to understand how the human brain work. However, because we can’t just cut open a living human’s skull as and when we wish (I know right? Ethics getting in the way of science xP), we use fMRI instead. The problem though? Because of field inhomogeneities, imperfect gradients, subject motion and what not, the signal from the brain is 0.5x that of the noise. To make things worse, we aren’t just interested in the signal, we are interested in the change of the signal in response to certain stimulation, which is often in the regions of 5% of the signal. All in all, what this means is that if your raw data is +1, your noise will be approximately +/-40. Despite this huge uncertainty, fMRI has proven to be an extremely valuable technique for brain studies and is responsible for ~60% of the knowledge on the human brain (Of course, this is only achievable by careful analysis, good modelling of the noise function, generating group level models, and so on). Now, could you imagine what would happen if the first researchers working on fMRI saw the error of +/-40 (probably closer to +/-400 in their time), and decided, screw this, we can’t read anything into the data because it’s so noisy? Neuroscience as a field would be so much worse off.
I am very familiar with the inherent mental block that tells us to only trust data with tiny errors (I remember measuring g to 5 decimal places, computing pi to the 50th decimal place and other fun stuff for physics labs in college). However, that isn’t strictly true. Tiny errors mean that we can draw clear cut, distinct conclusions directly, with no room for negotiation (unless the whole experimental paradigm is wrong). Having larger errors just means that there is more uncertainty in the data, but the data itself is still useable. One of my favorite sayings, when I first entered the world of fMRI, was “Don’t throw the baby out with the bathwater”. Yes, the data is noisy and skewed, but that doesn’t mean it is completely unusable. Rather, what it means is that we should proceed with caution and always check if there are alternative explanations.
On Treading lightly and Constant Re-evaluation
Knowing that we are on thin ice with our shaky data, we have to always be careful about making statements, and being aware that there is always a possibility of a better explanation. We should also try to generalize and test our theories. For example, instead of using a single data point to back up our statements, it’s often worth considering if there is a generalization that we can apply and see whether it holds for other cases. (e.g. looking at Predator’s Instinct and Xenan Intiation, I suspect that if two cards fill very similar roles, the weaker card’s WRM will be artificially suppressed by the model due to a negative correlation with the frequency of the stronger card. Looking at Strength of Many vs Finest Hour, Storm Glider vs Stormcrasher, Scaly Gruan vs East-Wind Herald, the same trend holds, which lends further evidence to this theory.)
Similarly, when we see strange anomalies in the dataset, we can theorize about possible explanations. For example, Icebow has a huge WR modifier and I think this arises from the crazy amount of synergy it gets from being played alongside Sand Viper, Gorgon Swiftblade, Memory Dredger and so on. In decks with no synergy, Icebow is often one of the first cuts. From this, even though we know the estimate for Icebow’s WR modifier is off, we learnt that players are probably pretty good at evaluating Icebow’s effectiveness in decks, and also that when Icebow is good, it is probably, freaking good.
It is also important to test these theories. By comparing the WRM of Icebow across different factions, we can see a significant increase in Icebow’s WRM in Elysian (deadly) and Feln decks (infiltrate/deadly). This substantiates that our initial explanation and also, tells us that if our deck has multiple synergistic units, it might be correct to pick Icebow over something like East Wind Herald.
Most importantly, there will also often be times when the data run contrary to our instincts (e.g. Iceknuckle Jotun) and there isn’t a clear explanation to why it is so. At that point, we should entertain the possibility that mean our instincts are wrong. Since running this analysis, I have played Iceknuckle Jotun twice, and it has definitely outperformed my expectations. The 6 health is a significant roadblock for most decks and it can often trade 2-for-1 if your opponent is trying to be the beatdown (Note: I am not saying to first pick Iceknuckle Jotun, just that he is not as unplayable as I thought). And this, right here, is the whole point of the analysis: to identify cards that are good, and being undervalued so that we can play them more often, and to identify cards that are bad, or “traps”, so that we can avoid stepping head first into them.
On tl;dr and wrapping up
tl;dr: Noisy data doesn’t mean it’s unusable. It means that we need to proceed with caution and see whether alternative explanations make sense (note: noise is NOT a good alternative explanation). We can also incorporate this data into our drafting, and see whether these data is showing true effects. Furthermore, we can run additional studies and/or analysis to test whether hypotheses formed from these results hold true, or are we just being led astray by random chance?
This segment turned out much longer than I expected, but I do hope I put forth my main point clearly enough. As always, let me know if you agree/disagree and feel free to hit me up for an in-depth discussion in the reddit thread or on discord!
There are lies, damned lies, and then there are statistics,