[OC] [MiC] Veepstakes part 2: Predicting Harris’ VP nominee based on who’s scrubbing their Wikipedia page the most



[OC] [MiC] Veepstakes part 2: Predicting Harris’ VP nominee based on who’s scrubbing their Wikipedia page the most

Posted by EdridgeD

14 comments
  1. An update to my last [post](https://www.reddit.com/r/dataisbeautiful/comments/1dod6la/oc_veepstakes_predicting_the_vp_nominee_based_on/) where I predicted Trump would pick Burgum as his running mate, which was in turn an update to a post where I [predicted](https://www.reddit.com/r/dataisbeautiful/comments/i20o4j/oc_veepstakes_using_wikipedia_edits_to_predict/) Harris would be the VP nominee in 2020. I scraped wikipedia edits and visualized the edit frequency in python/matplotlib.

    I still stand by my prior analysis for the GOP pick; reportedly, the choice was changed to Vance relatively last-minute and Trump had favored Burgum. And in my other analysis, Vance was a pretty close #2.

    In my Dem analysis, I started counting edits from the day before Biden announced that he’d drop out. It looks like the pick will be either Shapiro or Walz; the betting markets seem to favor Shapiro, and Harris is set to hold a rally on Tuesday in Philadelphia. It’s still uncertain, and there’s a chance it may end up being Walz! Since last time, I’ve learned to hedge my bets a little. But I believe it is *slightly* more likely to be Shapiro.

    You can see more breakdowns on Github. Feel free to submit PRs for other ways to better visualize this data.
    https://github.com/edridgedsouza/Veepstakes/blob/master/Veepstakes.ipynb

    Here’s a version showing number of edits per day: https://i.imgur.com/SEPSbZf.png

  2. Nice data. I think the colors could use some help, I had to do far too much work jumping back and forth to the legend with there being light and dark of the same colors. Maybe make the lighter shades the people closer to the bottom (less relevant) or label the endpoint.

    You also don’t need year in the axis label, it just adds a bit of unnecessary clutter. I’d also argue you don’t need the numbers on the left axis (or gridlines on either) since we don’t really care about the specific number of edits, just the total amount and how they accumulated over time.

  3. Genuine question: why would they “scrub” their wiki pages and what are they scrubbing?

  4. but how is the quality vs quantity?

    how many can field dress a deer in a few minutes while signing a law to feed children while speaking Mandarin? 🙂

  5. I think it will be Shapiro because Pennsylvania will be a lock for the Dems. Not my first choice though, I’d prefer Whitmer.

  6. The bottom 3 are all semi-protected, and Walz just got extended protected status. Considering it’s the first time the others have gotten this level of national attention, I’m not too surprised we’re seeing edits come in.

  7. Hey nice colour choices bro. Really, you’ve put in a lot of effort with this

  8. Some of these names were not well-known nationally before the past few weeks, which probably has more to do with it.

    Someone like Buttigieg who’s already been in the national spotlight has a well-documented Wiki that will naturally have fewer edits.

    I’d still bet on Shapiro, though. Love Walz, but PA is just too valuable. 

  9. I suspect that those bottom 3 -whitmer, newsom, and butt- started their scrubbing years ago. So wouldn’t rule them out but this is an interesting graph and even more interesting insight.

  10. I’m here for Harri Butt! This also an interesting graphic. Not sure if I would consider edits as scrubs. I would assume this is akin to updating your resume.

  11. there’s few enough categories to label the lines directly, you could then remove the legend. Else interesting data set

  12. I wonder if the data could been skewed if someone didn’t actually need to scrub any socials. Mark Kelly is pretty squeaky clean.

  13. Is a higher number associated with high, medium, or low scrubbing. If there’s too many things to put a polish on, perhaps it’s not a great candidate. Low edits could show a matured page that has the ‘right stuff’. :shrug:

  14. This looks neat, but too often this sub is r/hatethecolorblind. This is really hard for me to read. 

Leave a Reply