**Source**: FiveThirtyEight. They provide both a nice csv of polls as well as pollster ratings
**Tools Used**: Plotly for the visualization
—-
I created my own average (just a simple Exponential Moving Average) since I want to do further analysis later on. What this is measuring is a given poll’s deviation from the polling average at a given point in time.
Then ofc we look at the distribution of deviations by the pollster rating (from FiveThirtyEight).
The actual polls I considered were national polls as well as polls from the 7 swing states for Trump’s numbers specifically. I only looked at polls from August onwards since that’s when Kamala joined the race
The expected variance was derived from the sample sizes of each poll as well as some random effects in a mixed model to account for variance between polls not accounted for in sample size.
The blue dotted bell curve is what we would “expect” to see if the pollsters were telling the truth and not herding, while the black bell curve is the distribution we actually got.
Basically all this is to say that herding probably did occur. It seems that good pollsters were honest and were perfectly willing to release outliers, while bad pollsters seemed to engage in herding behavior
Most surprisingly perhaps is the fact that it doesn’t seem to be straight up “worse the pollster, harder they heard”. Rather the 2nd quartile of pollsters by quality are responsible for the worst herding behavior, while the bottom quartile herded much more mildly
—-
Also plan on releasing a full article with an interactive version with a more indepth post mortem, so stay tuned
>“Herding” specifically refers to the possibility that pollsters use existing poll results to help adjust the presentation of their own poll results. “Herding” strategies can range from making statistical adjustments to ensure that the released results appear similar to existing polls to deciding whether or not to release the poll depending on how the results compare to existing polls
By drawing upon information from previous polls, herding may appear to increase the perceived accuracy of an individual survey estimate. A troublesome potential consequence of “herding” is that survey researchers who practice herding will produce artificially consistent results with one another that may not accurately reflect public attitudes. This perceived consistency of public opinion could instill a false confidence about who will win an election, thereby impacting how the race is covered by the media, whether parties devote resources to a campaign, and even if voters think it is worthwhile to turn out to vote.
What does left vs right on the x axis mean here? Why is the second quartile shifted to the right?
Unless Musk found a way to cheat. Always a possibility when your shadow running mate controls a military grade satellite array and a shitload of technology capability.
Nope, not the bad ones. All of them. Stop paying pollsters and they’ll go away.
So the “amplitude” of the two crests implies herding, but the “phase” does not?
On the fivethirtyeight sub reddit told me Ann Selzer’s +3 Iowa poll was completely legit.
I got downvoted to oblivion for calling it out just based on common sense. The echo chamber is still real here.
7 comments
**Source**: FiveThirtyEight. They provide both a nice csv of polls as well as pollster ratings
**Tools Used**: Plotly for the visualization
—-
I created my own average (just a simple Exponential Moving Average) since I want to do further analysis later on. What this is measuring is a given poll’s deviation from the polling average at a given point in time.
Then ofc we look at the distribution of deviations by the pollster rating (from FiveThirtyEight).
The actual polls I considered were national polls as well as polls from the 7 swing states for Trump’s numbers specifically. I only looked at polls from August onwards since that’s when Kamala joined the race
The expected variance was derived from the sample sizes of each poll as well as some random effects in a mixed model to account for variance between polls not accounted for in sample size.
The blue dotted bell curve is what we would “expect” to see if the pollsters were telling the truth and not herding, while the black bell curve is the distribution we actually got.
Basically all this is to say that herding probably did occur. It seems that good pollsters were honest and were perfectly willing to release outliers, while bad pollsters seemed to engage in herding behavior
Most surprisingly perhaps is the fact that it doesn’t seem to be straight up “worse the pollster, harder they heard”. Rather the 2nd quartile of pollsters by quality are responsible for the worst herding behavior, while the bottom quartile herded much more mildly
—-
Also plan on releasing a full article with an interactive version with a more indepth post mortem, so stay tuned
>“Herding” specifically refers to the possibility that pollsters use existing poll results to help adjust the presentation of their own poll results. “Herding” strategies can range from making statistical adjustments to ensure that the released results appear similar to existing polls to deciding whether or not to release the poll depending on how the results compare to existing polls
By drawing upon information from previous polls, herding may appear to increase the perceived accuracy of an individual survey estimate. A troublesome potential consequence of “herding” is that survey researchers who practice herding will produce artificially consistent results with one another that may not accurately reflect public attitudes. This perceived consistency of public opinion could instill a false confidence about who will win an election, thereby impacting how the race is covered by the media, whether parties devote resources to a campaign, and even if voters think it is worthwhile to turn out to vote.
What does left vs right on the x axis mean here? Why is the second quartile shifted to the right?
Unless Musk found a way to cheat. Always a possibility when your shadow running mate controls a military grade satellite array and a shitload of technology capability.
Nope, not the bad ones. All of them. Stop paying pollsters and they’ll go away.
So the “amplitude” of the two crests implies herding, but the “phase” does not?
On the fivethirtyeight sub reddit told me Ann Selzer’s +3 Iowa poll was completely legit.
I got downvoted to oblivion for calling it out just based on common sense. The echo chamber is still real here.
Comments are closed.