| March 2008 |
| Sun |
Mon |
Tue |
Wed |
Thu |
Fri |
Sat |
| |
|
|
|
|
|
1 |
| 2 |
3 |
4 |
5 |
6 |
7 |
8 |
| 9 |
10 |
11 |
12 |
13 |
14 |
15 |
| 16 |
17 |
18 |
19 |
20 |
21 |
22 |
| 23 |
24 |
25 |
26 |
27 |
28 |
29 |
| 30 |
31 |
|
|
|
|
|
| Nov Apr |
Subscribe to this blog in Radio:
E-mail this blog's author, Satan's Poop Inc. Paila Master: 
|
|
 |
Tuesday, November 13, 2007 |
Somebody sent me the link to this website devoted to the Venezuelan Electoral system. It is well done and there are over 20 presentations on the problems of our electoral system, including evidence of fraud, problems with the electoral registry and the like. A lot of the material complements or explains with voice and slides, some of the stuff found in my RR Models section. Unfortunately, it is only in Spanish.
8:29:04 PM
|
|
 |
Tuesday, October 03, 2006 |
In four earlier posts, I presented a description of the work of Delfino and Salas and the complementary work of Medina on the evidence for fraud in the recal referendum. I wrote four posts on the subject, which you can find here, here, here and here. In the second one of those posts I discussed the parameter "k" a measure of the proportion of fraction of Si or Yes votes to recall Hugo Chavez, divided by the number of people in the same center that signed the petition to recall Hugo Chavez:
Yes(Si) Votes k= ------------------ Signatures
As a function of another “normalized” parameter s
Signatures s= ------------------ Total Votes
In the latest version of the paper, now in English, Delfino and Salas have added a compelling graph of k seen here:

Fig. 1. k as a function of the total number of votes on equal scales at each center for manual (left) and automated (right) centers. (Open circles are centers abroad)
What this plot does is to show k as a function of the number of total votes for voting centers of equal sizes, so that no distortions are introduced by the absolute number of voters. What can be seen is that the manual centers show a lot of fluctuations or scatterat smaller centers, which is what you expect. as the number of voters becomes small. This is because k in some sense measures how good a predictor the signatures were of the actual Si vote to recall Chavez, but as centers become smaller, the accuracy will diminish because the statitistics are "worse" since the number of voters is smaller. That is why you see scatter for small number of votes on the manual centers.
The problem is, that since the size of these centers are the same, one should see the same whether the centers are automated or not. But this simply does not happen as seen on the plot on the right for automated centers. In fact, for smaller centers the automated case curiousl seems to show even less fluctuations, which is absolutely counterintuitive. This is further evidence that the automated vote was manipulated at the recall referendum.
Some people have argued that the problem is that manual centers tend to occurr in more rural or sparsely populated areas, so that the data above simply reflect socioeconomic or socio cultural differences between the manual and automated centers.
What Delfino and Salas did the, is to select centers which are classified by the CNE as "Mixed Townships" and "hamlets" and plot the data for these two specifica cases separately. This is shown below in Fig. 2:

Fig. 2. k as a function of s (defined above) for manual centers (left) and automated centers (right) in "Mixed Townships" (top) and "hamlets" (bottom)
As can be seen, the "strange" absence of scatter or fluctuations, still occurs in the automated centers when these two types of population centers are considered, while the manual centers in both cases show the expected scatter or fluctuations. This is once again evidence that the automated results were somehow manipulated and the data from the recall petition was somehow used mathematically to generate the results, rather than the actual vote.
If you are not mathematically inclined, Delfino and Salas have posted a presentation called "The ABC of the Referendum" (In Spanish, soon in English), where they try to make it simple to understand. You can think of this as k being how good a predictor of the vote in the recall was. If in the automated center k is of the order of one, this means there were as many votes to recall as signatures in the petition to have the recall take place. As k increases, it means there were more and more votes than you would have thought just from the signatures.
Below is a Google maps image taken from the Delfno and Salas presentation. It represents a Parish of the municipality of Valencia, thus, nearby centers are similar in socioeconomic profile. But as you can see, while the automated centers (blue) have k near 1, the manual centers (green) have k's as large as 4.3 in one case, despite the fact that that particular center is very close to automated centers where k barely moved above 1.This makes abslutely no sense unless the data was faked.
Could it be clearer than that?

7:21:22 PM
|
|
 |
Thursday, September 14, 2006 |
Abstract from the Center for Information Technology Policy at Princeton University on breaking the security of a Diebold voting machine, should we make a collection and send them a Smartmatic machine? You can watch the video showing the hacking:. Who was it that said Venezuela had the safest voting system in the world? Ignorance is bliss indeed!
Security Analysis of the Diebold AccuVote-TS Voting Machine
Ariel J. Feldman, J. Alex Halderman, and Edward W. Felten
Abstract This paper presents a fully independent
security study of a Diebold AccuVote-TS voting machine, including its
hardware and software. We obtained the machine from a private party.
Analysis of the machine, in light of real election procedures, shows
that it is vulnerable to extremely serious attacks. For example, an
attacker who gets physical access to a machine or its removable memory
card for as little as one minute could install malicious code;
malicious code on a machine could steal votes undetectably, modifying
all records, logs, and counters to be consistent with the fraudulent
vote count it creates. An attacker could also create malicious code
that spreads automatically and silently from machine to machine during
normal election activities — a voting-machine virus. We have
constructed working demonstrations of these attacks in our
lab. Mitigating these threats will require changes to the voting
machine's hardware and software and the adoption of more rigorous
election procedures.
5:37:09 PM
|
|
 |
Thursday, July 27, 2006 |
While it
does not deal with Venezuela, maybe some readers would be
interested in a paper on Benford’s Law and elections written by Prof.
Mebane of the Department of Government at Cornell University.
The paper looks
at Benford’s law in the context of elections and the detection of fraud. It
looks at the effects of manipulations of data on the results and shows that
various simulated manipulations can have a strong impact on the expected results
from Benford’s Law. The author
then looks at data from the 2004 Florida
election and the recent Mexican election. He concludes that the second digit
Benford test worked well in Dade, Broward and Pascon counties, although there
were some exceptions where questionable results were obtained. In
contrast, the results from the Mexican election imply that there are problems
in many Mexican states with the results although not in most of them. Prof. Mebane
suggests that a manual recount of the vote would clarify these discrepancies. And
based on a recount with sampling, one could decide whether to carry out or not
a complete recount. By the
way, the Mexican election shows a lot of cynicism on the part of both the
Venezuelan Government and the opposition. The Venezuelan Government because
while suggesting that they thought there had been fraud in the election, they
never came out and said that there should be a recount. This would be exactly the opposite of their position in local elections. The opposition, because
their apparent sympathy towards Calderon or antipathy towards Lopez Obrador,
stopped them from calling for a full recount in Mexico, which would have been completely
consistent with their positions on the Venezuelan elections. Shame on both
groups!
8:02:28 PM
|
|
 |
Tuesday, July 11, 2006 |
Hats off to Mexican
scientists that in just a week
are producing data and short papers and posts on the analysis of the elections
in Mexico.
If Venezuelan scientists had responded with similar speed and flexibility maybe
some part of history may have been changed with the 2004 RR. Maybe the precedent
helped!
The paper
on Benford’s law by Mansilla finds some discrepancies with the 8th.
and 9th. digit, which may be statistically significant, but nothing like those
found in the No vote for the RR, it will be interesting to see in the future
further detailed statistical analysis like what was done in Venezuela and what it may
tell us. There is also a long
analysis and discussion by Mochan on the
evolution of the reported vote on the night of the election and the recount. Lots
of data and comments, but I have not digested it sufficiently to understand
even the graphs he is showing, but it looks quite interesting.
I guess those Governments that
want to cheat in future elections in LA are going to have to hire scientists so
that whatever they do to change the outcome matches all of these tests for the adequacy of their
fudged results. Maybe one day UNEFA will honor them?I find the whole thing very cool!
6:08:36 PM
|
|
 |
Tuesday, May 23, 2006 |
This series of four posts on Delfino, Salas and Medina is dedicated to the upcoming visitor from the Carter Center, hoping someone there will read it and will try to get an honest academic opinion on them.
I will close my posting on the work of Delfino, Salas and Medina by showing how curious the results of the failed audit of the night of the referendum were. This is probably the least impacting of the four, but it certainly gives you food for thought.
The CNE had promised the country to audit 1% of the voting machines or 196 of them. Unfortunately only 26 of them were audited on the fateful night of the RR. Curiously, the Si (Yes) vote obtained 63.47% in these 26 machines, compared to the 40.9% that it obtained nationwide.
What Delfino and Salas did was to order the centers that were supposed to be audited according to the fraction of signatures to voters at each center f=Signatures/Voters as shown in Fig. 1, from low f to high f. The sample of centers generated by the CNE had an average value of f=0.37, that is 37% of the registered voters in these 196 centers had signed to have the recall vote against Hugo Chavez. In contrast, the average f for those centers that were eventually audited that evening was a much higher f=0.54 or 54% of the voters in those centers had signed to have a recall vote, as can be easily seen in the plot below of all of the centers and where those that were effectively audited that night fell on the curve.

Fig. 1 Plot of the value of f at each of the centers that were supposed to be audited on the night of the recall vote, ordered from low f to high f. The crosses indicate the 26 centers that were effectively audited.
(The cross point with the low f around 0.17 that was audited curiously corresponds to the military hospital in Caracas)
Now, one can ask a very simple question: What was the probability that you would choose the 26 centers with an average value of f above 0.54 or f>0.54. What Medina did was to calculate it theoretically and then to also simulate it numerically and the probability comes out to an extremely small 3x 10-8 as shown below in the probability curve for getting each value above a certain f:

Fig. 2 Probability plot of the value of f being above a certain value when you chose 26 centers at random from the 196 centers that were chosen on the night of the recall referendum to be audited.
1x 10-8 is extremely unlikely…as so many things related to the recall vote.
What is intriguing is that centers with high f concentrate only a small fraction of all the voters as can be seen in the following figure, where you can see that the largest number of voters is concentrated around f=0.3, precisely where audits were not performed.

Fig. 3 Distribution of the number of votes as a function of the value of f for all automatic and manual centers, showing where the largets concentarion of votes was..
Curious, no?
7:34:20 PM
|
|
 |
Saturday, May 20, 2006 |
In contrast with parts I and II of this series, this part requires some knowledge of statistics. I will try to explain things as much as possible, but it does require a little knowledge. Sorry!
Based on the Delfino and Salas hypothesis, Medina asked himself: Is there anything in the pro-Chavez versus anti-Chavez votes from each election or the recall vote that can reveal that if there is any difference between them? The answer is yes, you can look at the symmetry of the distributions and they will tell you whether there is one ir two random variables.
Suppose you have to variables, let’s say the 1998 anti-Chavez vote and the 2000 anti-Chavez vote at each voting center. You plot one versus the other such as the automated 2000 anti-Chavez vote versus the 1998 automated anti-Chavez vote, you get a plot that looks like this:

Fig. 1 Plot of anti-Chavez votes in 2000 versus anti-Chavez votes in 1998 at automated centers.
You now will measure what is called the vertical and transversal deviations of a graph like this. Let me explain this a little better:
For the graph above you would have an “expected value” which comes from doing a least squares fit to the line y=ax that best fits the data. Now, for each point in the voting data you measure the “vertical” deviation, that is how far is the point vertically from the “expected” or mean line y=ax and the “transverse” deviation, that is how far is each point from the mean line in the direction perpendicular to the line. (See Figure 3)
You now plot these two deviations in a histogram, where as you go away from deviation “zero” you will have fewer points in both the positive and negative directions. For the graph above from the anti-Chavez in the automated centers in 2000 and 1998 you get something that looks like this:

Fig. 2 Distriburion of transverse deviations for the automated votes of the RR
Now, the interesting thing is that there is a mathematical test to determine whether the two variables are random or not. If the two variables were random, which is what you expect from two consecutive elections at the same automated centers, then you get schematically, asymmetric distribution from the fertical deviations and an assymetric one from the transverse deviations.

Fig3. Only one variable is random. The other depends on it.
But, if one only one of the variables is random, i.e. in our case, if the two elections are not “independent” of each other but one set of results was obtained from each other then you expect the opposite, an assymetric distribution from the vertical deviatiosn and a symmetrical one form the transverse:

Fig4. Both variables are random Well what Medina did was to plot this distributions for the RR versus the signatures and also the manual and automated centers and what he finds is that EXCEPT for the case of the data from the automated centers of the RR versus the signatures, everything else follows what you expect from two random variables. That is, in all cases but the RR, the vertical deviations show a positive asymmetry, while the transverse deviations are symmetrical. This suggests that both variables were independently random.
In contrast, the data for he automated centers of the recall vote versus the signatures shows the opposite, the vertical deviations are symmetrical, while the transverse ones are asymmetrical.
Now, for those of you that are not too mathematical inclined, this means that there is a mathematical test that shows exactly the positive behavior between the two cases.
In fact, Medina performed three mathematical calculations that showed that in the following cases there was only single random variable:
--The total number of votes versus voters in the RR
--The total number of signatures versus voters in the RR
--The total number of automated votes versus the signatures in the R
While he performed four others that showed in othere cases there were two independent variables:
--Total votes at the RR versus signatures.
--Manual votes un 2000 versus manual votes in 1998R
--Automated votes 2000 versus 1998
--Manual Votes RR versus signatures.
Mathematically, there is no other conclusion that the SI votes at the automated centers of the RR were obtained from the number of people who signed the petition to recall Chavez using some form of equation with a distribution
How about that!
9:40:08 PM
|
|
 |
Sunday, May 14, 2006 |
Sometimes in the next few hours, I will get the one millionth "page read"
according to the salon.com ranking system. Remarkable that what started
as a curiosity on my part in August 2002 has had so many visitors and
despite its somewhat restricted topic has managed to stay up there in
the salon.com rankings. To tell you the truth, its not only had many
more visitors than I expected, but I have made more posts than I ever
imagined. It certainly beats the school newspaper I started in high
school called "Se dice..." (People say...), a weekly rag which was
banned by the school authorities after only three weeks of very
succesful printing.
Obviously I thank you all for your attention and participation.
While it is not easy writing this regularly, I have to say that the satisfaction of having posted on topics like the Chascon (Chavez/Tascon) list/database and the referendum studies on a timely manner, has been sufficient compensation for my effort.
Perhaps the thrill of writing a blog like this can be best summarized by something that happened last night.Two nights ago I posted
part II of the recall studies by Delfino, Salas and Medina and was particularly
taken by the results of the regional election in October 2004. To me,
seeing that data was the strongest and most compelling proof that may be understood by
anyone that the results of the 2004 recall referendum were fabricated
by the CNE. Then, I began exchanging emails with a good friend on how
strange those results were and amieres in the comments pointed out a single case that was truly amazing. In his own words:
"How about this one example: Escuela Raul Leoni, Parroquia Santa
Apolonia, Municipio La Ceiba, Estado Trujillo. Signatures=762,
Referendum 2004=616/938 (40%/60%); Regionals 2004=1247/530 (70%/30%);
Presidentials 1998=689/318 (68%/32%); Presidentials 2000=597/466
(56%/44%) In this center they have been pro opposition in 1998(68%),
2000(56%) but amazingly in August 2004(40%) the completely flipped and
in Octber 2004(70%) they flipped again and became the most pro
opposition they have ever been!!!
Think
about it. At this voting machine the opposition has always had more
than 56% of the vote, but miracolously, in the recall vote, the
opposition only got 40% of the vote in the form of 616 Si (Yes) votes
and then, as abstentrion went up and the opposition was demolarized, twice as many people came out in the October 2004 regional elections to vote, given the opposition 70% of the vote in the form of 1247 votes for it!
I
asked the same reader if he could check in how many voting machines the
number of pro-opposition votes was larger than the Si (Yes) votes in
the referendum and he quickly answered:
"There are 2181 cases (out 8228 centers, a full 27%) where there were
more votes in the regionals for the opposition than SIs in the
referendum!!! And that considering that many people didn't vote in the
regionals because of the disapointment because of the Referendum result."
This
is by far the clearest and most convincing proof of the fraud that took
place at the recall referendum. It does not require mathematical
knowledge to understand how implausible it would be that a demoralized
opposition, with abstention increasing from 30% to 60%, would increase
the absolute number of votes in 27% of the voting machines. Take that
Carter Center and Jorge Rodriguez! Dare to explain it!Or even try!
I
did not require this to believe that there was fraud, the matehmatical
studies for me were convicing enough. But this information should be
useful in convincing many that still think there was no fraud on that
fateful August day.
In contrast to these fake numbers, and we don't even know how many of those there were in the recall referendum, my visitors are all real and they seem to like coming here
searching for the truth and helping in finding the truth. That in
itslef is satisfaction enough for all of the work that goes into writing this.
9:26:33 PM
|
|
 |
Friday, May 12, 2006 |
While in part I of my presentation of the Delfino, Salas and Medina results, I emphasized the correlation between the signatures collected to call for the referendum to recall Hugo Chavez and the number of actual Si (Yes) votes to recall at the recall referendum, I only did that in order to use as simple a language as possible as an introduction to the topic.
What Delfino and Salas did was to plot the data in a different manner in order to bring the anomalies out better in the data.
What they actually plotted was a “normalized” parameter k equals to
Yes(Si) Votes k= ------------------ Signatures
As a function of another “normalized” parameter f
Signatures f= ------------------ Total Votes
The reason for plotting the data this way, is that it magnifies those voting centers in which the number of Yes (Si) votes is much larger than the number of signatures at that center. Think about it. First of all f is limited to be between zero and one, the maximum number of signatures at one center can only be at most the number of voters at the same center. On the other hand, given the difficulties, limitations and methods for obtaining the signatures as discussed in part I of these articles on Delfino, Salas and Medina, there should be a number of centers where with a low number of signatures, but a high number of Yes (Si) votes, where people did go out and vote but could not sign the petition. Additionally, this would be emphasized in those centers with low f, since f measures the number of signatures. In those centers with difficulties to gather the signatures, the number of people signing should be small, but you would expect the number of people voting Yes (Si) to vary significantly, to fluctuate!
Well, remarkably this does not happen in the automated centers as shown in Fig. 1 (left) but does in the manual centers shown in Fig. 1 (right):

f f
Fig. 1 (Left) k versus f for automated centers (Right) The same k versus f but for the manual centers separating the centers abroad from the data set, because there were special difficulties for gathering signatures for those living abroad.
What is most remarkable about Fig. 1 (Left) is that despite the difficulties in obtaining the signatures, the data for the automated centers is quite uniform and there are very few centers where the number of actual SI (Yes) votes exceeds significantly the number of signatures. Only in seven automated centers is k >2 which is remarkable given that there were forms for only 30% of the people to sign, while everyone could go and vote. In fact, only seven of the automated centers exceeds k=2 but none of them do it by much.
In contrast with the result for the automated centers, in the manual centers the number of pints falling above k> 2 is large and you can see points as high as k close to 10, as would be expected from a process that was so difficult as that of the signatures. This is what you would expect, as only 30% of the people could sign, while close to 70% of them actually voted in the recall referendum. This should generate the type of fluctuations you see in the manual centers but is absent in the automated centers. This is very strange and makes no sense!
If the result above is strange, in my own mind, it is its inconsistency with the next graph that proves the the fraud. The opposition came out of the recall vote absolutely demoralized, three months later in October tehre were regional elections. There was not only a campaign to promote abstention, but abstention more than doubled, going from 30% in the recall referendum to over 70% in the regional elections in October 2004. Despite this, take a look at what happened if we plot the pro-opposition votes as a function of the recall signatures below on the left, in the same centers that were automated for the recall vote:
 
f f
Figure 2. Left: Opposition votes normalized to the number of signatures k, at each center as a function of f the fraction of signatures to voters at each center for the regional election in October 2004. Right: The automated centers once again just for direct one to one comparison.
To me this graph is absolutely compelling: There are more than three dozen points above k=2 in contrast to the seven at the recall vote. There are points as high as k=6, this despite the fact that abstention was double in the regional election what it was in the recall, that it was the opposition that mostly abstained in it and nevertheless the opposition actually increased the number of votes with respect to the signatures in dozens of voting centers, all at once! In fact, I repeat the same plot for the automated centers in the recall next to that regional election just so that you can see how different the two results are.
Personally, I would like to challenge the Carter Center or whomever they designate to even attempt to explain how the results of the regional elections could be what they were compared to the recall vote in Figure. 2 and what was the mysterious mechanism by which opposition voters in so many centers came out in larger number that October to give those results, despite the higher abstention and the demoralized opposition. Where were this people the day of the recall? Why didn’t they go vote and then all of them in synch showed up in October 2004? This simply has no other explanation that the Delfino Salas hypothesis, which I advanced in my conclusions of Part I on the correlations. :
The official results of the recall vote in the automated centers were forced to follow a linear relation with respect to the number of signatures obtained at each of those centers in the recall petition.
For the sake of completeness, I also include below the graphs of the votes against Chavze in the 1998 and 2000 elections, both at the peak of Chavez’ popularity. Despite this, values as high as k=4 or even above can be seen in both cases. These are magically and mysteriously missing from the automated centers in the recall vote:

f f
Fig. 3 Results for the 1998 and 2000 opposition votes as a function of the signatures in the recall petition k, as a function of the number of signatures collected in each center normalized to the total number of voters at each center.
Next, part III: We get a little dense to show that the statistical characteristics of the result of the recall vote show mathematically that the data came from a single set of numbers and not two as expected, indicating the results were obtained from the signatures used to petition the recall.
10:19:06 PM
|
|
 |
Sunday, May 07, 2006 |
I have had a debt with this blog for quite a while, in not presenting the results of Delfino and Salas, a very interesting paper (Spanish version here, English version here) that has taken a look at the recall vote form a different angle than previous studies. In some sense this has actually been good, because now an old friend and colleague of mine, Rodrigo Medina, has expanded the work of Delfino and Salas, showing that it is indeed quite difficult to explain away some of the surprising results from the recall vote. I will jump back and forth between the two papers in my discussion and presentation.
As was the case with other studies of the recall vote, I will try to explain some of these results in as simple a manner as possible. I will do it in sections, so as not to make it too long. Today I will talk about the correlations between the number of signatures for the petition to recall Hugo Chavez and the number of Si (Yes) votes (Vote to recall) at the same voting center, separating the centers into whether they were manual or automated in how the votes were processed and counted.
The first thing to look at is what is the correlation between the Si vote in the recall referendum and the signatures gathered in order to call for the referendum. Onw would think that given the difficulties and limitations in gathering the signatures, as well as the rejection of many signatures by the CNE, the signature values at each center represent a floor. The Carter Center actually cited in its reports the strong correlation obtained between these two variables in the centers which were automated, but said nothing about the manual ones. Indeed, when one calculates the correlation between these two variables for automated centers, one obtains a very strong correlation between the two as shown by Delfino and Salas in their Table I or Medina in his Figure 1, which shows that the correlation coefficient is a remarkably high 0.989, as shown in the following figure (left side) from medina's paper:

Figure 1: Left: Yes votes versus signatures at the automated centers. Right: The same for the manual centers, ecluding points from abroad.
For those not too familiar with the concept of correlation, the “cloud” of points in Fig. 1 (left) would be a straight line if the coefficient were 1.00 and would be a circle of points if it were zero. That the correlation is so high is somewhat surprising. First of all, the number of signature centers was restricted; there were only 2600 centers for the signatures versus 8300 voting centers. Moreover, the number of signatures that could be collected in the process was only 30% of the voters, limiting the total possible, while more than 80% of Venezuelans participated in the recall vote. The forms were on top of that distributed uniformly throughout Venezuela, rather than according to the distribution of voters. Additionally, there were many factors why some of the signatures were missing or not taken into account, such as the CNE invalidating a lot of them, the signatures being public, forms were lost and there were pressures for people to withdraw their signatures. The vote in the recall process on the other hand was supposedly secret. In contrast to this result, in the centers where the voting process was manual, show on the right of Figure 1, the correlation was much less stronger, being only 0. 9264. In the figure, the votes abroad were plotted as squares and not taken into account in the calculation because as can be seen they were much different than the other manual centers for reasons that do not have much to do with the study. They were simply excluded.
If you think about what these correlations mean, there is no reason a priori for much a big difference between automated and manual centers. What the correlation is simply telling us is that in centers with few signatures, few people voted against Chavez and in those with lots of people signing, lots of people voted against Chavez. In fact, what determines whether a center was automated or not is largely the total number of voters at taht center, so there is no reason why centers in similar areas in terms of socio-economic conditions would have different behavior, but they do, as we will see later
The surprising differnec between manual and automated centers can be shown better by making the scales similar in the two plots above as was done by Medina in his Fig. 2 to show the behavior when the number of voters and signatures was small in both cases:

Fig. 2. Plots of the number of yes votes as a function of the number of signatures when the number of signatures is less than 600 for both manual (left) and automated centers (right) Note how different the two are. In the manual case, the dispersion is larger broadening out as it increases. In contrast, in the automated centers it actually narrows down as it reaches zero. This is truly unusual as you would expect fluctuations to be larger as the number of signatures becomes smaller (as the number of signatures goes to zero, there is a higher possibility that a few people will show up and vote against Chavez in some ceneters). In fact, the manual centers behave the way you would expect, the smaller the number the signatures the larger the variations one would expect in the total number of anti-Chavez votes in that same center. In technical terms: fluctuations should be larger as the number of people that signed was smaller.
There is another way of showing how anomalous this is, as done by Delfino and Salas. You order the centers according to the fraction of people that signed the petition to recall Chavez, from the smallest number of signatures to the largest number, in both manual centers and automated centers. Now, you calculate the correlation for only the 150 centers with the smallest number of signatures, that is, you calculate the correlation for the centers 1 through 150 and that is your first point for which you calcualte the correlation. Then, you do the same between numbers 2 through 151, then 3 through 152, then 4 through 154 etc. First of all, since it is a matter of numbers, you would expect the same qualitative behavior in both the manual and automated centers. Second, you would expect more fluctuations at the lower end of the graph since you are calculating the correlations only a range, thus the centers with the lowest number of signatures should show the largest fluctuations. However, this is not what happens as shown in the figures below: The manual centers show the expected behavior, but the automated centers show practically no change in the correlation as the size increases. This certainly makes absolutely no sense, as the number gets smaller in both cases the correlations should definitely fluctuate.

Fig. 3 Correlations calculated for 150 centers as the number of signatures in each center increases, that is, first the correlation is calculated for the 150 centers with the lowest number of signatures, then the smallest center is dropped and the next one with more signatures is included in the sample and so on. Note how in the manual centers (top left) the fluctuations in the calculated correlation go even lower than 0.5 moving around significantly and then increasing to a fairly constant value above the sample #1400. In contrast, the automated centers have the same value for the correlation. The behavior of the automated centers is simply absurd in Figure 3.
Finally, Medina looked at some interesting correlations in municipalities that you would expect to be quite similar:
 
Figure 4: Three municipalities that should have the same proportionality between the number of signatures and the Si (Yes) vote against Chavez, from left to right: Naguanagua (left), Duaca (right). Let us look first at the graph on the left of Figure 4 corresponding to Naguanagua. There are two very clear lines: In one, that with small crosses the number of Yes (Si) votes in the recall is almost perfectly proportional to the number of signatures to hold the recall vote against Chavez, all point practically falling in a straight line. In contrast in the manual centers of the same municipality, the line has a slope which is much larger. Thus, in these centers, the number of people voting to recall Chavez is larger than the signatures while in the automated centers is roughly the same and follows the same proportionality. Curious, no?
In the middle figure, corresponding to the Duaca municipality, the automated centers follow once gain proportionality with the number of signatures. But those centers, in which automation failed, curiously fall all over the place.
What this all suggests and will be explained in future articles, is that basically, the number of votes in the automated centers, was somehow interfered with and the final outcome was simply a number generated in such a way that it would be proportional to the number of signatures at that center. Meanwhile the number of votes in the manual centers were the real ones. In the next post on the subject, the correlations will be looked at in a different way that brings our better the significant differences between the automated and manual centers.
7:58:13 PM
|
|
|