Talk:CHSH inequality/Archive 1

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Whether of not CHSH actually backed the "CHSH" test

(Caroline Thompson 23:28, 30 Jun 2004 (UTC)) See also a comparison of CHSH and CH74 inequalities Note that CHSH did not in fact endorse use of the inequality in real optical experiments in the manner that has become customary. The page as it stands does not cover the important matter of how the test statistic is to be calculated from the data. The formula in general use involves normalisation by division by the sum of the coincidences. Clauser et al recognised that this was not a safe practice -- it is liable to bias the test -- and would have divided instead by the total number of pairs emitted. This, however is impractical on two counts: (a) the number emitted is not known, and (b) to use even a crude estimate of this number would mean all the terms in the test being so small that there would be no chance of violation, since with the known low detection efficiencies it is clear that the number emitted is much larger than the total number of observed coincidences.

They therefore recommended a different test, explained more fully in Clauser and Horne's 1974 paper, Clauser, J F and Horne, M A, “Experimental consequences of objective local theories”, Physical Review D, 10, 526-35 (1974). The test assumes a slightly different experimental procedure. Alain Aspect considered this test inferior, but he did not use the most elegant derivation. His derivation (see quant-ph/0402001) relied heavily on the "fair sampling" assumption, so he assumed the test itself did. Using Clauser and Horne's own derivation it is clear that it does not, and is generally superior.

On second thoughts, readers would be better off consulting the original paper than this page -- I find it both confusing and misleading! They would be even better off reading the Clauser and Horne 1974 paper, which makes it clear that in real experiments the hidden variables that are of logical importance control only the probability of detection. They talk of an "Objective Local Theory" rather than a local hidden variable one. Their approach is, it seems to me, exactly what is needed in optical experiments. Rather than thinking in terms of photons, we need to think in terms of short light pulses. They are emitted with a certain polarisation direction and pass through polarisers to emerge with intensity reduced according to Malus' Law. The intensity at this stage determines the probability of detection. (Caroline Thompson 19:57, 1 Jul 2004 (UTC))

(Caroline Thompson 21:40, 8 Jul 2004 (UTC)) again. Is nobody else interested?

This whole subject of Bell inequalities is a mess that could have been avoided had people (including Alain Aspect) taken more notice of Clauser and Horne's 1974 ideas. The notation there is what is needed. Earlier work is just confusing, with P sometimes standing for a probability, sometimes for a quantum correlation estimate.

I've just been re-reading the CHSH 1969 paper, CH 74 and Bell's original papers. The key thing is that Clauser and Horne did not endorse the use of the "CHSH" inequality in the form in which it is now used. They said emphatically that it could only be used if N, the number of pairs emitted, was known. The derivation they give in 1974 of what I call the CH74 test shows that it does not demand knowledge of N. Incidentally, I now realise why Aspect used a clumsy derivation of the CH74 test in his 2002 talk and thought it required the fair sampling assumption (the standard manoeuvre to bypass the problem over N). He was following the 1969 derivation.

Where the current wiki article goes wrong is in blandly assuming the correspondence:

ρ( j ) = 1 / ∫Γ dλ → 1/({ j = first of J Σ last of J} 1).

I don't see how one can proceed logically working this way around. You have to work, as Bell and CHSH did, just with the hidden variable approach. You then see (and in fact this is clear in Bell's papers) that you are dealing with probability estimates. You must use N as denominator, not this sum over j. This is because you must, in real optical experiments, allow for non-detections and you cannot assume that these will not depend on the detector setting. (See the Bell test loopholes page and my Chaotic Ball paper. The latter, incidentally, includes the diagram that shows that the "Bell inequality" introduced in the Bell's Theorem page is simply not valid if there are non-detections, even if you do use N.)

Returning to the wiki article, it does seem very professional, even if I do grumble! I found it copied in a couple of other free encyclopaedias. Does anyone know who wrote it? I'd better not simply scrap it and start again!


(Frank W ~@) R 05:32, 20 Jul 2004 (UTC)): Hallo Caroline Thompson,

I just noticed your comments, and I'd like to give at least a rapid "heads-up".

Where the current wiki article goes wrong is in blandly assuming the correspondence:[
ρ( j ) = 1 / ∫Γ1/({ j = first of J Σ last of J} 1) ]

Rather than being wrong, that's where the present article avoids the discussion of possible additional assumptions and considerations, which might broaden the applicability of the derived CHSH inequality.
E.g. the consideration: "what if we didn't merely try to express and summarize the actual counts Σ nj themselves, but instead some generally unknowable numbers N >= Σ nj, of which the counts Σ nj are merely a more or less representative sample ?" ...

you must, in real optical experiments, allow for non-detections

No, not unconditionally: it depends on the quantity which is to be measured. One might for instance be strictly and only concerned with (the counts of) actual "detections". As the simplest and experimentally immediate case, that's the approach taken in the article. Considerations of whether and how to count possible "non-detections", and their implications for the inequality and/or its applicability, are necessarily in addition to dealing with "detections".
However, the present article would surely benefit from outlining, or linking to, such additional implications.

you cannot assume that these will not depend on the detector setting.

At least: all assumptions must be made explicit. Which assumptions one or the other can or cannot follow may depend on purpose and rigour.
Myself, I'd try to avoid any assumptions concerning "non-detections"; as (incidentally? &) does the present article.

I'd better not simply scrap it and start again!

I appreciate your being patient and communicative. Of course I'd rather see the article contents enhanced, than diminished;
most of all the continued accounting of the assumption (made in order to derive this and related inequalities) of a constant total domain Γ in all sets of trials, independent and in spite of different "detector settings" in various subsets of trials. Frank W ~@) R 05:32, 20 Jul 2004 (UTC)


Thanks for the feedback. You are right in a way -- the article makes this assumption about the equivalence of the sum over j and the integral that CHSH really requires. It may not be a reasonable assumption but they've stated it and develop the consequences.
Do they do this logically, though?
The inequality they land up with is true, so long as you remember that the terms represent integrals over the whole of the hidden variable space. What is not true is the next sentence:
"For certain settings, the corresponding experimentally determined correlation numbers which are necessarily obtained from counts in four disjoint sets of trials, can be found to fail the CHSH inequality with considerable significance; as demonstrated for instance by Alain Aspect et al.."
The estimated correlation numbers used in the tests are the ones using the sum over j as denominator, but the inequality does not apply to these -- unless, of course, the infamous "fair sampling" assumption is correct, and the samples in each set are between them scattered evenly over the whole of the space Γ. (We then still don't have unbiased estimates of the terms as originally defined but this does not matter so long as all points in Γ have the same chance of being detected. We could have derived the same inequality taking this constant chance (the detector efficiency) into account. (I think. Haven't quite checked this. I find the notation of the article very off-putting compared to Clauser and Horne's 1974 derivation.)
Whether or not the next sentence is true is debatable. The article says:
"Therefore the assumptions based on which the CHSH inequality is derived are collectively unsuitable to represent all experimental results; namely the assumption of local hidden variables from one and the same constant total domain Γ in all sets of trials."
What are they implying? That the inequality as applied in practice is not valid (with which I agree) or that local hidden variable theories are incompatible with the observations? Are they saying that it is a necessary requirement of local HV theories that the detections should come evenly from all points in the domain Γ? If so they are wrong. Hidden variable theories are, as the Clauser and Horne 1974 paper clearly expresses, more general than this, and can perfectly well result in "variable detection probabilities" -- as has been known since Pearle, 1970.
So I'm still in a quandary as to how to improve the page. I'll have a try some day.
Caroline Thompson 21:33, 20 Jul 2004 (UTC)

What the present edition of the article says/means

(Frank W ~@) R 04:10, 23 Jul 2004 (UTC)) Hallo Caroline Thompson,

the article makes this assumption about the equivalence of the sum over j and the integral that CHSH really requires

Rather than summing or integrating counts of "detections", some seem to entertain assumptions such as the infamous "fair sampling" in order to sum or integrate numbers of "non-detections" as well (or instead). Such assumptions are those to which I objected above; not least because the CHSH inequality can be derived and related experiments be discussed in terms of "detections" alone.

The estimated correlation numbers used in the tests are the ones using the sum over j as denominator

Yes -- however: not estimated, but plainly observationally counted (at least in the simplest case, without contamination by any more or less objectionable assumptions about "sampling of nondetections").

but the inequality does not apply to these -- unless, of course, the infamous "fair sampling" assumption is [invoked]

On the contrary: If the inequality is strictly expressed in terms of counts of "detections", then "sampling" is trivial: we count and consider "detections", all "detections" and nothing but the "detections".
Only by making additional assumptions about numbers of "non-detections" the inequality can be expressed in terms of such numbers as well.

[... the assumption of local hidden variables from one and the same constant total domain ...] What are they implying?

Your above reference, PRD10, 526 (1974), spells out the crucial additional assumption most directly in note (13):
"We should emphasize that an assumption is made here. By writing the density as ρ( λ ), instead of a more general conditional ρ( λ | a, b ) we deny such objective and local possibilities as ..."
This assumption is far stronger than the ubiquitous
"p12( λ, a, b ) = p1( λ, a ) p2( λ, b ) (2')"
which (per definition of "objective and local possibility") must hold (merely) and separately for any one particular value λ, with a corresponding one particular pair (a, b).
A particularly plain "objective and local possibility" which is being denied through this additional assumption (or rather: requirement) would be:
to identify Γ with the entire set of trials in which the "detections" were made, and (thereby) to identify any one value of the variable λ as one particular trial (index) in which one particlar pair of "detections" were made (one by A, and one by B); i. e. generally for one particular pair (a, b), but not for any other, distinct pairs such as (a, b') or (a', b) or (a', b').
For this "possibility", for instance, the inequality cannot be derived. Therefore this "possibility" remains consistent even with experimental results which are not summarized by the inequality.
Perhaps the article should give this point more emphasis ...
Regards, Frank W ~@) R 04:10, 23 Jul 2004 (UTC)
I just prepared a careful response and accidentally lost it! The main point I want to make is that, as far as I know, a derivation of the CHSH inequality using only the observed detections that is valid for cases in which there are in fact some non-detections does not exist. If you think otherwise, can you tell me where to find such a derivation?
But in point of fact I can prove that the inequality is not true in such cases. My Chaotic Ball model is a counter-example, sufficient to refute any such inequality.
Clauser and Horne's note 13 concerns possibilities that are not relevant in experiments conducted in the spirit Bell intended. They are talking of effects of the detectors on the source, which, with distant detectors, can be assumed not to happen. Your example is not relevant unless the inequality can be proved when restricted to detected events -- which my model proves it can't!
Incidentally, you might be interested in my new little paper, comparing the CHSH and CH74 inequalities. The CH74 inequality does not need any assumptions about fair sampling. It is designed for use using only observed detections. It was used for all experiments up to 1980 and was, I think, dropped in favour of the (mis-interpreted) CHSH test as a result of misunderstandings.
Caroline Thompson 18:18, 23 Jul 2004 (UTC)

Clauser and Horne's note 13, PRD10, 526 (1974)

(Frank W ~@) R 04:12, 26 Jul 2004 (UTC)) Hallo Caroline Thompson,

Clauser and Horne's note 13 concerns possibilities that are not relevant in experiments conducted in the spirit Bell intended.

The possibilities which Clauser and Horne describe in note 13 appear considerably less general and far-reaching than what is implied by the exact mathematical condition, which Clauser and Horne require in their derivation, and which they state quite unambiguously (also in note 13). They don't seem to address the (to any experimentalist presumably very plain) possibility at all, that the trial index presents the "hidden variable". Yet their mathematical condition clearly rules out this possibility, because the required four different "pairs of settings" inevitably refer to four distinct (though not necessarily disjoint) sets of trials. Consequently trials must have been considered that belong to one particlar set, described by one particular "pair of settings" but not another. For the corresponding trial indices (as values of the "hidden variable") the condition stipulated by Clauser and Horne is therefore not satisfied, and the inequality cannot be derived. Accordingly, finding the inequality experimentally violated cannot conflict with this particularly plain possibility; regardless of experimental sensitivity or theoretical intentions.
While this is already quite obvious in the present article, perhaps this and other suitable "possibilities" may be spelled out in more detail in Objective local theories which are not experimentally refuted.

[...] a derivation of the CHSH inequality using only the observed detections that is valid for cases in which there are in fact some non-detections does not exist

I don't presume to judge the validity of any statement concerning "non-detections"; except to note that it is surely one valid possibility to stricty ignore any "non-detections", in definitions and derivations as well as in corresponding experimental counts ... [Continued in Chaotic Ball section]

The Chaotic Ball model

(Frank W ~@) R 04:12, 26 Jul 2004 (UTC)) -- continued My Chaotic Ball model [...]

Most questionable and puzzling in this model seems to me, that you don't consider the orientation angle φ defined (according to Malus, in terms of the described counts) as
"ArcCos[ (NN + SS - NS - SN) / (NN + SS + NS + SN) ]",
but instead some apparently geometric "angle φ between directions", presumably
"ArcCos[ (ab^2 - 1/2 (na^2 + sa^2 + nb^2 + sb^2 - ns^2)) / Sqrt[ (na^2 + sa^2 - 1/2 ns^2) (nb^2 + sb^2 - 1/2 ns^2) ] ]"
where "ab" is the distance between Anne and Bob, "na" is the distance between Anne and point N on the ball, etc.
Those are two differently defined quantities, which must be distinguished carefully. (Usually one considers only the former, i. e. the orientation angle between detector pairs; especially in order to evaluate their respective "settings" in any set of trials in the first place.) Frank W ~@) R 04:12, 26 Jul 2004 (UTC)
Frank, we seem to be talking entirely at cross-purposes. Before we go any further could you please tell me where you have seen a derivation of the inequality that you think is a Bell one but that is somehow derived directly from the kind of approach used on the first page of the wiki article? As far as I know, such an inequality does not exist! If it did I might be able to make sense of your concerns re Clauser and Horne's note 13, but as it is, the "trial index" simply is not the kind of thing that can usefully be used as a hidden variable. It is not the kind of thing considered by Bell, Clauser, Horne, Shimony or anybody else actually concerned with Bell tests. It contains no useful information that could contribute to the correlations. Bell's idea is concerned with hidden variables that actually *do* something in the experiment.
Of course the characteristics of the source could vary with trial index, but if it did the experimenter would take measures to allow for it, such as randomising the order of the four different subexperiments and repeating them many times. In practice, they monitor carefully the rate of production of photon pairs and do a certain amount of randomisation and replication. There is no reason why the trial index per se should affect the result. There is also no reason why the detector settings should affect the actual pairs produced. In the version of Bell test you are considering (which is not, I think, a valid one), by restricting your attention to detected pairs only you land up with a sample that can depend on the detector settings. But this does not give rise to a valid Bell test. This is the reason that Clauser, Horne et al insisted that, in the estimation of terms for the CHSH inequality, the whole hidden variable space be used.
Re my Chaotic Ball model, I don't understand how you can fail to understand a diagram that shows precisely what &phi is: the angle between the directions in which the two observers are looking. As explained in the text, it corresponds to the difference between the two detector settings. It has absolutely nothing to do with the algebraic expression you've invented. And the other definition that you take as standard is also an invented one. It is probably not invented by you but by whoever you are following. The angle between detector settings is not something to be derived from the observed counts! The experimenter sets the two detector orientations. It is simply the difference between them.
The model shows immediately what is involved if you don't look at the whole HV space: the shaded areas on the diagram (fig. 5) don't add up to the total area.
Caroline Thompson 10:18, 26 Jul 2004 (UTC)



(Frank W ~@) R 05:47, 27 Jul 2004 (UTC)) Hallo Caroline Thompson,

please tell me where you have seen a derivation of the inequality

In the original reference given in the article, PRL23, 880 (1969); supplemented by the definitions detailed in PRD10, 526 (1974), in particular the equations (1) and (2), p. 527. (Clauser and Horn were authors of both these references, of course ...)

the "trial index" simply is not the kind of thing that can usefully be used as a hidden variable

I beg to differ. According to the comprehensive definitions of PRD10, on p. 527, any one value "λ", a.k.a. the "hidden variable", is (not more and not less than) to
"[...] Denote [...] the state specification of the above system at a time intermediate between its emission and its impingement on either apparatus. [Where] we do not necessarily make a commitment to the completeness of this state specification [...]"
The index of any one trial certainly *denotes* the state of the system in this trial unambiguously; of course without much claim of completeness of state specification (but that's permitted).
(Indeed, "one emission and one subsequent pair of impingements on either apparatus" seems a suitable definition of what constitutes one "trial" in this context in the first place.)

Bell's idea is concerned with hidden variables that actually *do* something in the experiment.

No -- the various instances of "the system" actually *do* something in the various experimental trials. Variables only serve to either characterize the variations between the instances or states (quantitatively, as "exposed variables"); or else, if such characteristics remain hidden, then merely to distinctly denote the instances or states.

randomising the order of the four different subexperiments and repeating them many times

Does this have any bearing on the conditions (such as "13" above) under which the inequality ether can or cannot be derived ??
After all, distinct trials remain distinct regardless of their order, and regardless of how many more trials are being observed.

I don't understand how you can fail to understand a diagram that shows precisely what &phi is: the angle between the directions in which the two observers are looking

We may have different interpretations of "precsion". Precisely *towards where* did you suppose should either one of the two have directed ... their looks ?
I presumed (effectively): each "towards the center C of the ball"
(whereby
"nc = sc = 1/2 ns", i. e.
"the distance between point N on the ball and the center of the ball equals the distance between point S on the ball and the center of the ball, which in turn equals half the the distance between point N on the ball and point S on the ball").
Aren't these the (geometric) "directions" you're considering ?

It has absolutely nothing to do with the algebraic expression you've invented.

Rather than invent I merely applied -- the "Law of cosines":
"(ca^2 + ns^2/4 - na^2) / (ca ns) = (sa^2 - ca^2 - ns^2/4) / (ca ns)", and
(to eliminate any reference to the presumed "C", because this doesn't even appear in your diagram)
"φ =?= ArcCos[ (ab^2 - 1/2 (na^2 + sa^2 + nb^2 + sb^2 - ns^2)) / Sqrt[ (na^2 + sa^2 - 1/2 ns^2) (nb^2 + sb^2 - 1/2 ns^2) ] ]"
How precisely do you suppose to define φ, the "angle between the directions" (discussed above) instead ?

The angle between detector settings is not something to be derived from the observed counts!

How do you suppose to measure values of such angles instead ?!
(if not in application of Malus' definition, as I use to do, i. e. in this context as
"ArcCos[ (NN + SS - NS - SN) / (NN + SS + NS + SN) ]").

The experimenter sets the two detector orientations.

Yes, the experimenter(s) may leave or shake up (and most especially: denote) the apparatus parts as they may.
But how else would the/any experimenter(s) *measure* a numerical value of the corresponding orientation angle ?

It is simply the difference between them.

"Difference" (as result of Subtraction) requires *numbers* ("minuend" and "subtrahend") as arguments.
How do you suppose that any particular suitable numerical values could be assigned to the "detector orientations" under consideration ? Geometrically ??
Regards, Frank W ~@) R 05:47, 27 Jul 2004 (UTC)

Caroline Thompson 08:40, 27 Jul 2004 (UTC) Hello Frank W

Perhaps we should continue this conversation privately?

(Frank W ~@) R 03:35, 29 Jul 2004 (UTC)) Perhaps our exchange may be of interest to future editors of this article ?

I think you are gravely misinterpreting Clauser and Horne. They (in PRD10, 526 (1974)) are talking about the physical state of the system.

Yes; to be denoted by "λ".

Unless "the system" is assumed to be something that remains essentially unchanged from trial to trial we can't usefully conduct Bell test experiments.

Unless you deny outright that, over the course of a sequence of experimental trials, "the system state" could change at all, to be appropriately denoted by distinct values of the "hidden variable" (from the set Γ) --
under which circumstances do you suppose "the system state" could undergo change, if not in general from one trial of the sequence to the next trial ?

But what is it we are really trying to reach agreement on? The major issue, I thought, was whether or not the CHSH test as currently applied, using the sum of coincidence rates as denominator, is valid?

I tried to address this already above: that's certainly valid, especially as being free of any particular assumptions which would have to be made in order to consider "non-detections".

I maintain that nowhere in Clauser and Horne's paper, or the CHSH 1969 one, will you find them saying anything other than that the test should be used with N as denominator, where N is the number of emitted pairs. OK, so they don't say that explicitly but they do say the test must not be used unless N is known. Their equation (1) (p 527 of the CH74 paper) is in terms of ratios with N as denominator. Their derivation would not make sense if anything else was used.

Allright, agreed. But neither do they appear decisive about how to obtain the number "N", the "number of emissions" itself.
Is "N" meant to be defined as "{ j = first of J Σ last of J}( nj (A↑) + nj (A↓) )" =(per definition)= "{ j = first of J Σ last of J}( nj (B «) - nj (B ») )",
or is consideration of "third options" permitted, e. g. "A_no_detection" and "B_no_detection", with the additional corresponding numbers "nj ( A_no_detection )" and "nj ( B_no_detection )" ?
(The present article uses such elaborate notation to allow these cases to be distinguished and intelligently compared in the first place.)
The former case is based on experimental observations and the corresponding counts alone, and is thereby self-evidently valid. The latter requires additional assumptions to evaluate "nj ( A_no_detection )" and "nj ( B_no_detection )"; assumptions which may be considered objectionable.

It does not make sense if we try and restrict out ensemble to the set of detected pairs.

Your claim and my rejoinder obviously depend on what's considered "making sense". In experimental physics especially, sense is being made from counts (of what has been directly observed/sensed). Accordingly it is at least one sensible approach, as well as the simplest, to define and count the "number of emissions", "N" as "number of detections" which is directly available.


At this stage perhaps we should consider some particular application of their test? If you do so, I think your whole problem with the angles of the detectors will be cleared up automatically. Yes, they are real geometrical angles.

I very much doubt that; after all, in comparisons to the so-called "predictions of QM" in the various references the numbers "a - b", "a' - b" etc. are employed as if they were orientation angles (in the sense of Malus). Of course my doubts could be alleviated if we'd consider some particular application:

We can measure them with a protractor!

Well, at least a protractor is supposed to show marks labelled with numbers corresponding values of geometrical angles. These would have had to be measured beforehand, in order to put the *appropriate* number to any particular mark on any particular protractor in the first place; and quite equally so on all "good" protractors.
I've found a web-page that pictures a protractor and (how friendly! :) points out one of its conspicuous parts by name: as the "index" (a.k.a. "origin" or "apex"). A procedure is provided as well: first one is supposed to "place the index" ... where ??, or on what ?? ...
(Also, the actual initial measurement of geometrical angles is of course based on the Law of cosines, which in turn requires measured values of distance ratios. Merely painting the integers "0" through "180" somewhere and anywhere "around the index" perhaps doesn't necessarily make a meaningful protractor.)

There is no need for any of your geometry in relation to my Chaotic Ball model (http://arXiv.org/abs/quant-ph/9611037), which you seem determined to misunderstand.


Could you for starters please point out just where the "index" of a protractor would have to be placed, in the presumably "precise diagram" (fig. 2)? My above guess, "(on) the center C of the ball" may have been a misunderstanding, of course. (I also note that the "Vectors a and b" of fig. 2 have no intersection drawn at all ...)

But perhaps more importantly, if you think you can estimate the angle from the observed counts (assuming Malus' Law) this is entirely begging the issue!

Well, it certainly underscores the importance of CHSH's suggestions as thought experiments; on par with considerations by EPR, Bohm, Bell, Wigner and d'Espagnat, Tsirelson, as well as GHZ. Preparing corresponding actual set-ups, actually recording "detections" and analyzing these data as described has largely pedagogical value.

You will have assumed everything the tests are intended to prove.

The proof that the CHSH inequality can be experimentally found violated has indeed been stated a priori: per eq. (4) of PRL23, 880 (1969).
The only remaining experimental tests are to test whether or not suitably decisive values of the orientation angles, e. g.
      b - a = a - b' = a' - b = π/8,
were actually attained in particular sets of trials.

I have seen the formula you use ("ArcCos[ (NN + SS - NS - SN) / (NN + SS + NS + SN) ]") elsewhere, but that does not make it meaningful.

The fact that this formula represents a mathematically unambiguous expression in terms of actual experimental counts (based on your own definition) does make it meaningful; a physical quantity to be measured. Since the days of Malus, this quantity even has a name: the "orientation angle", e. g. between a pair of "polarizer axes" (or in the later version: between a pair of "analyzer axes").

Caroline Thompson 08:40, 27 Jul 2004 (UTC)

Regards, Frank W ~@) R 03:35, 29 Jul 2004 (UTC)