John Dee's Private Passion

John Dee's Private Passion

Shielding of Electrosmog: Some Observations (part 8)

Today I consider effect size as well as statistical significance in the assessment of orgonite as a method of protecting against cell phone RF EMR.

John Dee's avatar
John Dee
Aug 29, 2025
∙ Paid

Well then, part 7 proved to be a trifle exciting even if rigorous statistical testing using Kruskal-Wallis clobbered the notion that we were seeing a bone fide reduction in the mean RF EMR once orgonite buttons were attached to my Motorola G23 Android (albeit a modest one). The problem with statistical testing of real-world ‘noisy’ data is that if we don’t have a big enough sample size then no mean difference is going to be declared as statistically significant. On the other hand if we collect thousands of observations then pretty much any mean difference is going to be declared as statistically significant even if this carries no real world meaning. This opens up a can of worms that we ought to cogitate on without delay; though I am going to suggest that normal people skip this entire article if they wish to remain sane...

Muchos Ouchos

Relying solely on p-values for statistical testing of mean differences is inadequate for several interconnected reasons. A p-value only indicates the compatibility of the observed data with the null hypothesis, not the probability that the null hypothesis is true, which is a common misinterpretation. The p-value is not a measure of effect size or the practical importance of a result, which means a statistically significant difference (p<0.05) can be trivial in real-world terms, whilst a non-significant result (p>0.05) can actually represent a scientifically meaningful effect, especially if the study lacks sufficient power. Ouch!

The arbitrary p=0.05 threshold for statistical significance is a convention without inherent scientific justification and leads to simplistic thinking patterns that discard valuable information. This practice is particularly problematic with very large sample sizes where even minuscule differences can easily yield very small p-values, making the result statistically significant but not meaningful. Double ouch!

Furthermore, the widespread practice of conducting multiple statistical tests on the same dataset and reporting only those with p<=0.05 (known as ‘p-hacking’ or ‘data dredging’) leads to a high rate of false positives and undermines the reproducibility of research. The American Statistical Association (ASA) has explicitly stated that scientific conclusions should not be based solely on whether a p-value crosses an arbitrary threshold, and that proper inference requires complete transparency about all analyses conducted. A more robust approach involves reporting the actual p-value attained alongside confidence intervals, whilst considering data quality and the practical significance of the findings. Triple ouch!

If all this nerd speak floats your boat then this paper might be of interest, along with this paper and this paper. Meanwhile, we’ve somehow got to get back to Earth in a meaningful manner when dealing with RF EMR signal means as measured in my kitchen. H’mmm…

Effect Size

Something else we can do to save us from a sea of chaos is to calculate what is known as the effect size.

Effect size is a quantitative measure of the magnitude of a relationship between variables; that is to say it attempts to quantify the difference between sub-groups in a population. It provides numerical information about the ‘strength’ or practical significance of a finding, indicating how meaningful the observed effect is in real-world terms. Unlike statistical significance (p-values), which only tells us whether a genuine effect likely exists, effect size tells us how large or important the effect is. A statistically significant result (p<0.05) with a very small effect size might not be practically meaningful.

Effect size can be expressed in different ways depending on the context. Standardised effect sizes are particularly useful because they allow for comparison across studies with different units of measurement or sample sizes. For instance, Cohen's d expresses the difference between two means in terms of standard deviation units, with values of 0.2, 0.5, and 0.8 generally considered ‘small’, ‘medium’, and ‘large’ effects, respectively. Effect size is crucial for understanding the practical importance of research findings, as a statistically significant result does not guarantee a large or meaningful effect.

So, then, Cohen’s d is our saviour from a sea of chaos eh? We may ask what exactly the heck is this when it’s at home. Well, it’s a pretty darn basic concept being the absolute difference in the means between two sub-groups divided by their pooled standard deviation. If we think of the mean difference as ‘muscle’ and the standard deviation as ‘wobble’, then Cohen’s d is a way of expressing muscle per wobble, and we may deduce it will take values between 0 and 1.

Yes indeedy, I am going to derive Cohen’s d right now using my trusty hand held vintage calculator so we may see what those orgonite buttons were capable of in terms of effect size.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 John Dee
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture