Looks like he put forth a great effort in this experiment, for which I truly commend him. However...
Even in the blind survey link, it does not appear that the tests were done blind, meaning that the player knew what cap was engaged while playing, a potential factor of influence which I believe should not be underestimated.
It is also complicated, and seems structured to seek feedback on differences which are assumed to exist from the beginning. Rather than jump directly to semantic differentials (which is the type of survey I think Kernel's test could most closely be described as), I prefer to first establish where the difference limen (or "just noticeable difference") may truly lie.
For this purpose it is important to eliminate ideas such as better or worse, or employ any metaphorical descriptions such as warm, bright, smooth, creamy, etc. Before proceeding with these subjective surveys, I believe it best to carefully test for the very existence of noticeable effects once all opportunity for subjective influence on tone (or perception thereof) are effectively ruled out. Common tests for this could employ the up-down method, or the odd-one-out which I felt was more well suited in this case.
Either way, simplicity is key (two samples, no descriptions, only whether any difference can be detected), as are control methods to keep the tests blind. Without following these simple rules, a test can be easily (even if unintentionally) corrupted.
I do very much like his format for the listening tests though, and may use that as something of a model.
What I would really love is a multitrack web playback system, where four tracks could run synced side by side, and the listener could click A, B, C, or D buttons to bounce back and forth between them. I doubt such a thing is available in that exact form, but perhaps it would not be too hard for a web designer to whip something like this up? Four tracks, repeating in a loop, able to bounce instantly from one track to another, in a high quality audio format - could such a thing be easily done on the web?
Better is highly subjective, so let's ignore the idea.
Most people should be able to hear a difference, IMO. Do I believe someone could pick out cap values? No. I'm simply stating that they (mylar/oil) sound so different that someone really paying attention should be able to tell.
Who knows, though? I know a dude who doesn't feel there's any difference between pickups.