Page 7 of 9

Re: Why am I no longer an anarchist

Posted: Sun Sep 08, 2024 10:11 pm
by brimstoneSalad
teo123 wrote: Sun Aug 25, 2024 12:02 pmIf that paper is critically flawed, then it would only be ethical to contact all of those people I misled and explain that to them. That would definitely destroy my reputation.
You could publish a retraction if you can find that it's flawed, or you could do more work to support it. Admitting you were wrong due to an honest mistake will not destroy your reputation, particularly given your young age. Rather, it may cement it as somebody who values integrity.

Do more work on this before you jump to the conclusion that it's flawed. I have not spent that much time on it.
teo123 wrote: Mon Sep 02, 2024 6:03 pm
brimstoneSalad wrote:Chance is the null hypothesis, so you have the burden of proof to show it's something else.
I am not sure that's a good principle.
Doesn't really matter what you think here Teo. You're likely misunderstanding it. I'm not getting drawn into a discussion on this. Disproving the null hypothesis is the whole point of P value and statistical analysis of results. First, show there even is something there not easily explained by chance.

Re: Why am I no longer an anarchist

Posted: Tue Sep 10, 2024 4:06 am
by teo123
brimstoneSalad wrote:Do more work on this before you jump to the conclusion that it's flawed.
And you told me in 2016 that I can safely reject any idea I come up with as being as dumb as a rock. Does that still apply?
brimstoneSalad wrote:Doesn't really matter what you think here Teo. You're likely misunderstanding it. I'm not getting drawn into a discussion on this. Disproving the null hypothesis is the whole point of P value and statistical analysis of results. First, show there even is something there not easily explained by chance.
It's simply not obvious to me that one who is, in the absence of a proper statistical analysis, claiming that some pattern is coincidental is more likely to be right than one who is claiming it's a real pattern.
Levy claimed the Semmelweis'es discovery of mortality rates from puerperal fever dropping significantly after the introduction of handwashing is coincidental (that puerperal fever is seasonal). He was obviously wrong.
On the other hand, yeah, one of the main arguments for the Greenberg's Amerind hypothesis (that all Native American languages except the Inuit languages share a recent common ancestor) was the n-m pattern in the personal pronouns (that the word for "I" tends to start with 'n' and the word for "you" with 'm'). That pattern turned out not to actually be statistically significant once you control for the fact that many languages around the world tend to have nasal sounds in their pronouns.
And that's kind of irrelevant here since I actually have a statistical analysis: basic information theory suggesting that the probability of that k-r pattern in the Croatian river names occurring by chance is between 1/300 and 1/17. One who is claiming that some pattern is coincidental in spite of a statistical analysis showing it's statistically significant is... probably wrong, right?
The problem is that, at least in soft sciences (I am not sure how it is in the hard sciences, but I suppose it's similar.), you can always say "But maybe some more appropriate model would suggest that pattern is not actually statistically significant.". Like Neuralbeans says, maybe a model that takes into account the supposed fact that different types of words (nouns, verbs, adjectives...) have different collision entropy would suggest that k-r pattern is not actually statistically significant. Statistical analyses are never absolute proofs that some pattern is statistically significant.

Re: Why am I no longer an anarchist

Posted: Tue Sep 10, 2024 12:56 pm
by brimstoneSalad
teo123 wrote: Tue Sep 10, 2024 4:06 am
brimstoneSalad wrote:Do more work on this before you jump to the conclusion that it's flawed.
And you told me in 2016 that I can safely reject any idea I come up with as being as dumb as a rock. Does that still apply?
I don't know. It has been eight years, but you're also still pulling the same stuff you were in your flat Earth days.
teo123 wrote: Tue Sep 10, 2024 4:06 am It's simply not obvious to me that one who is, in the absence of a proper statistical analysis, claiming that some pattern is coincidental is more likely to be right than one who is claiming it's a real pattern.
The pattern must be defined more clearly. If merely the appearance of a pattern dug up out of a huge amount of data by selecting a slice of data ad hoc that fits that apparent pattern, then apparent patterns likely vastly outnumber actual patterns. So yes, we should assume it's coincidence.

As you demonstrate a better P value relative to the slice that ad hoc selection holds in that original data pool, you increase the chances of it not being coincidence. P value standards for papers are actually arbitrary, they should in some cases be far higher to match the circumstances of the pattern's "discovery". Experimental P values in the harder sciences are different as long as we force all study results to be shared.

teo123 wrote: Tue Sep 10, 2024 4:06 am And that's kind of irrelevant here since I actually have a statistical analysis: basic information theory suggesting that the probability of that k-r pattern in the Croatian river names occurring by chance is between 1/300 and 1/17. One who is claiming that some pattern is coincidental in spite of a statistical analysis showing it's statistically significant is... probably wrong, right?
In a soft science where the only thing done was data analysis and no actual experiments, no. I explained why previously.
If it's only 1/17, it's probably coincidence.

It's not hard to look at a large data set, and select a subset of that data which fits a pattern with a 1/17 chance of being wrong.

For instance, I roll a quarter million dice of various colors, I will find different patterns among the different colors. What are the odds of the fifty six-sided yellowish green dice with silverly flecks not rolling any 2's? I will find different patterns throughout the various colors. It seems improbable at first (one in ten thousand), but once multiplied by the number of possible sets I cherry picked from I find the odds are extremely good, and a "coincidence" of this specific nature (no 2s) in the data set is about 50-50 odds. Add to that looking for other similar coincidences (no 1s, no 3s, etc. and the odds increase linearly).

You have looked at an entire language (and you have done so perhaps unknowingly just by being a native speaker and intuitively seeking for apparent patterns), and you've found one niche category which appears to fit a specific pattern with a very underwhelming 1/17 odds of being coincidence. I think you will find many more patterns out of coincidence with those odds or better

If it's 1/300 you have a better argument, but I still don't put your odds very high due to the large number of words in Croatian and the small set you're dealing with (river names, right?).

teo123 wrote: Tue Sep 10, 2024 4:06 amThe problem is that, at least in soft sciences (I am not sure how it is in the hard sciences, but I suppose it's similar.), you can always say "But maybe some more appropriate model would suggest that pattern is not actually statistically significant.".
If you want to be in the soft sciences, that's something you have to deal with. In the hard sciences, that's more difficult and usually involves pointing out a flaw with the experimental setup.

In case it's not obvious, here's the equation for the odds of not rolling 2s in the colored dice analysis, so you can apply it to your own field:

5/6 (the odds of not rolling a 2) ^ 50 (number of rolls in the subset chosen) * 250,000 (the total number of dice) / 50 (your subset of dice for the total fraction)

This of course doesn't fully represent the entire set, because there's overlap. You could also form a set of all silver flecked dice, all green dice regardless of flecks, etc. which makes the odds of a pattern with this P value occurring even higher since dice can be double-counted for inclusion in other mixed sets. It must also be compounded by every other hypothesis of the same general type (like all cases of not rolling a specific number). Doing that makes not finding MANY coincidences of the 1/10,000 odds astronomically small.

Re: Why am I no longer an anarchist

Posted: Wed Sep 11, 2024 5:08 am
by teo123
brimstoneSalad wrote:you're also still pulling the same stuff you were in your flat Earth days.
Are you seriously comparing my paper to Flat-Earthism? To a complete rejection of the quantitative methods? Do you think that I would be able to publish a paper propagating Flat-Earthism in two peer-reviewed journals? And that it would convince so many experts I know in real life?
brimstoneSalad wrote:For instance, I roll a quarter million dice of various colors, I will find different patterns among the different colors. What are the odds of the fifty six-sided yellowish green dice with silverly flecks not rolling any 2's? I will find different patterns throughout the various colors. It seems improbable at first (one in ten thousand), but once multiplied by the number of possible sets I cherry picked from I find the odds are extremely good, and a "coincidence" of this specific nature (no 2s) in the data set is about 50-50 odds. Add to that looking for other similar coincidences (no 1s, no 3s, etc. and the odds increase linearly).
I think you are drastically over-estimating the number of linguistically plausible patterns, that is, of patterns that it's plausible they aren't coincidences. Had I claimed that I found a pattern that many river names contain the vowel sequence a-a-i and claimed that it proves that the river names come from the same language, that would be implausible because that's not how languages work. Vowels are unstable (Croatian was undergoing a massive vowel shift around the 7th century), and no known language uses vowels as roots and consonants as transfixes (the opposite of what the Semitic languages do). But some two-consonant prefix appearing at a rate much higher than chance in the river names is exactly what you would expect if river names come from the same language. And that's what this k-r pattern is.
What do you think about the Krahe's Old European Hydronymy? To me it seems like if Krahe's methodology is good, then mine is even better, since the only major difference is that I am calculating the p-values while Krahe isn't. Krahe even came to the same conclusion I did that Illyrian was a centum language (while mainstream linguistics claims it was satem).

Re: Why am I no longer an anarchist

Posted: Fri Sep 13, 2024 1:56 am
by brimstoneSalad
teo123 wrote: Wed Sep 11, 2024 5:08 am
brimstoneSalad wrote:you're also still pulling the same stuff you were in your flat Earth days.
Are you seriously comparing my paper to Flat-Earthism? To a complete rejection of the quantitative methods? Do you think that I would be able to publish a paper propagating Flat-Earthism in two peer-reviewed journals? And that it would convince so many experts I know in real life?
https://xkcd.com/451/

This should answer your question.
teo123 wrote: Wed Sep 11, 2024 5:08 amI think you are drastically over-estimating the number of linguistically plausible patterns, that is, of patterns that it's plausible they aren't coincidences.
I may be, but the thing is that's in keeping with the null hypothesis. You would need to show that in order to rule out that kind of cherry picking a random pattern out.
teo123 wrote: Wed Sep 11, 2024 5:08 amHad I claimed that I found a pattern that many river names contain the vowel sequence a-a-i and claimed that it proves that the river names come from the same language, that would be implausible because that's not how languages work.
And a standard six sided die can't roll a 7. You need to quantify and assess all possible patterns to make the argument you're making. I don't know what to tell you other than to get started. Seems like the makings of a new paper.
teo123 wrote: Wed Sep 11, 2024 5:08 amWhat do you think about the Krahe's Old European Hydronymy?
I think it's not my field, so I don't have anything to say about it or the time to get acquainted.
I do understand probability, though, and if you want to do a good job of establishing a real P value for your observations, you have to account for unconscious post-hoc cherry picking. That's a lot of possible patterns to lay out and assess.

Re: Why am I no longer an anarchist

Posted: Sat Sep 14, 2024 2:05 pm
by teo123
brimstoneSalad wrote:This should answer your question.
I must admit I don't understand what you mean. We are primarily talking about informatics here. Informatics is an engineering field, it's not a soft science.
Now, admittedly, regarding my idea that Karašica was called *Kurrurrissia in antiquity (borrowed into Proto-Slavic as *Kъrъrьsьja which would regularly give *Karaša in modern Croatian), the problem is that the early historical phonology of Croatian tends not to be well-known, even among linguistically-educated people, and it's the early historical phonology of Croatian that's necessary to evaluate my linguistic claims. In my experience, even linguistically educated people tend not to know that in early Proto-Slavic (that Croatians were speaking in the 7th century and that the toponyms were borrowed into), before the yers were schwa-like sounds, front yer is reconstructed to have been pronounced as short 'i' and back yer as short 'u'. So you can perhaps dismiss the fact that linguistically-educated people I know in real life say my arguments seem compelling to them as not meaning much. You might even say that the educated guesses about how Croatian was pronounced between the 7th and the 11th century (when it was not attested) are too speculative to be considered science.
But you cannot dismiss the fact that people educated in informatics I know in real life say that my arguments (that the basic information theory suggests that the probability of the k-r pattern in the Croatian river names is between 1/300 and 1/17) seem compelling to them the same way.
Seriously, you are comparing my paper to Flat-Earthism? Where are the massive conspiracies? Where are the endless ad-hoc hypotheses? Where are the not-even-wrong arguments?
brimstoneSalad wrote:I do understand probability, though, and if you want to do a good job of establishing a real P value for your observations, you have to account for unconscious post-hoc cherry picking.
You didn't answer me: Do you think that Levy was being reasonable when he objected to Semmelweis that perhaps puerperal fever is seasonal and that Semmelweis'es observations are coincidences? You could also object to Semmelweis that he tried many hypotheses: that puerperal fever was caused by overcrowdedness, that puerperal fever was caused by climate, that puerperal fever was caused by priests scaring mothers to death...

Re: Why am I no longer an anarchist

Posted: Mon Sep 16, 2024 1:04 pm
by brimstoneSalad
teo123 wrote: Sat Sep 14, 2024 2:05 pmWe are primarily talking about informatics here. Informatics is an engineering field, it's not a soft science.
Many of the premises come from linguistics, not informatics. The use of informatics is what makes some linguistic pursuits harder than others, but people poorly understand informatics generally in the same way they poorly understand statistics specifically, and that includes students of science, so don't be surprised if bullshitting makes it pretty far into the field without detection.
Even many physicists poorly understand statistics, the benefit there is that statistics are less critical to the hardness of experimental data. Even just one successful experiment (outside quantum effects) is unlikely to have issues with a low P-value.
Take even the oil drop experiment to determine electron charge, for instance, which is unique for the very small number of molecules involved -- if you measure enough droplets to find a common denominator even once there's not an issue with random effects on your measurements that could push that out of statistical significance (e.g. isotopic contamination of the oil or something) because you're still already dealing with so many particles. Most physics students don't *really* even need to learn about statistics if they understand significant figures. The benefit physics has is largely just being more fool proof, not necessarily having more informatically competent practitioners.
teo123 wrote: Sat Sep 14, 2024 2:05 pmSeriously, you are comparing my paper to Flat-Earthism? Where are the massive conspiracies? Where are the endless ad-hoc hypotheses? Where are the not-even-wrong arguments?
I believe I was comparing your psychology, Teo. You're overly hungry for some original idea or discovery about the world that you're still willing to push the boundaries of what is reasonable to have confidence in.

You may some day discover such a thing, but you shouldn't assume it or be so hasty in your defense of it.

Linguistics as a whole is too soft of a science for confidence in things that are less obvious, and obvious things (the low hanging fruit where we can be confident) have mostly already been discovered as far as I know. The barriers to entry are just too low for such discoveries.

There are fields with low hanging fruit, but they mostly deal in engineering, manufacturing, and practical applications like in some material sciences.
teo123 wrote: Sat Sep 14, 2024 2:05 pmYou didn't answer me: Do you think that Levy was being reasonable when he objected to Semmelweis that perhaps puerperal fever is seasonal and that Semmelweis'es observations are coincidences? You could also object to Semmelweis that he tried many hypotheses: that puerperal fever was caused by overcrowdedness, that puerperal fever was caused by climate, that puerperal fever was caused by priests scaring mothers to death...
It is the job of the scientific community to challenge, even as devils' advocates, any new observations. Every remotely reasonable alternative hypothesis should be explored as a criticism.
Unless "It was caused by demons" was on the list, he was doing his job as a critic.
There is a point where criticism becomes obsession and is no longer scientific (like the current generation of critics of anthropogenic climate change), I'm not going to assess whether that criticism crossed that line or not.
Look at modern day climate denialists to find key features of that mindset and bad behavior. Broadly, there's a lot of lying and outright fraud and unscientific manipulation of data, and that may be an inevitable hallmark of unscientific criticism.

Re: Why am I no longer an anarchist

Posted: Wed Sep 18, 2024 11:19 am
by teo123
brimstoneSalad wrote:Many of the premises come from linguistics, not informatics.
Well, yes, I assumed there is no some magic in the Croatian grammar that would make nouns have significantly lower collision entropy than other words in the Aspell word-list. But that is fundamentally no different than when you argue for veganism and assume the Alan Savory pseudoscience is pseudoscience. That is, as you say that, that you cannot use magical cows to go around thermodynamics. Similarly, I am assuming you cannot use magical languages to go around basic information theory.
brimstoneSalad wrote:You may some day discover such a thing, but you shouldn't assume it or be so hasty in your defense of it.
What do you mean by "hasty"?
brimstoneSalad wrote:There are fields with low hanging fruit, but they mostly deal in engineering, manufacturing, and practical applications like in some material sciences.
And I would assume the low-hanging fruit lies in the interdisciplinary fields. Toponyms seem to be an obvious example of that. People who know about toponyms tend to know zilch about information theory (as they often explicitly admit), and people who know about information theory tend to know zilch about toponyms. So it seems possible that information theory actually has something important to say about toponyms, but that nobody noticed that.
Furthermore, I think it's probable that many people who know both the basics of information theory and about toponyms are misinterpreting the Birthday Paradox as if it is saying that there is a large probability of that k-r pattern occurring by chance.
brimstoneSalad wrote:Broadly, there's a lot of lying and outright fraud and unscientific manipulation of data, and that may be an inevitable hallmark of unscientific criticism.
Well, climate change deniers are often claiming it's the climate scientists who manipulate the data unscientifically. Continental data shows little or no global warming (many of the stations even showing a cooling trend), but the data from ships shows huge warming (around a third more than what's accepted to have actually been). According to climate change deniers, the right thing to do is to admit we have no high-quality data about temperature in the first decades of the 20th century, or at least to add huge error bars to the reconstructions. Climate scientists are claiming to be able to adjust the data for the errors by controlling for the time of observation bias (which increases the warming in the continental data), controlling for the differences in ships which measure the temperature at the beginning of the 20th century and now (which decreases the warming in the ocean data), and various other supposed biases, until they arrive at some seemingly-consistent dataset. Climate change deniers are claiming such manipulation of the data is unscientific.

Re: Why am I no longer an anarchist

Posted: Sat Sep 28, 2024 1:13 pm
by teo123
brimstoneSalad wrote:I'm not interested in discussing Havlik's Law, which is a post-hoc observational claim with exceptions. That's not a scientific theory, it's too specific to a case and makes no generalized predictions of the evolution of languages which is what you'd need to consider it.
But the Havlik's Law does make some predictions. At least in the sense that it says what couldn't have happened. For example, it says that the Illyrian name for Karašica couldn't have been *Kurrissia, because that would give *Kraša in Modern Croatian. It also says that it couldn't have been *Kurrurrišša, because that would give *Krarša in Modern Croatian. But, given what we know, it could have been *Kurrurrissia (and I argue that it probably was).

Re: Why am I no longer an anarchist

Posted: Fri Oct 11, 2024 1:42 pm
by teo123
Anyway, I was browsing a Croatian historical revisionist website and I sumbled upon a new argument against Vukovar Massacre: "Why is it reasonable to believe Vukovar Massacre 1991 really happened? We are living in the most peaceful time in the human history, so there recently being a large massacre is a pretty extraordinary claim, isn't it?". It sounds rather compelling to me. What do you think about that argument? I've asked a Quora question about it.