• Cattail@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    1
    ·
    14 days ago

    I told a guy that a wide variance in data essentially means that results were random then he proceeded to explain p values and I’m like “yeah I’m sure the random values values came from nature”.

    Moral is p values kinda worth less than variance

    • FishFace@piefed.social
      link
      fedilink
      English
      arrow-up
      6
      ·
      14 days ago

      Are you saying you can’t determine a difference in aggregate statistics by performing more trials if the variance is high?

      • Cattail@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        13 days ago

        You can the issue with A-B comparison is that is kinda expected for one group to be higher on average simple based on the data points you selected.

        If the averages are kinda similar and variance are high in both group A and B then I’d say both groups are statistically the same even if the statistical values are different

        • FishFace@piefed.social
          link
          fedilink
          English
          arrow-up
          2
          ·
          13 days ago

          So you have two groups of ten experiments, mean if group A is 100, mean of group B is 105, variance is 25 (for both groups). Obviously we are not confident that these groups differ.

          Now suppose we repeat the experiment two billion times. The group A average is now 99, and the group B average is now 103. The variance is still 25. Are you still not confident that the groups are different?

          • Cattail@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            12 days ago

            I’d be curious what constitution a group a versus a group b and how you get 2 billion of them.

            But I’ll interpret it as sperm on a race track since you can get 2 billion runs with one nut.

            I have to say after billion trials the averages that you calculate did come from random sample, but it would be indicative average for that group since it can’t move far from that calculated average.

            I’m visualizing the 2 billion points of both groups and seeing a bell curve with a lot of overlap. I guess they would be different, but overall very similar since the variance is pretty wide.

            • FishFace@piefed.social
              link
              fedilink
              English
              arrow-up
              1
              ·
              12 days ago

              Right. But that’s what p-values quantify: given the number of trials and the observed variance and means, how likely is it that the two groups are drawn from the same distribution versus actually having different means?

              So variance isn’t “more important” than p-values; high variance means that (by definition) your p-value is lower (less confident) than it otherwise would be.

              • Cattail@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                12 days ago

                So I looked into the definition of P and it can depend on variance if you assume gaussian distribution.

                I wouldn’t know how you would get a P value for 2 different distribution with similar means. I can come up with the null hypothesis being that group a and group b are the same, but then idk how to relate that to a probability of given mean and variance of A is B.

                • FishFace@piefed.social
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  12 days ago

                  In general you need to know the distribution in order to calculate p values, though there are statistical methods for deciding - with some confidence level - whether a sample conforms to some distribution.

                  • Cattail@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    11 days ago

                    I did ask chatgpt 5.2 how to calculate the p value the sets of means and variance and set the null hypothesis as the means being the same then used Pooled t-test. The ai determined that both samples were more than 13 than the p is less than 5%.

                    P value seems a concept with a mathematical descriptions, but then I run into a wall when it’s like how do you figure out probably of group A having the values it has given group B values. I would need to see how people actually calculate their p values and null hypothesis to get concrete examples

                    I do like how the Wikipedia page shows that a set of 20 coin flips having 14 heads would have a p value above .05