Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
There is a basic flaw in your tally of write-ups per group, and you probably don't have sufficient data to fix it.

You have taken the total number of posts by someone who is now a "saint", and added that to the sum of "posts created by saints". But you don't know how many of this person's posts were submitted before he or she became a saint.

To summarize the proportion of nodes from non-saints "in general", you'd need to do extra work on each person who is not an initiate, to determine how many nodes they wrote at each of the levels they passed through, and distribute those numbers properly among the various levels.

Of course, another factor in the "imbalance" is the "graduated" scaling of the XP thresholds. The trip from Initiate to Monk involves steps of 20, 50, 100 and 200 XP. If an average non-clueless node yields about 5 XP, Initiates don't get to post more than 4 nodes or so before they cease to be Initiates; and with another just 10 nodes or so (not to mention XP derived from voting), they cease to be Novices. This tends to limit the total node contribution from these groups; the stats page shows about 25K initiates with about 33K posts among them, which is probably close to the limit of how many nodes can be owned by that many initiates at any one time.

And frankly, I think the coded pyramid layout, while eye-catching and portentous, gives a misleading sense of proportion when two groups of roughly equal size dominate the distribution. In the write-ups picture, it seems like the A's have a vast dominance over the lowly B's. It takes some time to count all those letters and realize its a difference of 42% vs. 37%, which isn't nearly as big a difference as it appears to be in the diagram.

Suppose you had just two groups of 50% each arranged in this sort of pyramid. Whichever group you put on top would occupy 7 full rows plus one cell in the eigth row, while the other group would occupy just the three bottom rows (minus the one cell taken by the top group). Show that picture to any casual observer and ask "Do you think there are more letters in one group than the other? If so, which group has more letters?"

With display techniques like this, it's no wonder that the phrase "Lies, Damn Lies, and Statistics" is so well known.

In reply to Re: Distribution of Levels and Writeups by graff
in thread Distribution of Levels and Writeups by jZed

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (6)
    As of 2021-01-19 09:30 GMT
    Find Nodes?
      Voting Booth?