Exactly how this works is complicated.
No, it isn't. How it works is as follows: we read a file, line by line. For the kth line read, we decide with probability 1/k whether we remember the line, forgetting any previous one.
That's fairly trivial. It takes just a little analysis to see it's fair. Fair means that any line will be the final pick. So, what's the probability of line k to be the final pick? First what needs to be happen is that it gets remembered when the line is read, which has a probability of 1/k. Then, any next lines must *not* be picked. For each of the next lines, this happens with probability (j -1)/j, where j is the line number. Since all those probabilities are independent of each other, we can multiply them:
n
-------
1 | | j-1 1 k 1
--- | | --- == --- * --- == ---
k | | j k n n
j=k+1
Too bad we can't really do math in HTML.