Specifically, most proofs use the fact that (nCr)(r!) = nPr, defining nCr to be the number of r-combinations in n objects, and nPr to be the number of r-permutations, then solving for nCr. It's not clear to me how to understand why the left hand side of this equality makes sense in terms of the product rule though
I know the first factor says that there are nCr ways to choose an r-combination from n choices, but I'm not sure what to make of the other factor (again, thinking in terms of the product rule).
(a+b)^n = (a+b)(a+b)...(a+b)
into a polynomial. If you just expanded it out, without grouping terms, there would be 2^n terms. Each of these terms would be obtained by choosing either a or b from the first factor; a or b from the second factor; ...; a or b from the nth factor, and then multiplying those a's and b's together.If you then group the terms, you'll find that the coefficient of the a^r b^(n-r) term will be nCr, the binomial coefficient.
Thinking back to the a-or-b selection process, note that each a^r b^(n-r) term corresponds to a path where you chose r a's and (n-r) b's. Now imagine that the n factors correspond to forming a subset of a set of n objects, where choosing a at position i means "include the i'th object in the subset" and choosing b at position i means "exclude the i'th object from the subset". Thus, each distinct choice of a's and b's defines a distinct subset of the n objects.
Choosing r a's and (n-r) b's means forming a subset of r out of the n elements. Hence the interpretation of nCr as the number of combinations of r elements out of a total of n elements.
I feel the situation is much clearer if you actually write nCr and nPr out.
Edit: the Wikipedia articles on combination and permutation are also helpful.