Comments: 
Well here is an approximation and it might be right, but I am pretty confident I am making invalid independence assumptions.
So, for a given player we can find the expected number that match with them = E(qaMatches) + E(aqMatches)
E(qaMatches) = E(aqMatches) since it's symmetric, so let's just do one and call it E(matches).
E[matches] = P(topMatches) * P(bottomMatches) * numberOfOtherPlayers = 1/8 * 1/8 * 135 = 2.11
so overall expectation is 4.22, which sounds vaguely plausible.
I think you should be able to have a group of size 1, but the explanation is too long to fit in the margin of this comment. And the largest group is worst case gonna be 34 (make all people with question A get answer B and vice versa).
Oh I guess P(topMatches) and P(bottomMatches) aren't actually 1/8, because we have to take ourselves out, so it's like 16 / 136, but in the flipped case, it's actually one bigger since we're taking out a nonmatched card.
Group of size 1 is pretty easy to construct. Start with everybody sorted into 4 groups: AB, CD, EF, GH. Take one member each from the first three groups: {AB, CD, EF}, and have them each pass their answer cards to the right. Now you have one member each for groups AF, CB, and ED.
It seems like your grouping method (A/D groups with both A/D and D/A) means that you don't have to consider the question and answer cards as distinct: everybody gets two nonduplicating cards and matches anyone with the same two cards. Yes?
There are also 8choose2 = 28 groups, so the average group has 4.86 members.
The most members any group could have, obviously, is 34, which means the fewest groups you could have is 4. The fewest members, then that any group could have, is 0.
How often will any given group be empty? For group AB, 34 players will have an A. In order for AB to be empty, none of them can draw B as their second card. For the first Aholder, there are 7*34 cards that can be drawn, and 6*34 of them are not Bs. For the second, there 7*34  1 cards, and 6*34  1 nonBs. So on to the last Aholder, who has a mere (6*34  33)/(7*34  33) chance. Multiply all these chances together and you have the likelihood of a group being empty.
If I keep writing, it will become very clear that I've taken discrete math and thought plenty about combinatorial math, but have never studied statistics. So I'll stop here.
Edited at 20120201 03:58 am (UTC)
I've taken discrete math and thought plenty about combinatorial math, but have never studied statistics.
Same here!
Oh I guess mine is the expectation for non empty groups and I also need to add one to include the person whose perspective I was using.
 From: cos 20120201 06:55 am (UTC)
 (Link)

Would it make any difference if there were no question and answer cards, just 16 different cards (with one of 16 symbols or words or colors or numbers or anything), and you were guaranteed not to get two of the same, and had to group with everyone who had the same two cards as you have? I'm trying to convince myself that there's a meaningful difference between this and your scenario, but haven't managed to yet. (Deleted comment)
From: (Anonymous) 20120201 03:35 pm (UTC)
Example  (Link)

"For example, if I have question A and answer D, ... I don't group up with players who have ..., say, answer A and answer H."
How could a player have answer A and answer H? I thought each player was supposed to get one question card and one answer card.
Sorry, that was a typo. I meant to say "answer A and question H". Fixed!
 From: pmb 20120201 04:22 pm (UTC)
Simulation says  (Link)

I think the question and answers dichotomy is a red herring. I am pretty sure the situation you describe is the same as "there are 8 different cards, and 34 copies of each, everyone gets two nonequal cards." It seems like we might be requiring the dealer to have godlike (NPhard solving) abilities. Time for a simulation  http://pastie.org/3296593About half the time, the deal does wrong, and my simulation crashes with Traceback (most recent call last):
File "deal.py", line 8, in
l1, l2 = random.sample(cards, 2)
File "/usr/lib/python3.2/random.py", line 303, in sample
raise ValueError("Sample larger than population")
ValueError: Sample larger than population
But the other half of the time, we get output like: (2, 5) 9
(4, 6) 8
(3, 5) 7
(1, 7) 7
(0, 7) 7
(0, 3) 7
(3, 4) 6
(2, 6) 6
(1, 2) 6
(6, 7) 5
(5, 7) 5
(4, 7) 5
(1, 5) 5
(0, 2) 5
(0, 1) 5
(3, 7) 4
(3, 6) 4
(2, 4) 4
(1, 6) 4
(1, 4) 4
(0, 6) 4
(0, 4) 4
(5, 6) 3
(4, 5) 3
(2, 3) 3
(1, 3) 3
(0, 5) 2
(2, 7) 1 I have seen groups as small as size 1, and as large as 11.
 From: pmb 20120201 04:33 pm (UTC)
Re: Simulation says  (Link)

Just in case my intuition was wrong, because I kept thinking it might be, I rewrote this to exactly simulate the question you are asking. http://pastie.org/3296662Again, about half the time it doesn't work because the deal goes wrong. When it does work, however, its output is the same as above. I have seen groups as small as 1, and as large as 12
It seems like we might be requiring the dealer to have godlike (NPhard solving) abilities.
I don't see why. Suppose the dealer deals the 136 question cards first. They have to remember which card was dealt to which player (which takes linear space in the number of players). Then, to deal the answer cards, before dealing to a player, they can just look up what question card that player has, which would take constant time, and then remove the 17 matching answer cards from the deck before dealing that player an answer card (which would take linear time in the number of cards). So, dealing all the answer cards might be kinda slow (dealing each one takes linear time, so dealing them all takes quadratic time), but totally still polynomial time.
(But I'm glossing over the time complexity of, e.g., shuffling the deck before dealing. Is that where the 'N' is coming from?)
 From: pmb 20120205 03:43 pm (UTC)
Re: Simulation says  (Link)

I always model things as a graph, and so, if you think of a cardcombo dealt to a person as being an edge in a degree34 hypergraph, then the deal needs to construct a random regular degree34 hypergraph. Constructing graphs with a given degree distribution, particularly a given random degree distribution, is a problem in NP, but does not have a known polytime algorithm for all graph sizes ad degrees (it may be NPC, I forget).
So, if you had fewer people, or more cards, then it is quite possible that almost all of the deals would fail, instead of merely half, like they do in simulation.
What does it mean for a deal to fail? If a failed deal is one in which someone gets two cards that aren't supposed to go together, why isn't that fixable using a twostage dealing strategy like the one I described?
 From: pmb 20120205 04:34 pm (UTC)
Re: Simulation says  (Link)

You can end in the situation where you have only answers to A left to give out (you have been dealing randomly, so this is quite possible) and at least one of the people who has not yet received a card has an A question.
Basically, "the dealer makes sure that nobody gets a question and answer card that go with each other" is an operation that is surprisingly tricky. How do they do that? After they have done that, how can they be sure that their method of ensuring this property holds has kept it so that the resulting distribution of cards is a uniform sample of the possibilities?
Oops! Yeah, Alex just pointed out the flaw in my algorithm: you can back yourself into a corner. You already know this, but for the benefit of anyone else reading, here's a simple case: Suppose there are only 3 players. You deal the question cards first, and they get question cards A, B, and C, respectively. Now it's time to deal the answer cards. You remove answer card A from the deck and randomly deal one of B or C to player 1. Let's say it's B. Now, you put answer card A back in, take out answer card B, and randomly deal one of A or C to player 2. Let's say it's A. Now you're left with answer card C, but player 3 has question card C, so you're backed into a corner. (You can backtrack, but I think that that would mean crossing over into exponential time.)
I think it's cool how everyone is applying their own favorite hammer to this problem. Alex sees it as a constraint satisfaction problem, and from that angle, it's easier for me to see the NPness.
Edited at 20120205 05:16 pm (UTC)
 From: pmb 20120205 05:17 pm (UTC)
Re: Simulation says  (Link)

NPness! Tee hee hee...
my sense of humor should be more mature, but sometimes I fail my saving roll against immature homonyms
From: (Anonymous) 20120201 06:41 pm (UTC)
 (Link)

My solution is now up at
http://www.cs.grinnell.edu/~stone/misc/nerdsnipe.rkt
Short answer: The mean number of other members in your group is about 4.6, so the mean group size is about 5.6.
From: (Anonymous) 20120201 08:51 pm (UTC)
 (Link)

'course, if you wanted to distribute the combinations equally you could just N print cards with each distinct questionanswer combo (N*56 of them) and throw out the extras. Each group would end up being 4 to 5 people.  