mauifan wrote:So a real-world BJT transistor is an "ideal" transistor with tiny capacitors (perhaps 1-2pF?) across the junctions?
Spot on.. that's even a good guess for the capacitor values. You'll find exact values listed in the 'Small Signal Characteristics' section of a transistor's datasheet. I use 2N3904s a lot, and the values there are 4pF at the input and 8pF at the output.
mauifan wrote:If so, it makes perfect sense to me why gain falls off at higher frequencies: Higher frequencies would bypass the ideal transistor through the caps.
Your mental model is almost perfect, but you haven't included the resistors. The full spread looks like this:

and isn't nearly as bad as it seems at first glance.
R.b, R.c, and R.e you already know.. they're the base, collector, and emitter resistors around the transistor. R.pi is the effective resistance between the base and emitter, R.o is the effective resistance between the collector and the emitter. R.l is the load resistance.
C.mu, C.pi, and C.l are the capacitors you postulated. C.mu is the capacitance between the base and collector, C.pi is the capacitance between the base and the base and emitter. There is some capacitance between the collector and emitter, but the transistor can't tell the difference between that and the capacitance of the load, so we lump it in with the load and call the whole thing C.l.
Whenever you put resistors and capacitors together, you get RC time constants. The RC time constant of R.b feeding current into C.pi and C.mu is what causes high-frequency attenuation.
For capacitive bypassing to happen, charge would have to go from C.mu to C.l. That path crosses the transistor's collector though, and the collector doesn't stand still. When the voltage at the transistor's base rises, the voltage at the collector falls. That leads to a phenomenon called 'Miller capacitance'.
Conceptually, the Miller effect is kind of like a seesaw. If we assume the circuit is arranged for a gain of 9 and raise the base voltage by 1mV, the voltage at the collector will fall by 9mV. C.mu sits between the base and collector, so we can imagine a point 1/10th of the way through it where the voltage never changes.. roughly like the fulcrum in the middle of a lever:

Fixed points like that create information barriers. Circuits can only see changes that can be measured, so if a point never changes, the circuits on opposite sides of that point can't see each other. That's not an exotic idea BTW.. we treat Vcc and GND that way all the time. Putting one of those points in the middle of a component has interesting effects though.
Operationally, a capacitor implements the idea "if we send current in, the voltage rises". We measure the size of the capacitor in terms of the ratio between 'current that went in' and 'amount the voltage rose'. 'One microfarad' means 'one microamp of current changes the voltage by 1v per second.'
If we put a 1uF cap in the position of C.mu then send 1uA of current into it for a second, the voltage across C.mu will indeed change by 1v. 0.9v of that change will happen on the side that R.b can't see, though. As far as R.b is concerned, it sent 1uA of current into C.mu, and only saw C.mu's voltage rise by 0.1v. Plugging those numbers into the equation for capacitance gives us 10uF.
More generally, every test we can do on the R.b side of the circuit will tell us that C.mu behaves exactly like a 10uF capacitor connected between the transistor's base and GND.
That's the Miller effect.. as far as the input is concerned, you can replace a capacitor at C.mu with one (gain+1) times larger going to GND:

Doing that removes any capacitive path to the output, so we can't get capacitive bypassing that way.
In theory, we could get capacitive bypassing through the new capacitor to GND (or the effective equivalent of it), but that would only happen for fast-moving signals on the transistor side of the input resistor (R.b). The fast-moving signal is on the other end of R.b though. Only the slow-moving parts make it through to the top of the capacitor, and the transistor only sees what happens at the top of the cap.
Taking that into account, the model ends up looking like this:

where C.M is the Miller equivalent to C.mu and the diamond in the middle controls the current through R.c based on the current through R.pi. With all the other pieces in place, that's all that's left for the transistor to do.
When you void a product warrany, you give up your right to sue the manufacturer if something goes wrong and accept full responsibility for whatever happens next. And then you truly own the product.