I would say that they are inherently “somewhat” different even though they are both used for similar purposes and obey the same mathematical axioms (of a countably additive non-negative measure on some space of “outcomes” in which the measure of the set of all outcomes is 1).
Of course there are many probability measures in the mathematical sense which have nothing to do with any actual probabilities in the real world, so having the same mathematical structure does NOT preclude their being “completely different”.
But probabilities in the real world are supposed to have something to do with our expectations regarding what will happen when we perform an experiment which could in principle be repeated an unlimited number of times. And since any concept of probability is intended as a measure of how “likely” we expect physical events to be, if both kinds are useful they can’t really be all that different in what they lead us to expect.
The difference, as I see it, is as follows:
The frequentist probability of an “Event” (which is just the jargon for a subset of the space of all possible outcomes) is assumed to be the proportion of “all” experimental trials in which the outcome is consistent with the event. Despite its rather fuzzy definition (in terms of whatever we mean by “all” trials), this is something that is assumed to be an actual property of the experiment, albeit one that can never be determined by experimental tests (because any finite set of trials might fail to be representative). Frequentist statisticians often try to choose between different possible sets of assumptions by using each set of assumptions for computing the probability of what they have actually seen and choosing the assumptions that lead to higher probability, but they generally do so with the mindset that there is one set of assumptions that is really true.
Bayesian probability, on the other hand, is something that is always acknowledged to depend on the observer (through both initial choices and subsequent acquired knowledge). Given those choices and experience, the Bayesian probabilities are perfectly well known because the Bayesian approach provides an explicit rule for modifying one’s assumptions in the light of experiments. But because it depends on the the observer’s assumptions and experience, the Bayesian probability is not a well-defined property of the experiment and outcome alone.
The difference is subtle though, because one cannot do any frequentist analysis without making an assumption (usually about what constitutes a set of “equally likely” outcomes). But though one can make and change the assumptions, the Frequentist attitude remains that they are assumptions about something that exists and are either true or false, while the Bayesian does not necessarily ever claim that there is one set that is really “true”.
However there are theorems which tell us (roughly speaking) that if some frequentist assumption is in fact correct, then no matter how bad the initial assumptions of the Bayesian are, after a lot of experiments it is very probable (in the frequentist sense) that the Bayesian probabilities become close to the “correct” ones.
So in the end both approaches end up giving the same advice about what one “should” expect (though neither gives anything I can understand as a non-circular definition of what that means!) and whether the difference in attitude is a “false dichotomy” is something I think we each have to decide for ourselves.