This sentence confused me: "For example, Sinh[ArcCosh[-2 + 0.001 I]] returns 11.214 + 2.89845 I but Sinh[ArcCosh[-2 + 0.001 I]] returns 11.214 - 2.89845 I," not the least of which because the two input expressions are the same, but also because we started out by saying Sinh[ArcCosh[-2]] = -Sqrt[3], which is not at all near 11.214 +/- 2.89845 I.
I think the author meant to say, "ArcCosh[-2 + 0.001 I] returns 1.31696 + 3.14102 I but ArcCosh[-2 - 0.001 I] returns 1.31696 - 3.14102 I," because we are talking about defining ArcCosh[] on the branch cut discontinuity, so there is no need to bring Sinh[] into it (and if we do, we find the limits are the same: the imaginary component goes to zero and Sinh[ArcCosh[-2 +/- t*I]] approaches -Sqrt[3] as t goes to zero from above or below). I am not sure what went wrong to get what they wrote.
I think you would get sqrt(x^2) = x, if x belonged to the natural domain of sqrt, which is a Riemann surface, that may also be defined using the language of "sheaves". I don't know how to connect this to the article or Mathematica.
That's not what it simplifies to using a real or complex number domains for x, it's abs(x). CAS need type inference assumptions and/or type qualifiers to be more powerful.
For x = -i, square(x) = -1, sqrt(square(x)) = i. Meanwhile, abs(x) = 1. You're right that it simplifies to abs(x) for real x, but that no longer holds for arbitrary complex values.
so there's an unconditionally correct answer (it's also equal to abs(x) for x>0), and then there is an answer that is only correct for half the domain, which requires an additional assumption.
Not in general. As people have pointed out elsewhere, it's true if x is real. That isn't always a helpful assumption. (When x is real you can plug that assumption into Mathematica. Then Mathematica should agree with you.)
But consider sqrt(i) = sqrt(exp(i\pi/2)). That's exp(i\pi/4). Your rule would give 1 as the answer. It's not helpful for a serious math system to give that answer to this problem.
I really wish Mathematica would open-source the heuristics behind these core functions (including common mathematical functions, Simplify, Integrate, etc.). The documentation is good, but it still lags behind the actual implementation. It would be much easier if we could peek inside the black box.
That blackbox being their entire moat, I would assume they'd never want to open-source any function. Mathematica as a front-end has innumerable frustrating bugs, but its CAS is top-notch. Especially combined with something like Rubi for integration, for me nothing comes close to Mathematica for algebraic computations.
Many built-in functions are open source too. Use the "PrintDefinitions" ResourceFunction to see the code of functions that are implemented in Wolfram Language itself.
Yes, it is all proprietary, but there are still ways to inspect most of the WL-implemented functions since the system does not go to extreme pains to keep them hidden from introspection. It is not unlike Maple in that sense.
For Simplify, I expect its a black, or at least gray box to Mathematica maintainers, too.
It will have simple rules such as constant folding, “replace x - x by zero”, “replace zero times something with the conditions under which ‘something’ has a value”, etc, lots of more complex but still easy to understand rules with conditionals such as “√x² = |x| if x is real”, and some weird logic that decides the order in which to try simplification rules.
There’s an analogy with compilers. In LLVM, most optimization passes are easy to understand, but if you look at the set of default optimization passes, there’s no clear reason for why it looks like it looks, other than “in our tests, that’s what performed best in a reasonable time frame”.
A lot of problems look like this. A while ago I was working on a calendar event optimization (think optimizing “every Monday from Jan 1, 2026 to March 10, 2026” + “every Monday from March 15, 2026 to March 31, 2026” to simply “every Monday from Jan 1, 2026 to March 31, 2026”). I wrote a number of intuitive and simple optimization passes as well as some unit tests. To my horror, some passes need to be repeated twice in different parts of the pipeline to get the tests to pass.
You can simplify most of those problems by writing the constraints down in conjunctive/disjunctive normal forms and applying standard simplification on them, like you're back in school. That also eliminates things like repetition, since doing so also makes the problem declarative. If you need the recursive loops, you're guaranteed to be to stratify them for any reasonable problem. If you wanted, you could solve the problem optimally from this point by finding the prime implicants, or just accept a suboptimal solution that runs faster than you have any reason to care about, like datalog and sql do.
That doesn't work in general for mathematica because it's too powerful.
Even for boolean logic problems, a minimum-size CNF or DNF will not necessarily be the cheapest solution in terms of gates. As far as I know, hardly anyone has even attempted automatic minimization in terms of general binary operators.
As a term-rewriting system the rule x-x=0 presumably won’t be in Simplify, it’ll be inside - (or Plus, actually). Instead I’d expect there to be strategies. Pick a strategy using a heuristic, push evaluation as far as it’ll go, pick a strategy, etc. But a lot of the work will be normal evaluation, not Simplify-specific.
The default is a balance between leaf count and number of digits. But the documentation page above gives an example of how to nudge the cost function away from specialised functions.
I think "simplify" is pretty clear here. For trigonometric functions you would expect a trig function and an inverse trig function to be simplified. We all know what we'd expect if we saw sin(arcsin(x)) (ie x). If we saw cos(arcsin(x)) I'll spoil it for you: it simplifies to sqrt(1-x^2).
Hyperbolic functions aren't used as much but the same principle applies. Here the core identity is cosh^2(x) = sinh^2(x) = 1 so:
The first time I looked at the comment above, there was a reply, a reply to that reply, and a reply to the reply to the reply.
Later I came back and this time there were no replies. Since HN won't let you delete a comment that has a reply the only ways a comment chain should be able to go away are (1) the participants delete them in reverse order, or (2) a moderator intervenes.
I came back again and the comments are back!
I wonder if this is related to another comment problem I've seen many times in the past few weeks? I'll be using the "next" or "prev" links on top level comments to move through the comment and will come to a point where that breaks. Next reaches a comment that it will not go past. Coming from below prev will also not go past that point. Examining the links, next and prev are pointing to a nonexistent comment.
If you can't recognize how much simpler the simplified version is, I'm not sure exactly what to tell you. But let's think about it in terms of assembly steps:
1. Multiply the input by itself
2. Add 1
3. Take the square root. There is often a fast square root function available.
The above is a fairly simply sequence of SIMD instructions. You can even do it without SIMD if you want.
Compare this to sinh being (e^x - e^-x) / 2 (you can reduce this to one exponentiation in terms of e^2x but I digress) and arccosh being ln(x + sqrt(s^2 - 1)) and you have an exponentiation, subtraction, division, logarithm, addition, square root and a subtraction. Computers generally implement e^2 and logarithm using numerical method approximations (eg of a Taylor's series expansion).
This is what differentiates (pun intended) between Complex Algebra and Complex Analysis:
complex functions in analysis are multivalued (or path dependent in some schools). Even a simple concept of value of F at complex point x becomes a topic of several lectures.
I’m algebraist at heart and training, but I remember beautiful many-layered surfaces of ordinary complex functions in books and on blackboards.
I've only an A-Level in Further Maths from 1997, but understand complex numbers and have come across complex inverse trig functions before.
My takeaway for other people like me from this is "computer is correct" because the proof shows that we can't define arccosh using a single proof across the entire complex plane (specifically imaginary, including infinity).
The representation of this means we have both complex functions that are defined as having coverage of infinity, and arccosh, that a proof exists in only one direction at a time during evaluation.
This distinction is a quirk in mathematics but means that the equation won't be simplified because although it looks like it can, the underlying proof is "one sided" (-ve or +ve) which means the variables are fundamentally not the same at evaluation time unless 2 approaches to the range definition are combined.
The QED is that this distinction won't be shown in the result's representation, leading to the confusion that it should have been simplified.
I believe this is correct: x/x = 1 everywhere except 0, where it has a removable singularity. So you can extend x/x holomorphically to full C.
This is completely different from the phenomenon described in the article: arccosh discontinuity can’t be dealt the same way. In fact complex analysis prefers to deal with it my making functions path-dependent (multi-valued).
PLEASE explain "So you can extend x/x holomorphically to full C" to someone with only a BSc in math/cs; something about this thread is giving me an existential crisis right now.
- function extension is defining a function where it is not defined
- <Adj> function extension is an extension that keeps (or gives) Adj property
- extended function is usually treated as originals if extension is good enough. Real analysis starts with defining real numbers and extending familiar functions onto them
- in this particular case we do not need C - even continuous extension on R works and agrees with x/x = 1 at 0
- holomorphic (analytic) extension makes function infinitely differentiable at every point of C
- because of the nature of discontinuity you can’t extend the simple arccosh in any reasonable way on C without introducing multivalued or path-dependent functions
- this continuity makes x/x=1 a reasonable simplification for CAS imo but not for complex functions as in the OP
- many things with point singularities in R have more structure in C, but x/x is not one of them. Even 1/x is of a different nature.
“You do not divide by zero” that forces you to carry x != 0 is more of a high-school construct than a real thing. Physicists ignore even more important stuff, and in the end their formulas work “just fine”.
Does anyone else think that the latest LLMs - some of which can be used locally for free - combined with proof-verifying software like Coq or Lean for mistake-detection, might make many uses of Computer Algebra Systems like Mathematica obsolete?
Certainly, people don't need Wolfram Alpha as much.
On another point, it sucks to know what this means for Algebraic Geometry (the computational variant), which you could partly motivate, until now, for its use in constructing CASes.
For me Mathematica is much more akin to numpy+sympy+matplotlib+... with absolutely crazy amount of batteries included in a single coherent package with IDE and fantastic documentation. In a way numpy ecosystem already "won" industry users over, yet Wolfram stack is still appealing to me personally for small experiments.
Do you want your LLM to generate one hundred lines of code to do things using open source libraries, or five lines in Mathematica?
This is actually subjective. For the vibe coding folks, they don’t care if the code is long winded and verbose. For others, the conciseness is part of the point; see APL and Notation as a Tool of Thought.
This sentence confused me: "For example, Sinh[ArcCosh[-2 + 0.001 I]] returns 11.214 + 2.89845 I but Sinh[ArcCosh[-2 + 0.001 I]] returns 11.214 - 2.89845 I," not the least of which because the two input expressions are the same, but also because we started out by saying Sinh[ArcCosh[-2]] = -Sqrt[3], which is not at all near 11.214 +/- 2.89845 I.
I think the author meant to say, "ArcCosh[-2 + 0.001 I] returns 1.31696 + 3.14102 I but ArcCosh[-2 - 0.001 I] returns 1.31696 - 3.14102 I," because we are talking about defining ArcCosh[] on the branch cut discontinuity, so there is no need to bring Sinh[] into it (and if we do, we find the limits are the same: the imaginary component goes to zero and Sinh[ArcCosh[-2 +/- t*I]] approaches -Sqrt[3] as t goes to zero from above or below). I am not sure what went wrong to get what they wrote.
Thanks. That was a mess. Don't know what happened, but I fixed it this morning.
This is a general pattern in CAS. For a more basic case, it’s not obvious sqrt(square(x)) will simplify to x without any further assumptions on x.
I think you would get sqrt(x^2) = x, if x belonged to the natural domain of sqrt, which is a Riemann surface, that may also be defined using the language of "sheaves". I don't know how to connect this to the article or Mathematica.
it's literally the prototypical example for `Assuming`
https://reference.wolfram.com/language/ref/Assuming.html
That's not what it simplifies to using a real or complex number domains for x, it's abs(x). CAS need type inference assumptions and/or type qualifiers to be more powerful.
Edit: Fixed stuff.
For x = -i, square(x) = -1, sqrt(square(x)) = i. Meanwhile, abs(x) = 1. You're right that it simplifies to abs(x) for real x, but that no longer holds for arbitrary complex values.
for arbitrary complex values sqrt() gives 2 answers with +- signs
so sqrt(square(-i)) = +-i, one of which is x
I've never seen a CAS that gives two answers for sqrt. Mathematica doesn't, sympy doesn't, and IIRC Maxima also doesn't.
The sqrt function returns the principle square root, not both. That’s true for all numbers, positive, negative, and complex alike.
It's abs(x) only over the reals, for complex numbers it's more complicated.
That abs(x) (or |x| as we wrote it) used to catch out so many of us in HS trig and algebra.
Right, that's why you need further assumptions on x in order for that simplification to hold.
It's not a simplification, it's wrong. Sqrt(square(x)) equals abs(x).
It also equals x with appropriate assumptions (x > 0).
Well, then sin(x) = x if x is infinitely small
so there's an unconditionally correct answer (it's also equal to abs(x) for x>0), and then there is an answer that is only correct for half the domain, which requires an additional assumption.
sqrt(square(i)) != abs(i)
So no, it’s not unconditionally correct either.
Not in general. As people have pointed out elsewhere, it's true if x is real. That isn't always a helpful assumption. (When x is real you can plug that assumption into Mathematica. Then Mathematica should agree with you.)
But consider sqrt(i) = sqrt(exp(i\pi/2)). That's exp(i\pi/4). Your rule would give 1 as the answer. It's not helpful for a serious math system to give that answer to this problem.
When I square 1 I don't get i.
I really wish Mathematica would open-source the heuristics behind these core functions (including common mathematical functions, Simplify, Integrate, etc.). The documentation is good, but it still lags behind the actual implementation. It would be much easier if we could peek inside the black box.
That blackbox being their entire moat, I would assume they'd never want to open-source any function. Mathematica as a front-end has innumerable frustrating bugs, but its CAS is top-notch. Especially combined with something like Rubi for integration, for me nothing comes close to Mathematica for algebraic computations.
Many functions source ate viewable. Use https://resources.wolframcloud.com/FunctionRepository/resour...
Many built-in functions are open source too. Use the "PrintDefinitions" ResourceFunction to see the code of functions that are implemented in Wolfram Language itself.
Source available? The license is still proprietary, right?
Yes, it is all proprietary, but there are still ways to inspect most of the WL-implemented functions since the system does not go to extreme pains to keep them hidden from introspection. It is not unlike Maple in that sense.
Yeah, hiding the recipes for how the math is really done would make the whole system kinda guess-and-hope for serious users.
For Simplify, I expect its a black, or at least gray box to Mathematica maintainers, too.
It will have simple rules such as constant folding, “replace x - x by zero”, “replace zero times something with the conditions under which ‘something’ has a value”, etc, lots of more complex but still easy to understand rules with conditionals such as “√x² = |x| if x is real”, and some weird logic that decides the order in which to try simplification rules.
There’s an analogy with compilers. In LLVM, most optimization passes are easy to understand, but if you look at the set of default optimization passes, there’s no clear reason for why it looks like it looks, other than “in our tests, that’s what performed best in a reasonable time frame”.
A lot of problems look like this. A while ago I was working on a calendar event optimization (think optimizing “every Monday from Jan 1, 2026 to March 10, 2026” + “every Monday from March 15, 2026 to March 31, 2026” to simply “every Monday from Jan 1, 2026 to March 31, 2026”). I wrote a number of intuitive and simple optimization passes as well as some unit tests. To my horror, some passes need to be repeated twice in different parts of the pipeline to get the tests to pass.
You can simplify most of those problems by writing the constraints down in conjunctive/disjunctive normal forms and applying standard simplification on them, like you're back in school. That also eliminates things like repetition, since doing so also makes the problem declarative. If you need the recursive loops, you're guaranteed to be to stratify them for any reasonable problem. If you wanted, you could solve the problem optimally from this point by finding the prime implicants, or just accept a suboptimal solution that runs faster than you have any reason to care about, like datalog and sql do.
That doesn't work in general for mathematica because it's too powerful.
Even for boolean logic problems, a minimum-size CNF or DNF will not necessarily be the cheapest solution in terms of gates. As far as I know, hardly anyone has even attempted automatic minimization in terms of general binary operators.
As a term-rewriting system the rule x-x=0 presumably won’t be in Simplify, it’ll be inside - (or Plus, actually). Instead I’d expect there to be strategies. Pick a strategy using a heuristic, push evaluation as far as it’ll go, pick a strategy, etc. But a lot of the work will be normal evaluation, not Simplify-specific.
> For example, Sinh[ArcCosh[2]] returns −√3 but √(2² − 1) = √3. The expression Mathematica returns for Sinh[ArcCosh[x]] correctly evaluates to −√3
but the expression given is sqrt((x-1)/(x+1))(x+1), which for x=2 would be sqrt(1/3)*3 = sqrt(3)
did you mean Sinh[ArcCosh[-2]]?
More generally it's not at all clear what 'simplify' means.
Is x*x simpler than x^2? Probably? Is sqrt(5)^3 simpler than 5^(3/2)? I don't know.
It entirely depends on what you're going to be doing with the expression later.
While some comments do point out the general opaqueness of Mathematica, the goal of Simplify is actually documented in Mathematica and something which can be changed: https://reference.wolfram.com/language/ref/ComplexityFunctio...
The default is a balance between leaf count and number of digits. But the documentation page above gives an example of how to nudge the cost function away from specialised functions.
I think "simplify" is pretty clear here. For trigonometric functions you would expect a trig function and an inverse trig function to be simplified. We all know what we'd expect if we saw sin(arcsin(x)) (ie x). If we saw cos(arcsin(x)) I'll spoil it for you: it simplifies to sqrt(1-x^2).
Hyperbolic functions aren't used as much but the same principle applies. Here the core identity is cosh^2(x) = sinh^2(x) = 1 so:
You should absolutely expect that from "simplify".OK, something weird is going on with HN here.
The first time I looked at the comment above, there was a reply, a reply to that reply, and a reply to the reply to the reply.
Later I came back and this time there were no replies. Since HN won't let you delete a comment that has a reply the only ways a comment chain should be able to go away are (1) the participants delete them in reverse order, or (2) a moderator intervenes.
I came back again and the comments are back!
I wonder if this is related to another comment problem I've seen many times in the past few weeks? I'll be using the "next" or "prev" links on top level comments to move through the comment and will come to a point where that breaks. Next reaches a comment that it will not go past. Coming from below prev will also not go past that point. Examining the links, next and prev are pointing to a nonexistent comment.
How is going from two functions with one variable to three functions with a variable and a constant a simplification?
If you can't recognize how much simpler the simplified version is, I'm not sure exactly what to tell you. But let's think about it in terms of assembly steps:
1. Multiply the input by itself
2. Add 1
3. Take the square root. There is often a fast square root function available.
The above is a fairly simply sequence of SIMD instructions. You can even do it without SIMD if you want.
Compare this to sinh being (e^x - e^-x) / 2 (you can reduce this to one exponentiation in terms of e^2x but I digress) and arccosh being ln(x + sqrt(s^2 - 1)) and you have an exponentiation, subtraction, division, logarithm, addition, square root and a subtraction. Computers generally implement e^2 and logarithm using numerical method approximations (eg of a Taylor's series expansion).
This is sometimes helpful. But more often it has very little overlap with what I need when I "simplify" some math.
"Simplify" is a very old term (>50y) in computer algebra. Its meaning has become kind of layered in that time.
If simplify means make it fast for a computer to run we might as well make division illegal.
In this case, a heuristic like "less parameters, less operators and less function calls" covers all the cases.
It doesn't, because we might consider different outputs "simple" depending on what we're going to do next.
This is what differentiates (pun intended) between Complex Algebra and Complex Analysis: complex functions in analysis are multivalued (or path dependent in some schools). Even a simple concept of value of F at complex point x becomes a topic of several lectures.
I’m algebraist at heart and training, but I remember beautiful many-layered surfaces of ordinary complex functions in books and on blackboards.
I've only an A-Level in Further Maths from 1997, but understand complex numbers and have come across complex inverse trig functions before.
My takeaway for other people like me from this is "computer is correct" because the proof shows that we can't define arccosh using a single proof across the entire complex plane (specifically imaginary, including infinity).
The representation of this means we have both complex functions that are defined as having coverage of infinity, and arccosh, that a proof exists in only one direction at a time during evaluation.
This distinction is a quirk in mathematics but means that the equation won't be simplified because although it looks like it can, the underlying proof is "one sided" (-ve or +ve) which means the variables are fundamentally not the same at evaluation time unless 2 approaches to the range definition are combined.
The QED is that this distinction won't be shown in the result's representation, leading to the confusion that it should have been simplified.
Simple rule to keep in mind that even math savvy people seem to forget about is that: sqrt(x²) = |x| with bars for absolute value.
For a programmer, it's clear that we have lost the sign information but not the magnitude.
Simple. Makes most sign and solution reasoning explicit instead of implicit when solving quadratics or otherwise working with square roots.
> Simple rule to keep in mind that even math savvy people seem to forget about is that: sqrt(x²) = |x| with bars for absolute value.
i would disagree with that (pun intended).
And yet it incorrectly simplifies f(x) = x/x with f(x) = 1
I believe this is correct: x/x = 1 everywhere except 0, where it has a removable singularity. So you can extend x/x holomorphically to full C.
This is completely different from the phenomenon described in the article: arccosh discontinuity can’t be dealt the same way. In fact complex analysis prefers to deal with it my making functions path-dependent (multi-valued).
PLEASE explain "So you can extend x/x holomorphically to full C" to someone with only a BSc in math/cs; something about this thread is giving me an existential crisis right now.
- function extension is defining a function where it is not defined
- <Adj> function extension is an extension that keeps (or gives) Adj property
- extended function is usually treated as originals if extension is good enough. Real analysis starts with defining real numbers and extending familiar functions onto them
- in this particular case we do not need C - even continuous extension on R works and agrees with x/x = 1 at 0
- holomorphic (analytic) extension makes function infinitely differentiable at every point of C
- because of the nature of discontinuity you can’t extend the simple arccosh in any reasonable way on C without introducing multivalued or path-dependent functions
- this continuity makes x/x=1 a reasonable simplification for CAS imo but not for complex functions as in the OP
- many things with point singularities in R have more structure in C, but x/x is not one of them. Even 1/x is of a different nature.
“You do not divide by zero” that forces you to carry x != 0 is more of a high-school construct than a real thing. Physicists ignore even more important stuff, and in the end their formulas work “just fine”.
As for existential crisis, you probably have missed this one: https://news.ycombinator.com/item?id=46962402
It was really fun
Does anyone else think that the latest LLMs - some of which can be used locally for free - combined with proof-verifying software like Coq or Lean for mistake-detection, might make many uses of Computer Algebra Systems like Mathematica obsolete?
Certainly, people don't need Wolfram Alpha as much.
On another point, it sucks to know what this means for Algebraic Geometry (the computational variant), which you could partly motivate, until now, for its use in constructing CASes.
For me Mathematica is much more akin to numpy+sympy+matplotlib+... with absolutely crazy amount of batteries included in a single coherent package with IDE and fantastic documentation. In a way numpy ecosystem already "won" industry users over, yet Wolfram stack is still appealing to me personally for small experiments.
Coq/Lean target very different use cases.
Do you want your LLM to generate one hundred lines of code to do things using open source libraries, or five lines in Mathematica?
This is actually subjective. For the vibe coding folks, they don’t care if the code is long winded and verbose. For others, the conciseness is part of the point; see APL and Notation as a Tool of Thought.