- Two Facebook researchers based in Paris have built a new neural network for Facebook capable of solving complex mathematical equations, even those dealing with calculus.
- Their work is outlined in a December 2 paper made public on arXiv, a repository of scientific research run by Cornell University.
- It represents a major leap for neural networks, which traditionally are great for use in pattern recognition, but could only complete rudimentary arithmetic until now.

If today's college students could find a way to get their hands on a copy of Facebook's latest neural network, they could cheat all the way through Calc 3. They could even solve the differential equation pictured above in under 30 seconds.

Okay, so maybe this isn't going to be a replacement for Wolfram Alpha anytime soon, but Facebook really did build a neural net that can complete complex mathematical problems for the first time, rather than the plain old arithmetic in which these AI models usually wheel and deal. The work represents a huge leap forward in computers' abilities to understand mathematical logic.

The research is outlined in a new paper, "Deep Learning for Symbolic Mathematics," published in arXiv, a repository of scientific research in areas like math, computer science, and physics, run by Cornell University. Two Facebook scientists in Paris, Guillaume Lample and Francois Charton, led the charge.

In their abstract, the scientists note that neural networks don't exactly have a whiz-kid reputation when it comes to performing calculations or working with symbolic data—basically numbers you can't add, multiply, or conduct operations on in any other way.

"All the data used by computers are numbers," Lample and Charton tell *Popular Mechanics*. "Most of the time, they represent quantities, such as the intensity of a color in an image, or the amount of sales of a product. But sometimes the numbers are used as symbols to represent objects or classes. For instance, the age group of an individual might be represented by a number."

"This is what complicates the task of neural networks on symbolic data," the scientists say. "They need to learn both the data, and symbolic rules."

Through a unique approach to breaking down mathematical logic for computers, Lample and Charton have fixed this problem, allowing their neural network to process and solve calculus problems in about one second. They say in their paper that their neural net can outperform other commercial algebra software packages like Matlab or Wolfram Mathematica, which industry professionals commonly use to crunch numbers.

## Why Differential Equations?

Just in case it's been a little while since you've taken calculus, here's a refresher: A differential equation is where you must solve for one or more derivatives of that function. These can be used to calculate the rate of change of a rapidly increasing rabbit population, for example, or perhaps the rate of decline in demand for a particular brand of running shoes.

An integral equation deals with some unknown function that lies beneath an integral sign. If you've ever seen *Mean Girls* and remember when Lindsay Lohan yelled out that "the limit does not exist!" during a Mathletes competition, you're on the right path.

Lample and Charton say they decided to focus on differential equations and integrals for three main reasons:

1) They're complex tasks, often taught at universities, that are also hard for humans to solve, let alone machines.

2) These kinds of problems involve symbol manipulation, like what's performed in language. "So we thought that this was a good target for the Natural Language Processing models we use," they say.

3) Lastly, these problems made it easy for the two researchers to produce a large set of problems and solutions to train their model on and to ensure that the solutions were correct.

## The Problem with Neural Networks

Neural Nets take a biological approach to computation—that is, they borrow concepts from the human brain's ability to solve problems. Our minds build up connections between parts of the brain as we learn new associations, or patterns. When you see a cat, for example, you'll notice it has fur, two eyes, and four legs. On further inspection, you'll note that it's a small animal. Your brain makes the connection that this is a cat, not a dog. The whole time, tiny brain cells—called neurons–are shooting electrical signals to one another, building up the connections between. This is how we learn to recognize patterns.

Similarly, neural networks rely on layers and layers of artificial "neurons" that mirror the ones in our own brain—only these so-called neurons perform basic calculations. With enough of these working together, the entire network has the power to solve more complex problems, even though individual layers of the neural network may only be equipped to complete one kind of equation.

With this in mind, neural networks are fantastic for use in image recognition (like the tiny square on Facebook that identifies friends' faces for tagging), beating humans at strategy games like Chess or Go, or even helping autonomous vehicles identify potential road hazards or predict the behavior of barriers nearby.

What they're not known for, however, is their ability to complete complex mathematical equations, like calculus problems. That's because the way we know and write down expressions goes right over machines' theoretical heads.

## Breaking Apart Expressions

Neural networks have difficulty solving derivative equations because expressions rely on shorthand that makes sense to humans, but becomes cumbersome for computers. For example, we write the expression *x*^{3 }but what really mean by that is x multiplied by x multiplied by x. So while simple, exponents are even broken down into smaller, simpler math equations. Neural networks struggle with this logic, which is apparent to humans, but must be taught to machines.

The same is true in differential and integral calculus problems, which also use shorthand for simpler equations contained inside an expression. These problems do have patterns that can be picked up by computer systems, but it just so happens that it hasn't been done reliably until now.

Lample and Charton's new method involves breaking down complicated expressions into its central parts. Next, they teach the neural net to find patterns of mathematical logic that are equal to integration and differentiation, allowing the software to complete the program in a uniquely machine way. Then, they let the neural net loose on novel expressions that it has not been trained with, comparing results with other software like Wolfram Mathematica and Matlab.

To do this, the duo unpacked the equations into smaller parts through tree-like structures. Each of the leaves on the trees are numbers, constants or variables, while the nodes are operator symbols like addition, multiplication, differentiation-with-respect-to and so on and so forth.

For example, the expression 2 + 3x (5+2) can be represented as:

And the expression 3x^{2 }+ cos (2x) - 1 is broken down into:

## Training the Neural Net

Next, Lample and Charton had to figure out a way to train their neural net, which must consume a huge amount of data to establish rich enough connections between its "neurons." These relationships, if built correctly, allow the neural net to "think" through a differential equation.

The two randomly put together a dataset that included a number of different differential and integral problems and solutions. They focused on first and second order equations, and limited the expressions from growing too large. After the neural net crunched this data, it learned how to compute derivatives and integrals for given mathematical expressions, like the one at the top of this story.

To complete the whole process, Lample and Charton put the neural net through new tests by feeding it 5,000 new expressions that it had never before seen.

The results were impressive.

"On all tasks, we observe that our model significantly outperforms Mathematica," they write in the paper. "On function integration, our model obtains close to 100 percent accuracy, while Mathematica barely reaches 85 percent."

In many of the tasks, Matlab and Mathematica found no solution to their problems in the 30 seconds allotted. The Facebook neural net, though, only takes about a second to figure out solutions. Our example up top is one of those.

## Okay, Now What?

Nowhere in the paper do Lample or Charton give us any hints at what Facebook plans to do with this neural net, unfortunately. Based on their continual mentions of natural language processing in an interview with *Popular Mechanics*, though, it's possible Facebook is working on a better language processing technique that could be used for a myriad of things. Or maybe the social media giant just wants to help you with your homework? Who knows.

Lample and Charton did tell *Popular Mechanics* that although their model is only a proof of concept, it shows that neural networks can handle symbolic mathematics and practical applications will come in due time.

"Differential equations are very common in science, notably in physics, chemistry, biology and engineering, so there is a lot of possible applications," they say.

And finally, in case you wanted to check your math, here is the final answer you should have gotten when solving for y in that very first example:

Don't feel bad if you got it wrong or it took you way longer than the 30 seconds it took the Facebook neural network. You're only human after all.